Ask HN: Best working file format for writing open source books?

A friend and I are starting to write a book, which we intend to publish and distribute ourselves under a Creative Commons license. Since we're both programmers we want to use a plain text format to work with, and maintain versions using Git. We'd like to (reasonably) easily output formats like HTML, PDF, print, eBook, etc. while also keeping the files easy to edit. What do you recommend as a working file format? I have considered Latex and HTML but wonder if there isn't something better.

  • I used Docbook SGML (and later XML) for my book about Webmin. It wasn't worth the pain.

    I've worked on another large Open Source project that used Word (yes, really!) and OpenOffice with custom templates and conversion tools (they would, for the most part, save as HTML, and then convert from that to PyDoc, .chm, and several other formats).

    I later converted the Webmin book into wiki format (TWiki, specifically), and now we're in the midst of moving it to Markdown (with extensions). I like Markdown best of all, and is now what we're using for everything (forums, tickets in the issue tracker, and all documentation).

    Don't underestimate the cost of hard to deal with formats. Docbook produced beautiful output, and in many formats, and was useful for pulling data out automatically. But, the effort involved in writing the docs was atrocious. Even with good tools (most competent XML-capable editors can work with Docbook pretty well...I used jEdit in the end), you'll produce half the docs with twice the frustration. With Markdown, when you need more control you can drop to straight HTML.

    Markdown works great in revision control systems, and is easy to generate programmatically (we've started exporting our API doc PODs to Markdown, for example). Nearly everything support it, as well. Our new CMS is Drupal, and it has great Markdown support, and there are converters from various wikis (we have more than a thousand printed pages worth of docs in two different wikis to be converted).

    I can wholeheartedly recommend it. Textile is another popular minimal markup that I imagine has pretty much all the same advantages as Markdown.

    Pandoc might be worth a look, too. I haven't had need for it, yet, but it's got an awesome feature set, and can do PDF from Markdown (so I probably will use it soon, as we get pretty frequent PDF requests).

  • I have used reStructuredText (which both "docutils" and "prest" can use to generate documents). The source text actually is the plain text "output", while it is still possible to automatically generate HTML, etc.

    For something as complex as a book, reStructuredText is best; though I often fall back on the simple Textile format for really short documents that still need to be converted to HTML.

  • LaTeX is pretty standard for typesetting math/CS texts of any length. You can use your favorite version control system, and write a Makefile to build a PDF/PS document.

    If you want to make it easily readable on a webpage (and not just a PDF download), you may want to use html.

  • Thanks everyone for the insightful answers, you've given me some good ideas to look into.