Semi-Literate JOD

JOD Logo

Click to view jodliterate.pdf

Despite seven decades of programming experience documenting software remains a challenge. There are many reasons for this sorry state of affairs with the most important being that programmers simply do not agree on the need for documentation. As pathetic as this sounds it’s not without merit. It all depends on what you call “documentation.”

Writing technical documents for management, marketing or users usually results in excruciating rounds of Dilbertian critiques. Everyone understands your code better than you do. If you provide too much detail, you get complaints. If you use unfamiliar words, you get complaints. If you point out limitations, assumptions or caveats, you get complaints. If you assume basic 8th grade reading levels, you get complaints. If you use nonstandard fonts or unauthorized style templates, you get complaints. No wonder many programmers hate “documentation” and blow off the entire problem by making ludicrous claims about “self documenting code.” The self documenting cabal may have fooled management but they’re not fooling the rest of us. The need for illuminating program documentation is as pressing today as it was for ENIAC coders in the 1940’s and, when in it comes to illuminating documentation, the best overall approach was pioneered by Donald Knuth over twenty-five years ago and goes by the moniker literate programming.

Providing basic literate programming support in JOD has been on my to-do list for ages. I’ve held off until recently because I have never been happy with my mark up options. JOD directly supports simple J scriptdoc compatible leading comment block formatting. For example many of my J verbs start with a comment block like:

betweenstrs=:4 : 0

NB.*betweenstrs v-- select sublists between  nonnested delimiters
NB. discarding delimiters.
NB.
NB. dyad:  blcl =. (clStart;clEnd) betweenstrs cl
NB.        blnl =. (nlStart;nlEnd) betweenstrs nl
NB.
NB.   ('start';'end') betweenstrs 'start yada yada end boo hoo start ahh end'
NB.
NB.   NB. also applies to numeric delimiters
NB.   (1 1;2 2) betweenstrs 1 1 66 666 2 2 7 87 1 1 0 2 2

's e'=. x
llst=. ((-#s) (|.!.0) s E. y) +. e E. y
mask=. ~:/\ llst
(mask#llst) <;.1 mask#y
)

Even if you can’t spell J I bet you have a good idea about what this “program” does and, if you doubt my claims, I’ve left you with some examples to try the next time you find yourself in J. Stupid comments may be for losers but telling comments, especially example laden ones, really help! And, if you really find comments distracting, JOD has a deal for you!

   ;1{compj 'betweenstrs' 
betweenstrs=:4 :0
's e'=.x
a=.((-#s )(|.!.0)s E.y)+.e E.y
b=.~:/\a
(b#a)<;.1 b#y
)

compj purges pesky comments and reduces tedious long identifiers like mask to pure compact J. Getting rid of comments is trivial, putting them back in: not so much! JOD’s simple comment block formatting has been very effective but it’s hardly literate programming.

Literate programming requires more muscle. Knuth used his own TeX. TeX and LaTeX are certainly up to the job, as are many HTML and XML approaches. Unfortunately, all these mark up formats suffer from “distracting taggyness.” I can tolerate LaTeX but HTML and XML drives me nuts. Yes, there are perfectly fine editors for all these formats, but remember, we are inserting the resulting text into code that we will be looking at for the rest of our miserable coding lives! We need a mark up format that’s stable, readable, versatile, easy to use and, this is very important, easy to ignore! Markdown is such a format. It’s almost ideal for program comments and is capable of much more. I’ve started using markdown in JOD and it’s already paying its way.

jodliterate.ijs is a J utility script that can generate semi-literate LaTeX documents directly from JOD groups. It uses a version of pandoc with J syntax highlighting, see Pandoc based J Syntax Highlighting for details. I consider jodliterate semi-literate because it’s completely at the mercy of the programmer. If you don’t store coherent markdown text fragments in JOD all you get is a nice syntax highlighted listing. But, if you actually write about your group, jodliterate can produce essential documents. jodliterate.pdf is an example of this tool being used on itself. Self reference always makes an excellent test case. jodliterate will be included in the next JOD release. Until then you can download the J script from this directory. As always referenced files are available in the files sidebar. Enjoy!

Turn your Blog into an eBook

If you have worked through the exhausting procedure of converting your blog to LaTeX: see posts (1), (2) and (3), you will be glad to hear that turning your blog into an image free eBook is almost effortless. In this post I will describe how I convert my blog into EPUB and MOBI eBooks.

eBooks how the cool kids are reading

eBook readers like Kindles, Nooks, iPads and many cell phones are optimized for plain old prose. They excel at displaying reflowable text in a variety of fonts, sizes and styles. One eBook reader feature, dear to my old fart eyes, is the ability to increase the size of text.  All eBooks are potentially large print editions. There are other advantages: most readers can store hundreds, if not thousands of books, making them portable libraries. It’s now technically possible to hand a kindergarten student a little tablet that holds every single book he will use from preschool to graduate school. The only obstacle is the rapacious textbook industry and their equally rapacious eBook publishing enablers. But fear not open source man will save the day. The days of overpriced digital goods are over! I will never pay more than a few bucks for an eBook because I can make my own and so can you! Let’s get together and kill off another industry that so has it coming!

PDFs, EPUBs and MOBIs

Native eBook file formats like EPUB and MOBI do not handle complex page layouts well. If your document contains a lot of mathematics, figures and well placed illustrations stick with PDF workflows.[1] You will save yourself and your readers a lot of grief.  But, if your document is a prose masterpiece, a veritable great American novel, then “publishing” it as an EPUB or MOBI is great way to target eBook readers. EPUBs and MOBIs can be compiled from many sources.  I start with the LaTeX files I created for the PDF version of this blog because I hate doing the same boring task twice. By far the most time-consuming part of converting WordPress export XML to LaTeX is editing the pandoc generated *.tex files to resolve figures and fix odd run-together-words and paragraphs. To preserve these edits I use pandoc to convert my edited *.tex to *.markdown files.

Markdown

Markdown is a very simple text oriented format. A markdown file is completely readable exactly the way it is. All you need is a text editor. Even text editors are overkill. You could compose markdown with early 20th century mechanical typewriters; it’s a low tech format for the ages: perfect for prose.

The J verb MarkdownFrLatex [2] calls pandoc and converts my *.tex files to *.markdown. I place my markdown in the directory

c:/pd/blog/wp2epub

and to track changes to my markdown files I GIT this directory. MarkdownFrLatex strips out image inclusions and removes typographic flourishes.  When it succeeds it writes a simple markdown file and when it fails it writes a *.baddown file. Baddown files are *.tex files that contain lstlistings and complex figure environments that are best resolved with manual edits. After removing such problematic LaTeX environments the J verb FixBaddown calls pandoc and turns baddown files into markdown files.

Generating EPUB and MOBI files

When the conversion to markdown is complete I run MainMarkdown to mash all my files into one large markdown file with an eBook header. The eBook header for this blog is:

% Analyze the Data not the Drivel
% John D. Baker

The first few lines of the consolidated bm.markdown file are:

% Analyze the Data not the Drivel
% John D. Baker

#[What’s In it for
Facebook?](https://bakerjd99.wordpress.com/2009/09/05/whats-in-it-for-facebook/)

-------------------------------------------------------------------------------------------------

*Posted: 05 Sep 2009 22:44:50*

[Facebook](http://www.facebook.com) is huge: they brag about a user
count well north of one hundred million. If only 0.5% of their users are
active that’s 500,000 *concurrent users.* How many expensive servers
does it take to support such a load? .....

Generating an EPUB from bm.markdown is a simple matter of opening up your favorite command line shell and issuing the pandoc command:

pandoc -S --epub-cover-image=bmcover.jpg -o bm.epub bm.markdown

You can read the resulting EPUB file bm.epub on any EPUB eBook reader. Here’s a screen shot of bm.epub on my iPhone.

iPhone loaded with my blog

iPhone loaded with my blog

The last step converts bm.epub to bm.mobi. MOBI is a native Kindle format. Pandoc can generate MOBI from bm.markdown but it inexplicably omits a table of contents. No problemo:  I use Calibre to convert bm.epub to bm.mobi. Calibre properly converts the embedded EPUB table of contents to MOBI.  Here’s bm.mobi on a Kindle.

Kindle loaded with my blog

Kindle loaded with my blog

All the “published” versions of this blog are available on the Download this Blog page so please help yourself!


[1] LaTeX is usually compiled to PDF making it one of hundreds of PDF workflows.

[2] All the J verbs referenced in this post are in the script TeXfrWpxml.ijs