Literate Programming

Introductions:

Literate Programming is a programming methodology in which a program is its own documentation.

It is a way to organize code so that it matches the way humans think and explain and present ideas, rather than matching the way a compiler expects ideas to be presented to it. --AsimJalis

From one source you can produce both the documentation and the executable code. A good example of what can be done with this methodology can be seen in the book TexTheProgram.

Note that "documentation" means a description of the implementation, not a user manual. For example, the TeXBook is the user manual for TeX, whereas TeX the Program describes (and contains) the implementation.

[ "a programming product ... is a program that can be run, tested, repaired, and extended by anybody. It is usable in many operating environments, for many sets of data." -- Frederick P. Brooks, in the book The Mythical Man-Month. So how does one test/repair/extend ("maintain" ?) a program ? I assume by "description of the implementation" you mean documentation explaining how a maintainer can do those things. ]

Literate Programming is also the name of DonKnuth's book (ISBN 0-937-07381-4 ) which describes the methodology.

One speculation for the reason behind Knuth's pushing of LP is that according to Stanford's intellectual property policy, Stanford would have owned all of Knuth's code, but not his published writing. So the answer's simple: make the code part of the document, and while you're at it, be sure to minimize the appearance and importance of the code itself. Ensuring the document outstrips the code by two to one or more is a great start. One of the tenets of ExtremeProgramming (which I claim is also highly overrated) is to use meaningful identifiers and sensibly factored code, in order to make it self documenting. LP isn't useless; it's really useful for embedding executable code samples when combined with an interactive editor and interpreter, e.g., Elmer in the EeLanguage. (The language is E, I'm just forced to write a god-awful WikiWord.) I just think Knuth was more interested in producing books that just happened to contain code samples, being that LP is really the antithesis of SelfDocumentingCode...

Issues:

(In what sense can LP be called "methodology"? This _simple_ idea of mixing "code" and "documentation", whatever this dubious distinction meant to Prof Knuth, is hardly ever used in ways indicative of any insight into SDLC, particularly, or SE in general, IMHO! A clip I saw out of the Professor's book (I don't have the book) was even rather, eh, infantile, meaning, the "literacy" of it was almost "lip service" like, a trivial macro language like trick, I mean what's the point?! And the "glue" text between those macros seemed more like a "stream of consciousness" narrative, ugh! Look, Wiki combined with an LP mechanism can really do things, solve practical difficulties (ie, software engineering sense) of what "code" means to people, and how it's being used. I suspect much of what's been done with LP is a disservice to the concept. IMHO, please.)
CharlesSimonyi states that with (his) IntentionalProgramming system, LiterateProgramming "will become practical".


IMO, literate programming emphasises a kind of documentation that is often forgotten in these times of document extraction tools: the "why" domain. Between the code with its interfaces ("what this is", handled by documentation extractors) and outside-look documentation like tutorials and the like, there is need for code commentary. How are the data structures used, what kind of techniques were considered but _not_ taken in use, why does a particular passage of code look weird, what are the cases to be taken into account, in what situation is a particular method expected to be used. That kind of thing.

Often, when you ponder a problem and attack it with code, you make notes and code simultaneously, intermingled. Why not put those in the same file, so you can sometime change some note into code or vice versa? In this way, literate programming is on the same level of being a "methodology" as TestDrivenDevelopment. By the way, UnitTests are a good example of code that would benefit of literate programming. -- PanuKalliokoski


If I recall correctly, the original compiler for TeX was a _Standard_ Pascal compiler, which is just horrible to code in since you are enforced by the language to provide your code in a way which is quite non-intuitive seen from today, namely strictly organized in sequence (labels-d?-types-variables-procedures-functions), and which had extremely bad StringHandling?. So basically Knuth wrote a preprocessor which processed his writings into standard pascal. His preprocessor was just on a grander scale than usually seen. The Literate Programming is just his personal way of writing code for books :) I tried it some years ago, and the verbose commenting just does not flow well for me. -- ThorbjoernRavnAndersen?


Related: CodeOrdering, DocumentationBeyondTheSourceCode, LiterateProgrammingBibliography, LiterateProgrammingIdeas, SelfDocumentingCode, TheSourceCodeIsTheDesign, CommentsAreCode, LiterateProgrammingTools,ElementalProgramming

Implementation: HyperPerl, NoWeb

and Leo (http://webpages.charter.net/edreamleo/front.html) . Leo is an interesting cross between a light IDE, a programmers editor and a LiterateProgramming tool.

(Lots of stale (and crude) LiterateProgrammingIdeas excised from this page 2002.02.05.)

(There's another (redundant?) intro on Wikibase: http://c2.com/cgi/wikibase?LiterateProgramming .)

http://www.literateprogramming.com/ http://www.ross.net/funnelweb/


Re: "Note that "documentation" means a description of the implementation, not a user manual."... I was reading the manual of a project today, of the "instructive examples" kind (not the "reference" kind); it was horribly out of sync with reality, as usual. I got to wondering: why not provide an executable example? A lot of articles provide downloadable code, but I mean something that does exactly what's in the document, so the manual can be tested as being accurate.

The literate systems I've come across represent a single point in time (the finished code), this is slightly different as it represents the evolution of the code. The documentation would be interleaved with script and diffs, which would be rendered as code blocks or hidden depending on tags in the documentation; 'executing' the document would run shell commands, and apply patches. Partial execution should work too, so you can edit in changes in the process, and try the examples at each stage in the document. In noweb terms, there would be no cross references in the code: the documentation would refer to patch chunks instead; and unlike noweb, the same line of code may appear differently several times as it is edited. Anyone seen a system like this? -- BrianEwins

(Can somebody post an example of LiterateProgramming?)


Can somebody post an example of LiterateProgramming?

Check this out: http://rdflib.net/ or rather http://redfoot.net/

Note how the webpages have "source code" intermingled with the (generated) HTML rendering the pages, all generated from a SemanticWeb describing rdflib -- I think. Yep. All the various bits of data including code are written in what's being called hypercode -- more documentation on the concept in the works -- in the mean time crack into the source ;)

Another example of LiterateProgramming is the lcc retargetable ANSI C compiler by Fraser and Hanson. Like Knuth's MetafontTheProgram? and TexTheProgram, a hardcover book was generated from the source code. See http://www.cs.princeton.edu/software/lcc/ and ISBN 0-8053-1670-1 -- TobyThain


CategoryBook CategoryTex CategoryDocumentation CategoryCodingIssues
EditText of this page (last edited November 25, 2005)
FindPage by browsing or searching

This page mirrored in ExtremeProgrammingRoadmap as of April 29, 2006