Contents
Here are the XML elements that may appear in Leo files:
Leo files start with the following line:
<?xml version="1.0" encoding="UTF-8"?>
An xml-stylesheet line is option. For example:
<?xml-stylesheet ekr_stylesheet?>
The <leo_header> element specifies version information and other information that affects how Leo parses the file. For example:
<leo_header file_format="2" tnodes="0" max_tnode_index="5725" clone_windows="0"/>
The file_format attribute gives the ‘major’ format number. It is ‘2’ for all 4.x versions of Leo. The tnodes and clone_windows attributes are no longer used. The max_tnode_index attribute is the largest tnode index.
The globals element specifies information relating to the entire file. For example:
<globals body_outline_ratio="0.50">
<global_window_position top="27" left="27" height="472" width="571"/>
<global_log_window_position top="183" left="446" height="397" width="534"/>
</globals>
The <v> element represents a single vnode and has the following form:
<v...><vh>sss</vh> (zero or more nested v elements) </v>
The <vh> element specifies the headline text. sss is the headline text encoded with the usual XML escapes. As shown above, a <v> element may contain nested <v> elements. This nesting indicates outline structure in the obvious way. Zero or more of the following attributes may appear in <v> elements:
t=name.timestamp.n
a="xxx"
The t=”Tnnn” attribute specifies the <t> element associated with a <v> element. The a=”xxx” attribute specifies vnode attributes. The xxx denotes one or more upper-case letters whose meanings are as follows:
C The vnode is a clone. (Not used in 4.x)
E The vnode is expanded so its children are visible.
M The vnode is marked.
T The vnode is the top visible node.
V The vnode is the current vnode.
For example, a=”EM” specifies that the vnode is expanded and is marked.
New in 4.0:
The <t> element represents the body text of the corresponding <v> element. It has this form:
<t tx="<gnx>">sss</t>
The tx attribute is required. The t attribute of <v> elements refer to this tx attribute. sss is the body text encoded with the usual XML escapes.
New in 4.0: Plugins and scripts may add attributes to <v> and <t> elements. See Writing plugins for details.
This section describe the format of external files. Leo’s sentinel lines are comments, and this section describes those comments.
Files derived from @file use gnx’s in @+node sentinels. Such gnx’s permanently and uniquely identify nodes. Gnx’s have the form:
id.yyyymmddhhmmss
id.yyyymmddhhmmss.n
The second form is used if two gnx’s would otherwise be identical.
Here are the sentinels used by Leo, in alphabetical order. Unless otherwise noted, the documentation applies to all versions of Leo. In the following discussion, gnx denotes a gnx as described above.
A sentinel of the form @<<section_name>> represents a section reference.
If the reference does not end the line, the sentinel line ending the expansion is followed by the remainder of the reference line. This allows the Read code to recreate the reference line exactly.
The @@ sentinel represents any line starting with @ in body text except @*whitespace*, @doc and @others. Examples:
@@nocolor
@@pagewidth 80
@@tabwidth 4
@@code
@at and @doc
The @+doc @+at sentinels indicate the start of a doc parts.
We use the following trailing whitespace convention to determine where putDocPart has inserted line breaks:
A line in a doc part is followed by an inserted newline if and only if the newline if preceded by whitespace.To make this convention work, Leo’s write code deletes the trailing whitespace of all lines that are followed by a “real” newline.
Marks the start of any external file. This sentinel has the form:
<opening_delim>@leo<closing_delim>
The read code uses single-line comments if <closing_delim> is empty. The write code generates single-line comments if possible.
The @+leo sentinel contains other information. For example:
<opening_delim>@leo-ver=4-thin<closing_delim>
Mark the start and end of a node.
@+node:gnx:<headline>
@verbatimAfterRef is generated when a comment following a section reference would otherwise be treated as a sentinel. In Python code, an example would be:
<< ref >> #+others
Leo uses unicode internally for all strings.
Leo converts headline and body text to unicode when reading .leo files and external files. Both .leo files and external files may specify their encoding. The default is utf-8. If the encoding used in a external file is not “utf-8” it is represented in the @+leo sentinel line. For example:
#@+leo-encoding=iso-8859-1.
The utf-8 encoding is a “lossless” encoding (it can represent all unicode code points), so converting to and from utf-8 plain strings will never cause a problem. When reading or writing a character not in a “lossy” encoding, Leo converts such characters to ‘?’ and issues a warning.
When writing .leo files and external files Leo uses the same encoding used to read the file, again with utf-8 used as a default.
leoSettings.leo contains the following Unicode settings, with the defaults as shown:
default_derived_file_encoding = UTF-8
new_leo_file_encoding = UTF-8
These control the default encodings used when writing external files and .leo files. Changing the new_leo_file_encoding setting is not recommended. See the comments in leoSettings.leo. You may set default_derived_file_encoding to anything that makes sense for you.
The @encoding directive specifies the encoding used in a external file. You can’t mix encodings in a single external file.
Leo checks that the URL is valid before attempting to open it. A valid URL is:
That is, a comma, hyphen and open curly brace may not be the last character.
URL’s in Leo should contain no spaces: use %20 to indicate spaces.
You may use any type of URL that your browser supports: http, mailto, ftp, file, etc.
This section discusses the most important milestones in history of Leo.
Leo grew out of my efforts to use Donald Knuth’s “CWEB system of Structured documentation.” I had known of literate programming since the mid 1980’s, but I never understood how to make it work for me. In November 1995 I started thinking about programming in earnest. Over the holidays I mused about making programs more understandable. In January 1996 the fog of confusion suddenly cleared. I summarized my thinking with the phrase, web are outlines in disguise. I suspected that outline views were the key to programming, but many details remained obscure.
March 5, 1996, is the most important date in Leo’s history. While returning from a day of skiing, I discussed my thoughts with Rebecca. During that conversation I realized that I could use the MORE outliner as a prototype for a “programming outliner.” I immediately started work on my first outlined program. It quickly became apparent that outlines work: all my old problems with programming vanished. The @others directive dates from this day. I realized that MORE’s outlines could form the basis for Leo’s screen design. Rather than opening body text within the outline, as MORE does, I decided to use a separate body pane.
I hacked a translator called M2C which allowed me to use MORE to write real code. I would write code in MORE, copy the text to the clipboard in MORE format, then run M2C, which would convert the outline into C code. This process was useful, if clumsy. I called the language used in the outline SWEB, for simplified CWEB. Much later Leo started supporting the noweb language.
Throughout 1996 I created a version of Leo on the Macintosh in plain C and the native Mac Toolbox. This was a poor choice; I wasted a huge amount of time programming with these primitive tools. However, this effort convinced me that Leo was a great way to program.
Late in 1997 I wrote a Print command to typeset an outline. Printing (Weaving) is supposedly a key feature of literate programming. Imagine my surprise when I realized that such a “beautiful” program listing was almost unintelligible; all the structure inherent in the outline was lost! I saw clearly that typesetting, no matter how well done, is no substitute for explicit structure.
In 1998 I created a version of Leo using Apple’s YellowBox environment. Alas, Apple broke its promises to Apple developers. I had to start again.
I rewrote Leo for Borland C++ starting in May 1999. Borland C++ was much better than CodeWarrior C, but it was still C++. This version of Leo was the first version to use xml as the format of .leo files. The last version of Borland Leo, 3.12 Final went out the door July 17, 2003.
I attended the Python conference in early 2001. In May of 2000 I began work on an wxWindows version of Leo. This did not work out, but something good did come from this effort. I spent a lot of time adding Python scripting to the wxWindows code and I became familiar with Python and its internals.
I really started to ‘get’ Python in September 2001. I wrote the white papers at about this time. Python solved all my programming problems. I rewrote Leo in Python in about two months! For the first time in my career I was no longer anxious while programming; it simply isn’t possible to create bad bugs in Python. The Python version of Leo was the first officially OpenSoftware version of Leo. The first functional version of Leo in Python was 0.05 alpha, December 17, 2001.
I registered the Leo project on SourceForge on March 10, 2003. It is certainly no accident that Leo started a new life shortly thereafter. Prior to SourceForge my interest in Leo had been waning.
In the summer of 2001 I began to consider using sentinel lines in external files. Previously I had thought that outline structure must be ‘protected’ by remaining inside .leo files. Accepting the possibility that sentinels might be corrupted opened vast new design possibilities. In retrospect, problems with sentinel almost never happen, but that wasn’t obvious at the time! The result of this design was known at first as Leo2. That terminology is extinct. I think of this version as the first version to support @file and automatic tangling and untangling.
The biggest surprise in Leo’s history was the realization it is much easier to untangle files derived from @file. Indeed, the old tangle code created all sorts of problems that just disappear when using @file. The new Python version of Leo became fully operational in early 2002. It was probably about this time that I chose noweb as Leo’s preferred markup language. My decision not to support noweb’s escape sequences made Leo’s read code much more robust.
I spent 2002 taking advantages of Python’s tremendous power and safety. Many improvements were at last easy enough to do:
In late 2002 and throughout 2003 I worked on an entirely new file format. 4.0 final went out the door October 17, 2003 after almost a year intense design work trying to improve error recovery scheme used while reading external files. In the summer of 2003 I realized that orphan and @ignore’d nodes must be prohibited in @file trees. With this restriction, Leo could finally recreate @file trees in outlines using only the information in external files. This made the read code much more robust, and eliminated all the previous unworkable error recovery schemes. At last Leo was on a completely firm foundation.
Leo first used gnx’s (global node indices) as a foolproof way of associating nodes in .leo files with nodes in external files. At the time, there was still intense discussions about protecting the logical consistency of outlines. @thin was later to solve all those problems, but nobody knew that then.
Leo 4.2 Final went out the door September 20, 2004. This surely is one of the most significant dates in Leo’s history:
This marked the end worries about consistency of outlines and external files: Leo recreates all essential information from thin external files, so there is nothing left in the .leo file to get out of synch.
makes thin external files more cvs friendly.
A sensational scripting plugin showed how to create script buttons. This has lead to improvements in the Execute Script command and other significant improvements in Unit testing.
As if this were not enough, 4.2 marked the ‘great divide’ in Leo’s internal data structures. Before 4.2, Leo every node in the outline had its own vnode. This was a big performance problem: clone operations had to traverse the entire outline! 4.2 represents clones by sharing subtrees. Changing Leo’s fundamental data structures while retaining compatibility with old scripts was engineering work of which the entire Leo community can be proud. Scripting Leo with Python tells how the position class makes this happen. This was a cooperative effort. Kent Tenney and Bernhard Mulder made absolutely crucial contributions. Kent pointed out that it is a tnode, not a vnode that must form the root of the shared data. Bernhard showed that iterators are the way to avoid creating huge numbers of positions.
Leo 4.2 marked so many significant changes. I often find it hard to remember what life with Leo was like before it.
Leo 4.3 corrected many problems with leoConfig.txt. Instead, Leo gets settings from one or more leoSettings.leo files. This version also introduced a way to changed settings using a settings dialog. However, the settings dialog proved not to be useful (worse, it inhibited design) and the settings dialog was retired in Leo 4.4.
Leo 4.4 was a year-long effort to incorporate an Emacs-style minibuffer and related commands into Leo. Thinking in terms of minibuffer commands frees my thinking. Leo 4.4 also featured many improvements in how keys are bound to commands, including per-pane bindings and user-defined key-binding modes.
Development on long-delayed projects accelerated after 4.4 final went out the door. Recent projects include:
This series of releases featured hundreds of improvements. The highlights were truly significant:
For a complete list, see the What’s New chapter.
Added support for @shadow files. This was a major breakthrough. See the Using @shadow chapter for full details.
This version of Leo featured more significant improvements:
Leo 4.7 accomplishes something I long thought to be impossible: the unification of vnodes and tnodes. tnodes no longer exist: vnodes contain all data. The Aha that made this possible is that iterators and positions allow a single node to appear in more than one place in a tree traversal.
This is one of the most significant developments in Leo’s history. At last the endless confusion between vnodes and tnodes is gone. At the most fundamental level, Leo’s data structures are as simple as possible. This makes them as general and as powerful as possible!
This version successfully produced a common code base that can run on both Python 2.x and Python 3.x.
Leo 4.8 simplified Leo’s sentinels as much as possible. Leo’s sentinel lines look very much like Emacs org-mode comment lines, except for the addition of gnx’s.
This version also produced a fundamentally important addition to Leo’s error recovery. Leo now shows “Resurrected” and “Recovered” nodes when loading an outline. These nodes protect against data loss, and also implicitly warn when unusual data-changing events occur. Creating this scheme is likely the final chapter in the epic saga of error recovery in Leo.
Leo 4.9 featured the completed transition to the PyQt application framework, the introduction of the viewrendered pane, and autocompletion.
I wrote this soon after discovering Python in 2001. The conclusions are still valid today.
I’ve known for a while that Python was interesting; I attended a Python conference last year and added Python support to Leo. But last week I got that Python is something truly remarkable. I wanted to convert Leo from wxWindows to wxPython, so I began work on c2py, a Python script that would help convert from C++ syntax to Python. While doing so, I had an Aha experience. Python is more than an incremental improvement over Smalltalk or C++ or objective-C; it is “something completely different”. The rest of this post tries to explain this difference.
What struck me first as I converted C++ code to Python is how much less blah, blah, blah there is in Python. No braces, no stupid semicolons and most importantly, no declarations. No more pointless distinctions between const, char *, char const *, char * and wxString. No more wondering whether a variable should be signed, unsigned, short or long.
Declarations add clutter, declarations are never obviously right and declarations don’t prevent memory allocation tragedies. Declarations also hinder prototyping. In C++, if I change the type of something I must change all related declarations; this can be a huge and dangerous task. With Python, I can change the type of an object without changing the code at all! It’s no accident that Leo’s new log pane was created first in Python.
Functions returning tuples are a “minor” feature with a huge impact on code clarity. No more passing pointers to data, no more defining (and allocating and deallocating) temporary structs to hold multiple values.
Python can’t check declarations because there aren’t any. However, there is a really nifty tool called pylint that does many of the checks typically done by compilers.
Python is much more powerful than C++, not because Python has more features, but because Python needs less features. Some examples:
Before using Python I never fully realized how difficult and dangerous memory allocation is in C++. Try doing:
aList[i:j] = list(aString)
in C. You will write about 20 lines of C code. Any error in this code will create a memory allocation crash or leak.
Python is fundamentally safe. C++ is fundamentally unsafe. When I am using Python I am free from worry and anxiety. When I am using C++ I must be constantly “on guard.” A momentary lapse can create a hard-to-find pointer bug. With Python, almost nothing serious can ever go wrong, so I can work late at night, or after a beer. The Python debugger is always available. If an exception occurs, the debugger/interpreter tells me just what went wrong. I don’t have to plan a debugging strategy! Finally, Python recovers from exceptions, so Leo can keep right on going even after a crash!
Python has almost all the speed of C. Other interpretive environments such as icon and Smalltalk have clarity, power and safety similar to Python. What makes Python unique is its seamless way of making C code look like Python code. Python executes at essentially the speed of C code because most Python modules are written in C. The overhead in calling such modules is negligible. Moreover, if code is too slow, one can always create a C module to do the job.
In fact, Python encourages optimization by moving to higher levels of expression. For example, Leo’s Open command reads an XML file. If this command is too slow I can use Python’s XML parser module. This will speed up Leo while at the same time raising the level of the code.
Little of Python is completely new. What stands out is the superb engineering judgment evident in Python’s design. Python is extremely powerful, yet small, simple and elegant. Python allows me to express my intentions clearly and at the highest possible level.
The only hope of making Leo all it can be is to use the best possible tools. I believe Python will allow me to add, at long last, the new features that Leo should have.
Edward K. Ream, October 25, 2001. P.S., September, 2005:
Four years of experience have only added to my admiration for Python. Leo could not possibly be what it is today without Python.