All Yesterday's Tomorrows: The report on, and of, Project Arxana concerning word processing, electronic publishing, hypertext, etc.

Arxana is a free/open hypertext system written in Emacs Lisp, with Common Lisp extensions.

By Joseph Corneli and Raymond S. Puzio

Short version | Gallery | Some inspiring quotes | Video


The idea is to refactor pieces of information into documents in a “holographic” fashion. This addresses the idea that a document is made up of paragraphs, a paragraph is made up of sentences, sentences are made up of words, and words are made up of letters -- but also that the world is not as simple and hierarchical as this picture might make you think.

Newspapers are made up of articles, but they can be folded into hats or taped into Möbius strips. Math papers are made up of definitions, theorems, and proofs, which can turn out to be incorrect. Comic books are made up of panels, art, and speech balloons, but they can also be made into movies.

A tarot spread is made up of cards that are arranged and narrated in a way that s(t)imulates the mind...

Over the years, we've built several prototypes and spin-offs of Arxana, ranging from a simple ipod-like list browser, to a representation language for mathematics that envisioned definitions and theorems as lists of assertions and instantiations.

Our aim for the project is to build something that's useful for authoring complicated documents with lots of interconnected and interrelated bits. Emacs was the natural starting place.

We presented the first really usable prototype of the system at a Symposium on Free Culture and the Digital Library in 2005, along with a paper that proposed a Scholium-Based Document Model for Commons Based Peer Production.


Currently the defining features of an Arxana implementation are: (1) low-level functions that allow you to create a network of texts; (2) a browser to navigate the graph; (3) functions that will assemble documents out of the graph.

In particular, we've created a programming framework for writing programs as graphs. Programs written in this manner can either be run in situ by a graphical interpreter or compiled down into traditional code for running outside the system.

This programming facility allows for literate programming in which programs are built out of chunks by transclusion and comments are attached as scholia. The same ideas applies to text: they can either be browsed as hypertext, or specific portions can be compiled into printable documents on the fly.

Accessing bits and pieces in different ways should be possible with multiple backends (e.g. database, web, memory).

Hypertext nouveau is based on the concept of semantic triples of the form subject, predicate, object. This rather constrained language allows one to say quite a lot. The picture that emerges is something like the constellations that we superimpose over the stars. We've had lots of interesting conversations about how to represent things, will we need 29 place terms? probably not...

The previous 2009-era prototype used CLSQL to interface to MySQL and represented links as triples in a handmade triple (or rather, quad) store. We also experimented with using cl-elephant as a storage system to persist nodes as objects.


Frontend features rely on Emacs text properties, and support the ability to edit multiple nodes at one time, and have the changes routed back into the backend properly.

Previous Emacs-based browsing systems include Help, Info, and Emacs/w3m. Editing and browsing with these systems were essentially completely distinct activities. Although completely silly, M-x doctor provides encouragement that an interactive system could do something interesting.

Various experiments with the social web point in the direction of an open read/write platform, but typically these experiments have had very limited semantics. We're thinking we can do more here.

We're planning middle-end features that will be associated not just with assembling texts on the fly, but doing additional processing, like proof checking.

One reason for choosing our data model is that inference rules can be represented very naturally as networks. The linear representations that we're used to is only a representation of the way people actually think about things. At some point in the future, we hope to be able to turn Arxana networks into schematic diagrams automatically!

For now, here's a hand-drawn picture showing how inference rules look when they're put presented in a network structure.


What can we do to interact with objects once they've been properly parsed? There will be some interesting experiments in the future, connecting Arxana and PlanetMath, which will be particularly nice now that the latter has been re-built using LaTeXML.

All of the math on PlanetMath is now rendered in Content MathML, which should give us plenty of things to chew on (and in the not-too-distant future, we'll have sTeX and OMDoc to chew on as well).

Whether on PlanetMath or beyond, Emacs and Drupal might be a match made in heaven! One of the most interesting applications we have in mind that should be available in coming days is a connection between Arxana and Drupal, using Drupal's Services module.

As interesting as the mathematics applications are, the proper vehicle for this system is literate programming, or even better, literary programming in Lisp. Why Lisp? is a question that has been asked and answered many times.

Donald Loritz writes: “in Lisp, a Saussurean arbitrary relation holds between signifier and significand [...] it is possible to redefine virtually every classical Lisp command word.”

Our point of view on network programming is similar, but it goes beyond Lisp in several important ways. First, our links are bidirectional, and second, we allow objects to be attached at any point in the network, not just at the tree boundaries. In Lisp, what you have are tuplets, not triplets. If you want to link to content, you have to put it in a CAR or a CDR, which gives you one link left, which is enough to create chains, but not any more interesting graph structures.


In sum, pretty much every object in this system is annotatable. This is Arxana's biggest strength, which, as is well known, often represents the greatest weakness. In the first place, making this statement precise has been tricky.

Secondly, if you make everything annotatable, you're bound to get a representation that's more complicated than what you might get if you wanted to just compute.

The system can be flexible, but it can also be formal; for example, if we wish, we can implement a type system in links.

How do we intend to connect with other developers? We already have connections to KWARC, and have presented work in progress several times at LISP NYC.

One of the most relevant places to look for connections is the Free Knowledge Incubator, particularly as we think about ways to implement Richard P. Gabriel and Ron Goldman's Mob Software ideas.

One of the most straightforward ways to explain this idea is that we want to build a wiki made of programs and documentation instead of text.


At present, real-time interactions like Etherpad and ShareJS are popular. It would be great to be able to integrate these things into the system. Ted Nelson is also interested in multimedia -- why not annotate video using this system?

If Douglas Adams were still alive, maybe he would be able to use a system like this one to sell his next novel, using a Namecoin distribution method.

If someone is listening to a lecture and taking notes, it would be great to live stream the notes that people are taking along with the video -- and save them so that the video can be read along with the notes later.

Linking is how scholia are attached to other articles; the collection of all articles is ``the commons''; and the rules for interacting with this collection defines the commons's regulatory system.

The idea of using Arxana to model the commons has some staggering implications. A network is just a way to take a shared system of some form and make it computational. Once you have a good model, you want to be able to interact with the system in some way.

If you want to organize for social change, you need a system of annotations that's open and robust.


We will connect Emacs to Common Lisp via Slime, and Common Lisp to PostgreSQL via CLSQL. CLSQL also talks directly to the Sphinx search engine, which we use for text-based search. Once all of these things are installed and working together, you should be able to begin to use Arxana.

This prototyping work didn't go particularly well, and ended up having several regressions with respect to the previous version of the code. But at the same time, we've been able to re-use code or from previous prototypes.

Now that we're making a cleaner separation of backend, middle-end, and front end, we'll have easier ways to swap things in and out. There will be less concern about which choice to make, because we can use any implementation for a module. This is a design principle that goes fairly deep into the system, so, anything that implements a 'pow' function can be used in that way.

The reader will perhaps have noticed the similarities and references to the work and ideas of Ted Nelson. We want to be clear that the system presented here isn't an implementation of the Xanadu™ idea, per se, although it provides some of the features one would expect from a “Xanadu™ implementation.”

One clear difference between this system and the Xanadu™ system is that here articles are not supposed to be presented in a pay-to-access fashion. The Xanadu™ system was intended to be Free as in Freedom (in a certain limited sense), but not Free as in Beer. We might recover some micropayment features eventually, but it's not a core focus.

In any case, the main concern with Xanadu™ is that it's cursed. We didn't want the same fate to befall Arxana, which was why we took a 100% free/open source approach. That said, we probably didn't have the best strategy for outreach and connection with other developers as we were getting started!


We would have also done well to pay more attention to the philosophical foundations of hypertext from the very start. Wittgenstein remarked ironically: “It should be possible to decide a priori whether, for example, I can get into a situation in which I need to symbolize with a sign of a 27-termed relation.” (Tractatus 5.5541)

Another interesting connection from philosophy is Theuth (AKA Thoth), as understood by Derrida, and the associated notions of deconstruction and the pharmakon.

One of the key points to keep in mind are the limits of any system -- we could quote also from Borges or Deleuze, but here's Wittgenstein again: “And how would it be possible that I should have to deal with forms in logic which I can invent: but I must have to deal with that which makes it possible for me to invent them.” (Tractatus 5.555)

How do we deal with content that isn't authored in Arxana? -- Especially if it is locked up behind non-free protocols.

In some sense there are circles of Hell here -- something you can't get any reading on whatsoever is a brick, but you can still attach comments to just about anything, even a brick.

Other things you can model.


Looking back over the history of the project, another key point of reference is Dirk Gently's Holistic Detective Agency. Perhaps you remember that the pseudo-protagonist, Richard MacDuff, got into programming when he was working on creating a text editor to edit his English papers with.

Hypertext can often feel like a distraction from a distraction from a distraction. (Dirk Gently author Douglas Adams was famous for his procrastination, and would generally only finish novels when his agent locked him in a hotel room.)

Anyway, the literary, textual, and philosophical references go way back, for instance to the Talmud and its history. For a more contemporary example, Bukowski's books happened to show up in the Xanadu bookstore in Memphis, where one of us was buying cheap geometry differential texts in bulk.

Assembling a document from various disparate sources in the database is reminiscent of the way p2p filesharing works.

There are some interesting developments with Bitcoin and more recently with Namecoin that we could potentially connect up with here.

Apart from Namecoin, an early aim of the system was to use distributed databases to make an alternative hypertext system much more in line with Ted Nelson's ideas. Well, perhaps one day.


There was probably a non-trivial chance of going crazy when working on this system. We create a lot of obscure documents on AsteroidMeta for example.

The question with this system was always a practical one. So far, it hasn't been all that practical, but all in good time?

“Marry, then, sweet wag, when thou art king, let not us that are squires of the night's body be called thieves of the day's beauty: let us be Diana's foresters, gentlemen of the shade, minions of the moon; and let men say we be men of good government, being governed, as the sea is, by our noble and chaste mistress the moon, under whose countenance we steal.”

If you put together all of the different things that we have here, you might get something like the Hyperreal Dictionary of Mathematics that we've talked about for years. If we get the real-time co-editing system set up, with some nice co-presence markers and use this system to build a real-time editable math MUD, that would be pretty practical even without the Artificial Intelligence part.

Networks and nodes are a really interesting thing to work with here. Our networks are more lively than the typical subject, predicate, object -- we're much more in the mind to represent beginning, middle, and end, representing dynamic process as opposed to static things.

What quotation does is it represents as a thing some process from a lower level theory. If A is a meta-theory to B, then you can start to reason about things at a higher level. The term ``format shifting'' belies the nature and power of simulation, and the true force of medium-as-message.


Ted Nelson's “Literary Machines” and Marvin Minsky's “Society of Mind” are important inspirations. Alfred Korzybski's “Science and Sanity” and Gilles Deleuze's “The Logic of Sense” provided some grounding and encouragement early on. LaTeX and GNU Emacs have been useful not just in prototyping this system, but also as exemplary projects in the genre. John McCarthy's Elephant 2000 was an inspiring thing to look at and think about, and of course Lisp has been a vital ingredient.

More recently, the conceptual artwork Monolyth gave some indication of the form we wanted this document to take -- 14000 characters, a kilotweet, which we will continue to revise as the project develops.

On the formal side, our approach has been informed by various concepts from the foundations of mathematics such as Quine's distinction of use versus mention, Tarski's consequence operator approach to logical theories, Lesniewski's mereology, Makarov's D-logic, Grothendieck's categorical geometry and Lawvere's categorical logic. One of our objectives is to bring some of these ideas down to earth from their lofty perch atop the tower of mathematical abstraction and embody them in computer code which can be applied to practical problems.

In short, Hello World! This document is readable and editable within Arxana, as is the system itself.

We're interested to get other people involved as easily as possible -- for now we've got code on, but that will be followed soon with a Github account, and links to our latest download.

There's even an arxana-talk mailing list, if you want to get in touch with us directly.


Arxana's source code is available under the terms of the Affero GNU GPL 3.0.                                                                                    

Made with LISP Made with GNU Emacs