Digital Media Web Blogs > Web

LISP is better than XML, but worse is better


At the dawn of XML, some LISP fans would say that XML was just a crappy LISP. (The clueier LISP fans would use "s-expr" or SEXPR or S-expressions, as the Lots of Irritating Silly Parentheses syntax is known.) But Java plus XML plus the Beanshell interpreter is a pretty nice crappy LISP!

The markup-language-as-bad-sexpr notion predates XML by almost a decade with SGML: indeed, with SGML the comparison is fairer, because SGML does include features for setting delimiters and constructing little languges—SGML's SHORTREF and ENTITY mechanism can be compared to macros in LISP, for example.

One reason XML was designed with the principle "Terseness is of minimal importance" was to cut SHORTREFs out. (SGML is still in use by people who need SHORTREFs. But vendors who cannot make a buck out of SGML won't tell you that :-)

Syntax aside, LISPers point out that the XML infoset (i.e., the general data structure that applications may see when the text is parsed) is an attribute value tree (AVT), just like modern LISP lists. (AVTs are very convenient to have available. Certainly one of the reasons for XML's success is that it has allowed vendors to add fairly similar AVT APIs to their libraries.) However, LISP has syntactic features to allow the recognition of numbers and symbols in data: XML just has strings. (Both can represent links between nodes, so really the data structure is an AVT with cross links, like a directed, rooted graph.)

Paul Prescod has a nice page XML is not S-Expressions on the topic. I would also add that XML's encoding declaration is the only text format that provides a workable (though, of course, fallible) approach to the problem of world-wide variations in text encoding: LISP and probably every other programming language does not even get to first base.

S-Expressions have no standard equivalent of DTDs, for validation. XML DTDs provide a basic unit test for documents, which promotes quality testing, clearer interface definition, a separation of concerns between information providers and information recipients, and that the WWW as a data flow model.

So XML's encoding basis is superior to LISP. Its flexibility for creating little languages is less than LISP. Their data structures are pretty much the same. LISP has marginally richer datatypes. Each have different software engineering qualitites. Parenthesis syntax is familiar to programmers; on the other hand, angle-bracket syntax is familiar to web coders.

So XML versus S-Expr is a draw, to me. When character set encoding and markup are important, XML wins. When terseness or recognizing numbers are important, S-Expressions win.

What about XML+Java versus LISP? That is a bit fairer.

I am very affectionate towards LISP. In the early 90s, I briefly worked for Texas Instruments supporting their Explorer LISP systems: wonderful things. TI were closing the Explorer project down at that stage: the belief was that LISP (the language) would not be needed because LISP (the bundle of features) would win. The TI boffins said that in the future (i.e. now) when you opened up a language platform, you would see standard list/AVT structures, garbage collection, object oriented-ness, message passing, dynamic linking, expression parsing, and a whole slew of other features LISPers loved and which were not available in, say, the C APIs. They were right.

But the most characteristic thing of LISP is the eval function. Can I have that in Java+XML? I have been using the BeanShell interpreter for this, to provide interpreted scripts in my company's products. With Beanshell "Users may now freely mix loose, unstructured BeanShell scripts, method closures, and full scripted classes." (At Topologi, we debug using Eclipse and compiled versions, then strip out some header info to generate the scripts when deploying.) I certainly don't want to claim that XML+Java+Beanshell is as beautiful as LISP, but they go a long way towards having the equivalent power of LISP, indeed of other interpreted languages.

LISP had another strong influence on XML, because of Richard Gabriel's paper, usually called Worse is Better, which should be required reading for anyone who is a big fan evangelizing any language, be it Python, C#, XQuery or ASP. (For more, including "Better is Worse" see Gabriel's site. Sun's Jim Waldo has a recent response Worse is still worse which I think misses Gabriel's fundamental point: Waldo paraphrases Gabriel as "Better depends on your quality metric", while I believe Gabriel's paper is the much more challenging "our quality metric can be wrong".)

Categories





AddThis Social Bookmark Button



Comments (5)
Read More Entries by Rick Jelliffe.

5 Comments

Rick Jelliffe said:

Hi, thanks for the comments. Nice to see people still passionate about LISP. It is indeed a great language. However, you are a rude bore, if not a troll, and my life is too short to be shouted at by idiots. No wonder LISP languishes when it has uncivil champions like you.

On EVAL. I am not alone in thinking eval is characteristic of LISP. (I don't know what your comment on whether it is good to use has to do with anything, though: I didn't mention its usage.) See for example the first paragraph of What makes LISP so Great which even quotes Alan Kay.

On encoding declarations, where is the equivalent of the encoding declaration in any LISP: an algorithmic way of determining reliably whether a file uses EBDCIC, UTF-8, or any of the hundreds of other encodings out there? Your comments on "Ampersand escape characters" suggests you don't actually know what these encoding declarations do. You don't seem to know the difference between Unicode and UTF-* or UCS-2.

On Erik Naggum, thanks for recommending I read a page I provided as a link myself. Too silly, please try to troll harder.

On your lecture on SGML, I started using it almost 20 years ago (in fact, I co-wrote a LISP engine to process an SGML subset then) so I think I am aware of its syntactic features. A language is more than syntax: indeed, it is the non-syntactic parts that are the most important. Just because LISP is much better at parsing home made little languages doesn't mean that the home made syntax actually provides any value to anyone. A boring syntax like XML (or S-expressions or J-Son) may provide just as much.

However, you are right in picking up my poor expression "marginally better datatypes": I should have written "lexical typing" I suppose. I mean determining atomic datatypes from lexical form in the printed version: strings, numbers, symbols, etc.

Steve said:

"LISP has marginally richer datatypes."

Do you mean Common Lisp, the ANSI standard Lisp? Because if you do, that statement is ridiculous. CL has numbers, strings, vectors, lists, trees, sets, hashtables, alists, plists, functions (because functions are first class) and a whole object system with classes, objects, and generic functions and methods. XML has strings and lists.

"I would also add that XML's encoding declaration is the only text format that provides a workable (though, of course, fallible) approach to the problem of world-wide variations in text encoding: LISP and probably every other programming language does not even get to first base."

So Unicode is a worse idea than the wonderful ampersand escape characters in *ML? It would be nice if your statement at least acknowledged that this assertion is debatable, rather than just coming out and declaring the XML way to be the only workable one. I mean, since you "would also add" that, perhaps you'd like to add a hint about your reasoning?

Just fyi, things have changed since the TI Explorer. The Lisp I use most often is a CL called SBCL. It has Unicode support. So do all the major free and commercial CL implementations with which I am familiar. So I think Lisp does "get to first base," and even scores a run.

Angle-bracket syntax is more familiar to web developers because HTML is in the same family of markup languages as XML, and I'll lay you odds that angle syntax is more familar to programmers than parentheses are, just because of HTML. But this has nothing to do with the question of whether it's a good idea. You might want to investigate the reason why explict end tags were made part of SGML, for instance, so you can think about whether they still make sense in the tag-dense world of the web and nearly every example of an XML document. (I'd tell you, but since you wrote this I thought it only fair that you be asked to do /some/ research, even if it's after the fact.)

You claim XML's "flexibility for creating little languages is less than LISP,"
when Lisp is generally described even by those who do not favor it (or the DSL approach) as being one the best tools for creating Domain Specific Languages (DSL) ever created, while XML lacks even the rudimentary features found in SGML. Just look at Peter Siebel's book Practical Common Lisp, Paul Graham's On Lisp, or Rainer Josiweg's DSL in Lisp video, all freely downloadable, to see what I am talking about.

You claim that "the most characteristic thing of LISP is the eval function," while contrariwise every CL book I have, and the commonly accepted consesus on comp.lang.lisp and irc.freemode.net#lisp is that use of eval in code is almost always a mistake. I say "in code" because of course most CL is developed with access to a REPL, and the 'E' in "REPL" means "eval." Lisp has many distinguishing characteristics, but certainly the way the language allows code to be treated as data is arguably the most central. It's what allows the macro system to be so clean, among other things. (And it's why CL doesn't need an entirely new syntax for a DTD to easily validate s-expressions.) Besides, many scripting languages also have "eval." But since you don't bother to produce a reason for your assertion, who can know why you make this claim.

Finally you claim that XML+Java+Beanshell goes "a long way towards having the equivalent power of LISP, indeed of other interpreted languages." It's hard to be sure, because that sentence is very ambiguous, but if you mean to say that Lisp is an interpreted language, you are wrong. Most CLs are not interpreters. Most have the ability to interpret and also to compile native code. Code you have compiled and code you have not all works together in one image, with calls from one to the other and back. Some produce stand-alone executables.

I'd suggest to you that if you want to compare XML to Lisp, you find out what's happened in the nearly twenty years since you apparently looked into it, and do so /before/ deciding to share your views with the world. You might also consider googling "erik naggum sgml" and "erik naggun xml" (you link to Erik Naggum's site above) to get a strong contrary view from a person who is a CL expert and an SGMl expert. I quickly found hiim giving highly technical SGML discussions going back to 1994. But Erik now believes that a number of the decisions behind the *ML markup languages were, though well-intentioned, ultimately mistaken. You don't bring up a single one of the points raised in his generally well-known critiques, relying on simple bromides and straw men.

akhu said:

surprised
You are right, eval (or evalute, sometimes) is not part of the standard and this is quite unfortunate.

XSLT processors like Saxon do have an eval (and evaluate) extension function to do the job.

It may not be a part of the standard, but still it is available as part of the language, at least with the better processors.

Still, I agree with you, it should part of the standard language and many interesting applications would not be feasible without it. We do use it extensively.

Thank you.
ac.

rjelliffe said:

surprised
Indeed, the LISP link is direct: XSL is a reworking of ISO DSSSL (Document Semantics and Style Specification Language pronounced like 'thistle') which used a subset of Scheme, a functional LISP language. (The editor of ISO DSSSL, W3C XPath and W3C XSLT was James Clark, technical lead for the XML Working Group and later editor of the ISO RELAX NG schema language. He also wrote the well-known open source programs groff, sgmls, xp, st, sgmls, and jing.)

Apparantly DSSSL was originally going to have a custom syntax, but they were having trouble making a nice one. (This was more than a decade ago.) Interleaf were using LISP with success; I wrote to comp.text.sgml to report that in Japan I had co-written RISP, a LISP subset for processing an SGML subset, and that it worked well; plus there were other LISP advocates around. James Clark had previously implement the troff typesetting language, which is as far from elegant as possible, so I think he was keen to adopt a functional/declarative approach. I wasn't involved in ISO at that stage, so I don't know the exact details.

Some purists (and I agree with them) say that the essence of LISP is not functional programming, nor the list structure, nor garbage collection, and so on, but the eval function. XSLT does not have the ability to generate a script then run it (if it did, Schematron could be compiled in one pass, for example!).

But you make a good point: with XSLT, you do have your program code available as a list/tree that can be accessed/created by other processes, just like in LISP. But, unfortunately, not in the same executing program. Java with the Beanshell does let you generate text and execute it as code, but it does not give a neat tree for manipulating the code (e.g. for symbolic computation.) So the closest thing for Java is to store the parse tree in an XML tree that can be manipulated, then generate Beanshell scripts. (The alternative to BeanShell, using reflexion, is just too horrible to contemplate for maintenance reasons at least.)

akhu said:

surprised
I am surprised that you have not considered XML and XSLT. I feel that this is a lot like LISP, better in some ways (ex: standards, web), at least it is functional programming working on graphs. Did I miss something ?

Cheers,
ac

Leave a comment


Type the characters you see in the picture above.

Topics of Interest

Related Books

Recommended for You

Archives


 
 


Or, visit our complete archive.  

Stay Connected