Fast Infoset and Respect-based Standards
Related link: https://fi.dev.java.net/
I'm quite bullish about W3C's "Binary XML Infoset"
project, after looking at the java.net
Open Source library
Fast Infoset,
which has been mentioned in
a few blogs
recently.
The thing I like is that it reminds me of
XML's development: I think respect-based standards
are a win all around.
Here are some of the similarities with XML:
- Emphasis on fitting in rather than reinventing the wheel: Fast Infoset is using ASN.1, which is mature and with well-understood stengths and weakness. Think of XML's efforts to build on SGML, HTTP and Unicode.
- Standards based: the Fast Infoset is going through international standardization as ITU-T Rec. X.891 | ISO/IEC 24824-1 (Fast Infoset). For XML, good relations with ISO lead to the ISO SGML standard being altered to cope with XML.
- Experience: primitive interchange formats are just too important to adopt without experience and analysis. Again, XML was based on years of experience.
- Use W3C to rebadge the technology: give bigots
(people who maintain a dislike long after the
ground for disliking has disappeared)
a face-maintaining way to adopt a technology they had rejected earlier, for whatever reason.
Many people had rejected SGML as being too complicated,though most people who were using SGML
were being quite simple in what they used. But changing the
name to XML and the organisation to W3C let them
adopt the technology of a standard generalized markup language (oh, this technology is so human friendly!)
without having
to adopt the Standard Generalized Markup Language
(oh that technology is so human unfriendly!)
- Freeing up a good technology currently locked up in standards: most ISO standards are copyright and must be bought, though you can typically find last draft versions on the web with a simple Google search.
This really goes against the WWW revolution.
If W3C adopted Fast Infoset, it would be more
easily available for SOHO or budget-poor implementers.
A free-beer, free spec also allows better Open Source
implementations, the all-important API competition,
and reduces the technology as field of competition
between the big boys. You can easily imagine that
Microsoft would decide not to support something that
Sun supports, or that IBM supports, just because of
the competitive mindeset: hopefully the good experience
with TCP/IP, HTTP and XML will reduce pathological
competitiveness. It would be great if Microsoft
were to adopt Fast Infoset, and whatever W3C goes for,
if they are compatible with XML and better.
- Constructive rather than carping:
when I read general, rather than technical, criticism of standards or standards bodies, I usually detect strategic sour grapes, where the organization or writer is trying to undermine a process that they cannot influence enough.
XML wasn't based on the mentality
people who don't or won't use this are idiots
but
we want to add to the solution space.
- Sisters are doing it for themselves:
the java.net implementation and
the ISO/ITU-T projects is pro-active.
Jon Bosak didn't wait for SGML's next ten-year
review or for someone else (W3C?) to do it,
he went ahead and energized the process
of making a simplified SGML for the Web.
- Adds value: the thing I really like about the
Fast Infoset proposal is that integers seem to be
a built-in type. This does not prevent numbers
being serialized to, say, SAX streams as strings,
but does allow a good speed up for numeric data.
XML and SGML simply were not designed to be
optimal or useful for numeric data, except perhaps
for archiving. XML added value to text/plain,
and its internationalization added substantial
value to SGML, as it turned out.
But the the biggest way the that W3C Binary project and Fast Infoset reminds me of XML's development is respect. Under Sun's Jon Bozak, XML's development was based on respect that people have different legitimate requirements, respect for standards, developers, open source and commercial imperitives, respect for industry experience, respect for internationalization, and so on.
XML is a rebadged standard technology that emerged out of this discipline of basic respect. I was shocked in some other standards groups I have seen how little respect their was: acting with no respect for the requirements of anyone other than your own company's current customers seems to be a sure way to guarantee a crappy standard. In developing ISO Schematron, I really tried to accept and correct any "big picture" criticism.
I read recently a criticism of the "Binary XML Infoset" project as polluting the stream. I believe the lesson to be learned from XML is not that "Everyone should use one format, it should be simple, it should be Unicode, it should use angle brackets" but the far more challenging "Respect-driven standards development produces really good and generally applicable results."
The other really nice thing in Fast Infoset is that, apparantly, you can define your own datatypes more readily, especially for lists of numbers and so on. XML Schemas datatypes went severely wrong by rejecting the old SGML idea of notations: that there are infinite number of data formats you might want to embed in XML elements or external documents. Extensible embedded data formats have been resurrected in a better form by Jenni Tennison's extensible XML Datatypes library.
Can't a guy get any repect around here?
Categories
WebRead More Entries by Rick Jelliffe.

It's not W3C
Sun did not bring Fast Infoset to ISO/ITU-T. The Fast Infoset standard is being developed within the ISO/ITU-T ASN.1 standards group, with major contributions from OSS Nokalva, Sun and other people. It has a group effort from the start.
W3C Banzai! Banzai! Banzai!
Why was it necessary to go to ISO? Because ASN.1 was already at ISO, before XML was a twinkle in Jon Bosak's overalls.
SGML started ISO and ended at W3C as XML. ASN.1 could do the same thing. It would be a nice role for the W3C to rework/re-targe and popularize mature ISO standards. There is no need to assume that because something is an ISO standard that the W3C could not adopt a profile of it as well: see ISO as a beginning not an end.
The simple answer to all those scary "what if" is just to say "what if they don't?"
But, more than that, if they have some completely different requirement to XML text or ASN.1 binary, then they should be encouraged to make their own format: their needs should be respected!
For example, SVG which uses inheritance a lot: ASN.1 and XML may both be very suboptimal for that, but having an ASN.1 solution in place may politically clear the room for their voices to be heard and for others in the same boat to band to gether.
Web servers support content negotiation and compression negotiation. MIME supports multiple types. These binary formats can have SAX etc readers and writers. There is no architectural problem with binary XML infosets, but the human "Not Invented Here" mentality that I see sometimes in some W3C member representatives seems to be based on fear of losing their (non-existant) control rather than embracing the adventure and leaping into the fray.
Microsoft (and indeed, people who support LDAP structured information, so I presume Linux and Apple as well) already provide ASN.1 libraries as part of the OS. If there is a requirement from customers, if there is a standard, matured and deployed technology that largely fits the bill, if it does not disrupt the general architecture, if it allows better transmission of the PSVI (in particular, for numbers) then, to me, the main arguement against is just puritan minimalism. The sky is not falling.
I am particularly wary of vague fear-mongering as an argument in standards-making, because I have seen standards become impractical when committee members bought into "but you cannot guarantee that this will not have unintended consequences so horrible that we cannot even imagine what might cause them: better leave things alone".
W3C Banzai! Banzai! Banzai!
There are multiple standards networks in this game. The Sun FastInfoset is also being developed as part of the W3DC (X3D, VRML) Consortium work. So far, Sun has stepped up to the plate to provide technical support, and at the end of the day, running code still rules. A problematic issue has also been the software patents (yes, someone patented Schema-based binarization). The W3DC has pledged to work with the W3C on this effort.
I believe that when it is all said and done, Rick put his finger on the issue, which is not the technology, but the human behavior. A respect-based standard certainly optimizes for the most players, and pareto optimality is a sign of stability. All work done on the Internet and with XML can't go through the W3C first. That would turn the W3C into a bottleneck and a hegemonic organization. The best future is for the standards organizations to learn the lessons that made the technology successful: cooperative networks with overlapping members.
It is an ecology of games, as Long put it. Played respectfully, the outcomes are more likely to be optimal.
W3C Banzai! Banzai! Banzai!
Banzai? Interesting subject line. :)
Your history lesson is very interesting, but it it seems to reinforce, not contradict, my main point: all those things you mention resulted in various XML specs that *ended up at w3c.* This is the first time that an XML serialization effort is going to another standards organization. I worry that the XML world will be bifurcated if a future W3C WG comes up with a different binary standard. And what about the precedent: suppose the US defense dept comes up with yet another binary standard? Or IEEE comes up with a numeric-xml binary standard?
Why was it necessary to go to ISO? Why wasn't something like the Java Community Process good enough here?
Whatever. We disagree, that's fine. But as for your PS/BTW comment, I'm not the only one who thought that your description of the Java code, the ISO WG, and the W3C binary characterization WG wasn't as clear as you'd hoped.
W3C Banzai! Banzai! Banzai!
What disrespect?
ASN.1 is an ISO standard: Are ISO not allowed to evolve any of their standards? Are Sun not allowed to implement ISO standards that they have experience in? Are W3C WGs not allowed to consider the experience gained from Open Source projects when making their own decisions? Is it improper for a W3C member to promote its favoured solution? Are W3C's processes and position compromised by runnning code?
As for W3C "creating" XML, I think that is entirely misleading. W3C provided a forum and a badge for assembling XML; the markup community created it; the XML Working Group excelled because it listened to the markup community's experience; the facilities that W3C provide are not guaranteed to produce such good result each time, and I am sure that W3C staff are only too aware of that.
To be more detailed, from my perpective (which is undoubtedly partial):
a big push from Gavin Nicol's use of Unicode in HTML);
(probably from several places, however it was part of my ERCS and part of the initial draft of Annex J of ISO SGML which I wrote, before XML started, but obviously hex references are not novel)
(from me, based on modem autodetection work years before, but obviously magic numbers and so are not novel)
Unless I have forgotten something, everything else in XML is just SGML pared away in the light of monastic SGML, HTML and the insight and good taste of people like Arjun Ray, Tim Bray and James Clark. Since W3C as an institution did not contribute anything new, I don't know what you can mean by "created", unless you think that wrapping a present in paper "creates" a Christmas present; "recommended" or "compiled" or "sponsored" are as much as you could say. I respect W3C because of the integrity of its staff and its commitment to accessibility, to internationalization and the way it has resisted becoming a sales front for companies with software patent. To say that "create" is an incorrect word is not to disrespect them.
Sun's Jon Bosak instigated and nurtured XML; he could have chosen some other forum instead of W3C: W3C was convenient at that time. So that is another similarity: Sun takes an existing ISO technology it already uses, and makes a profile of it for simplified Web use (in this case, through ISO/ITU-1), and works in a W3C WG to get W3C branding kudos on it. (Indeed, W3C owes just as much respect to Sun for "creating" XML as the other way around.)
B.t.w., I think in the blog I maintained a clear distinction between the java.net project, the ISO project, and the W3C project, and I certainly don't want to represent that the draft ITU-T/ISO standard that java.net's Fast Infoset project is implementing is a shoe-in to be also adopted by W3C's Binary XML Infoset project. But it is great that it is there for people who need something like it in the sort term.
It's not W3C
The "fast infoset" isn't a w3c effort. It's Sun bringing it directly to ISO. Seems to me that shows a real lack of respect for the institution that created XML...