This is a Free Software DOM Level 3 implementation, supporting these features:
Note that while DOM does not specify its behavior in the face of concurrent access, this implementation does. Specifically:
A number of DOM implementations are available in Java, including commercial ones from Sun, IBM, Oracle, and DataChannel as well as noncommercial ones from Docuverse, OpenXML, and Silfide. Why have another? Some of the goals of this version:
This also works with the GNU Compiler for Java (GCJ). GCJ promises to be quite the environment for programming Java, both directly and from C++ using the new CNI interfaces (which really use C++, unlike JNI).
At this writing:
I ran a profiler a few times and remove some of the performance hotspots, but it's not tuned. Reporting mutation events, in particular, is rather costly -- it started at about a 40% penalty for appendNode calls, I've got it down around 12%, but it'll be hard to shrink it much further. The overall code size is relatively small, though you may want to be rid of many of the unused DOM interface classes (HTML, CSS, and so on).
Starting with DOM Level 2, you can really see that DOM is constructed as a bunch of optional modules around a core of either XML or HTML functionality. Different implementations will support different optional modules. This implementation provides a set of features that should be useful if you're not depending on the HTML functionality (lots of convenience functions that mostly don't buy much except API surface area) and user interface support. That is, browsers will want more -- but what they need should be cleanly layered over what's already here.
This DOM implementation supports the "XML" feature set, which basically gets you four things over the bare core (which you're officially not supposed to implement except in conjunction with the "XML" or "HTML" feature). In order of decreasing utility, those four things are:
Events may be one of the more interesting new features in Level 2. This package provides the core feature set and exposes mutation events. No gooey events though; if you want that, write a layered implementation!
Three mutation events aren't currently generated:
In addition, certain kinds of attribute modification aren't reported. A fix is known, but it couldn't report the previous value of the attribute. More work could fix all of this (as well as reduce the generally high cost of childful attributes), but that's not been done yet.
Also, note that it is a Bad Thing to have the listener for a mutation event change the ancestry for the target of that event. Or to prevent mutation events from bubbling to where they're needed. Just don't do those, OK?
As an experimental feature (named "USER-Events"), you can provide your own "user" events. Just name them anything starting with "USER-" and you're set. Dispatch them through, bubbling, capturing, or what ever takes your fancy. One important thing you can't currently do is pass any data (like an object) with those events. Maybe later there will be a "UserEvent" interface letting you get some substantial use out of this mechanism even if you're not "inside" of a DOM package.
You can create and send HTML events. Ditto UIEvents. Since DOM doesn't require a UI, it's the UI's job to send them; perhaps that's part of your application.
This package may be built without the ability to report mutation events, gaining a significant speedup in DOM construction time. However, if that is done then certain other features -- notably node iterators and getElementsByTagname -- will not be available.
Each DOM node has all you need to walk to everything connected to that node. Lightweight, efficient utilities are easily layered on top of just the core APIs.
Traversal APIs are an optional part of DOM Level 2, providing a not-so-lightweight way to walk over DOM trees, if your application didn't already have such utilities for use with data represented via DOM. Implementing this helped debug the (optional) event and mutation event subsystems, so it's provided here.
At this writing, the "TreeWalker" interface isn't implemented.
For what appear to be a combination of historical and "committee logic" reasons, DOM has a number of features which I strongly advise you to avoid using in your library and application code. These include the following types of DOM nodes; see the documentation for the implementation class for more information:
If you really need to use unparsed entities or notations, use SAX; it offers better support for all DTD-related functionality. It also exposes actual document typing information (such as element content models).
Also, when accessing attribute values, use methods that provide their values as single strings, rather than those which expose value substructure (Text and EntityReference nodes). (See the DomAttr documentation for more information.)
Note that many of these features were provided as partial support for editor functionality (including the incomplete DTD access). Full editor functionality requires access to potentially malformed lexical structure, at the level of unparsed tokens and below. Access at such levels is so complex that using it in non-editor applications sacrifices all the benefits of XML; editor aplications need extremely specialized APIs.
(This isn't a slam against DTDs, note; only against the broken support for them in DOM. Even despite inclusion of some dubious SGML legacy features such as notations and unparsed entities, and the ongoing proliferation of alternative schema and validation tools, DTDs are still the most widely adopted tool to constrain XML document structure. Alternative schemes generally focus on data transfer style applications; open document architectures comparable to DocBook 4.0 don't yet exist in the schema world. Feel free to use DTDs; just don't expect DOM to help you.)