summaryrefslogtreecommitdiff
path: root/libjava/classpath/gnu/xml/pipeline
diff options
context:
space:
mode:
Diffstat (limited to 'libjava/classpath/gnu/xml/pipeline')
-rw-r--r--libjava/classpath/gnu/xml/pipeline/CallFilter.java257
-rw-r--r--libjava/classpath/gnu/xml/pipeline/DomConsumer.java967
-rw-r--r--libjava/classpath/gnu/xml/pipeline/EventConsumer.java95
-rw-r--r--libjava/classpath/gnu/xml/pipeline/EventFilter.java796
-rw-r--r--libjava/classpath/gnu/xml/pipeline/LinkFilter.java242
-rw-r--r--libjava/classpath/gnu/xml/pipeline/NSFilter.java341
-rw-r--r--libjava/classpath/gnu/xml/pipeline/PipelineFactory.java723
-rw-r--r--libjava/classpath/gnu/xml/pipeline/TeeConsumer.java417
-rw-r--r--libjava/classpath/gnu/xml/pipeline/TextConsumer.java117
-rw-r--r--libjava/classpath/gnu/xml/pipeline/ValidationConsumer.java1928
-rw-r--r--libjava/classpath/gnu/xml/pipeline/WellFormednessFilter.java363
-rw-r--r--libjava/classpath/gnu/xml/pipeline/XIncludeFilter.java579
-rw-r--r--libjava/classpath/gnu/xml/pipeline/XsltFilter.java130
-rw-r--r--libjava/classpath/gnu/xml/pipeline/package.html255
14 files changed, 7210 insertions, 0 deletions
diff --git a/libjava/classpath/gnu/xml/pipeline/CallFilter.java b/libjava/classpath/gnu/xml/pipeline/CallFilter.java
new file mode 100644
index 000000000..2398b8685
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/CallFilter.java
@@ -0,0 +1,257 @@
+/* CallFilter.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.io.IOException;
+import java.io.OutputStreamWriter;
+import java.net.URL;
+import java.net.URLConnection;
+import java.io.Writer;
+
+import org.xml.sax.DTDHandler;
+import org.xml.sax.ErrorHandler;
+import org.xml.sax.InputSource;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXNotRecognizedException;
+import org.xml.sax.XMLReader;
+import org.xml.sax.helpers.XMLReaderFactory;
+
+import gnu.xml.util.Resolver;
+import gnu.xml.util.XMLWriter;
+
+
+/**
+ * Input is sent as an XML request to given URI, and the output of this
+ * filter is the parsed response to that request.
+ * A connection is opened to the remote URI when the startDocument call is
+ * issued through this filter, and the request is finished when the
+ * endDocument call is issued. Events should be written quickly enough to
+ * prevent the remote HTTP server from aborting the connection due to
+ * inactivity; you may want to buffer text in an earlier pipeline stage.
+ * If your application requires validity checking of such
+ * outputs, have the output pipeline include a validation stage.
+ *
+ * <p>In effect, this makes a remote procedure call to the URI, with the
+ * request and response document syntax as chosen by the application.
+ * <em>Note that all the input events must be seen, and sent to the URI,
+ * before the first output event can be seen. </em> Clients are delayed
+ * at least by waiting for the server to respond, constraining concurrency.
+ * Services can thus be used to synchronize concurrent activities, and
+ * even to prioritize service among different clients.
+ *
+ * <p> You are advised to avoid restricting yourself to an "RPC" model
+ * for distributed computation. With a World Wide Web, network latencies
+ * and failures (e.g. non-availability)
+ * are significant; adopting a "procedure" model, rather than a workflow
+ * model where bulk requests are sent and worked on asynchronously, is not
+ * generally an optimal system-wide architecture. When the messages may
+ * need authentication, such as with an OpenPGP signature, or when server
+ * loads don't argue in favor of immediate responses, non-RPC models can
+ * be advantageous. (So-called "peer to peer" computing models are one
+ * additional type of model, though too often that term is applied to
+ * systems that still have a centralized control structure.)
+ *
+ * <p> <em>Be strict in what you send, liberal in what you accept,</em> as
+ * the Internet tradition goes. Strictly conformant data should never cause
+ * problems to its receiver; make your request pipeline be very strict, and
+ * don't compromise on that. Make your response pipeline strict as well,
+ * but be ready to tolerate specific mild, temporary, and well-documented
+ * variations from specific communications peers.
+ *
+ * @see XmlServlet
+ *
+ * @author David Brownell
+ */
+final public class CallFilter implements EventConsumer
+{
+ private Requestor req;
+ private EventConsumer next;
+ private URL target;
+ private URLConnection conn;
+ private ErrorHandler errHandler;
+
+
+ /**
+ * Initializes a call filter so that its inputs are sent to the
+ * specified URI, and its outputs are sent to the next consumer
+ * provided.
+ *
+ * @exception IOException if the URI isn't accepted as a URL
+ */
+ // constructor used by PipelineFactory
+ public CallFilter (String uri, EventConsumer next)
+ throws IOException
+ {
+ this.next = next;
+ req = new Requestor ();
+ setCallTarget (uri);
+ }
+
+ /**
+ * Assigns the URI of the call target to be used.
+ * Does not affect calls currently being made.
+ */
+ final public void setCallTarget (String uri)
+ throws IOException
+ {
+ target = new URL (uri);
+ }
+
+ /**
+ * Assigns the error handler to be used to present most fatal
+ * errors.
+ */
+ public void setErrorHandler (ErrorHandler handler)
+ {
+ req.setErrorHandler (handler);
+ }
+
+
+ /**
+ * Returns the call target's URI.
+ */
+ final public String getCallTarget ()
+ {
+ return target.toString ();
+ }
+
+ /** Returns the content handler currently in use. */
+ final public org.xml.sax.ContentHandler getContentHandler ()
+ {
+ return req;
+ }
+
+ /** Returns the DTD handler currently in use. */
+ final public DTDHandler getDTDHandler ()
+ {
+ return req;
+ }
+
+
+ /**
+ * Returns the declaration or lexical handler currently in
+ * use, or throws an exception for other properties.
+ */
+ final public Object getProperty (String id)
+ throws SAXNotRecognizedException
+ {
+ if (EventFilter.DECL_HANDLER.equals (id))
+ return req;
+ if (EventFilter.LEXICAL_HANDLER.equals (id))
+ return req;
+ throw new SAXNotRecognizedException (id);
+ }
+
+
+ // JDK 1.1 seems to need it to be done this way, sigh
+ ErrorHandler getErrorHandler () { return errHandler; }
+
+ //
+ // Takes input and echoes to server as POST input.
+ // Then sends the POST reply to the next pipeline element.
+ //
+ final class Requestor extends XMLWriter
+ {
+ Requestor ()
+ {
+ super ((Writer)null);
+ }
+
+ public synchronized void startDocument () throws SAXException
+ {
+ // Connect to remote object and set up to send it XML text
+ try {
+ if (conn != null)
+ throw new IllegalStateException ("call is being made");
+
+ conn = target.openConnection ();
+ conn.setDoOutput (true);
+ conn.setRequestProperty ("Content-Type",
+ "application/xml;charset=UTF-8");
+
+ setWriter (new OutputStreamWriter (
+ conn.getOutputStream (),
+ "UTF8"), "UTF-8");
+
+ } catch (IOException e) {
+ fatal ("can't write (POST) to URI: " + target, e);
+ }
+
+ // NOW base class can safely write that text!
+ super.startDocument ();
+ }
+
+ public void endDocument () throws SAXException
+ {
+ //
+ // Finish writing the request (for HTTP, a POST);
+ // this closes the output stream.
+ //
+ super.endDocument ();
+
+ //
+ // Receive the response.
+ // Produce events for the next stage.
+ //
+ InputSource source;
+ XMLReader producer;
+ String encoding;
+
+ try {
+
+ source = new InputSource (conn.getInputStream ());
+
+// FIXME if status is anything but success, report it!! It'd be good to
+// save the request data just in case we need to deal with a forward.
+
+ encoding = Resolver.getEncoding (conn.getContentType ());
+ if (encoding != null)
+ source.setEncoding (encoding);
+
+ producer = XMLReaderFactory.createXMLReader ();
+ producer.setErrorHandler (getErrorHandler ());
+ EventFilter.bind (producer, next);
+ producer.parse (source);
+ conn = null;
+
+ } catch (IOException e) {
+ fatal ("I/O Exception reading response, " + e.getMessage (), e);
+ }
+ }
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/DomConsumer.java b/libjava/classpath/gnu/xml/pipeline/DomConsumer.java
new file mode 100644
index 000000000..141f36eca
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/DomConsumer.java
@@ -0,0 +1,967 @@
+/* DomConsumer.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import gnu.xml.util.DomParser;
+
+import org.xml.sax.Attributes;
+import org.xml.sax.ContentHandler;
+import org.xml.sax.DTDHandler;
+import org.xml.sax.ErrorHandler;
+import org.xml.sax.Locator;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXNotRecognizedException;
+import org.xml.sax.SAXParseException;
+import org.xml.sax.ext.DeclHandler;
+import org.xml.sax.ext.LexicalHandler;
+import org.xml.sax.helpers.AttributesImpl;
+import org.w3c.dom.Attr;
+import org.w3c.dom.CDATASection;
+import org.w3c.dom.CharacterData;
+import org.w3c.dom.Document;
+import org.w3c.dom.DOMImplementation;
+import org.w3c.dom.Element;
+import org.w3c.dom.EntityReference;
+import org.w3c.dom.Node;
+import org.w3c.dom.ProcessingInstruction;
+import org.w3c.dom.Text;
+
+/**
+ * This consumer builds a DOM Document from its input, acting either as a
+ * pipeline terminus or as an intermediate buffer. When a document's worth
+ * of events has been delivered to this consumer, that document is read with
+ * a {@link DomParser} and sent to the next consumer. It is also available
+ * as a read-once property.
+ *
+ * <p>The DOM tree is constructed as faithfully as possible. There are some
+ * complications since a DOM should expose behaviors that can't be implemented
+ * without API backdoors into that DOM, and because some SAX parsers don't
+ * report all the information that DOM permits to be exposed. The general
+ * problem areas involve information from the Document Type Declaration (DTD).
+ * DOM only represents a limited subset, but has some behaviors that depend
+ * on much deeper knowledge of a document's DTD. You shouldn't have much to
+ * worry about unless you change handling of "noise" nodes from its default
+ * setting (which ignores them all); note if you use JAXP to populate your
+ * DOM trees, it wants to save "noise" nodes by default. (Such nodes include
+ * ignorable whitespace, comments, entity references and CDATA boundaries.)
+ * Otherwise, your
+ * main worry will be if you use a SAX parser that doesn't flag ignorable
+ * whitespace unless it's validating (few don't).
+ *
+ * <p> The SAX2 events used as input must contain XML Names for elements
+ * and attributes, with original prefixes. In SAX2,
+ * this is optional unless the "namespace-prefixes" parser feature is set.
+ * Moreover, many application components won't provide completely correct
+ * structures anyway. <em>Before you convert a DOM to an output document,
+ * you should plan to postprocess it to create or repair such namespace
+ * information.</em> The {@link NSFilter} pipeline stage does such work.
+ *
+ * <p> <em>Note: changes late in DOM L2 process made it impractical to
+ * attempt to create the DocumentType node in any implementation-neutral way,
+ * much less to populate it (L1 didn't support even creating such nodes).
+ * To create and populate such a node, subclass the inner
+ * {@link DomConsumer.Handler} class and teach it about the backdoors into
+ * whatever DOM implementation you want. It's possible that some revised
+ * DOM API (L3?) will make this problem solvable again. </em>
+ *
+ * @see DomParser
+ *
+ * @author David Brownell
+ */
+public class DomConsumer implements EventConsumer
+{
+ private Class domImpl;
+
+ private boolean hidingCDATA = true;
+ private boolean hidingComments = true;
+ private boolean hidingWhitespace = true;
+ private boolean hidingReferences = true;
+
+ private Handler handler;
+ private ErrorHandler errHandler;
+
+ private EventConsumer next;
+
+ // FIXME: this can't be a generic pipeline stage just now,
+ // since its input became a Class not a String (to be turned
+ // into a class, using the right class loader)
+
+
+ /**
+ * Configures this pipeline terminus to use the specified implementation
+ * of DOM when constructing its result value.
+ *
+ * @param impl class implementing {@link org.w3c.dom.Document Document}
+ * which publicly exposes a default constructor
+ *
+ * @exception SAXException when there is a problem creating an
+ * empty DOM document using the specified implementation
+ */
+ public DomConsumer (Class impl)
+ throws SAXException
+ {
+ domImpl = impl;
+ handler = new Handler (this);
+ }
+
+ /**
+ * This is the hook through which a subclass provides a handler
+ * which knows how to access DOM extensions, specific to some
+ * implementation, to record additional data in a DOM.
+ * Treat this as part of construction; don't call it except
+ * before (or between) parses.
+ */
+ protected void setHandler (Handler h)
+ {
+ handler = h;
+ }
+
+
+ private Document emptyDocument ()
+ throws SAXException
+ {
+ try {
+ return (Document) domImpl.newInstance ();
+ } catch (IllegalAccessException e) {
+ throw new SAXException ("can't access constructor: "
+ + e.getMessage ());
+ } catch (InstantiationException e) {
+ throw new SAXException ("can't instantiate Document: "
+ + e.getMessage ());
+ }
+ }
+
+
+ /**
+ * Configures this consumer as a buffer/filter, using the specified
+ * DOM implementation when constructing its result value.
+ *
+ * <p> This event consumer acts as a buffer and filter, in that it
+ * builds a DOM tree and then writes it out when <em>endDocument</em>
+ * is invoked. Because of the limitations of DOM, much information
+ * will as a rule not be seen in that replay. To get a full fidelity
+ * copy of the input event stream, use a {@link TeeConsumer}.
+ *
+ * @param impl class implementing {@link org.w3c.dom.Document Document}
+ * which publicly exposes a default constructor
+ * @param next receives a "replayed" sequence of parse events when
+ * the <em>endDocument</em> method is invoked.
+ *
+ * @exception SAXException when there is a problem creating an
+ * empty DOM document using the specified DOM implementation
+ */
+ public DomConsumer (Class impl, EventConsumer n)
+ throws SAXException
+ {
+ this (impl);
+ next = n;
+ }
+
+
+ /**
+ * Returns the document constructed from the preceding
+ * sequence of events. This method should not be
+ * used again until another sequence of events has been
+ * given to this EventConsumer.
+ */
+ final public Document getDocument ()
+ {
+ return handler.clearDocument ();
+ }
+
+ public void setErrorHandler (ErrorHandler handler)
+ {
+ errHandler = handler;
+ }
+
+
+ /**
+ * Returns true if the consumer is hiding entity references nodes
+ * (the default), and false if EntityReference nodes should
+ * instead be created. Such EntityReference nodes will normally be
+ * empty, unless an implementation arranges to populate them and then
+ * turn them back into readonly objects.
+ *
+ * @see #setHidingReferences
+ */
+ final public boolean isHidingReferences ()
+ { return hidingReferences; }
+
+ /**
+ * Controls whether the consumer will hide entity expansions,
+ * or will instead mark them with entity reference nodes.
+ *
+ * @see #isHidingReferences
+ * @param flag False if entity reference nodes will appear
+ */
+ final public void setHidingReferences (boolean flag)
+ { hidingReferences = flag; }
+
+
+ /**
+ * Returns true if the consumer is hiding comments (the default),
+ * and false if they should be placed into the output document.
+ *
+ * @see #setHidingComments
+ */
+ public final boolean isHidingComments ()
+ { return hidingComments; }
+
+ /**
+ * Controls whether the consumer is hiding comments.
+ *
+ * @see #isHidingComments
+ */
+ public final void setHidingComments (boolean flag)
+ { hidingComments = flag; }
+
+
+ /**
+ * Returns true if the consumer is hiding ignorable whitespace
+ * (the default), and false if such whitespace should be placed
+ * into the output document as children of element nodes.
+ *
+ * @see #setHidingWhitespace
+ */
+ public final boolean isHidingWhitespace ()
+ { return hidingWhitespace; }
+
+ /**
+ * Controls whether the consumer hides ignorable whitespace
+ *
+ * @see #isHidingComments
+ */
+ public final void setHidingWhitespace (boolean flag)
+ { hidingWhitespace = flag; }
+
+
+ /**
+ * Returns true if the consumer is saving CDATA boundaries, or
+ * false (the default) otherwise.
+ *
+ * @see #setHidingCDATA
+ */
+ final public boolean isHidingCDATA ()
+ { return hidingCDATA; }
+
+ /**
+ * Controls whether the consumer will save CDATA boundaries.
+ *
+ * @see #isHidingCDATA
+ * @param flag True to treat CDATA text differently from other
+ * text nodes
+ */
+ final public void setHidingCDATA (boolean flag)
+ { hidingCDATA = flag; }
+
+
+
+ /** Returns the document handler being used. */
+ final public ContentHandler getContentHandler ()
+ { return handler; }
+
+ /** Returns the DTD handler being used. */
+ final public DTDHandler getDTDHandler ()
+ { return handler; }
+
+ /**
+ * Returns the lexical handler being used.
+ * (DOM construction can't really use declaration handlers.)
+ */
+ final public Object getProperty (String id)
+ throws SAXNotRecognizedException
+ {
+ if ("http://xml.org/sax/properties/lexical-handler".equals (id))
+ return handler;
+ if ("http://xml.org/sax/properties/declaration-handler".equals (id))
+ return handler;
+ throw new SAXNotRecognizedException (id);
+ }
+
+ EventConsumer getNext () { return next; }
+
+ ErrorHandler getErrorHandler () { return errHandler; }
+
+ /**
+ * Class used to intercept various parsing events and use them to
+ * populate a DOM document. Subclasses would typically know and use
+ * backdoors into specific DOM implementations, used to implement
+ * DTD-related functionality.
+ *
+ * <p> Note that if this ever throws a DOMException (runtime exception)
+ * that will indicate a bug in the DOM (e.g. doesn't support something
+ * per specification) or the parser (e.g. emitted an illegal name, or
+ * accepted illegal input data). </p>
+ */
+ public static class Handler
+ implements ContentHandler, LexicalHandler,
+ DTDHandler, DeclHandler
+ {
+ protected DomConsumer consumer;
+
+ private DOMImplementation impl;
+ private Document document;
+ private boolean isL2;
+
+ private Locator locator;
+ private Node top;
+ private boolean inCDATA;
+ private boolean mergeCDATA;
+ private boolean inDTD;
+ private String currentEntity;
+
+ private boolean recreatedAttrs;
+ private AttributesImpl attributes = new AttributesImpl ();
+
+ /**
+ * Subclasses may use SAX2 events to provide additional
+ * behaviors in the resulting DOM.
+ */
+ protected Handler (DomConsumer consumer)
+ throws SAXException
+ {
+ this.consumer = consumer;
+ document = consumer.emptyDocument ();
+ impl = document.getImplementation ();
+ isL2 = impl.hasFeature ("XML", "2.0");
+ }
+
+ private void fatal (String message, Exception x)
+ throws SAXException
+ {
+ SAXParseException e;
+ ErrorHandler errHandler = consumer.getErrorHandler ();
+
+ if (locator == null)
+ e = new SAXParseException (message, null, null, -1, -1, x);
+ else
+ e = new SAXParseException (message, locator, x);
+ if (errHandler != null)
+ errHandler.fatalError (e);
+ throw e;
+ }
+
+ /**
+ * Returns and forgets the document produced. If the handler is
+ * reused, a new document may be created.
+ */
+ Document clearDocument ()
+ {
+ Document retval = document;
+ document = null;
+ locator = null;
+ return retval;
+ }
+
+ /**
+ * Returns the document under construction.
+ */
+ protected Document getDocument ()
+ { return document; }
+
+ /**
+ * Returns the current node being populated. This is usually
+ * an Element or Document, but it might be an EntityReference
+ * node if some implementation-specific code knows how to put
+ * those into the result tree and later mark them as readonly.
+ */
+ protected Node getTop ()
+ { return top; }
+
+
+ // SAX1
+ public void setDocumentLocator (Locator locator)
+ {
+ this.locator = locator;
+ }
+
+ // SAX1
+ public void startDocument ()
+ throws SAXException
+ {
+ if (document == null)
+ try {
+ if (isL2) {
+ // couple to original implementation
+ document = impl.createDocument (null, "foo", null);
+ document.removeChild (document.getFirstChild ());
+ } else {
+ document = consumer.emptyDocument ();
+ }
+ } catch (Exception e) {
+ fatal ("DOM create document", e);
+ }
+ top = document;
+ }
+
+ // SAX1
+ public void endDocument ()
+ throws SAXException
+ {
+ try {
+ if (consumer.getNext () != null && document != null) {
+ DomParser parser = new DomParser (document);
+
+ EventFilter.bind (parser, consumer.getNext ());
+ parser.parse ("ignored");
+ }
+ } finally {
+ top = null;
+ }
+ }
+
+ // SAX1
+ public void processingInstruction (String target, String data)
+ throws SAXException
+ {
+ // we can't create populated entity ref nodes using
+ // only public DOM APIs (they've got to be readonly)
+ if (currentEntity != null)
+ return;
+
+ ProcessingInstruction pi;
+
+ if (isL2
+ // && consumer.isUsingNamespaces ()
+ && target.indexOf (':') != -1)
+ namespaceError (
+ "PI target name is namespace nonconformant: "
+ + target);
+ if (inDTD)
+ return;
+ pi = document.createProcessingInstruction (target, data);
+ top.appendChild (pi);
+ }
+
+ /**
+ * Subclasses may overrride this method to provide a more efficient
+ * way to construct text nodes.
+ * Typically, copying the text into a single character array will
+ * be more efficient than doing that as well as allocating other
+ * needed for a String, including an internal StringBuffer.
+ * Those additional memory and CPU costs can be incurred later,
+ * if ever needed.
+ * Unfortunately the standard DOM factory APIs encourage those costs
+ * to be incurred early.
+ */
+ protected Text createText (
+ boolean isCDATA,
+ char ch [],
+ int start,
+ int length
+ ) {
+ String value = new String (ch, start, length);
+
+ if (isCDATA)
+ return document.createCDATASection (value);
+ else
+ return document.createTextNode (value);
+ }
+
+ // SAX1
+ public void characters (char ch [], int start, int length)
+ throws SAXException
+ {
+ // we can't create populated entity ref nodes using
+ // only public DOM APIs (they've got to be readonly
+ // at creation time)
+ if (currentEntity != null)
+ return;
+
+ Node lastChild = top.getLastChild ();
+
+ // merge consecutive text or CDATA nodes if appropriate.
+ if (lastChild instanceof Text) {
+ if (consumer.isHidingCDATA ()
+ // consecutive Text content ... always merge
+ || (!inCDATA
+ && !(lastChild instanceof CDATASection))
+ // consecutive CDATASection content ... don't
+ // merge between sections, only within them
+ || (inCDATA && mergeCDATA
+ && lastChild instanceof CDATASection)
+ ) {
+ CharacterData last = (CharacterData) lastChild;
+ String value = new String (ch, start, length);
+
+ last.appendData (value);
+ return;
+ }
+ }
+ if (inCDATA && !consumer.isHidingCDATA ()) {
+ top.appendChild (createText (true, ch, start, length));
+ mergeCDATA = true;
+ } else
+ top.appendChild (createText (false, ch, start, length));
+ }
+
+ // SAX2
+ public void skippedEntity (String name)
+ throws SAXException
+ {
+ // this callback is useless except to report errors, since
+ // we can't know if the ref was in content, within an
+ // attribute, within a declaration ... only one of those
+ // cases supports more intelligent action than a panic.
+ fatal ("skipped entity: " + name, null);
+ }
+
+ // SAX2
+ public void startPrefixMapping (String prefix, String uri)
+ throws SAXException
+ {
+ // reconstruct "xmlns" attributes deleted by all
+ // SAX2 parsers without "namespace-prefixes" = true
+ if ("".equals (prefix))
+ attributes.addAttribute ("", "", "xmlns",
+ "CDATA", uri);
+ else
+ attributes.addAttribute ("", "", "xmlns:" + prefix,
+ "CDATA", uri);
+ recreatedAttrs = true;
+ }
+
+ // SAX2
+ public void endPrefixMapping (String prefix)
+ throws SAXException
+ { }
+
+ // SAX2
+ public void startElement (
+ String uri,
+ String localName,
+ String qName,
+ Attributes atts
+ ) throws SAXException
+ {
+ // we can't create populated entity ref nodes using
+ // only public DOM APIs (they've got to be readonly)
+ if (currentEntity != null)
+ return;
+
+ // parser discarded basic information; DOM tree isn't writable
+ // without massaging to assign prefixes to all nodes.
+ // the "NSFilter" class does that massaging.
+ if (qName.length () == 0)
+ qName = localName;
+
+
+ Element element;
+ int length = atts.getLength ();
+
+ if (!isL2) {
+ element = document.createElement (qName);
+
+ // first the explicit attributes ...
+ length = atts.getLength ();
+ for (int i = 0; i < length; i++)
+ element.setAttribute (atts.getQName (i),
+ atts.getValue (i));
+ // ... then any recreated ones (DOM deletes duplicates)
+ if (recreatedAttrs) {
+ recreatedAttrs = false;
+ length = attributes.getLength ();
+ for (int i = 0; i < length; i++)
+ element.setAttribute (attributes.getQName (i),
+ attributes.getValue (i));
+ attributes.clear ();
+ }
+
+ top.appendChild (element);
+ top = element;
+ return;
+ }
+
+ // For an L2 DOM when namespace use is enabled, use
+ // createElementNS/createAttributeNS except when
+ // (a) it's an element in the default namespace, or
+ // (b) it's an attribute with no prefix
+ String namespace;
+
+ if (localName.length () != 0)
+ namespace = (uri.length () == 0) ? null : uri;
+ else
+ namespace = getNamespace (getPrefix (qName), atts);
+
+ if (namespace == null)
+ element = document.createElement (qName);
+ else
+ element = document.createElementNS (namespace, qName);
+
+ populateAttributes (element, atts);
+ if (recreatedAttrs) {
+ recreatedAttrs = false;
+ // ... DOM deletes any duplicates
+ populateAttributes (element, attributes);
+ attributes.clear ();
+ }
+
+ top.appendChild (element);
+ top = element;
+ }
+
+ final static String xmlnsURI = "http://www.w3.org/2000/xmlns/";
+
+ private void populateAttributes (Element element, Attributes attrs)
+ throws SAXParseException
+ {
+ int length = attrs.getLength ();
+
+ for (int i = 0; i < length; i++) {
+ String type = attrs.getType (i);
+ String value = attrs.getValue (i);
+ String name = attrs.getQName (i);
+ String local = attrs.getLocalName (i);
+ String uri = attrs.getURI (i);
+
+ // parser discarded basic information, DOM tree isn't writable
+ if (name.length () == 0)
+ name = local;
+
+ // all attribute types other than these three may not
+ // contain scoped names... enumerated attributes get
+ // reported as NMTOKEN, except for NOTATION values
+ if (!("CDATA".equals (type)
+ || "NMTOKEN".equals (type)
+ || "NMTOKENS".equals (type))) {
+ if (value.indexOf (':') != -1) {
+ namespaceError (
+ "namespace nonconformant attribute value: "
+ + "<" + element.getNodeName ()
+ + " " + name + "='" + value + "' ...>");
+ }
+ }
+
+ // xmlns="" is legal (undoes default NS)
+ // xmlns:foo="" is illegal
+ String prefix = getPrefix (name);
+ String namespace;
+
+ if ("xmlns".equals (prefix)) {
+ if ("".equals (value))
+ namespaceError ("illegal null namespace decl, " + name);
+ namespace = xmlnsURI;
+ } else if ("xmlns".equals (name))
+ namespace = xmlnsURI;
+
+ else if (prefix == null)
+ namespace = null;
+ else if (!"".equals(uri) && uri.length () != 0)
+ namespace = uri;
+ else
+ namespace = getNamespace (prefix, attrs);
+
+ if (namespace == null)
+ element.setAttribute (name, value);
+ else
+ element.setAttributeNS (namespace, name, value);
+ }
+ }
+
+ private String getPrefix (String name)
+ {
+ int temp;
+
+ if ((temp = name.indexOf (':')) > 0)
+ return name.substring (0, temp);
+ return null;
+ }
+
+ // used with SAX1-level parser output
+ private String getNamespace (String prefix, Attributes attrs)
+ throws SAXParseException
+ {
+ String namespace;
+ String decl;
+
+ // defaulting
+ if (prefix == null) {
+ decl = "xmlns";
+ namespace = attrs.getValue (decl);
+ if ("".equals (namespace))
+ return null;
+ else if (namespace != null)
+ return namespace;
+
+ // "xmlns" is like a keyword
+ // ... according to the Namespace REC, but DOM L2 CR2+
+ // and Infoset violate that by assigning a namespace.
+ // that conflict is resolved elsewhere.
+ } else if ("xmlns".equals (prefix))
+ return null;
+
+ // "xml" prefix is fixed
+ else if ("xml".equals (prefix))
+ return "http://www.w3.org/XML/1998/namespace";
+
+ // otherwise, expect a declaration
+ else {
+ decl = "xmlns:" + prefix;
+ namespace = attrs.getValue (decl);
+ }
+
+ // if we found a local declaration, great
+ if (namespace != null)
+ return namespace;
+
+
+ // ELSE ... search up the tree we've been building
+ for (Node n = top;
+ n != null && n.getNodeType () != Node.DOCUMENT_NODE;
+ n = n.getParentNode ()) {
+ if (n.getNodeType () == Node.ENTITY_REFERENCE_NODE)
+ continue;
+ Element e = (Element) n;
+ Attr attr = e.getAttributeNode (decl);
+ if (attr != null)
+ return attr.getNodeValue ();
+ }
+ // see above re "xmlns" as keyword
+ if ("xmlns".equals (decl))
+ return null;
+
+ namespaceError ("Undeclared namespace prefix: " + prefix);
+ return null;
+ }
+
+ // SAX2
+ public void endElement (String uri, String localName, String qName)
+ throws SAXException
+ {
+ // we can't create populated entity ref nodes using
+ // only public DOM APIs (they've got to be readonly)
+ if (currentEntity != null)
+ return;
+
+ top = top.getParentNode ();
+ }
+
+ // SAX1 (mandatory reporting if validating)
+ public void ignorableWhitespace (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (consumer.isHidingWhitespace ())
+ return;
+ characters (ch, start, length);
+ }
+
+ // SAX2 lexical event
+ public void startCDATA ()
+ throws SAXException
+ {
+ inCDATA = true;
+ // true except for the first fragment of a cdata section
+ mergeCDATA = false;
+ }
+
+ // SAX2 lexical event
+ public void endCDATA ()
+ throws SAXException
+ {
+ inCDATA = false;
+ }
+
+ // SAX2 lexical event
+ //
+ // this SAX2 callback merges two unrelated things:
+ // - Declaration of the root element type ... belongs with
+ // the other DTD declaration methods, NOT HERE.
+ // - IDs for the optional external subset ... belongs here
+ // with other lexical information.
+ //
+ // ...and it doesn't include the internal DTD subset, desired
+ // both to support DOM L2 and to enable "pass through" processing
+ //
+ public void startDTD (String name, String publicId, String SystemId)
+ throws SAXException
+ {
+ // need to filter out comments and PIs within the DTD
+ inDTD = true;
+ }
+
+ // SAX2 lexical event
+ public void endDTD ()
+ throws SAXException
+ {
+ inDTD = false;
+ }
+
+ // SAX2 lexical event
+ public void comment (char ch [], int start, int length)
+ throws SAXException
+ {
+ Node comment;
+
+ // we can't create populated entity ref nodes using
+ // only public DOM APIs (they've got to be readonly)
+ if (consumer.isHidingComments ()
+ || inDTD
+ || currentEntity != null)
+ return;
+ comment = document.createComment (new String (ch, start, length));
+ top.appendChild (comment);
+ }
+
+ /**
+ * May be overridden by subclasses to return true, indicating
+ * that entity reference nodes can be populated and then made
+ * read-only.
+ */
+ public boolean canPopulateEntityRefs ()
+ { return false; }
+
+ // SAX2 lexical event
+ public void startEntity (String name)
+ throws SAXException
+ {
+ // are we ignoring what would be contents of an
+ // entity ref, since we can't populate it?
+ if (currentEntity != null)
+ return;
+
+ // Are we hiding all entity boundaries?
+ if (consumer.isHidingReferences ())
+ return;
+
+ // SAX2 shows parameter entities; DOM hides them
+ if (name.charAt (0) == '%' || "[dtd]".equals (name))
+ return;
+
+ // Since we can't create a populated entity ref node in any
+ // standard way, we create an unpopulated one.
+ EntityReference ref = document.createEntityReference (name);
+ top.appendChild (ref);
+ top = ref;
+
+ // ... allowing subclasses to populate them
+ if (!canPopulateEntityRefs ())
+ currentEntity = name;
+ }
+
+ // SAX2 lexical event
+ public void endEntity (String name)
+ throws SAXException
+ {
+ if (name.charAt (0) == '%' || "[dtd]".equals (name))
+ return;
+ if (name.equals (currentEntity))
+ currentEntity = null;
+ if (!consumer.isHidingReferences ())
+ top = top.getParentNode ();
+ }
+
+
+ // SAX1 DTD event
+ public void notationDecl (
+ String name,
+ String publicId, String SystemId
+ ) throws SAXException
+ {
+ /* IGNORE -- no public DOM API lets us store these
+ * into the doctype node
+ */
+ }
+
+ // SAX1 DTD event
+ public void unparsedEntityDecl (
+ String name,
+ String publicId, String SystemId,
+ String notationName
+ ) throws SAXException
+ {
+ /* IGNORE -- no public DOM API lets us store these
+ * into the doctype node
+ */
+ }
+
+ // SAX2 declaration event
+ public void elementDecl (String name, String model)
+ throws SAXException
+ {
+ /* IGNORE -- no content model support in DOM L2 */
+ }
+
+ // SAX2 declaration event
+ public void attributeDecl (
+ String eName,
+ String aName,
+ String type,
+ String mode,
+ String value
+ ) throws SAXException
+ {
+ /* IGNORE -- no attribute model support in DOM L2 */
+ }
+
+ // SAX2 declaration event
+ public void internalEntityDecl (String name, String value)
+ throws SAXException
+ {
+ /* IGNORE -- no public DOM API lets us store these
+ * into the doctype node
+ */
+ }
+
+ // SAX2 declaration event
+ public void externalEntityDecl (
+ String name,
+ String publicId,
+ String SystemId
+ ) throws SAXException
+ {
+ /* IGNORE -- no public DOM API lets us store these
+ * into the doctype node
+ */
+ }
+
+ //
+ // These really should offer the option of nonfatal handling,
+ // like other validity errors, though that would cause major
+ // chaos in the DOM data structures. DOM is already spec'd
+ // to treat many of these as fatal, so this is consistent.
+ //
+ private void namespaceError (String description)
+ throws SAXParseException
+ {
+ SAXParseException err;
+
+ err = new SAXParseException (description, locator);
+ throw err;
+ }
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/EventConsumer.java b/libjava/classpath/gnu/xml/pipeline/EventConsumer.java
new file mode 100644
index 000000000..a0a8824f7
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/EventConsumer.java
@@ -0,0 +1,95 @@
+/* EventConsumer.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import org.xml.sax.*;
+
+
+/**
+ * Collects the event consumption apparatus of a SAX pipeline stage.
+ * Consumers which permit some handlers or other characteristics to be
+ * configured will provide methods to support that configuration.
+ *
+ * <p> Two important categories of consumers include <em>filters</em>, which
+ * process events and pass them on to other consumers, and <em>terminus</em>
+ * (or <em>terminal</em>) stages, which don't pass events on. Filters are not
+ * necessarily derived from the {@link EventFilter} class, although that
+ * class can substantially simplify their construction by automating the
+ * most common activities.
+ *
+ * <p> Event consumers which follow certain conventions for the signatures
+ * of their constructors can be automatically assembled into pipelines
+ * by the {@link PipelineFactory} class.
+ *
+ * @author David Brownell
+ */
+public interface EventConsumer
+{
+ /** Most stages process these core SAX callbacks. */
+ public ContentHandler getContentHandler ();
+
+ /** Few stages will use unparsed entities. */
+ public DTDHandler getDTDHandler ();
+
+ /**
+ * This method works like the SAX2 XMLReader method of the same name,
+ * and is used to retrieve the optional lexical and declaration handlers
+ * in a pipeline.
+ *
+ * @param id This is a URI identifying the type of property desired.
+ * @return The value of that property, if it is defined.
+ *
+ * @exception SAXNotRecognizedException Thrown if the particular
+ * pipeline stage does not understand the specified identifier.
+ */
+ public Object getProperty (String id)
+ throws SAXNotRecognizedException;
+
+ /**
+ * This method provides a filter stage with a handler that abstracts
+ * presentation of warnings and both recoverable and fatal errors.
+ * Most pipeline stages should share a single policy and mechanism
+ * for such reports, since application components require consistency
+ * in such activities. Accordingly, typical responses to this method
+ * invocation involve saving the handler for use; filters will pass
+ * it on to any other consumers they use.
+ *
+ * @param handler encapsulates error handling policy for this stage
+ */
+ public void setErrorHandler (ErrorHandler handler);
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/EventFilter.java b/libjava/classpath/gnu/xml/pipeline/EventFilter.java
new file mode 100644
index 000000000..b3cc2d654
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/EventFilter.java
@@ -0,0 +1,796 @@
+/* EventFilter.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+
+import org.xml.sax.*;
+import org.xml.sax.ext.*;
+import org.xml.sax.helpers.XMLFilterImpl;
+
+/**
+ * A customizable event consumer, used to assemble various kinds of filters
+ * using SAX handlers and an optional second consumer. It can be constructed
+ * in two ways: <ul>
+ *
+ * <li> To serve as a passthrough, sending all events to a second consumer.
+ * The second consumer may be identified through {@link #getNext}.
+ *
+ * <li> To serve as a dead end, with all handlers null;
+ * {@link #getNext} returns null.
+ *
+ * </ul>
+ *
+ * <p> Additionally, SAX handlers may be assigned, which completely replace
+ * the "upstream" view (through {@link EventConsumer}) of handlers, initially
+ * null or the "next" consumer provided to the constructor. To make
+ * it easier to build specialized filter classes, this class implements
+ * all the standard SAX consumer handlers, and those implementations
+ * delegate "downstream" to the consumer accessed by {@link #getNext}.
+ *
+ * <p> The simplest way to create a custom a filter class is to create a
+ * subclass which overrides one or more handler interface methods. The
+ * constructor for that subclass then registers itself as a handler for
+ * those interfaces using a call such as <em>setContentHandler(this)</em>,
+ * so the "upstream" view of event delivery is modified from the state
+ * established in the base class constructor. That way,
+ * the overridden methods intercept those event callbacks
+ * as they go "downstream", and
+ * all other event callbacks will pass events to any next consumer.
+ * Overridden methods may invoke superclass methods (perhaps after modifying
+ * parameters) if they wish to delegate such calls. Such subclasses
+ * should use {@link #getErrorHandler} to report errors using the
+ * common error reporting mechanism.
+ *
+ * <p> Another important technique is to construct a filter consisting
+ * of only a few specific types of handler. For example, one could easily
+ * prune out lexical events or various declarations by providing handlers
+ * which don't pass those events downstream, or by providing null handlers.
+ *
+ * <hr />
+ *
+ * <p> This may be viewed as the consumer oriented analogue of the SAX2
+ * {@link org.xml.sax.helpers.XMLFilterImpl XMLFilterImpl} class.
+ * Key differences include: <ul>
+ *
+ * <li> This fully separates consumer and producer roles: it
+ * does not implement the producer side <em>XMLReader</em> or
+ * <em>EntityResolver</em> interfaces, so it can only be used
+ * in "push" mode (it has no <em>parse()</em> methods).
+ *
+ * <li> "Extension" handlers are fully supported, enabling a
+ * richer set of application requirements.
+ * And it implements {@link EventConsumer}, which groups related
+ * consumer methods together, rather than leaving them separated.
+ *
+ * <li> The chaining which is visible is "downstream" to the next
+ * consumer, not "upstream" to the preceding producer.
+ * It supports "fan-in", where
+ * a consumer can be fed by several producers. (For "fan-out",
+ * see the {@link TeeConsumer} class.)
+ *
+ * <li> Event chaining is set up differently. It is intended to
+ * work "upstream" from terminus towards producer, during filter
+ * construction, as described above.
+ * This is part of an early binding model:
+ * events don't need to pass through stages which ignore them.
+ *
+ * <li> ErrorHandler support is separated, on the grounds that
+ * pipeline stages need to share the same error handling policy.
+ * For the same reason, error handler setup goes "downstream":
+ * when error handlers get set, they are passed to subsequent
+ * consumers.
+ *
+ * </ul>
+ *
+ * <p> The {@link #chainTo chainTo()} convenience routine supports chaining to
+ * an XMLFilterImpl, in its role as a limited functionality event
+ * consumer. Its event producer role ({@link XMLFilter}) is ignored.
+ *
+ * <hr />
+ *
+ * <p> The {@link #bind bind()} routine may be used associate event pipelines
+ * with any kind of {@link XMLReader} that will produce the events.
+ * Such pipelines don't necessarily need to have any members which are
+ * implemented using this class. That routine has some intelligence
+ * which supports automatic changes to parser feature flags, letting
+ * event piplines become largely independent of the particular feature
+ * sets of parsers.
+ *
+ * @author David Brownell
+ */
+public class EventFilter
+ implements EventConsumer, ContentHandler, DTDHandler,
+ LexicalHandler, DeclHandler
+{
+ // SAX handlers
+ private ContentHandler docHandler, docNext;
+ private DTDHandler dtdHandler, dtdNext;
+ private LexicalHandler lexHandler, lexNext;
+ private DeclHandler declHandler, declNext;
+ // and ideally, one more for the stuff SAX2 doesn't show
+
+ private Locator locator;
+ private EventConsumer next;
+ private ErrorHandler errHandler;
+
+
+ /** SAX2 URI prefix for standard feature flags. */
+ public static final String FEATURE_URI
+ = "http://xml.org/sax/features/";
+ /** SAX2 URI prefix for standard properties (mostly for handlers). */
+ public static final String PROPERTY_URI
+ = "http://xml.org/sax/properties/";
+
+ /** SAX2 property identifier for {@link DeclHandler} events */
+ public static final String DECL_HANDLER
+ = PROPERTY_URI + "declaration-handler";
+ /** SAX2 property identifier for {@link LexicalHandler} events */
+ public static final String LEXICAL_HANDLER
+ = PROPERTY_URI + "lexical-handler";
+
+ //
+ // These class objects will be null if the relevant class isn't linked.
+ // Small configurations (pJava and some kinds of embedded systems) need
+ // to facilitate smaller executables. So "instanceof" is undesirable
+ // when bind() sees if it can remove some stages.
+ //
+ // SECURITY NOTE: assuming all these classes are part of the same sealed
+ // package, there's no problem saving these in the instance of this class
+ // that's associated with "this" class loader. But that wouldn't be true
+ // for classes in another package.
+ //
+ private static boolean loaded;
+ private static Class nsClass;
+ private static Class validClass;
+ private static Class wfClass;
+ private static Class xincClass;
+
+ static ClassLoader getClassLoader ()
+ {
+ Method m = null;
+
+ try {
+ m = Thread.class.getMethod("getContextClassLoader");
+ } catch (NoSuchMethodException e) {
+ // Assume that we are running JDK 1.1, use the current ClassLoader
+ return EventFilter.class.getClassLoader();
+ }
+
+ try {
+ return (ClassLoader) m.invoke(Thread.currentThread());
+ } catch (IllegalAccessException e) {
+ // assert(false)
+ throw new UnknownError(e.getMessage());
+ } catch (InvocationTargetException e) {
+ // assert(e.getTargetException() instanceof SecurityException)
+ throw new UnknownError(e.getMessage());
+ }
+ }
+
+ static Class loadClass (ClassLoader classLoader, String className)
+ {
+ try {
+ if (classLoader == null)
+ return Class.forName(className);
+ else
+ return classLoader.loadClass(className);
+ } catch (Exception e) {
+ return null;
+ }
+ }
+
+ static private void loadClasses ()
+ {
+ ClassLoader loader = getClassLoader ();
+
+ nsClass = loadClass (loader, "gnu.xml.pipeline.NSFilter");
+ validClass = loadClass (loader, "gnu.xml.pipeline.ValidationConsumer");
+ wfClass = loadClass (loader, "gnu.xml.pipeline.WellFormednessFilter");
+ xincClass = loadClass (loader, "gnu.xml.pipeline.XIncludeFilter");
+ loaded = true;
+ }
+
+
+ /**
+ * Binds the standard SAX2 handlers from the specified consumer
+ * pipeline to the specified producer. These handlers include the core
+ * {@link ContentHandler} and {@link DTDHandler}, plus the extension
+ * {@link DeclHandler} and {@link LexicalHandler}. Any additional
+ * application-specific handlers need to be bound separately.
+ * The {@link ErrorHandler} is handled differently: the producer's
+ * error handler is passed through to the consumer pipeline.
+ * The producer is told to include namespace prefix information if it
+ * can, since many pipeline stages need that Infoset information to
+ * work well.
+ *
+ * <p> At the head of the pipeline, certain standard event filters are
+ * recognized and handled specially. This facilitates construction
+ * of processing pipelines that work regardless of the capabilities
+ * of the XMLReader implementation in use; for example, it permits
+ * validating output of a {@link gnu.xml.util.DomParser}. <ul>
+ *
+ * <li> {@link NSFilter} will be removed if the producer can be
+ * told not to discard namespace data, using the "namespace-prefixes"
+ * feature flag.
+ *
+ * <li> {@link ValidationConsumer} will be removed if the producer
+ * can be told to validate, using the "validation" feature flag.
+ *
+ * <li> {@link WellFormednessFilter} is always removed, on the
+ * grounds that no XMLReader is permitted to producee malformed
+ * event streams and this would just be processing overhead.
+ *
+ * <li> {@link XIncludeFilter} stops the special handling, except
+ * that it's told about the "namespace-prefixes" feature of the
+ * event producer so that the event stream is internally consistent.
+ *
+ * <li> The first consumer which is not one of those classes stops
+ * such special handling. This means that if you want to force
+ * one of those filters to be used, you could just precede it with
+ * an instance of {@link EventFilter} configured as a pass-through.
+ * You might need to do that if you are using an {@link NSFilter}
+ * subclass to fix names found in attributes or character data.
+ *
+ * </ul>
+ *
+ * <p> Other than that, this method works with any kind of event consumer,
+ * not just event filters. Note that in all cases, the standard handlers
+ * are assigned; any previous handler assignments for the handler will
+ * be overridden.
+ *
+ * @param producer will deliver events to the specified consumer
+ * @param consumer pipeline supplying event handlers to be associated
+ * with the producer (may not be null)
+ */
+ public static void bind (XMLReader producer, EventConsumer consumer)
+ {
+ Class klass = null;
+ boolean prefixes;
+
+ if (!loaded)
+ loadClasses ();
+
+ // DOM building, printing, layered validation, and other
+ // things don't work well when prefix info is discarded.
+ // Include it by default, whenever possible.
+ try {
+ producer.setFeature (FEATURE_URI + "namespace-prefixes",
+ true);
+ prefixes = true;
+ } catch (SAXException e) {
+ prefixes = false;
+ }
+
+ // NOTE: This loop doesn't use "instanceof", since that
+ // would prevent compiling/linking without those classes
+ // being present.
+ while (consumer != null) {
+ klass = consumer.getClass ();
+
+ // we might have already changed this problematic SAX2 default.
+ if (nsClass != null && nsClass.isAssignableFrom (klass)) {
+ if (!prefixes)
+ break;
+ consumer = ((EventFilter)consumer).getNext ();
+
+ // the parser _might_ do DTD validation by default ...
+ // if not, maybe we can change this setting.
+ } else if (validClass != null
+ && validClass.isAssignableFrom (klass)) {
+ try {
+ producer.setFeature (FEATURE_URI + "validation",
+ true);
+ consumer = ((ValidationConsumer)consumer).getNext ();
+ } catch (SAXException e) {
+ break;
+ }
+
+ // parsers are required not to have such bugs
+ } else if (wfClass != null && wfClass.isAssignableFrom (klass)) {
+ consumer = ((WellFormednessFilter)consumer).getNext ();
+
+ // stop on the first pipeline stage we can't remove
+ } else
+ break;
+
+ if (consumer == null)
+ klass = null;
+ }
+
+ // the actual setting here doesn't matter as much
+ // as that producer and consumer agree
+ if (xincClass != null && klass != null
+ && xincClass.isAssignableFrom (klass))
+ ((XIncludeFilter)consumer).setSavingPrefixes (prefixes);
+
+ // Some SAX parsers can't handle null handlers -- bleech
+ DefaultHandler2 h = new DefaultHandler2 ();
+
+ if (consumer != null && consumer.getContentHandler () != null)
+ producer.setContentHandler (consumer.getContentHandler ());
+ else
+ producer.setContentHandler (h);
+ if (consumer != null && consumer.getDTDHandler () != null)
+ producer.setDTDHandler (consumer.getDTDHandler ());
+ else
+ producer.setDTDHandler (h);
+
+ try {
+ Object dh;
+
+ if (consumer != null)
+ dh = consumer.getProperty (DECL_HANDLER);
+ else
+ dh = null;
+ if (dh == null)
+ dh = h;
+ producer.setProperty (DECL_HANDLER, dh);
+ } catch (Exception e) { /* ignore */ }
+ try {
+ Object lh;
+
+ if (consumer != null)
+ lh = consumer.getProperty (LEXICAL_HANDLER);
+ else
+ lh = null;
+ if (lh == null)
+ lh = h;
+ producer.setProperty (LEXICAL_HANDLER, lh);
+ } catch (Exception e) { /* ignore */ }
+
+ // this binding goes the other way around
+ if (producer.getErrorHandler () == null)
+ producer.setErrorHandler (h);
+ if (consumer != null)
+ consumer.setErrorHandler (producer.getErrorHandler ());
+ }
+
+ /**
+ * Initializes all handlers to null.
+ */
+ // constructor used by PipelineFactory
+ public EventFilter () { }
+
+
+ /**
+ * Handlers that are not otherwise set will default to those from
+ * the specified consumer, making it easy to pass events through.
+ * If the consumer is null, all handlers are initialzed to null.
+ */
+ // constructor used by PipelineFactory
+ public EventFilter (EventConsumer consumer)
+ {
+ if (consumer == null)
+ return;
+
+ next = consumer;
+
+ // We delegate through the "xxNext" handlers, and
+ // report the "xxHandler" ones on our input side.
+
+ // Normally a subclass would both override handler
+ // methods and register itself as the "xxHandler".
+
+ docHandler = docNext = consumer.getContentHandler ();
+ dtdHandler = dtdNext = consumer.getDTDHandler ();
+ try {
+ declHandler = declNext = (DeclHandler)
+ consumer.getProperty (DECL_HANDLER);
+ } catch (SAXException e) { /* leave value null */ }
+ try {
+ lexHandler = lexNext = (LexicalHandler)
+ consumer.getProperty (LEXICAL_HANDLER);
+ } catch (SAXException e) { /* leave value null */ }
+ }
+
+ /**
+ * Treats the XMLFilterImpl as a limited functionality event consumer,
+ * by arranging to deliver events to it; this lets such classes be
+ * "wrapped" as pipeline stages.
+ *
+ * <p> <em>Upstream Event Setup:</em>
+ * If no handlers have been assigned to this EventFilter, then the
+ * handlers from specified XMLFilterImpl are returned from this
+ * {@link EventConsumer}: the XMLFilterImpl is just "wrapped".
+ * Otherwise the specified handlers will be returned.
+ *
+ * <p> <em>Downstream Event Setup:</em>
+ * Subclasses may chain event delivery to the specified XMLFilterImpl
+ * by invoking the appropiate superclass methods,
+ * as if their constructor passed a "next" EventConsumer to the
+ * constructor for this class.
+ * If this EventFilter has an ErrorHandler, it is assigned as
+ * the error handler for the XMLFilterImpl, just as would be
+ * done for a next stage implementing {@link EventConsumer}.
+ *
+ * @param next the next downstream component of the pipeline.
+ * @exception IllegalStateException if the "next" consumer has
+ * already been set through the constructor.
+ */
+ public void chainTo (XMLFilterImpl next)
+ {
+ if (this.next != null)
+ throw new IllegalStateException ();
+
+ docNext = next.getContentHandler ();
+ if (docHandler == null)
+ docHandler = docNext;
+ dtdNext = next.getDTDHandler ();
+ if (dtdHandler == null)
+ dtdHandler = dtdNext;
+
+ try {
+ declNext = (DeclHandler) next.getProperty (DECL_HANDLER);
+ if (declHandler == null)
+ declHandler = declNext;
+ } catch (SAXException e) { /* leave value null */ }
+ try {
+ lexNext = (LexicalHandler) next.getProperty (LEXICAL_HANDLER);
+ if (lexHandler == null)
+ lexHandler = lexNext;
+ } catch (SAXException e) { /* leave value null */ }
+
+ if (errHandler != null)
+ next.setErrorHandler (errHandler);
+ }
+
+ /**
+ * Records the error handler that should be used by this stage, and
+ * passes it "downstream" to any subsequent stage.
+ */
+ final public void setErrorHandler (ErrorHandler handler)
+ {
+ errHandler = handler;
+ if (next != null)
+ next.setErrorHandler (handler);
+ }
+
+ /**
+ * Returns the error handler assigned this filter stage, or null
+ * if no such assigment has been made.
+ */
+ final public ErrorHandler getErrorHandler ()
+ {
+ return errHandler;
+ }
+
+
+ /**
+ * Returns the next event consumer in sequence; or null if there
+ * is no such handler.
+ */
+ final public EventConsumer getNext ()
+ { return next; }
+
+
+ /**
+ * Assigns the content handler to use; a null handler indicates
+ * that these events will not be forwarded.
+ * This overrides the previous settting for this handler, which was
+ * probably pointed to the next consumer by the base class constructor.
+ */
+ final public void setContentHandler (ContentHandler h)
+ {
+ docHandler = h;
+ }
+
+ /** Returns the content handler being used. */
+ final public ContentHandler getContentHandler ()
+ {
+ return docHandler;
+ }
+
+ /**
+ * Assigns the DTD handler to use; a null handler indicates
+ * that these events will not be forwarded.
+ * This overrides the previous settting for this handler, which was
+ * probably pointed to the next consumer by the base class constructor.
+ */
+ final public void setDTDHandler (DTDHandler h)
+ { dtdHandler = h; }
+
+ /** Returns the dtd handler being used. */
+ final public DTDHandler getDTDHandler ()
+ {
+ return dtdHandler;
+ }
+
+ /**
+ * Stores the property, normally a handler; a null handler indicates
+ * that these events will not be forwarded.
+ * This overrides the previous handler settting, which was probably
+ * pointed to the next consumer by the base class constructor.
+ */
+ final public void setProperty (String id, Object o)
+ throws SAXNotRecognizedException, SAXNotSupportedException
+ {
+ try {
+ Object value = getProperty (id);
+
+ if (value == o)
+ return;
+ if (DECL_HANDLER.equals (id)) {
+ declHandler = (DeclHandler) o;
+ return;
+ }
+ if (LEXICAL_HANDLER.equals (id)) {
+ lexHandler = (LexicalHandler) o;
+ return;
+ }
+ throw new SAXNotSupportedException (id);
+
+ } catch (ClassCastException e) {
+ throw new SAXNotSupportedException (id);
+ }
+ }
+
+ /** Retrieves a property of unknown intent (usually a handler) */
+ final public Object getProperty (String id)
+ throws SAXNotRecognizedException
+ {
+ if (DECL_HANDLER.equals (id))
+ return declHandler;
+ if (LEXICAL_HANDLER.equals (id))
+ return lexHandler;
+
+ throw new SAXNotRecognizedException (id);
+ }
+
+ /**
+ * Returns any locator provided to the next consumer, if this class
+ * (or a subclass) is handling {@link ContentHandler } events.
+ */
+ public Locator getDocumentLocator ()
+ { return locator; }
+
+
+ // CONTENT HANDLER DELEGATIONS
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void setDocumentLocator (Locator locator)
+ {
+ this.locator = locator;
+ if (docNext != null)
+ docNext.setDocumentLocator (locator);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void startDocument () throws SAXException
+ {
+ if (docNext != null)
+ docNext.startDocument ();
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void skippedEntity (String name) throws SAXException
+ {
+ if (docNext != null)
+ docNext.skippedEntity (name);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void processingInstruction (String target, String data)
+ throws SAXException
+ {
+ if (docNext != null)
+ docNext.processingInstruction (target, data);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void characters (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (docNext != null)
+ docNext.characters (ch, start, length);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void ignorableWhitespace (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (docNext != null)
+ docNext.ignorableWhitespace (ch, start, length);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void startPrefixMapping (String prefix, String uri)
+ throws SAXException
+ {
+ if (docNext != null)
+ docNext.startPrefixMapping (prefix, uri);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void startElement (
+ String uri, String localName,
+ String qName, Attributes atts
+ ) throws SAXException
+ {
+ if (docNext != null)
+ docNext.startElement (uri, localName, qName, atts);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void endElement (String uri, String localName, String qName)
+ throws SAXException
+ {
+ if (docNext != null)
+ docNext.endElement (uri, localName, qName);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void endPrefixMapping (String prefix) throws SAXException
+ {
+ if (docNext != null)
+ docNext.endPrefixMapping (prefix);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void endDocument () throws SAXException
+ {
+ if (docNext != null)
+ docNext.endDocument ();
+ locator = null;
+ }
+
+
+ // DTD HANDLER DELEGATIONS
+
+ /** <b>SAX1:</b> passes this callback to the next consumer, if any */
+ public void unparsedEntityDecl (
+ String name,
+ String publicId,
+ String systemId,
+ String notationName
+ ) throws SAXException
+ {
+ if (dtdNext != null)
+ dtdNext.unparsedEntityDecl (name, publicId, systemId, notationName);
+ }
+
+ /** <b>SAX1:</b> passes this callback to the next consumer, if any */
+ public void notationDecl (String name, String publicId, String systemId)
+ throws SAXException
+ {
+ if (dtdNext != null)
+ dtdNext.notationDecl (name, publicId, systemId);
+ }
+
+
+ // LEXICAL HANDLER DELEGATIONS
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void startDTD (String name, String publicId, String systemId)
+ throws SAXException
+ {
+ if (lexNext != null)
+ lexNext.startDTD (name, publicId, systemId);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void endDTD ()
+ throws SAXException
+ {
+ if (lexNext != null)
+ lexNext.endDTD ();
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void comment (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (lexNext != null)
+ lexNext.comment (ch, start, length);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void startCDATA ()
+ throws SAXException
+ {
+ if (lexNext != null)
+ lexNext.startCDATA ();
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void endCDATA ()
+ throws SAXException
+ {
+ if (lexNext != null)
+ lexNext.endCDATA ();
+ }
+
+ /**
+ * <b>SAX2:</b> passes this callback to the next consumer, if any.
+ */
+ public void startEntity (String name)
+ throws SAXException
+ {
+ if (lexNext != null)
+ lexNext.startEntity (name);
+ }
+
+ /**
+ * <b>SAX2:</b> passes this callback to the next consumer, if any.
+ */
+ public void endEntity (String name)
+ throws SAXException
+ {
+ if (lexNext != null)
+ lexNext.endEntity (name);
+ }
+
+
+ // DECLARATION HANDLER DELEGATIONS
+
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void elementDecl (String name, String model)
+ throws SAXException
+ {
+ if (declNext != null)
+ declNext.elementDecl (name, model);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void attributeDecl (String eName, String aName,
+ String type, String mode, String value)
+ throws SAXException
+ {
+ if (declNext != null)
+ declNext.attributeDecl (eName, aName, type, mode, value);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void externalEntityDecl (String name,
+ String publicId, String systemId)
+ throws SAXException
+ {
+ if (declNext != null)
+ declNext.externalEntityDecl (name, publicId, systemId);
+ }
+
+ /** <b>SAX2:</b> passes this callback to the next consumer, if any */
+ public void internalEntityDecl (String name, String value)
+ throws SAXException
+ {
+ if (declNext != null)
+ declNext.internalEntityDecl (name, value);
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/LinkFilter.java b/libjava/classpath/gnu/xml/pipeline/LinkFilter.java
new file mode 100644
index 000000000..e11a5eca6
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/LinkFilter.java
@@ -0,0 +1,242 @@
+/* LinkFilter.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.io.IOException;
+import java.net.URL;
+import java.util.Enumeration;
+import java.util.Vector;
+
+import org.xml.sax.Attributes;
+import org.xml.sax.SAXException;
+
+
+/**
+ * Pipeline filter to remember XHTML links found in a document,
+ * so they can later be crawled. Fragments are not counted, and duplicates
+ * are ignored. Callers are responsible for filtering out URLs they aren't
+ * interested in. Events are passed through unmodified.
+ *
+ * <p> Input MUST include a setDocumentLocator() call, as it's used to
+ * resolve relative links in the absence of a "base" element. Input MUST
+ * also include namespace identifiers, since it is the XHTML namespace
+ * identifier which is used to identify the relevant elements.
+ *
+ * <p><em>FIXME:</em> handle xml:base attribute ... in association with
+ * a stack of base URIs. Similarly, recognize/support XLink data.
+ *
+ * @author David Brownell
+ */
+public class LinkFilter extends EventFilter
+{
+ // for storing URIs
+ private Vector vector = new Vector ();
+
+ // struct for "full" link record (tbd)
+ // these for troubleshooting original source:
+ // original uri
+ // uri as resolved (base, relative, etc)
+ // URI of originating doc
+ // line #
+ // original element + attrs (img src, desc, etc)
+
+ // XLink model of the link ... for inter-site pairups ?
+
+ private String baseURI;
+
+ private boolean siteRestricted = false;
+
+ //
+ // XXX leverage blacklist info (like robots.txt)
+ //
+ // XXX constructor w/param ... pipeline for sending link data
+ // probably XHTML --> XLink, providing info as sketched above
+ //
+
+
+ /**
+ * Constructs a new event filter, which collects links in private data
+ * structure for later enumeration.
+ */
+ // constructor used by PipelineFactory
+ public LinkFilter ()
+ {
+ super.setContentHandler (this);
+ }
+
+
+ /**
+ * Constructs a new event filter, which collects links in private data
+ * structure for later enumeration and passes all events, unmodified,
+ * to the next consumer.
+ */
+ // constructor used by PipelineFactory
+ public LinkFilter (EventConsumer next)
+ {
+ super (next);
+ super.setContentHandler (this);
+ }
+
+
+ /**
+ * Returns an enumeration of the links found since the filter
+ * was constructed, or since removeAllLinks() was called.
+ *
+ * @return enumeration of strings.
+ */
+ public Enumeration getLinks ()
+ {
+ return vector.elements ();
+ }
+
+ /**
+ * Removes records about all links reported to the event
+ * stream, as if the filter were newly created.
+ */
+ public void removeAllLinks ()
+ {
+ vector = new Vector ();
+ }
+
+
+ /**
+ * Collects URIs for (X)HTML content from elements which hold them.
+ */
+ public void startElement (
+ String uri,
+ String localName,
+ String qName,
+ Attributes atts
+ ) throws SAXException
+ {
+ String link;
+
+ // Recognize XHTML links.
+ if ("http://www.w3.org/1999/xhtml".equals (uri)) {
+
+ if ("a".equals (localName) || "base".equals (localName)
+ || "area".equals (localName))
+ link = atts.getValue ("href");
+ else if ("iframe".equals (localName) || "frame".equals (localName))
+ link = atts.getValue ("src");
+ else if ("blockquote".equals (localName) || "q".equals (localName)
+ || "ins".equals (localName) || "del".equals (localName))
+ link = atts.getValue ("cite");
+ else
+ link = null;
+ link = maybeAddLink (link);
+
+ // "base" modifies designated baseURI
+ if ("base".equals (localName) && link != null)
+ baseURI = link;
+
+ if ("iframe".equals (localName) || "img".equals (localName))
+ maybeAddLink (atts.getValue ("longdesc"));
+ }
+
+ super.startElement (uri, localName, qName, atts);
+ }
+
+ private String maybeAddLink (String link)
+ {
+ int index;
+
+ // ignore empty links and fragments inside docs
+ if (link == null)
+ return null;
+ if ((index = link.indexOf ("#")) >= 0)
+ link = link.substring (0, index);
+ if (link.equals (""))
+ return null;
+
+ try {
+ // get the real URI
+ URL base = new URL ((baseURI != null)
+ ? baseURI
+ : getDocumentLocator ().getSystemId ());
+ URL url = new URL (base, link);
+
+ link = url.toString ();
+
+ // ignore duplicates
+ if (vector.contains (link))
+ return link;
+
+ // other than what "base" does, stick to original site:
+ if (siteRestricted) {
+ // don't switch protocols
+ if (!base.getProtocol ().equals (url.getProtocol ()))
+ return link;
+ // don't switch servers
+ if (base.getHost () != null
+ && !base.getHost ().equals (url.getHost ()))
+ return link;
+ }
+
+ vector.addElement (link);
+
+ return link;
+
+ } catch (IOException e) {
+ // bad URLs we don't want
+ }
+ return null;
+ }
+
+ /**
+ * Reports an error if no Locator has been made available.
+ */
+ public void startDocument ()
+ throws SAXException
+ {
+ if (getDocumentLocator () == null)
+ throw new SAXException ("no Locator!");
+ }
+
+ /**
+ * Forgets about any base URI information that may be recorded.
+ * Applications will often want to call removeAllLinks(), likely
+ * after examining the links which were reported.
+ */
+ public void endDocument ()
+ throws SAXException
+ {
+ baseURI = null;
+ super.endDocument ();
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/NSFilter.java b/libjava/classpath/gnu/xml/pipeline/NSFilter.java
new file mode 100644
index 000000000..0fa4621d3
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/NSFilter.java
@@ -0,0 +1,341 @@
+/* NSFilter.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.util.Enumeration;
+import java.util.Stack;
+
+import org.xml.sax.Attributes;
+import org.xml.sax.ErrorHandler;
+import org.xml.sax.Locator;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXParseException;
+import org.xml.sax.helpers.AttributesImpl;
+import org.xml.sax.helpers.NamespaceSupport;
+
+/**
+ * This filter ensures that element and attribute names are properly prefixed,
+ * and that such prefixes are declared. Such data is critical for operations
+ * like writing XML text, and validating against DTDs: names or their prefixes
+ * may have been discarded, although they are essential to the exchange of
+ * information using XML. There are various common ways that such data
+ * gets discarded: <ul>
+ *
+ * <li> By default, SAX2 parsers must discard the "xmlns*"
+ * attributes, and may also choose not to report properly prefixed
+ * names for elements or attributes. (Some parsers may support
+ * changing the <em>namespace-prefixes</em> value from the default
+ * to <em>true</em>, effectively eliminating the need to use this
+ * filter on their output.)
+ *
+ * <li> When event streams are generated from a DOM tree, they may
+ * have never have had prefixes or declarations for namespaces; or
+ * the existing prefixes or declarations may have been invalidated
+ * by structural modifications to that DOM tree.
+ *
+ * <li> Other software writing SAX event streams won't necessarily
+ * be worrying about prefix management, and so they will need to
+ * have a transparent solution for managing them.
+ *
+ * </ul>
+ *
+ * <p> This filter uses a heuristic to choose the prefix to assign to any
+ * particular name which wasn't already corectly prefixed. The associated
+ * namespace will be correct, and the prefix will be declared. Original
+ * structures facilitating text editing, such as conventions about use of
+ * mnemonic prefix names or the scoping of prefixes, can't always be
+ * reconstructed after they are discarded, as strongly encouraged by the
+ * current SAX2 defaults.
+ *
+ * <p> Note that this can't possibly know whether values inside attribute
+ * value or document content involve prefixed names. If your application
+ * requires using prefixed names in such locations you'll need to add some
+ * appropriate logic (perhaps adding additional heuristics in a subclass).
+ *
+ * @author David Brownell
+ */
+public class NSFilter extends EventFilter
+{
+ private NamespaceSupport nsStack = new NamespaceSupport ();
+ private Stack elementStack = new Stack ();
+
+ private boolean pushedContext;
+ private String nsTemp [] = new String [3];
+ private AttributesImpl attributes = new AttributesImpl ();
+ private boolean usedDefault;
+
+ // gensymmed prefixes use this root name
+ private static final String prefixRoot = "prefix-";
+
+
+ /**
+ * Passes events through to the specified consumer, after first
+ * processing them.
+ *
+ * @param next the next event consumer to receive events.
+ */
+ // constructor used by PipelineFactory
+ public NSFilter (EventConsumer next)
+ {
+ super (next);
+
+ setContentHandler (this);
+ }
+
+ private void fatalError (String message)
+ throws SAXException
+ {
+ SAXParseException e;
+ ErrorHandler handler = getErrorHandler ();
+ Locator locator = getDocumentLocator ();
+
+ if (locator == null)
+ e = new SAXParseException (message, null, null, -1, -1);
+ else
+ e = new SAXParseException (message, locator);
+ if (handler != null)
+ handler.fatalError (e);
+ throw e;
+ }
+
+
+ public void startDocument () throws SAXException
+ {
+ elementStack.removeAllElements ();
+ nsStack.reset ();
+ pushedContext = false;
+ super.startDocument ();
+ }
+
+ /**
+ * This call is not passed to the next consumer in the chain.
+ * Prefix declarations and scopes are only exposed in the form
+ * of attributes; this callback just records a declaration that
+ * will be exposed as an attribute.
+ */
+ public void startPrefixMapping (String prefix, String uri)
+ throws SAXException
+ {
+ if (pushedContext == false) {
+ nsStack.pushContext ();
+ pushedContext = true;
+ }
+
+ // this check is awkward, but the paranoia prevents big trouble
+ for (Enumeration e = nsStack.getDeclaredPrefixes ();
+ e.hasMoreElements ();
+ /* NOP */ ) {
+ String declared = (String) e.nextElement ();
+
+ if (!declared.equals (prefix))
+ continue;
+ if (uri.equals (nsStack.getURI (prefix)))
+ return;
+ fatalError ("inconsistent binding for prefix '" + prefix
+ + "' ... " + uri + " (was " + nsStack.getURI (prefix) + ")");
+ }
+
+ if (!nsStack.declarePrefix (prefix, uri))
+ fatalError ("illegal prefix declared: " + prefix);
+ }
+
+ private String fixName (String ns, String l, String name, boolean isAttr)
+ throws SAXException
+ {
+ if ("".equals (name) || name == null) {
+ name = l;
+ if ("".equals (name) || name == null)
+ fatalError ("empty/null name");
+ }
+
+ // can we correctly process the name as-is?
+ // handles "element scope" attribute names here.
+ if (nsStack.processName (name, nsTemp, isAttr) != null
+ && nsTemp [0].equals (ns)
+ ) {
+ return nsTemp [2];
+ }
+
+ // nope, gotta modify the name or declare a default mapping
+ int temp;
+
+ // get rid of any current prefix
+ if ((temp = name.indexOf (':')) >= 0) {
+ name = name.substring (temp + 1);
+
+ // ... maybe that's enough (use/prefer default namespace) ...
+ if (!isAttr && nsStack.processName (name, nsTemp, false) != null
+ && nsTemp [0].equals (ns)
+ ) {
+ return nsTemp [2];
+ }
+ }
+
+ // must we define and use the default/undefined prefix?
+ if ("".equals (ns)) {
+ if (isAttr)
+ fatalError ("processName bug");
+ if (attributes.getIndex ("xmlns") != -1)
+ fatalError ("need to undefine default NS, but it's bound: "
+ + attributes.getValue ("xmlns"));
+
+ nsStack.declarePrefix ("", "");
+ attributes.addAttribute ("", "", "xmlns", "CDATA", "");
+ return name;
+ }
+
+ // is there at least one non-null prefix we can use?
+ for (Enumeration e = nsStack.getDeclaredPrefixes ();
+ e.hasMoreElements ();
+ /* NOP */) {
+ String prefix = (String) e.nextElement ();
+ String uri = nsStack.getURI (prefix);
+
+ if (uri == null || !uri.equals (ns))
+ continue;
+ return prefix + ":" + name;
+ }
+
+ // no such luck. create a prefix name, declare it, use it.
+ for (temp = 0; temp >= 0; temp++) {
+ String prefix = prefixRoot + temp;
+
+ if (nsStack.getURI (prefix) == null) {
+ nsStack.declarePrefix (prefix, ns);
+ attributes.addAttribute ("", "", "xmlns:" + prefix,
+ "CDATA", ns);
+ return prefix + ":" + name;
+ }
+ }
+ fatalError ("too many prefixes genned");
+ // NOTREACHED
+ return null;
+ }
+
+ public void startElement (
+ String uri, String localName,
+ String qName, Attributes atts
+ ) throws SAXException
+ {
+ if (!pushedContext)
+ nsStack.pushContext ();
+ pushedContext = false;
+
+ // make sure we have all NS declarations handy before we start
+ int length = atts.getLength ();
+
+ for (int i = 0; i < length; i++) {
+ String aName = atts.getQName (i);
+
+ if (!aName.startsWith ("xmlns"))
+ continue;
+
+ String prefix;
+
+ if ("xmlns".equals (aName))
+ prefix = "";
+ else if (aName.indexOf (':') == 5)
+ prefix = aName.substring (6);
+ else // "xmlnsfoo" etc.
+ continue;
+ startPrefixMapping (prefix, atts.getValue (i));
+ }
+
+ // put namespace decls at the start of our regenned attlist
+ attributes.clear ();
+ for (Enumeration e = nsStack.getDeclaredPrefixes ();
+ e.hasMoreElements ();
+ /* NOP */) {
+ String prefix = (String) e.nextElement ();
+
+ attributes.addAttribute ("", "",
+ ("".equals (prefix)
+ ? "xmlns"
+ : "xmlns:" + prefix),
+ "CDATA",
+ nsStack.getURI (prefix));
+ }
+
+ // name fixups: element, then attributes.
+ // fixName may declare a new prefix or, for the element,
+ // redeclare the default (if element name needs it).
+ qName = fixName (uri, localName, qName, false);
+
+ for (int i = 0; i < length; i++) {
+ String aName = atts.getQName (i);
+ String aNS = atts.getURI (i);
+ String aLocal = atts.getLocalName (i);
+ String aType = atts.getType (i);
+ String aValue = atts.getValue (i);
+
+ if (aName.startsWith ("xmlns"))
+ continue;
+ aName = fixName (aNS, aLocal, aName, true);
+ attributes.addAttribute (aNS, aLocal, aName, aType, aValue);
+ }
+
+ elementStack.push (qName);
+
+ // pass event along, with cleaned-up names and decls.
+ super.startElement (uri, localName, qName, attributes);
+ }
+
+ public void endElement (String uri, String localName, String qName)
+ throws SAXException
+ {
+ nsStack.popContext ();
+ qName = (String) elementStack.pop ();
+ super.endElement (uri, localName, qName);
+ }
+
+ /**
+ * This call is not passed to the next consumer in the chain.
+ * Prefix declarations and scopes are only exposed in their
+ * attribute form.
+ */
+ public void endPrefixMapping (String prefix)
+ throws SAXException
+ { }
+
+ public void endDocument () throws SAXException
+ {
+ elementStack.removeAllElements ();
+ nsStack.reset ();
+ super.endDocument ();
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/PipelineFactory.java b/libjava/classpath/gnu/xml/pipeline/PipelineFactory.java
new file mode 100644
index 000000000..c2adab021
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/PipelineFactory.java
@@ -0,0 +1,723 @@
+/* PipelineFactory.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.lang.reflect.Constructor;
+import java.util.StringTokenizer;
+
+import org.xml.sax.*;
+import org.xml.sax.ext.*;
+
+
+/**
+ * This provides static factory methods for creating simple event pipelines.
+ * These pipelines are specified by strings, suitable for passing on
+ * command lines or embedding in element attributes. For example, one way
+ * to write a pipeline that restores namespace syntax, validates (stopping
+ * the pipeline on validity errors) and then writes valid data to standard
+ * output is this: <pre>
+ * nsfix | validate | write ( stdout )</pre>
+ *
+ * <p> In this syntax, the tokens are always separated by whitespace, and each
+ * stage of the pipeline may optionally have a parameter (which can be a
+ * pipeline) in parentheses. Interior stages are called filters, and the
+ * rightmost end of a pipeline is called a terminus.
+ *
+ * <p> Stages are usually implemented by a single class, which may not be
+ * able to act as both a filter and a terminus; but any terminus can be
+ * automatically turned into a filter, through use of a {@link TeeConsumer}.
+ * The stage identifiers are either class names, or are one of the following
+ * short identifiers built into this class. (Most of these identifiers are
+ * no more than aliases for classes.) The built-in identifiers include:</p>
+
+ <table border="1" cellpadding="3" cellspacing="0">
+ <tr bgcolor="#ccccff" class="TableHeadingColor">
+ <th align="center" width="5%">Stage</th>
+ <th align="center" width="9%">Parameter</th>
+ <th align="center" width="1%">Terminus</th>
+ <th align="center">Description</th>
+ </tr>
+
+ <tr valign="top" align="center">
+ <td><a href="../dom/Consumer.html">dom</a></td>
+ <td><em>none</em></td>
+ <td> yes </td>
+ <td align="left"> Applications code can access a DOM Document built
+ from the input event stream. When used as a filter, this buffers
+ data up to an <em>endDocument</em> call, and then uses a DOM parser
+ to report everything that has been recorded (which can easily be
+ less than what was reported to it). </td>
+ </tr>
+ <tr valign="top" align="center">
+ <td><a href="NSFilter.html">nsfix</a></td>
+ <td><em>none</em></td>
+ <td>no</td>
+ <td align="left">This stage ensures that the XML element and attribute
+ names in its output use namespace prefixes and declarations correctly.
+ That is, so that they match the "Namespace plus LocalName" naming data
+ with which each XML element and attribute is already associated. </td>
+ </tr>
+ <tr valign="top" align="center">
+ <td><a href="EventFilter.html">null</a></td>
+ <td><em>none</em></td>
+ <td>yes</td>
+ <td align="left">This stage ignores all input event data.</td>
+ </tr>
+ <tr valign="top" align="center">
+ <td><a href="CallFilter.html">server</a></td>
+ <td><em>required</em><br> server URL </td>
+ <td>no</td>
+ <td align="left">Sends its input as XML request to a remote server,
+ normally a web application server using the HTTP or HTTPS protocols.
+ The output of this stage is the parsed response from that server.</td>
+ </tr>
+ <tr valign="top" align="center">
+ <td><a href="TeeConsumer.html">tee</a></td>
+ <td><em>required</em><br> first pipeline</td>
+ <td>no</td>
+ <td align="left">This sends its events down two paths; its parameter
+ is a pipeline descriptor for the first path, and the second path
+ is the output of this stage.</td>
+ </tr>
+
+ <tr valign="top" align="center">
+ <td><a href="ValidationConsumer.html">validate</a></td>
+ <td><em>none</em></td>
+ <td>yes</td>
+ <td align="left">This checks for validity errors, and reports them
+ through its error handler. The input must include declaration events
+ and some lexical events. </td>
+ </tr>
+ <tr valign="top" align="center">
+ <td><a href="WellFormednessFilter.html">wf</a></td>
+ <td><em>none</em></td>
+ <td>yes</td>
+ <td align="left"> This class provides some basic "well formedness"
+ tests on the input event stream, and reports a fatal error if any
+ of them fail. One example: start/end calls for elements must match.
+ No SAX parser is permitted to produce malformed output, but other
+ components can easily do so.</td>
+ </tr>
+ <tr valign="top" align="center">
+ <td>write</td>
+ <td><em>required</em><br> "stdout", "stderr", or filename</td>
+ <td>yes</td>
+ <td align="left"> Writes its input to the specified output, as pretty
+ printed XML text encoded using UTF-8. Input events must be well
+ formed and "namespace fixed", else the output won't be XML (or possibly
+ namespace) conformant. The symbolic names represent
+ <em>System.out</em> and <em>System.err</em> respectively; names must
+ correspond to files which don't yet exist.</td>
+ </tr>
+ <tr valign="top" align="center">
+ <td>xhtml</td>
+ <td><em>required</em><br> "stdout", "stderr", or filename</td>
+ <td>yes</td>
+ <td align="left"> Like <em>write</em> (above), except that XHTML rules
+ are followed. The XHTML 1.0 Transitional document type is declared,
+ and only ASCII characters are written (for interoperability). Other
+ characters are written as entity or character references; the text is
+ pretty printed.</td>
+ </tr>
+ <tr valign="top" align="center">
+ <td><a href="XIncludeFilter.html">xinclude</a></td>
+ <td><em>none</em></td>
+ <td>no</td>
+ <td align="left">This stage handles XInclude processing.
+ This is like entity inclusion, except that the included content
+ is declared in-line rather than in the DTD at the beginning of
+ a document.
+ </td>
+ </tr>
+ <tr valign="top" align="center">
+ <td><a href="XsltFilter.html">xslt</a></td>
+ <td><em>required</em><br> XSLT stylesheet URI</td>
+ <td>no</td>
+ <td align="left">This stage handles XSLT transformation
+ according to a stylesheet.
+ The implementation of the transformation may not actually
+ stream data, although if such an XSLT engine is in use
+ then that can happen.
+ </td>
+ </tr>
+
+ </table>
+
+ * <p> Note that {@link EventFilter#bind} can automatically eliminate
+ * some filters by setting SAX2 parser features appropriately. This means
+ * that you can routinely put filters like "nsfix", "validate", or "wf" at the
+ * front of a pipeline (for components that need inputs conditioned to match
+ * that level of correctness), and know that it won't actually be used unless
+ * it's absolutely necessary.
+ *
+ * @author David Brownell
+ */
+public class PipelineFactory
+{
+ /**
+ * Creates a simple pipeline according to the description string passed in.
+ */
+ public static EventConsumer createPipeline (String description)
+ throws IOException
+ {
+ return createPipeline (description, null);
+ }
+
+ /**
+ * Extends an existing pipeline by prepending the filter pipeline to the
+ * specified consumer. Some pipelines need more customization than can
+ * be done through this simplified syntax. When they are set up with
+ * direct API calls, use this method to merge more complex pipeline
+ * segments with easily configured ones.
+ */
+ public static EventConsumer createPipeline (
+ String description,
+ EventConsumer next
+ ) throws IOException
+ {
+ // tokens are (for now) what's separated by whitespace;
+ // very easy to parse, but IDs never have spaces.
+
+ StringTokenizer tokenizer;
+ String tokens [];
+
+ tokenizer = new StringTokenizer (description);
+ tokens = new String [tokenizer.countTokens ()];
+ for (int i = 0; i < tokens.length; i++)
+ tokens [i] = tokenizer.nextToken ();
+
+ PipelineFactory factory = new PipelineFactory ();
+ Pipeline pipeline = factory.parsePipeline (tokens, next);
+
+ return pipeline.createPipeline ();
+ }
+
+
+ private PipelineFactory () { /* NYET */ }
+
+
+ /**
+ * Extends an existing pipeline by prepending a pre-tokenized filter
+ * pipeline to the specified consumer. Tokens are class names (or the
+ * predefined aliases) left and right parenthesis, and the vertical bar.
+ */
+ public static EventConsumer createPipeline (
+ String tokens [],
+ EventConsumer next
+ ) throws IOException
+ {
+ PipelineFactory factory = new PipelineFactory ();
+ Pipeline pipeline = factory.parsePipeline (tokens, next);
+
+ return pipeline.createPipeline ();
+ }
+
+
+ private String tokens [];
+ private int index;
+
+ private Pipeline parsePipeline (String toks [], EventConsumer next)
+ {
+ tokens = toks;
+ index = 0;
+
+ Pipeline retval = parsePipeline (next);
+
+ if (index != toks.length)
+ throw new ArrayIndexOutOfBoundsException (
+ "extra token: " + tokens [index]);
+ return retval;
+ }
+
+ // pipeline ::= stage | stage '|' pipeline
+ private Pipeline parsePipeline (EventConsumer next)
+ {
+ Pipeline retval = new Pipeline (parseStage ());
+
+ // minimal pipelines: "stage" and "... | id"
+ if (index > (tokens.length - 2)
+ || !"|".equals (tokens [index])
+ ) {
+ retval.next = next;
+ return retval;
+ }
+ index++;
+ retval.rest = parsePipeline (next);
+ return retval;
+ }
+
+ // stage ::= id | id '(' pipeline ')'
+ private Stage parseStage ()
+ {
+ Stage retval = new Stage (tokens [index++]);
+
+ // minimal stages: "id" and "id ( id )"
+ if (index > (tokens.length - 2)
+ || !"(".equals (tokens [index]) /*)*/
+ )
+ return retval;
+
+ index++;
+ retval.param = parsePipeline (null);
+ if (index >= tokens.length)
+ throw new ArrayIndexOutOfBoundsException (
+ "missing right paren");
+ if (/*(*/ !")".equals (tokens [index++]))
+ throw new ArrayIndexOutOfBoundsException (
+ "required right paren, not: " + tokens [index - 1]);
+ return retval;
+ }
+
+
+ //
+ // these classes obey the conventions for constructors, so they're
+ // only built in to this table of shortnames
+ //
+ // - filter (one or two types of arglist)
+ // * last constructor is 'next' element
+ // * optional (first) string parameter
+ //
+ // - terminus (one or types of arglist)
+ // * optional (only) string parameter
+ //
+ // terminus stages are transformed into filters if needed, by
+ // creating a "tee". filter stages aren't turned to terminus
+ // stages though; either eliminate such stages, or add some
+ // terminus explicitly.
+ //
+ private static final String builtinStages [][] = {
+ { "dom", "gnu.xml.dom.Consumer" },
+ { "nsfix", "gnu.xml.pipeline.NSFilter" },
+ { "null", "gnu.xml.pipeline.EventFilter" },
+ { "server", "gnu.xml.pipeline.CallFilter" },
+ { "tee", "gnu.xml.pipeline.TeeConsumer" },
+ { "validate", "gnu.xml.pipeline.ValidationConsumer" },
+ { "wf", "gnu.xml.pipeline.WellFormednessFilter" },
+ { "xinclude", "gnu.xml.pipeline.XIncludeFilter" },
+ { "xslt", "gnu.xml.pipeline.XsltFilter" },
+
+// XXX want: option for validate, to preload external part of a DTD
+
+ // xhtml, write ... nyet generic-ready
+ };
+
+ private static class Stage
+ {
+ String id;
+ Pipeline param;
+
+ Stage (String name)
+ { id = name; }
+
+ public String toString ()
+ {
+ if (param == null)
+ return id;
+ return id + " ( " + param + " )";
+ }
+
+ private void fail (String message)
+ throws IOException
+ {
+ throw new IOException ("in '" + id
+ + "' stage of pipeline, " + message);
+ }
+
+ EventConsumer createStage (EventConsumer next)
+ throws IOException
+ {
+ String name = id;
+
+ // most builtins are just class aliases
+ for (int i = 0; i < builtinStages.length; i++) {
+ if (id.equals (builtinStages [i][0])) {
+ name = builtinStages [i][1];
+ break;
+ }
+ }
+
+ // Save output as XML or XHTML text
+ if ("write".equals (name) || "xhtml".equals (name)) {
+ String filename;
+ boolean isXhtml = "xhtml".equals (name);
+ OutputStream out = null;
+ TextConsumer consumer;
+
+ if (param == null)
+ fail ("parameter is required");
+
+ filename = param.toString ();
+ if ("stdout".equals (filename))
+ out = System.out;
+ else if ("stderr".equals (filename))
+ out = System.err;
+ else {
+ File f = new File (filename);
+
+/*
+ if (!f.isAbsolute ())
+ fail ("require absolute file paths");
+ */
+ if (f.exists ())
+ fail ("file already exists: " + f.getName ());
+
+// XXX this races against the existence test
+ out = new FileOutputStream (f);
+ }
+
+ if (!isXhtml)
+ consumer = new TextConsumer (out);
+ else
+ consumer = new TextConsumer (
+ new OutputStreamWriter (out, "8859_1"),
+ true);
+
+ consumer.setPrettyPrinting (true);
+ if (next == null)
+ return consumer;
+ return new TeeConsumer (consumer, next);
+
+ } else {
+ //
+ // Here go all the builtins that are just aliases for
+ // classes, and all stage IDs that started out as such
+ // class names. The following logic relies on several
+ // documented conventions for constructor invocation.
+ //
+ String msg = null;
+
+ try {
+ Class klass = Class.forName (name);
+ Class argTypes [] = null;
+ Constructor constructor = null;
+ boolean filter = false;
+ Object params [] = null;
+ Object obj = null;
+
+ // do we need a filter stage?
+ if (next != null) {
+ // "next" consumer is always passed, with
+ // or without the optional string param
+ if (param == null) {
+ argTypes = new Class [1];
+ argTypes [0] = EventConsumer.class;
+
+ params = new Object [1];
+ params [0] = next;
+
+ msg = "no-param filter";
+ } else {
+ argTypes = new Class [2];
+ argTypes [0] = String.class;
+ argTypes [1] = EventConsumer.class;
+
+ params = new Object [2];
+ params [0] = param.toString ();
+ params [1] = next;
+
+ msg = "one-param filter";
+ }
+
+
+ try {
+ constructor = klass.getConstructor (argTypes);
+ } catch (NoSuchMethodException e) {
+ // try creating a filter from a
+ // terminus and a tee
+ filter = true;
+ msg += " built from ";
+ }
+ }
+
+ // build from a terminus stage, with or
+ // without the optional string param
+ if (constructor == null) {
+ String tmp;
+
+ if (param == null) {
+ argTypes = new Class [0];
+ params = new Object [0];
+
+ tmp = "no-param terminus";
+ } else {
+ argTypes = new Class [1];
+ argTypes [0] = String.class;
+
+ params = new Object [1];
+ params [0] = param.toString ();
+
+ tmp = "one-param terminus";
+ }
+ if (msg == null)
+ msg = tmp;
+ else
+ msg += tmp;
+ constructor = klass.getConstructor (argTypes);
+ // NOT creating terminus by dead-ending
+ // filters ... users should think about
+ // that one, something's likely wrong
+ }
+
+ obj = constructor.newInstance (params);
+
+ // return EventConsumers directly, perhaps after
+ // turning them into a filter
+ if (obj instanceof EventConsumer) {
+ if (filter)
+ return new TeeConsumer ((EventConsumer) obj, next);
+ return (EventConsumer) obj;
+ }
+
+ // if it's not a handler, it's an error
+ // we can wrap handlers in a filter
+ EventFilter retval = new EventFilter ();
+ boolean updated = false;
+
+ if (obj instanceof ContentHandler) {
+ retval.setContentHandler ((ContentHandler) obj);
+ updated = true;
+ }
+ if (obj instanceof DTDHandler) {
+ retval.setDTDHandler ((DTDHandler) obj);
+ updated = true;
+ }
+ if (obj instanceof LexicalHandler) {
+ retval.setProperty (
+ EventFilter.PROPERTY_URI + "lexical-handler",
+ obj);
+ updated = true;
+ }
+ if (obj instanceof DeclHandler) {
+ retval.setProperty (
+ EventFilter.PROPERTY_URI + "declaration-handler",
+ obj);
+ updated = true;
+ }
+
+ if (!updated)
+ fail ("class is neither Consumer nor Handler");
+
+ if (filter)
+ return new TeeConsumer (retval, next);
+ return retval;
+
+ } catch (IOException e) {
+ throw e;
+
+ } catch (NoSuchMethodException e) {
+ fail (name + " constructor missing -- " + msg);
+
+ } catch (ClassNotFoundException e) {
+ fail (name + " class not found");
+
+ } catch (Exception e) {
+ // e.printStackTrace ();
+ fail ("stage not available: " + e.getMessage ());
+ }
+ }
+ // NOTREACHED
+ return null;
+ }
+ }
+
+ private static class Pipeline
+ {
+ Stage stage;
+
+ // rest may be null
+ Pipeline rest;
+ EventConsumer next;
+
+ Pipeline (Stage s)
+ { stage = s; }
+
+ public String toString ()
+ {
+ if (rest == null && next == null)
+ return stage.toString ();
+ if (rest != null)
+ return stage + " | " + rest;
+ throw new IllegalArgumentException ("next");
+ }
+
+ EventConsumer createPipeline ()
+ throws IOException
+ {
+ if (next == null) {
+ if (rest == null)
+ next = stage.createStage (null);
+ else
+ next = stage.createStage (rest.createPipeline ());
+ }
+ return next;
+ }
+ }
+
+/*
+ public static void main (String argv [])
+ {
+ try {
+ // three basic terminus cases
+ createPipeline ("null");
+ createPipeline ("validate");
+ createPipeline ("write ( stdout )");
+
+ // four basic filters
+ createPipeline ("nsfix | write ( stderr )");
+ createPipeline ("wf | null");
+ createPipeline ("null | null");
+ createPipeline (
+"call ( http://www.example.com/services/xml-1a ) | xhtml ( stdout )");
+
+ // tee junctions
+ createPipeline ("tee ( validate ) | write ( stdout )");
+ createPipeline ("tee ( nsfix | write ( stdout ) ) | validate");
+
+ // longer pipeline
+ createPipeline ("nsfix | tee ( validate ) | write ( stdout )");
+ createPipeline (
+ "null | wf | nsfix | tee ( validate ) | write ( stdout )");
+
+ // try some parsing error cases
+ try {
+ createPipeline ("null ("); // extra token '('
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+
+ try {
+ createPipeline ("nsfix |"); // extra token '|'
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+
+ try {
+ createPipeline ("xhtml ( foo"); // missing right paren
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+
+ try {
+ createPipeline ("xhtml ( foo bar"); // required right paren
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+
+ try {
+ createPipeline ("tee ( nsfix | validate");// missing right paren
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+
+ // try some construction error cases
+
+ try {
+ createPipeline ("call"); // missing param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("call ( foobar )"); // broken param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("nsfix ( foobar )"); // illegal param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("null ( foobar )"); // illegal param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("wf ( foobar )"); // illegal param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("xhtml ( foobar.html )");
+ new File ("foobar.html").delete ();
+ // now supported
+ } catch (Exception e) {
+ System.err.println ("** err: " + e.getMessage ()); }
+ try {
+ createPipeline ("xhtml"); // missing param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("write ( stdout ) | null"); // nonterminal
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("validate | null");
+ // now supported
+ } catch (Exception e) {
+ System.err.println ("** err: " + e.getMessage ()); }
+ try {
+ createPipeline ("validate ( foo )"); // illegal param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ createPipeline ("tee"); // missing param
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+ try {
+ // only builtins so far
+ createPipeline ("com.example.xml.FilterClass");
+ System.err.println ("** didn't report error");
+ } catch (Exception e) {
+ System.err.println ("== err: " + e.getMessage ()); }
+
+ } catch (Exception e) {
+ e.printStackTrace ();
+ }
+ }
+/**/
+
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/TeeConsumer.java b/libjava/classpath/gnu/xml/pipeline/TeeConsumer.java
new file mode 100644
index 000000000..3ac860575
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/TeeConsumer.java
@@ -0,0 +1,417 @@
+/* TeeConsumer.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import org.xml.sax.Attributes;
+import org.xml.sax.ContentHandler;
+import org.xml.sax.DTDHandler;
+import org.xml.sax.ErrorHandler;
+import org.xml.sax.Locator;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXNotRecognizedException;
+import org.xml.sax.ext.DeclHandler;
+import org.xml.sax.ext.LexicalHandler;
+
+/**
+ * Fans its events out to two other consumers, a "tee" filter stage in an
+ * event pipeline. Networks can be assembled with multiple output points.
+ *
+ * <p> Error handling should be simple if you remember that exceptions
+ * you throw will cancel later stages in that callback's pipeline, and
+ * generally the producer will stop if it sees such an exception. You
+ * may want to protect your pipeline against such backflows, making a
+ * kind of reverse filter (or valve?) so that certain exceptions thrown by
+ * your pipeline will caught and handled before the producer sees them.
+ * Just use a "try/catch" block, rememebering that really important
+ * cleanup tasks should be in "finally" clauses.
+ *
+ * <p> That issue isn't unique to "tee" consumers, but tee consumers have
+ * the additional twist that exceptions thrown by the first consumer
+ * will cause the second consumer not to see the callback (except for
+ * the endDocument callback, which signals state cleanup).
+ *
+ * @author David Brownell
+ */
+final public class TeeConsumer
+ implements EventConsumer,
+ ContentHandler, DTDHandler,
+ LexicalHandler,DeclHandler
+{
+ private EventConsumer first, rest;
+
+ // cached to minimize time overhead
+ private ContentHandler docFirst, docRest;
+ private DeclHandler declFirst, declRest;
+ private LexicalHandler lexFirst, lexRest;
+
+
+ /**
+ * Constructs a consumer which sends all its events to the first
+ * consumer, and then the second one. If the first consumer throws
+ * an exception, the second one will not see the event which
+ * caused that exception to be reported.
+ *
+ * @param car The first consumer to get the events
+ * @param cdr The second consumer to get the events
+ */
+ public TeeConsumer (EventConsumer car, EventConsumer cdr)
+ {
+ if (car == null || cdr == null)
+ throw new NullPointerException ();
+ first = car;
+ rest = cdr;
+
+ //
+ // Cache the handlers.
+ //
+ docFirst = first.getContentHandler ();
+ docRest = rest.getContentHandler ();
+ // DTD handler isn't cached (rarely needed)
+
+ try {
+ declFirst = null;
+ declFirst = (DeclHandler) first.getProperty (
+ EventFilter.DECL_HANDLER);
+ } catch (SAXException e) {}
+ try {
+ declRest = null;
+ declRest = (DeclHandler) rest.getProperty (
+ EventFilter.DECL_HANDLER);
+ } catch (SAXException e) {}
+
+ try {
+ lexFirst = null;
+ lexFirst = (LexicalHandler) first.getProperty (
+ EventFilter.LEXICAL_HANDLER);
+ } catch (SAXException e) {}
+ try {
+ lexRest = null;
+ lexRest = (LexicalHandler) rest.getProperty (
+ EventFilter.LEXICAL_HANDLER);
+ } catch (SAXException e) {}
+ }
+
+/* FIXME
+ /**
+ * Constructs a pipeline, and is otherwise a shorthand for the
+ * two-consumer constructor for this class.
+ *
+ * @param first Description of the first pipeline to get events,
+ * which will be passed to {@link PipelineFactory#createPipeline}
+ * @param rest The second pipeline to get the events
+ * /
+ // constructor used by PipelineFactory
+ public TeeConsumer (String first, EventConsumer rest)
+ throws IOException
+ {
+ this (PipelineFactory.createPipeline (first), rest);
+ }
+*/
+
+ /** Returns the first pipeline to get event calls. */
+ public EventConsumer getFirst ()
+ { return first; }
+
+ /** Returns the second pipeline to get event calls. */
+ public EventConsumer getRest ()
+ { return rest; }
+
+ /** Returns the content handler being used. */
+ final public ContentHandler getContentHandler ()
+ {
+ if (docRest == null)
+ return docFirst;
+ if (docFirst == null)
+ return docRest;
+ return this;
+ }
+
+ /** Returns the dtd handler being used. */
+ final public DTDHandler getDTDHandler ()
+ {
+ // not cached (hardly used)
+ if (rest.getDTDHandler () == null)
+ return first.getDTDHandler ();
+ if (first.getDTDHandler () == null)
+ return rest.getDTDHandler ();
+ return this;
+ }
+
+ /** Returns the declaration or lexical handler being used. */
+ final public Object getProperty (String id)
+ throws SAXNotRecognizedException
+ {
+ //
+ // in degenerate cases, we have no work to do.
+ //
+ Object firstProp = null, restProp = null;
+
+ try { firstProp = first.getProperty (id); }
+ catch (SAXNotRecognizedException e) { /* ignore */ }
+ try { restProp = rest.getProperty (id); }
+ catch (SAXNotRecognizedException e) { /* ignore */ }
+
+ if (restProp == null)
+ return firstProp;
+ if (firstProp == null)
+ return restProp;
+
+ //
+ // we've got work to do; handle two builtin cases.
+ //
+ if (EventFilter.DECL_HANDLER.equals (id))
+ return this;
+ if (EventFilter.LEXICAL_HANDLER.equals (id))
+ return this;
+
+ //
+ // non-degenerate, handled by both consumers, but we don't know
+ // how to handle this.
+ //
+ throw new SAXNotRecognizedException ("can't tee: " + id);
+ }
+
+ /**
+ * Provides the error handler to both subsequent nodes of
+ * this filter stage.
+ */
+ public void setErrorHandler (ErrorHandler handler)
+ {
+ first.setErrorHandler (handler);
+ rest.setErrorHandler (handler);
+ }
+
+
+ //
+ // ContentHandler
+ //
+ public void setDocumentLocator (Locator locator)
+ {
+ // this call is not made by all parsers
+ docFirst.setDocumentLocator (locator);
+ docRest.setDocumentLocator (locator);
+ }
+
+ public void startDocument ()
+ throws SAXException
+ {
+ docFirst.startDocument ();
+ docRest.startDocument ();
+ }
+
+ public void endDocument ()
+ throws SAXException
+ {
+ try {
+ docFirst.endDocument ();
+ } finally {
+ docRest.endDocument ();
+ }
+ }
+
+ public void startPrefixMapping (String prefix, String uri)
+ throws SAXException
+ {
+ docFirst.startPrefixMapping (prefix, uri);
+ docRest.startPrefixMapping (prefix, uri);
+ }
+
+ public void endPrefixMapping (String prefix)
+ throws SAXException
+ {
+ docFirst.endPrefixMapping (prefix);
+ docRest.endPrefixMapping (prefix);
+ }
+
+ public void skippedEntity (String name)
+ throws SAXException
+ {
+ docFirst.skippedEntity (name);
+ docRest.skippedEntity (name);
+ }
+
+ public void startElement (String uri, String localName,
+ String qName, Attributes atts)
+ throws SAXException
+ {
+ docFirst.startElement (uri, localName, qName, atts);
+ docRest.startElement (uri, localName, qName, atts);
+ }
+
+ public void endElement (String uri, String localName, String qName)
+ throws SAXException
+ {
+ docFirst.endElement (uri, localName, qName);
+ docRest.endElement (uri, localName, qName);
+ }
+
+ public void processingInstruction (String target, String data)
+ throws SAXException
+ {
+ docFirst.processingInstruction (target, data);
+ docRest.processingInstruction (target, data);
+ }
+
+ public void characters (char ch [], int start, int length)
+ throws SAXException
+ {
+ docFirst.characters (ch, start, length);
+ docRest.characters (ch, start, length);
+ }
+
+ public void ignorableWhitespace (char ch [], int start, int length)
+ throws SAXException
+ {
+ docFirst.ignorableWhitespace (ch, start, length);
+ docRest.ignorableWhitespace (ch, start, length);
+ }
+
+
+ //
+ // DTDHandler
+ //
+ public void notationDecl (String name, String publicId, String systemId)
+ throws SAXException
+ {
+ DTDHandler l1 = first.getDTDHandler ();
+ DTDHandler l2 = rest.getDTDHandler ();
+
+ l1.notationDecl (name, publicId, systemId);
+ l2.notationDecl (name, publicId, systemId);
+ }
+
+ public void unparsedEntityDecl (String name,
+ String publicId, String systemId,
+ String notationName
+ ) throws SAXException
+ {
+ DTDHandler l1 = first.getDTDHandler ();
+ DTDHandler l2 = rest.getDTDHandler ();
+
+ l1.unparsedEntityDecl (name, publicId, systemId, notationName);
+ l2.unparsedEntityDecl (name, publicId, systemId, notationName);
+ }
+
+
+ //
+ // DeclHandler
+ //
+ public void attributeDecl (String eName, String aName,
+ String type,
+ String mode, String value)
+ throws SAXException
+ {
+ declFirst.attributeDecl (eName, aName, type, mode, value);
+ declRest.attributeDecl (eName, aName, type, mode, value);
+ }
+
+ public void elementDecl (String name, String model)
+ throws SAXException
+ {
+ declFirst.elementDecl (name, model);
+ declRest.elementDecl (name, model);
+ }
+
+ public void externalEntityDecl (String name,
+ String publicId, String systemId)
+ throws SAXException
+ {
+ declFirst.externalEntityDecl (name, publicId, systemId);
+ declRest.externalEntityDecl (name, publicId, systemId);
+ }
+
+ public void internalEntityDecl (String name, String value)
+ throws SAXException
+ {
+ declFirst.internalEntityDecl (name, value);
+ declRest.internalEntityDecl (name, value);
+ }
+
+
+ //
+ // LexicalHandler
+ //
+ public void comment (char ch [], int start, int length)
+ throws SAXException
+ {
+ lexFirst.comment (ch, start, length);
+ lexRest.comment (ch, start, length);
+ }
+
+ public void startCDATA ()
+ throws SAXException
+ {
+ lexFirst.startCDATA ();
+ lexRest.startCDATA ();
+ }
+
+ public void endCDATA ()
+ throws SAXException
+ {
+ lexFirst.endCDATA ();
+ lexRest.endCDATA ();
+ }
+
+ public void startEntity (String name)
+ throws SAXException
+ {
+ lexFirst.startEntity (name);
+ lexRest.startEntity (name);
+ }
+
+ public void endEntity (String name)
+ throws SAXException
+ {
+ lexFirst.endEntity (name);
+ lexRest.endEntity (name);
+ }
+
+ public void startDTD (String name, String publicId, String systemId)
+ throws SAXException
+ {
+ lexFirst.startDTD (name, publicId, systemId);
+ lexRest.startDTD (name, publicId, systemId);
+ }
+
+ public void endDTD ()
+ throws SAXException
+ {
+ lexFirst.endDTD ();
+ lexRest.endDTD ();
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/TextConsumer.java b/libjava/classpath/gnu/xml/pipeline/TextConsumer.java
new file mode 100644
index 000000000..13dcfa7f6
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/TextConsumer.java
@@ -0,0 +1,117 @@
+/* TextConsumer.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.io.*;
+
+import org.xml.sax.*;
+
+import gnu.xml.util.XMLWriter;
+
+
+/**
+ * Terminates a pipeline, consuming events to print them as well formed
+ * XML (or XHTML) text.
+ *
+ * <p> Input must be well formed, and must include XML names (e.g. the
+ * prefixes and prefix declarations must be present), or the output of
+ * this class is undefined.
+ *
+ * @see NSFilter
+ * @see WellFormednessFilter
+ *
+ * @author David Brownell
+ */
+public class TextConsumer extends XMLWriter implements EventConsumer
+{
+ /**
+ * Constructs an event consumer which echoes its input as text,
+ * optionally adhering to some basic XHTML formatting options
+ * which increase interoperability with old (v3) browsers.
+ *
+ * <p> For the best interoperability, when writing as XHTML only
+ * ASCII characters are emitted; other characters are turned to
+ * entity or character references as needed, and no XML declaration
+ * is provided in the document.
+ */
+ public TextConsumer (Writer w, boolean isXhtml)
+ throws IOException
+ {
+ super (w, isXhtml ? "US-ASCII" : null);
+ setXhtml (isXhtml);
+ }
+
+ /**
+ * Constructs a consumer that writes its input as XML text.
+ * XHTML rules are not followed.
+ */
+ public TextConsumer (Writer w)
+ throws IOException
+ {
+ this (w, false);
+ }
+
+ /**
+ * Constructs a consumer that writes its input as XML text,
+ * encoded in UTF-8. XHTML rules are not followed.
+ */
+ public TextConsumer (OutputStream out)
+ throws IOException
+ {
+ this (new OutputStreamWriter (out, "UTF8"), false);
+ }
+
+ /** <b>EventConsumer</b> Returns the document handler being used. */
+ public ContentHandler getContentHandler ()
+ { return this; }
+
+ /** <b>EventConsumer</b> Returns the dtd handler being used. */
+ public DTDHandler getDTDHandler ()
+ { return this; }
+
+ /** <b>XMLReader</b>Retrieves a property (lexical and decl handlers) */
+ public Object getProperty (String propertyId)
+ throws SAXNotRecognizedException
+ {
+ if (EventFilter.LEXICAL_HANDLER.equals (propertyId))
+ return this;
+ if (EventFilter.DECL_HANDLER.equals (propertyId))
+ return this;
+ throw new SAXNotRecognizedException (propertyId);
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/ValidationConsumer.java b/libjava/classpath/gnu/xml/pipeline/ValidationConsumer.java
new file mode 100644
index 000000000..0346984d3
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/ValidationConsumer.java
@@ -0,0 +1,1928 @@
+/* ValidationConsumer.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.io.IOException;
+import java.io.StringReader;
+import java.io.StringWriter;
+import java.util.EmptyStackException;
+import java.util.Enumeration;
+import java.util.Hashtable;
+import java.util.Stack;
+import java.util.StringTokenizer;
+import java.util.Vector;
+
+import org.xml.sax.Attributes;
+import org.xml.sax.EntityResolver;
+import org.xml.sax.ErrorHandler;
+import org.xml.sax.InputSource;
+import org.xml.sax.Locator;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXParseException;
+import org.xml.sax.XMLReader;
+import org.xml.sax.helpers.XMLReaderFactory;
+
+/**
+ * This class checks SAX2 events to report validity errors; it works as
+ * both a filter and a terminus on an event pipeline. It relies on the
+ * producer of SAX events to: </p> <ol>
+ *
+ * <li> Conform to the specification of a non-validating XML parser that
+ * reads all external entities, reported using SAX2 events. </li>
+ *
+ * <li> Report ignorable whitespace as such (through the ContentHandler
+ * interface). This is, strictly speaking, optional for nonvalidating
+ * XML processors. </li>
+ *
+ * <li> Make SAX2 DeclHandler callbacks, with default
+ * attribute values already normalized (and without "&lt;").</li>
+ *
+ * <li> Make SAX2 LexicalHandler startDTD() and endDTD ()
+ * callbacks. </li>
+ *
+ * <li> Act as if the <em>(URI)/namespace-prefixes</em> property were
+ * set to true, by providing XML 1.0 names and all <code>xmlns*</code>
+ * attributes (rather than omitting either or both). </li>
+ *
+ * </ol>
+ *
+ * <p> At this writing, the major SAX2 parsers (such as &AElig;lfred2,
+ * Crimson, and Xerces) meet these requirements, and this validation
+ * module is used by the optional &AElig;lfred2 validation support.
+ * </p>
+ *
+ * <p> Note that because this is a layered validator, it has to duplicate some
+ * work that the parser is doing; there are also other cost to layering.
+ * However, <em>because of layering it doesn't need a parser</em> in order
+ * to work! You can use it with anything that generates SAX events, such
+ * as an application component that wants to detect invalid content in
+ * a changed area without validating an entire document, or which wants to
+ * ensure that it doesn't write invalid data to a communications partner.</p>
+ *
+ * <p> Also, note that because this is a layered validator, the line numbers
+ * reported for some errors may seem strange. For example, if an element does
+ * not permit character content, the validator
+ * will use the locator provided to it.
+ * That might reflect the last character of a <em>characters</em> event
+ * callback, rather than the first non-whitespace character. </p>
+ *
+ * <hr />
+ *
+ * <!--
+ * <p> Of interest is the fact that unlike most currently known XML validators,
+ * this one can report some cases of non-determinism in element content models.
+ * It is a compile-time option, enabled by default. This will only report
+ * such XML errors if they relate to content actually appearing in a document;
+ * content models aren't aggressively scanned for non-deterministic structure.
+ * Documents which trigger such non-deterministic transitions may be handled
+ * differently by different validating parsers, without losing conformance
+ * to the XML specification. </p>
+ * -->
+ *
+ * <p> Current limitations of the validation performed are in roughly three
+ * categories. </p>
+ *
+ * <p> The first category represents constraints which demand violations
+ * of software layering: exposing lexical details, one of the first things
+ * that <em>application</em> programming interfaces (APIs) hide. These
+ * invariably relate to XML entity handling, and to historical oddities
+ * of the XML validation semantics. Curiously,
+ * recent (Autumn 1999) conformance testing showed that these constraints are
+ * among those handled worst by existing XML validating parsers. Arguments
+ * have been made that each of these VCs should be turned into WFCs (most
+ * of them) or discarded (popular for the standalone declaration); in short,
+ * that these are bugs in the XML specification (not all via SGML): </p><ul>
+ *
+ * <li> The <em>Proper Declaration/PE Nesting</em> and
+ * <em>Proper Group/PE Nesting</em> VCs can't be tested because they
+ * require access to particularly low level lexical level information.
+ * In essence, the reason XML isn't a simple thing to parse is that
+ * it's not a context free grammar, and these constraints elevate that
+ * SGML-derived context sensitivity to the level of a semantic rule.
+ *
+ * <li> The <em>Standalone Document Declaration</em> VC can't be
+ * tested. This is for two reasons. First, this flag isn't made
+ * available through SAX2. Second, it also requires breaking that
+ * lexical layering boundary. (If you ever wondered why classes
+ * in compiler construction or language design barely mention the
+ * existence of context-sensitive grammars, it's because of messy
+ * issues like these.)
+ *
+ * <li> The <em>Entity Declared</em> VC can't be tested, because it
+ * also requires breaking that lexical layering boundary! There's also
+ * another issue: the VC wording (and seemingly intent) is ambiguous.
+ * (This is still true in the "Second edition" XML spec.)
+ * Since there is a WFC of the same name, everyone's life would be
+ * easier if references to undeclared parsed entities were always well
+ * formedness errors, regardless of whether they're parameter entities
+ * or not. (Note that nonvalidating parsers are not required
+ * to report all such well formedness errors if they don't read external
+ * parameter entities, although currently most XML parsers read them
+ * in an attempt to avoid problems from inconsistent parser behavior.)
+ *
+ * </ul>
+ *
+ * <p> The second category of limitations on this validation represent
+ * constraints associated with information that is not guaranteed to be
+ * available (or in one case, <em>is guaranteed not to be available</em>,
+ * through the SAX2 API: </p><ul>
+ *
+ * <li> The <em>Unique Element Type Declaration</em> VC may not be
+ * reportable, if the underlying parser happens not to expose
+ * multiple declarations. (&AElig;lfred2 reports these validity
+ * errors directly.)</li>
+ *
+ * <li> Similarly, the <em>Unique Notation Name</em> VC, added in the
+ * 14-January-2000 XML spec errata to restrict typing models used by
+ * elements, may not be reportable. (&AElig;lfred reports these
+ * validity errors directly.) </li>
+ *
+ * </ul>
+ *
+ * <p> A third category relates to ease of implementation. (Think of this
+ * as "bugs".) The most notable issue here is character handling. Rather
+ * than attempting to implement the voluminous character tables in the XML
+ * specification (Appendix B), Unicode rules are used directly from
+ * the java.lang.Character class. Recent JVMs have begun to diverge from
+ * the original specification for that class (Unicode 2.0), meaning that
+ * different JVMs may handle that aspect of conformance differently.
+ * </p>
+ *
+ * <p> Note that for some of the validity errors that SAX2 does not
+ * expose, a nonvalidating parser is permitted (by the XML specification)
+ * to report validity errors. When used with a parser that does so for
+ * the validity constraints mentioned above (or any other SAX2 event
+ * stream producer that does the same thing), overall conformance is
+ * substantially improved.
+ *
+ * @see gnu.xml.aelfred2.SAXDriver
+ * @see gnu.xml.aelfred2.XmlReader
+ *
+ * @author David Brownell
+ */
+public final class ValidationConsumer extends EventFilter
+{
+ // report error if we happen to notice a non-deterministic choice?
+ // we won't report buggy content models; just buggy instances
+ private static final boolean warnNonDeterministic = false;
+
+ // for tracking active content models
+ private String rootName;
+ private Stack contentStack = new Stack ();
+
+ // flags for "saved DTD" processing
+ private boolean disableDeclarations;
+ private boolean disableReset;
+
+ //
+ // most VCs get tested when we see element start tags. the per-element
+ // info (including attributes) recorded here duplicates that found inside
+ // many nonvalidating parsers, hence dual lookups etc ... that's why a
+ // layered validator isn't going to be as fast as a non-layered one.
+ //
+
+ // key = element name; value = ElementInfo
+ private Hashtable elements = new Hashtable ();
+
+ // some VCs relate to ID/IDREF/IDREFS attributes
+ // key = id; value = boolean true (defd) or false (refd)
+ private Hashtable ids = new Hashtable ();
+
+ // we just record declared notation and unparsed entity names.
+ // the implementation here is simple/slow; these features
+ // are seldom used, one hopes they'll wither away soon
+ private Vector notations = new Vector (5, 5);
+ private Vector nDeferred = new Vector (5, 5);
+ private Vector unparsed = new Vector (5, 5);
+ private Vector uDeferred = new Vector (5, 5);
+
+ // note: DocBk 3.1.7 XML defines over 2 dozen notations,
+ // used when defining unparsed entities for graphics
+ // (and maybe in other places)
+
+
+
+ /**
+ * Creates a pipeline terminus which consumes all events passed to
+ * it; this will report validity errors as if they were fatal errors,
+ * unless an error handler is assigned.
+ *
+ * @see #setErrorHandler
+ */
+ // constructor used by PipelineFactory
+ // ... and want one taking system ID of an external subset
+ public ValidationConsumer ()
+ {
+ this (null);
+ }
+
+ /**
+ * Creates a pipeline filter which reports validity errors and then
+ * passes events on to the next consumer if they were not fatal.
+ *
+ * @see #setErrorHandler
+ */
+ // constructor used by PipelineFactory
+ // ... and want one taking system ID of an external subset
+ // (which won't send declaration events)
+ public ValidationConsumer (EventConsumer next)
+ {
+ super (next);
+
+ setContentHandler (this);
+ setDTDHandler (this);
+ try { setProperty (DECL_HANDLER, this); }
+ catch (Exception e) { /* "can't happen" */ }
+ try { setProperty (LEXICAL_HANDLER, this); }
+ catch (Exception e) { /* "can't happen" */ }
+ }
+
+
+ private static final String fakeRootName
+ = ":Nobody:in:their_Right.Mind_would:use:this-name:1x:";
+
+ /**
+ * Creates a validation consumer which is preloaded with the DTD provided.
+ * It does this by constructing a document with that DTD, then parsing
+ * that document and recording its DTD declarations. Then it arranges
+ * not to modify that information.
+ *
+ * <p> The resulting validation consumer will only validate against
+ * the specified DTD, regardless of whether some other DTD is found
+ * in a document being parsed.
+ *
+ * @param rootName The name of the required root element; if this is
+ * null, any root element name will be accepted.
+ * @param publicId If non-null and there is a non-null systemId, this
+ * identifier provides an alternate access identifier for the DTD's
+ * external subset.
+ * @param systemId If non-null, this is a URI (normally URL) that
+ * may be used to access the DTD's external subset.
+ * @param internalSubset If non-null, holds literal markup declarations
+ * comprising the DTD's internal subset.
+ * @param resolver If non-null, this will be provided to the parser for
+ * use when resolving parameter entities (including any external subset).
+ * @param resolver If non-null, this will be provided to the parser for
+ * use when resolving parameter entities (including any external subset).
+ * @param minimalElement If non-null, a minimal valid document.
+ *
+ * @exception SAXNotSupportedException If the default SAX parser does
+ * not support the standard lexical or declaration handlers.
+ * @exception SAXParseException If the specified DTD has either
+ * well-formedness or validity errors
+ * @exception IOException If the specified DTD can't be read for
+ * some reason
+ */
+ public ValidationConsumer (
+ String rootName,
+ String publicId,
+ String systemId,
+ String internalSubset,
+ EntityResolver resolver,
+ String minimalDocument
+ ) throws SAXException, IOException
+ {
+ this (null);
+
+ disableReset = true;
+ if (rootName == null)
+ rootName = fakeRootName;
+
+ //
+ // Synthesize document with that DTD; is it possible to do
+ // better for the declaration of the root element?
+ //
+ // NOTE: can't use SAX2 to write internal subsets.
+ //
+ StringWriter writer = new StringWriter ();
+
+ writer.write ("<!DOCTYPE ");
+ writer.write (rootName);
+ if (systemId != null) {
+ writer.write ("\n ");
+ if (publicId != null) {
+ writer.write ("PUBLIC '");
+ writer.write (publicId);
+ writer.write ("'\n\t'");
+ } else
+ writer.write ("SYSTEM '");
+ writer.write (systemId);
+ writer.write ("'");
+ }
+ writer.write (" [ ");
+ if (rootName == fakeRootName) {
+ writer.write ("\n<!ELEMENT ");
+ writer.write (rootName);
+ writer.write (" EMPTY>");
+ }
+ if (internalSubset != null)
+ writer.write (internalSubset);
+ writer.write ("\n ]>");
+
+ if (minimalDocument != null) {
+ writer.write ("\n");
+ writer.write (minimalDocument);
+ writer.write ("\n");
+ } else {
+ writer.write (" <");
+ writer.write (rootName);
+ writer.write ("/>\n");
+ }
+ minimalDocument = writer.toString ();
+
+ //
+ // OK, load it
+ //
+ XMLReader producer;
+
+ producer = XMLReaderFactory.createXMLReader ();
+ bind (producer, this);
+
+ if (resolver != null)
+ producer.setEntityResolver (resolver);
+
+ InputSource in;
+
+ in = new InputSource (new StringReader (minimalDocument));
+ producer.parse (in);
+
+ disableDeclarations = true;
+ if (rootName == fakeRootName)
+ this.rootName = null;
+ }
+
+ private void resetState ()
+ {
+ if (!disableReset) {
+ rootName = null;
+ contentStack.removeAllElements ();
+ elements.clear ();
+ ids.clear ();
+
+ notations.removeAllElements ();
+ nDeferred.removeAllElements ();
+ unparsed.removeAllElements ();
+ uDeferred.removeAllElements ();
+ }
+ }
+
+
+ private void warning (String description)
+ throws SAXException
+ {
+ ErrorHandler errHandler = getErrorHandler ();
+ Locator locator = getDocumentLocator ();
+ SAXParseException err;
+
+ if (errHandler == null)
+ return;
+
+ if (locator == null)
+ err = new SAXParseException (description, null, null, -1, -1);
+ else
+ err = new SAXParseException (description, locator);
+ errHandler.warning (err);
+ }
+
+ // package private (for ChildrenRecognizer)
+ private void error (String description)
+ throws SAXException
+ {
+ ErrorHandler errHandler = getErrorHandler ();
+ Locator locator = getDocumentLocator ();
+ SAXParseException err;
+
+ if (locator == null)
+ err = new SAXParseException (description, null, null, -1, -1);
+ else
+ err = new SAXParseException (description, locator);
+ if (errHandler != null)
+ errHandler.error (err);
+ else // else we always treat it as fatal!
+ throw err;
+ }
+
+ private void fatalError (String description)
+ throws SAXException
+ {
+ ErrorHandler errHandler = getErrorHandler ();
+ Locator locator = getDocumentLocator ();
+ SAXParseException err;
+
+ if (locator != null)
+ err = new SAXParseException (description, locator);
+ else
+ err = new SAXParseException (description, null, null, -1, -1);
+ if (errHandler != null)
+ errHandler.fatalError (err);
+ // we always treat this as fatal, regardless of the handler
+ throw err;
+ }
+
+
+ private static boolean isExtender (char c)
+ {
+ // [88] Extender ::= ...
+ return c == 0x00b7 || c == 0x02d0 || c == 0x02d1 || c == 0x0387
+ || c == 0x0640 || c == 0x0e46 || c == 0x0ec6 || c == 0x3005
+ || (c >= 0x3031 && c <= 0x3035)
+ || (c >= 0x309d && c <= 0x309e)
+ || (c >= 0x30fc && c <= 0x30fe);
+ }
+
+
+ // use augmented Unicode rules, not full XML rules
+ private boolean isName (String name, String context, String id)
+ throws SAXException
+ {
+ char buf [] = name.toCharArray ();
+ boolean pass = true;
+
+ if (!Character.isUnicodeIdentifierStart (buf [0])
+ && ":_".indexOf (buf [0]) == -1)
+ pass = false;
+ else {
+ int max = buf.length;
+ for (int i = 1; pass && i < max; i++) {
+ char c = buf [i];
+ if (!Character.isUnicodeIdentifierPart (c)
+ && ":-_.".indexOf (c) == -1
+ && !isExtender (c))
+ pass = false;
+ }
+ }
+
+ if (!pass)
+ error ("In " + context + " for " + id
+ + ", '" + name + "' is not a name");
+ return pass; // true == OK
+ }
+
+ // use augmented Unicode rules, not full XML rules
+ private boolean isNmtoken (String nmtoken, String context, String id)
+ throws SAXException
+ {
+ char buf [] = nmtoken.toCharArray ();
+ boolean pass = true;
+ int max = buf.length;
+
+ // XXX make this share code with isName
+
+ for (int i = 0; pass && i < max; i++) {
+ char c = buf [i];
+ if (!Character.isUnicodeIdentifierPart (c)
+ && ":-_.".indexOf (c) == -1
+ && !isExtender (c))
+ pass = false;
+ }
+
+ if (!pass)
+ error ("In " + context + " for " + id
+ + ", '" + nmtoken + "' is not a name token");
+ return pass; // true == OK
+ }
+
+ private void checkEnumeration (String value, String type, String name)
+ throws SAXException
+ {
+ if (!hasMatch (value, type))
+ // VC: Enumeration
+ error ("Value '" + value
+ + "' for attribute '" + name
+ + "' is not permitted: " + type);
+ }
+
+ // used to test enumerated attributes and mixed content models
+ // package private
+ static boolean hasMatch (String value, String orList)
+ {
+ int len = value.length ();
+ int max = orList.length () - len;
+
+ for (int start = 0;
+ (start = orList.indexOf (value, start)) != -1;
+ start++) {
+ char c;
+
+ if (start > max)
+ break;
+ c = orList.charAt (start - 1);
+ if (c != '|' && c != '('/*)*/)
+ continue;
+ c = orList.charAt (start + len);
+ if (c != '|' && /*(*/ c != ')')
+ continue;
+ return true;
+ }
+ return false;
+ }
+
+ /**
+ * <b>LexicalHandler</b> Records the declaration of the root
+ * element, so it can be verified later.
+ * Passed to the next consumer, unless this one was
+ * preloaded with a particular DTD.
+ */
+ public void startDTD (String name, String publicId, String systemId)
+ throws SAXException
+ {
+ if (disableDeclarations)
+ return;
+
+ rootName = name;
+ super.startDTD (name, publicId, systemId);
+ }
+
+ /**
+ * <b>LexicalHandler</b> Verifies that all referenced notations
+ * and unparsed entities have been declared.
+ * Passed to the next consumer, unless this one was
+ * preloaded with a particular DTD.
+ */
+ public void endDTD ()
+ throws SAXException
+ {
+ if (disableDeclarations)
+ return;
+
+ // this is a convenient hook for end-of-dtd checks, but we
+ // could also trigger it in the first startElement call.
+ // locator info is more appropriate here though.
+
+ // VC: Notation Declared (NDATA can refer to them before decls,
+ // as can NOTATION attribute enumerations and defaults)
+ int length = nDeferred.size ();
+ for (int i = 0; i < length; i++) {
+ String notation = (String) nDeferred.elementAt (i);
+ if (!notations.contains (notation)) {
+ error ("A declaration referred to notation '" + notation
+ + "' which was never declared");
+ }
+ }
+ nDeferred.removeAllElements ();
+
+ // VC: Entity Name (attribute values can refer to them
+ // before they're declared); VC Attribute Default Legal
+ length = uDeferred.size ();
+ for (int i = 0; i < length; i++) {
+ String entity = (String) uDeferred.elementAt (i);
+ if (!unparsed.contains (entity)) {
+ error ("An attribute default referred to entity '" + entity
+ + "' which was never declared");
+ }
+ }
+ uDeferred.removeAllElements ();
+ super.endDTD ();
+ }
+
+
+ // These are interned, so we can rely on "==" to find the type of
+ // all attributes except enumerations ...
+ // "(this|or|that|...)" and "NOTATION (this|or|that|...)"
+ static final String types [] = {
+ "CDATA",
+ "ID", "IDREF", "IDREFS",
+ "NMTOKEN", "NMTOKENS",
+ "ENTITY", "ENTITIES"
+ };
+
+
+ /**
+ * <b>DecllHandler</b> Records attribute declaration for later use
+ * in validating document content, and checks validity constraints
+ * that are applicable to attribute declarations.
+ * Passed to the next consumer, unless this one was
+ * preloaded with a particular DTD.
+ */
+ public void attributeDecl (
+ String eName,
+ String aName,
+ String type,
+ String mode,
+ String value
+ ) throws SAXException
+ {
+ if (disableDeclarations)
+ return;
+
+ ElementInfo info = (ElementInfo) elements.get (eName);
+ AttributeInfo ainfo = new AttributeInfo ();
+ boolean checkOne = false;
+ boolean interned = false;
+
+ // cheap interning of type names and #FIXED, #REQUIRED
+ // for faster startElement (we can use "==")
+ for (int i = 0; i < types.length; i++) {
+ if (types [i].equals (type)) {
+ type = types [i];
+ interned = true;
+ break;
+ }
+ }
+ if ("#FIXED".equals (mode))
+ mode = "#FIXED";
+ else if ("#REQUIRED".equals (mode))
+ mode = "#REQUIRED";
+
+ ainfo.type = type;
+ ainfo.mode = mode;
+ ainfo.value = value;
+
+ // we might not have seen the content model yet
+ if (info == null) {
+ info = new ElementInfo (eName);
+ elements.put (eName, info);
+ }
+ if ("ID" == type) {
+ checkOne = true;
+ if (!("#REQUIRED" == mode || "#IMPLIED".equals (mode))) {
+ // VC: ID Attribute Default
+ error ("ID attribute '" + aName
+ + "' must be #IMPLIED or #REQUIRED");
+ }
+
+ } else if (!interned && type.startsWith ("NOTATION ")) {
+ checkOne = true;
+
+ // VC: Notation Attributes (notations must be declared)
+ StringTokenizer tokens = new StringTokenizer (
+ type.substring (10, type.lastIndexOf (')')),
+ "|");
+ while (tokens.hasMoreTokens ()) {
+ String token = tokens.nextToken ();
+ if (!notations.contains (token))
+ nDeferred.addElement (token);
+ }
+ }
+ if (checkOne) {
+ for (Enumeration e = info.attributes.keys ();
+ e.hasMoreElements ();
+ /* NOP */) {
+ String name;
+ AttributeInfo ainfo2;
+
+ name = (String) e.nextElement ();
+ ainfo2 = (AttributeInfo) info.attributes.get (name);
+ if (type == ainfo2.type || !interned /* NOTATION */) {
+ // VC: One ID per Element Type
+ // VC: One Notation per Element TYpe
+ error ("Element '" + eName
+ + "' already has an attribute of type "
+ + (interned ? "NOTATION" : type)
+ + " ('" + name
+ + "') so '" + aName
+ + "' is a validity error");
+ }
+ }
+ }
+
+ // VC: Attribute Default Legal
+ if (value != null) {
+
+ if ("CDATA" == type) {
+ // event source rejected '<'
+
+ } else if ("NMTOKEN" == type) {
+ // VC: Name Token (is a nmtoken)
+ isNmtoken (value, "attribute default", aName);
+
+ } else if ("NMTOKENS" == type) {
+ // VC: Name Token (is a nmtoken; at least one value)
+ StringTokenizer tokens = new StringTokenizer (value);
+ if (!tokens.hasMoreTokens ())
+ error ("Default for attribute '" + aName
+ + "' must have at least one name token.");
+ else do {
+ String token = tokens.nextToken ();
+ isNmtoken (token, "attribute default", aName);
+ } while (tokens.hasMoreTokens ());
+
+ } else if ("IDREF" == type || "ENTITY" == type) {
+ // VC: Entity Name (is a name)
+ // VC: IDREF (is a name) (is declared)
+ isName (value, "attribute default", aName);
+ if ("ENTITY" == type && !unparsed.contains (value))
+ uDeferred.addElement (value);
+
+ } else if ("IDREFS" == type || "ENTITIES" == type) {
+ // VC: Entity Name (is a name; at least one value)
+ // VC: IDREF (is a name; at least one value)
+ StringTokenizer names = new StringTokenizer (value);
+ if (!names.hasMoreTokens ())
+ error ("Default for attribute '" + aName
+ + "' must have at least one name.");
+ else do {
+ String name = names.nextToken ();
+ isName (name, "attribute default", aName);
+ if ("ENTITIES" == type && !unparsed.contains (name))
+ uDeferred.addElement (value);
+ } while (names.hasMoreTokens ());
+
+ } else if (type.charAt (0) == '(' /*)*/ ) {
+ // VC: Enumeration (must match)
+ checkEnumeration (value, type, aName);
+
+ } else if (!interned && checkOne) { /* NOTATION */
+ // VC: Notation attributes (must be names)
+ isName (value, "attribute default", aName);
+
+ // VC: Notation attributes (must be declared)
+ if (!notations.contains (value))
+ nDeferred.addElement (value);
+
+ // VC: Enumeration (must match)
+ checkEnumeration (value, type, aName);
+
+ } else if ("ID" != type)
+ throw new RuntimeException ("illegal attribute type: " + type);
+ }
+
+ if (info.attributes.get (aName) == null)
+ info.attributes.put (aName, ainfo);
+ /*
+ else
+ warning ("Element '" + eName
+ + "' already has an attribute named '" + aName + "'");
+ */
+
+ if ("xml:space".equals (aName)) {
+ if (!("(default|preserve)".equals (type)
+ || "(preserve|default)".equals (type)
+ // these next two are arguable; XHTML's DTD doesn't
+ // deserve errors. After all, it's not like any
+ // illegal _value_ could pass ...
+ || "(preserve)".equals (type)
+ || "(default)".equals (type)
+ ))
+ error (
+ "xml:space attribute type must be like '(default|preserve)'"
+ + " not '" + type + "'"
+ );
+
+ }
+ super.attributeDecl (eName, aName, type, mode, value);
+ }
+
+ /**
+ * <b>DecllHandler</b> Records the element declaration for later use
+ * when checking document content, and checks validity constraints that
+ * apply to element declarations. Passed to the next consumer, unless
+ * this one was preloaded with a particular DTD.
+ */
+ public void elementDecl (String name, String model)
+ throws SAXException
+ {
+ if (disableDeclarations)
+ return;
+
+ ElementInfo info = (ElementInfo) elements.get (name);
+
+ // we might have seen an attribute decl already
+ if (info == null) {
+ info = new ElementInfo (name);
+ elements.put (name, info);
+ }
+ if (info.model != null) {
+ // NOTE: not all parsers can report such duplicates.
+ // VC: Unique Element Type Declaration
+ error ("Element type '" + name
+ + "' was already declared.");
+ } else {
+ info.model = model;
+
+ // VC: No Duplicate Types (in mixed content models)
+ if (model.charAt (1) == '#') // (#PCDATA...
+ info.getRecognizer (this);
+ }
+ super.elementDecl (name, model);
+ }
+
+ /**
+ * <b>DecllHandler</b> passed to the next consumer, unless this
+ * one was preloaded with a particular DTD
+ */
+ public void internalEntityDecl (String name, String value)
+ throws SAXException
+ {
+ if (!disableDeclarations)
+ super.internalEntityDecl (name, value);
+ }
+
+ /**
+ * <b>DecllHandler</b> passed to the next consumer, unless this
+ * one was preloaded with a particular DTD
+ */
+ public void externalEntityDecl (String name,
+ String publicId, String systemId)
+ throws SAXException
+ {
+ if (!disableDeclarations)
+ super.externalEntityDecl (name, publicId, systemId);
+ }
+
+
+ /**
+ * <b>DTDHandler</b> Records the notation name, for checking
+ * NOTATIONS attribute values and declararations of unparsed
+ * entities. Passed to the next consumer, unless this one was
+ * preloaded with a particular DTD.
+ */
+ public void notationDecl (String name, String publicId, String systemId)
+ throws SAXException
+ {
+ if (disableDeclarations)
+ return;
+
+ notations.addElement (name);
+ super.notationDecl (name, publicId, systemId);
+ }
+
+ /**
+ * <b>DTDHandler</b> Records the entity name, for checking
+ * ENTITY and ENTITIES attribute values; records the notation
+ * name if it hasn't yet been declared. Passed to the next consumer,
+ * unless this one was preloaded with a particular DTD.
+ */
+ public void unparsedEntityDecl (
+ String name,
+ String publicId,
+ String systemId,
+ String notationName
+ ) throws SAXException
+ {
+ if (disableDeclarations)
+ return;
+
+ unparsed.addElement (name);
+ if (!notations.contains (notationName))
+ nDeferred.addElement (notationName);
+ super.unparsedEntityDecl (name, publicId, systemId, notationName);
+ }
+
+
+ /**
+ * <b>ContentHandler</b> Ensures that state from any previous parse
+ * has been deleted.
+ * Passed to the next consumer.
+ */
+ public void startDocument ()
+ throws SAXException
+ {
+ resetState ();
+ super.startDocument ();
+ }
+
+
+ private static boolean isAsciiLetter (char c)
+ {
+ return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
+ }
+
+
+ /**
+ * <b>ContentHandler</b> Reports a fatal exception. Validating
+ * XML processors may not skip any entities.
+ */
+ public void skippedEntity (String name)
+ throws SAXException
+ {
+ fatalError ("may not skip entities");
+ }
+
+ /*
+ * SAX2 doesn't expand non-PE refs in attribute defaults...
+ */
+ private String expandDefaultRefs (String s)
+ throws SAXException
+ {
+ if (s.indexOf ('&') < 0)
+ return s;
+
+// FIXME: handle &#nn; &#xnn; &name;
+ String message = "Can't expand refs in attribute default: " + s;
+ warning (message);
+
+ return s;
+ }
+
+ /**
+ * <b>ContentHandler</b> Performs validity checks against element
+ * (and document) content models, and attribute values.
+ * Passed to the next consumer.
+ */
+ public void startElement (
+ String uri,
+ String localName,
+ String qName,
+ Attributes atts
+ ) throws SAXException
+ {
+ //
+ // First check content model for the enclosing scope.
+ //
+ if (contentStack.isEmpty ()) {
+ // VC: Root Element Type
+ if (!qName.equals (rootName)) {
+ if (rootName == null)
+ warning ("This document has no DTD, can't be valid");
+ else
+ error ("Root element type '" + qName
+ + "' was declared to be '" + rootName + "'");
+ }
+ } else {
+ Recognizer state = (Recognizer) contentStack.peek ();
+
+ if (state != null) {
+ Recognizer newstate = state.acceptElement (qName);
+
+ if (newstate == null)
+ error ("Element type '" + qName
+ + "' in element '" + state.type.name
+ + "' violates content model " + state.type.model
+ );
+ if (newstate != state) {
+ contentStack.pop ();
+ contentStack.push (newstate);
+ }
+ }
+ }
+
+ //
+ // Then check that this element was declared, and push the
+ // object used to validate its content model onto our stack.
+ //
+ // This is where the recognizer gets created, if needed; if
+ // it's a "children" (elements) content model, an NDFA is
+ // created. (One recognizer is used per content type, no
+ // matter how complex that recognizer is.)
+ //
+ ElementInfo info;
+
+ info = (ElementInfo) elements.get (qName);
+ if (info == null || info.model == null) {
+ // VC: Element Valid (base clause)
+ error ("Element type '" + qName + "' was not declared");
+ contentStack.push (null);
+
+ // for less diagnostic noise, fake a declaration.
+ elementDecl (qName, "ANY");
+ } else
+ contentStack.push (info.getRecognizer (this));
+
+ //
+ // Then check each attribute present
+ //
+ int len;
+ String aname;
+ AttributeInfo ainfo;
+
+ if (atts != null)
+ len = atts.getLength ();
+ else
+ len = 0;
+
+ for (int i = 0; i < len; i++) {
+ aname = atts.getQName (i);
+
+ if (info == null
+ || (ainfo = (AttributeInfo) info.attributes.get (aname))
+ == null) {
+ // VC: Attribute Value Type
+ error ("Attribute '" + aname
+ + "' was not declared for element type " + qName);
+ continue;
+ }
+
+ String value = atts.getValue (i);
+
+ // note that "==" for type names and "#FIXED" is correct
+ // (and fast) since we've interned those literals.
+
+ if ("#FIXED" == ainfo.mode) {
+ String expanded = expandDefaultRefs (ainfo.value);
+
+ // VC: Fixed Attribute Default
+ if (!value.equals (expanded)) {
+ error ("Attribute '" + aname
+ + "' must match " + expanded
+ );
+ continue;
+ }
+ }
+
+ if ("CDATA" == ainfo.type)
+ continue;
+
+ //
+ // For all other attribute types, there are various
+ // rules to follow.
+ //
+
+ if ("ID" == ainfo.type) {
+ // VC: ID (must be a name)
+ if (isName (value, "ID attribute", aname)) {
+ if (Boolean.TRUE == ids.get (value))
+ // VC: ID (appears once)
+ error ("ID attribute " + aname
+ + " uses an ID value '" + value
+ + "' which was already declared.");
+ else
+ // any forward refs are no longer problems
+ ids.put (value, Boolean.TRUE);
+ }
+ continue;
+ }
+
+ if ("IDREF" == ainfo.type) {
+ // VC: IDREF (value must be a name)
+ if (isName (value, "IDREF attribute", aname)) {
+ // VC: IDREF (must match some ID attribute)
+ if (ids.get (value) == null)
+ // new -- assume it's a forward ref
+ ids.put (value, Boolean.FALSE);
+ }
+ continue;
+ }
+
+ if ("IDREFS" == ainfo.type) {
+ StringTokenizer tokens = new StringTokenizer (value, " ");
+
+ if (!tokens.hasMoreTokens ()) {
+ // VC: IDREF (one or more values)
+ error ("IDREFS attribute " + aname
+ + " must have at least one ID ref");
+ } else do {
+ String id = tokens.nextToken ();
+
+ // VC: IDREF (value must be a name)
+ if (isName (id, "IDREFS attribute", aname)) {
+ // VC: IDREF (must match some ID attribute)
+ if (ids.get (id) == null)
+ // new -- assume it's a forward ref
+ ids.put (id, Boolean.FALSE);
+ }
+ } while (tokens.hasMoreTokens ());
+ continue;
+ }
+
+ if ("NMTOKEN" == ainfo.type) {
+ // VC: Name Token (is a name token)
+ isNmtoken (value, "NMTOKEN attribute", aname);
+ continue;
+ }
+
+ if ("NMTOKENS" == ainfo.type) {
+ StringTokenizer tokens = new StringTokenizer (value, " ");
+
+ if (!tokens.hasMoreTokens ()) {
+ // VC: Name Token (one or more values)
+ error ("NMTOKENS attribute " + aname
+ + " must have at least one name token");
+ } else do {
+ String token = tokens.nextToken ();
+
+ // VC: Name Token (is a name token)
+ isNmtoken (token, "NMTOKENS attribute", aname);
+ } while (tokens.hasMoreTokens ());
+ continue;
+ }
+
+ if ("ENTITY" == ainfo.type) {
+ if (!unparsed.contains (value))
+ // VC: Entity Name
+ error ("Value of attribute '" + aname
+ + "' refers to unparsed entity '" + value
+ + "' which was not declared.");
+ continue;
+ }
+
+ if ("ENTITIES" == ainfo.type) {
+ StringTokenizer tokens = new StringTokenizer (value, " ");
+
+ if (!tokens.hasMoreTokens ()) {
+ // VC: Entity Name (one or more values)
+ error ("ENTITIES attribute " + aname
+ + " must have at least one name token");
+ } else do {
+ String entity = tokens.nextToken ();
+
+ if (!unparsed.contains (entity))
+ // VC: Entity Name
+ error ("Value of attribute '" + aname
+ + "' refers to unparsed entity '" + entity
+ + "' which was not declared.");
+ } while (tokens.hasMoreTokens ());
+ continue;
+ }
+
+ //
+ // check for enumerations last; more expensive
+ //
+ if (ainfo.type.charAt (0) == '(' /*)*/
+ || ainfo.type.startsWith ("NOTATION ")
+ ) {
+ // VC: Enumeration (value must be defined)
+ checkEnumeration (value, ainfo.type, aname);
+ continue;
+ }
+ }
+
+ //
+ // Last, check that all #REQUIRED attributes were provided
+ //
+ if (info != null) {
+ Hashtable table = info.attributes;
+
+ if (table.size () != 0) {
+ Enumeration e = table.keys ();
+
+ // XXX table.keys uses the heap, bleech -- slows things
+
+ while (e.hasMoreElements ()) {
+ aname = (String) e.nextElement ();
+ ainfo = (AttributeInfo) table.get (aname);
+
+ // "#REQUIRED" mode was interned in attributeDecl
+ if ("#REQUIRED" == ainfo.mode
+ && atts.getValue (aname) == null) {
+ // VC: Required Attribute
+ error ("Attribute '" + aname + "' must be specified "
+ + "for element type " + qName);
+ }
+ }
+ }
+ }
+ super.startElement (uri, localName, qName, atts);
+ }
+
+ /**
+ * <b>ContentHandler</b> Reports a validity error if the element's content
+ * model does not permit character data.
+ * Passed to the next consumer.
+ */
+ public void characters (char ch [], int start, int length)
+ throws SAXException
+ {
+ Recognizer state;
+
+ if (contentStack.empty ())
+ state = null;
+ else
+ state = (Recognizer) contentStack.peek ();
+
+ // NOTE: if this ever supports with SAX parsers that don't
+ // report ignorable whitespace as such (only XP?), this class
+ // needs to morph it into ignorableWhitespace() as needed ...
+
+ if (state != null && !state.acceptCharacters ())
+ // VC: Element Valid (clauses three, four -- see recognizer)
+ error ("Character content not allowed in element "
+ + state.type.name);
+
+ super.characters (ch, start, length);
+ }
+
+
+ /**
+ * <b>ContentHandler</b> Reports a validity error if the element's content
+ * model does not permit end-of-element yet, or a well formedness error
+ * if there was no matching startElement call.
+ * Passed to the next consumer.
+ */
+ public void endElement (String uri, String localName, String qName)
+ throws SAXException
+ {
+ try {
+ Recognizer state = (Recognizer) contentStack.pop ();
+
+ if (state != null && !state.completed ())
+ // VC: Element valid (clauses two, three, four; see Recognizer)
+ error ("Premature end for element '"
+ + state.type.name
+ + "', content model "
+ + state.type.model);
+
+ // could insist on match of start element, but that's
+ // something the input stream must to guarantee.
+
+ } catch (EmptyStackException e) {
+ fatalError ("endElement without startElement: " + qName
+ + ((uri == null)
+ ? ""
+ : ( " { '" + uri + "', " + localName + " }")));
+ }
+ super.endElement (uri, localName, qName);
+ }
+
+ /**
+ * <b>ContentHandler</b> Checks whether all ID values that were
+ * referenced have been declared, and releases all resources.
+ * Passed to the next consumer.
+ *
+ * @see #setDocumentLocator
+ */
+ public void endDocument ()
+ throws SAXException
+ {
+ for (Enumeration idNames = ids.keys ();
+ idNames.hasMoreElements ();
+ /* NOP */) {
+ String id = (String) idNames.nextElement ();
+
+ if (Boolean.FALSE == ids.get (id)) {
+ // VC: IDREF (must match ID)
+ error ("Undeclared ID value '" + id
+ + "' was referred to by an IDREF/IDREFS attribute");
+ }
+ }
+
+ resetState ();
+ super.endDocument ();
+ }
+
+
+ /** Holds per-element declarations */
+ static private final class ElementInfo
+ {
+ String name;
+ String model;
+
+ // key = attribute name; value = AttributeInfo
+ Hashtable attributes = new Hashtable (11);
+
+ ElementInfo (String n) { name = n; }
+
+ private Recognizer recognizer;
+
+ // for validating content models: one per type, shared,
+ // and constructed only on demand ... so unused elements do
+ // not need to consume resources.
+ Recognizer getRecognizer (ValidationConsumer consumer)
+ throws SAXException
+ {
+ if (recognizer == null) {
+ if ("ANY".equals (model))
+ recognizer = ANY;
+ else if ("EMPTY".equals (model))
+ recognizer = new EmptyRecognizer (this);
+ else if ('#' == model.charAt (1))
+ // n.b. this constructor does a validity check
+ recognizer = new MixedRecognizer (this, consumer);
+ else
+ recognizer = new ChildrenRecognizer (this, consumer);
+ }
+ return recognizer;
+ }
+ }
+
+ /** Holds per-attribute declarations */
+ static private final class AttributeInfo
+ {
+ String type;
+ String mode; // #REQUIRED, etc (or null)
+ String value; // or null
+ }
+
+
+ //
+ // Content model validation
+ //
+
+ static private final Recognizer ANY = new Recognizer (null);
+
+
+ // Base class defines the calls used to validate content,
+ // and supports the "ANY" content model
+ static private class Recognizer
+ {
+ final ElementInfo type;
+
+ Recognizer (ElementInfo t) { type = t; }
+
+ // return true iff character data is legal here
+ boolean acceptCharacters ()
+ throws SAXException
+ // VC: Element Valid (third and fourth clauses)
+ { return true; }
+
+ // null return = failure
+ // otherwise, next state (like an FSM)
+ // prerequisite: tested that name was declared
+ Recognizer acceptElement (String name)
+ throws SAXException
+ // VC: Element Valid (fourth clause)
+ { return this; }
+
+ // return true iff model is completed, can finish
+ boolean completed ()
+ throws SAXException
+ // VC: Element Valid (fourth clause)
+ { return true; }
+
+ public String toString ()
+ // n.b. "children" is the interesting case!
+ { return (type == null) ? "ANY" : type.model; }
+ }
+
+ // "EMPTY" content model -- no characters or elements
+ private static final class EmptyRecognizer extends Recognizer
+ {
+ public EmptyRecognizer (ElementInfo type)
+ { super (type); }
+
+ // VC: Element Valid (first clause)
+ boolean acceptCharacters ()
+ { return false; }
+
+ // VC: Element Valid (first clause)
+ Recognizer acceptElement (String name)
+ { return null; }
+ }
+
+ // "Mixed" content model -- ANY, but restricts elements
+ private static final class MixedRecognizer extends Recognizer
+ {
+ private String permitted [];
+
+ // N.B. constructor tests for duplicated element names (VC)
+ public MixedRecognizer (ElementInfo t, ValidationConsumer v)
+ throws SAXException
+ {
+ super (t);
+
+ // (#PCDATA...)* or (#PCDATA) ==> ... or empty
+ // with the "..." being "|elname|..."
+ StringTokenizer tokens = new StringTokenizer (
+ t.model.substring (8, t.model.lastIndexOf (')')),
+ "|");
+ Vector vec = new Vector ();
+
+ while (tokens.hasMoreTokens ()) {
+ String token = tokens.nextToken ();
+
+ if (vec.contains (token))
+ v.error ("element " + token
+ + " is repeated in mixed content model: "
+ + t.model);
+ else
+ vec.addElement (token.intern ());
+ }
+ permitted = new String [vec.size ()];
+ for (int i = 0; i < permitted.length; i++)
+ permitted [i] = (String) vec.elementAt (i);
+
+ // in one large machine-derived DTD sample, most of about
+ // 250 mixed content models were empty, and 25 had ten or
+ // more entries. 2 had over a hundred elements. Linear
+ // search isn't obviously wrong.
+ }
+
+ // VC: Element Valid (third clause)
+ Recognizer acceptElement (String name)
+ {
+ int length = permitted.length;
+
+ // first pass -- optimistic w.r.t. event source interning
+ // (and document validity)
+ for (int i = 0; i < length; i++)
+ if (permitted [i] == name)
+ return this;
+ // second pass -- pessimistic w.r.t. event source interning
+ for (int i = 0; i < length; i++)
+ if (permitted [i].equals (name))
+ return this;
+ return null;
+ }
+ }
+
+
+ // recognizer loop flags, see later
+ private static final int F_LOOPHEAD = 0x01;
+ private static final int F_LOOPNEXT = 0x02;
+
+ // for debugging -- used to label/count nodes in toString()
+ private static int nodeCount;
+
+ /**
+ * "Children" content model -- these are nodes in NDFA state graphs.
+ * They work in fixed space. Note that these graphs commonly have
+ * cycles, handling features such as zero-or-more and one-or-more.
+ *
+ * <p>It's readonly, so only one copy is ever needed. The content model
+ * stack may have any number of pointers into each graph, when a model
+ * happens to be needed more than once due to element nesting. Since
+ * traversing the graph just moves to another node, and never changes
+ * it, traversals never interfere with each other.
+ *
+ * <p>There is an option to report non-deterministic models. These are
+ * always XML errors, but ones which are not often reported despite the
+ * fact that they can lead to different validating parsers giving
+ * different results for the same input. (The XML spec doesn't require
+ * them to be reported.)
+ *
+ * <p><b>FIXME</b> There's currently at least one known bug here, in that
+ * it's not actually detecting the non-determinism it tries to detect.
+ * (Of the "optional.xml" test, the once-or-twice-2* tests are all non-D;
+ * maybe some others.) This may relate to the issue flagged below as
+ * "should not" happen (but it was), which showed up when patching the
+ * graph to have one exit node (or more EMPTY nodes).
+ */
+ private static final class ChildrenRecognizer extends Recognizer
+ implements Cloneable
+ {
+ // for reporting non-deterministic content models
+ // ... a waste of space if we're not reporting those!
+ // ... along with the 'model' member (in base class)
+ private ValidationConsumer consumer;
+
+ // for CHOICE nodes -- each component is an arc that
+ // accepts a different NAME (or is EMPTY indicating
+ // NDFA termination).
+ private Recognizer components [];
+
+ // for NAME/SEQUENCE nodes -- accepts that NAME and
+ // then goes to the next node (CHOICE, NAME, EMPTY).
+ private String name;
+ private Recognizer next;
+
+ // loops always point back to a CHOICE node. we mark such choice
+ // nodes (F_LOOPHEAD) for diagnostics and faster deep cloning.
+ // We also mark nodes before back pointers (F_LOOPNEXT), to ensure
+ // termination when we patch sequences and loops.
+ private int flags;
+
+
+ // prevent a needless indirection between 'this' and 'node'
+ private void copyIn (ChildrenRecognizer node)
+ {
+ // model & consumer are already set
+ components = node.components;
+ name = node.name;
+ next = node.next;
+ flags = node.flags;
+ }
+
+ // used to construct top level "children" content models,
+ public ChildrenRecognizer (ElementInfo type, ValidationConsumer vc)
+ {
+ this (vc, type);
+ populate (type.model.toCharArray (), 0);
+ patchNext (new EmptyRecognizer (type), null);
+ }
+
+ // used internally; populating is separate
+ private ChildrenRecognizer (ValidationConsumer vc, ElementInfo type)
+ {
+ super (type);
+ consumer = vc;
+ }
+
+
+ //
+ // When rewriting some graph nodes we need deep clones in one case;
+ // mostly shallow clones (what the JVM handles for us) are fine.
+ //
+ private ChildrenRecognizer shallowClone ()
+ {
+ try {
+ return (ChildrenRecognizer) clone ();
+ } catch (CloneNotSupportedException e) {
+ throw new Error ("clone");
+ }
+ }
+
+ private ChildrenRecognizer deepClone ()
+ {
+ return deepClone (new Hashtable (37));
+ }
+
+ private ChildrenRecognizer deepClone (Hashtable table)
+ {
+ ChildrenRecognizer retval;
+
+ if ((flags & F_LOOPHEAD) != 0) {
+ retval = (ChildrenRecognizer) table.get (this);
+ if (retval != null)
+ return this;
+
+ retval = shallowClone ();
+ table.put (this, retval);
+ } else
+ retval = shallowClone ();
+
+ if (next != null) {
+ if (next instanceof ChildrenRecognizer)
+ retval.next = ((ChildrenRecognizer)next)
+ .deepClone (table);
+ else if (!(next instanceof EmptyRecognizer))
+ throw new RuntimeException ("deepClone");
+ }
+
+ if (components != null) {
+ retval.components = new Recognizer [components.length];
+ for (int i = 0; i < components.length; i++) {
+ Recognizer temp = components [i];
+
+ if (temp == null)
+ retval.components [i] = null;
+ else if (temp instanceof ChildrenRecognizer)
+ retval.components [i] = ((ChildrenRecognizer)temp)
+ .deepClone (table);
+ else if (!(temp instanceof EmptyRecognizer))
+ throw new RuntimeException ("deepClone");
+ }
+ }
+
+ return retval;
+ }
+
+ // connect subgraphs, first to next (sequencing)
+ private void patchNext (Recognizer theNext, Hashtable table)
+ {
+ // backpointers must not be repatched or followed
+ if ((flags & F_LOOPNEXT) != 0)
+ return;
+
+ // XXX this table "shouldn't" be needed, right?
+ // but some choice nodes looped if it isn't there.
+ if (table != null && table.get (this) != null)
+ return;
+ if (table == null)
+ table = new Hashtable ();
+
+ // NAME/SEQUENCE
+ if (name != null) {
+ if (next == null)
+ next = theNext;
+ else if (next instanceof ChildrenRecognizer) {
+ ((ChildrenRecognizer)next).patchNext (theNext, table);
+ } else if (!(next instanceof EmptyRecognizer))
+ throw new RuntimeException ("patchNext");
+ return;
+ }
+
+ // CHOICE
+ for (int i = 0; i < components.length; i++) {
+ if (components [i] == null)
+ components [i] = theNext;
+ else if (components [i] instanceof ChildrenRecognizer) {
+ ((ChildrenRecognizer)components [i])
+ .patchNext (theNext, table);
+ } else if (!(components [i] instanceof EmptyRecognizer))
+ throw new RuntimeException ("patchNext");
+ }
+
+ if (table != null && (flags & F_LOOPHEAD) != 0)
+ table.put (this, this);
+ }
+
+ /**
+ * Parses a 'children' spec (or recursively 'cp') and makes this
+ * become a regular graph node.
+ *
+ * @return index after this particle
+ */
+ private int populate (char parseBuf [], int startPos)
+ {
+ int nextPos = startPos + 1;
+ char c;
+
+ if (nextPos < 0 || nextPos >= parseBuf.length)
+ throw new IndexOutOfBoundsException ();
+
+ // Grammar of the string is from the XML spec, but
+ // with whitespace removed by the SAX parser.
+
+ // children ::= (choice | seq) ('?' | '*' | '+')?
+ // cp ::= (Name | choice | seq) ('?' | '*' | '+')?
+ // choice ::= '(' cp ('|' choice)* ')'
+ // seq ::= '(' cp (',' choice)* ')'
+
+ // interior nodes only
+ // cp ::= name ...
+ if (parseBuf [startPos] != '('/*)*/) {
+ boolean done = false;
+ do {
+ switch (c = parseBuf [nextPos]) {
+ case '?': case '*': case '+':
+ case '|': case ',':
+ case /*(*/ ')':
+ done = true;
+ continue;
+ default:
+ nextPos++;
+ continue;
+ }
+ } while (!done);
+ name = new String (parseBuf, startPos, nextPos - startPos);
+
+ // interior OR toplevel nodes
+ // cp ::= choice ..
+ // cp ::= seq ..
+ } else {
+ // collect everything as a separate list, and merge it
+ // into "this" later if we can (SEQUENCE or singleton)
+ ChildrenRecognizer first;
+
+ first = new ChildrenRecognizer (consumer, type);
+ nextPos = first.populate (parseBuf, nextPos);
+ c = parseBuf [nextPos++];
+
+ if (c == ',' || c == '|') {
+ ChildrenRecognizer current = first;
+ char separator = c;
+ Vector v = null;
+
+ if (separator == '|') {
+ v = new Vector ();
+ v.addElement (first);
+ }
+
+ do {
+ ChildrenRecognizer link;
+
+ link = new ChildrenRecognizer (consumer, type);
+ nextPos = link.populate (parseBuf, nextPos);
+
+ if (separator == ',') {
+ current.patchNext (link, null);
+ current = link;
+ } else
+ v.addElement (link);
+
+ c = parseBuf [nextPos++];
+ } while (c == separator);
+
+ // choice ... collect everything into one array.
+ if (separator == '|') {
+ // assert v.size() > 1
+ components = new Recognizer [v.size ()];
+ for (int i = 0; i < components.length; i++) {
+ components [i] = (Recognizer)
+ v.elementAt (i);
+ }
+ // assert flags == 0
+
+ // sequence ... merge into "this" to be smaller.
+ } else
+ copyIn (first);
+
+ // treat singletons like one-node sequences.
+ } else
+ copyIn (first);
+
+ if (c != /*(*/ ')')
+ throw new RuntimeException ("corrupt content model");
+ }
+
+ //
+ // Arity is optional, and the root of all fun. We keep the
+ // FSM state graph simple by only having NAME/SEQUENCE and
+ // CHOICE nodes (or EMPTY to terminate a model), easily
+ // evaluated. So we rewrite each node that has arity, using
+ // those primitives. We create loops here, if needed.
+ //
+ if (nextPos < parseBuf.length) {
+ c = parseBuf [nextPos];
+ if (c == '?' || c == '*' || c == '+') {
+ nextPos++;
+
+ // Rewrite 'zero-or-one' "?" arity to a CHOICE:
+ // - SEQUENCE (clone, what's next)
+ // - or, what's next
+ // Size cost: N --> N + 1
+ if (c == '?') {
+ Recognizer once = shallowClone ();
+
+ components = new Recognizer [2];
+ components [0] = once;
+ // components [1] initted to null
+ name = null;
+ next = null;
+ flags = 0;
+
+
+ // Rewrite 'zero-or-more' "*" arity to a CHOICE.
+ // - LOOP (clone, back to this CHOICE)
+ // - or, what's next
+ // Size cost: N --> N + 1
+ } else if (c == '*') {
+ ChildrenRecognizer loop = shallowClone ();
+
+ loop.patchNext (this, null);
+ loop.flags |= F_LOOPNEXT;
+ flags = F_LOOPHEAD;
+
+ components = new Recognizer [2];
+ components [0] = loop;
+ // components [1] initted to null
+ name = null;
+ next = null;
+
+
+ // Rewrite 'one-or-more' "+" arity to a SEQUENCE.
+ // Basically (a)+ --> ((a),(a)*).
+ // - this
+ // - CHOICE
+ // * LOOP (clone, back to the CHOICE)
+ // * or, whatever's next
+ // Size cost: N --> 2N + 1
+ } else if (c == '+') {
+ ChildrenRecognizer loop = deepClone ();
+ ChildrenRecognizer choice;
+
+ choice = new ChildrenRecognizer (consumer, type);
+ loop.patchNext (choice, null);
+ loop.flags |= F_LOOPNEXT;
+ choice.flags = F_LOOPHEAD;
+
+ choice.components = new Recognizer [2];
+ choice.components [0] = loop;
+ // choice.components [1] initted to null
+ // choice.name, choice.next initted to null
+
+ patchNext (choice, null);
+ }
+ }
+ }
+
+ return nextPos;
+ }
+
+ // VC: Element Valid (second clause)
+ boolean acceptCharacters ()
+ { return false; }
+
+ // VC: Element Valid (second clause)
+ Recognizer acceptElement (String type)
+ throws SAXException
+ {
+ // NAME/SEQUENCE
+ if (name != null) {
+ if (name.equals (type))
+ return next;
+ return null;
+ }
+
+ // CHOICE ... optionally reporting nondeterminism we
+ // run across. we won't check out every transition
+ // for nondeterminism; only the ones we follow.
+ Recognizer retval = null;
+
+ for (int i = 0; i < components.length; i++) {
+ Recognizer temp = components [i].acceptElement (type);
+
+ if (temp == null)
+ continue;
+ else if (!warnNonDeterministic)
+ return temp;
+ else if (retval == null)
+ retval = temp;
+ else if (retval != temp)
+ consumer.error ("Content model " + this.type.model
+ + " is non-deterministic for " + type);
+ }
+ return retval;
+ }
+
+ // VC: Element Valid (second clause)
+ boolean completed ()
+ throws SAXException
+ {
+ // expecting a specific element
+ if (name != null)
+ return false;
+
+ // choice, some sequences
+ for (int i = 0; i < components.length; i++) {
+ if (components [i].completed ())
+ return true;
+ }
+
+ return false;
+ }
+
+/** /
+ // FOR DEBUGGING ... flattens the graph for printing.
+
+ public String toString ()
+ {
+ StringBuffer buf = new StringBuffer ();
+
+ // only one set of loop labels can be generated
+ // at a time...
+ synchronized (ANY) {
+ nodeCount = 0;
+
+ toString (buf, new Hashtable ());
+ return buf.toString ();
+ }
+ }
+
+ private void toString (StringBuffer buf, Hashtable table)
+ {
+ // When we visit a node, label and count it.
+ // Nodes are never visited/counted more than once.
+ // For small models labels waste space, but if arity
+ // mappings were used the savings are substantial.
+ // (Plus, the output can be more readily understood.)
+ String temp = (String) table.get (this);
+
+ if (temp != null) {
+ buf.append ('{');
+ buf.append (temp);
+ buf.append ('}');
+ return;
+ } else {
+ StringBuffer scratch = new StringBuffer (15);
+
+ if ((flags & F_LOOPHEAD) != 0)
+ scratch.append ("loop");
+ else
+ scratch.append ("node");
+ scratch.append ('-');
+ scratch.append (++nodeCount);
+ temp = scratch.toString ();
+
+ table.put (this, temp);
+ buf.append ('[');
+ buf.append (temp);
+ buf.append (']');
+ buf.append (':');
+ }
+
+ // NAME/SEQUENCE
+ if (name != null) {
+ // n.b. some output encodings turn some name chars into '?'
+ // e.g. with Japanese names and ASCII output
+ buf.append (name);
+ if (components != null) // bug!
+ buf.append ('$');
+ if (next == null)
+ buf.append (",*");
+ else if (next instanceof EmptyRecognizer) // patch-to-next
+ buf.append (",{}");
+ else if (next instanceof ChildrenRecognizer) {
+ buf.append (',');
+ ((ChildrenRecognizer)next).toString (buf, table);
+ } else // bug!
+ buf.append (",+");
+ return;
+ }
+
+ // CHOICE
+ buf.append ("<");
+ for (int i = 0; i < components.length; i++) {
+ if (i != 0)
+ buf.append ("|");
+ if (components [i] instanceof EmptyRecognizer) {
+ buf.append ("{}");
+ } else if (components [i] == null) { // patch-to-next
+ buf.append ('*');
+ } else {
+ ChildrenRecognizer r;
+
+ r = (ChildrenRecognizer) components [i];
+ r.toString (buf, table);
+ }
+ }
+ buf.append (">");
+ }
+/**/
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/WellFormednessFilter.java b/libjava/classpath/gnu/xml/pipeline/WellFormednessFilter.java
new file mode 100644
index 000000000..7a3db6593
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/WellFormednessFilter.java
@@ -0,0 +1,363 @@
+/* WellFormednessFilter.java --
+ Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.util.EmptyStackException;
+import java.util.Stack;
+
+import org.xml.sax.Attributes;
+import org.xml.sax.ErrorHandler;
+import org.xml.sax.Locator;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXParseException;
+
+/**
+ * This filter reports fatal exceptions in the case of event streams that
+ * are not well formed. The rules currently tested include: <ul>
+ *
+ * <li>setDocumentLocator ... may be called only before startDocument
+ *
+ * <li>startDocument/endDocument ... must be paired, and all other
+ * calls (except setDocumentLocator) must be nested within these.
+ *
+ * <li>startElement/endElement ... must be correctly paired, and
+ * may never appear within CDATA sections.
+ *
+ * <li>comment ... can't contain "--"
+ *
+ * <li>character data ... can't contain "]]&gt;"
+ *
+ * <li>whitespace ... can't contain CR
+ *
+ * <li>whitespace and character data must be within an element
+ *
+ * <li>processing instruction ... can't contain "?&gt;" or CR
+ *
+ * <li>startCDATA/endCDATA ... must be correctly paired.
+ *
+ * </ul>
+ *
+ * <p> Other checks for event stream correctness may be provided in
+ * the future. For example, insisting that
+ * entity boundaries nest correctly,
+ * namespace scopes nest correctly,
+ * namespace values never contain relative URIs,
+ * attributes don't have "&lt;" characters;
+ * and more.
+ *
+ * @author David Brownell
+ */
+public final class WellFormednessFilter extends EventFilter
+{
+ private boolean startedDoc;
+ private Stack elementStack = new Stack ();
+ private boolean startedCDATA;
+ private String dtdState = "before";
+
+
+ /**
+ * Swallows all events after performing well formedness checks.
+ */
+ // constructor used by PipelineFactory
+ public WellFormednessFilter ()
+ { this (null); }
+
+
+ /**
+ * Passes events through to the specified consumer, after first
+ * processing them.
+ */
+ // constructor used by PipelineFactory
+ public WellFormednessFilter (EventConsumer consumer)
+ {
+ super (consumer);
+
+ setContentHandler (this);
+ setDTDHandler (this);
+
+ try {
+ setProperty (LEXICAL_HANDLER, this);
+ } catch (SAXException e) { /* can't happen */ }
+ }
+
+ /**
+ * Resets state as if any preceding event stream was well formed.
+ * Particularly useful if it ended through some sort of error,
+ * and the endDocument call wasn't made.
+ */
+ public void reset ()
+ {
+ startedDoc = false;
+ startedCDATA = false;
+ elementStack.removeAllElements ();
+ }
+
+
+ private SAXParseException getException (String message)
+ {
+ SAXParseException e;
+ Locator locator = getDocumentLocator ();
+
+ if (locator == null)
+ return new SAXParseException (message, null, null, -1, -1);
+ else
+ return new SAXParseException (message, locator);
+ }
+
+ private void fatalError (String message)
+ throws SAXException
+ {
+ SAXParseException e = getException (message);
+ ErrorHandler handler = getErrorHandler ();
+
+ if (handler != null)
+ handler.fatalError (e);
+ throw e;
+ }
+
+ /**
+ * Throws an exception when called after startDocument.
+ *
+ * @param locator the locator, to be used in error reporting or relative
+ * URI resolution.
+ *
+ * @exception IllegalStateException when called after the document
+ * has already been started
+ */
+ public void setDocumentLocator (Locator locator)
+ {
+ if (startedDoc)
+ throw new IllegalStateException (
+ "setDocumentLocator called after startDocument");
+ super.setDocumentLocator (locator);
+ }
+
+ public void startDocument () throws SAXException
+ {
+ if (startedDoc)
+ fatalError ("startDocument called more than once");
+ startedDoc = true;
+ startedCDATA = false;
+ elementStack.removeAllElements ();
+ super.startDocument ();
+ }
+
+ public void startElement (
+ String uri, String localName,
+ String qName, Attributes atts
+ ) throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ if ("inside".equals (dtdState))
+ fatalError ("element inside DTD?");
+ else
+ dtdState = "after";
+ if (startedCDATA)
+ fatalError ("element inside CDATA section");
+ if (qName == null || "".equals (qName))
+ fatalError ("startElement name missing");
+ elementStack.push (qName);
+ super.startElement (uri, localName, qName, atts);
+ }
+
+ public void endElement (String uri, String localName, String qName)
+ throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ if (startedCDATA)
+ fatalError ("element inside CDATA section");
+ if (qName == null || "".equals (qName))
+ fatalError ("endElement name missing");
+
+ try {
+ String top = (String) elementStack.pop ();
+
+ if (!qName.equals (top))
+ fatalError ("<" + top + " ...>...</" + qName + ">");
+ // XXX could record/test namespace info
+ } catch (EmptyStackException e) {
+ fatalError ("endElement without startElement: </" + qName + ">");
+ }
+ super.endElement (uri, localName, qName);
+ }
+
+ public void endDocument () throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ dtdState = "before";
+ startedDoc = false;
+ super.endDocument ();
+ }
+
+
+ public void startDTD (String root, String publicId, String systemId)
+ throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ if ("before" != dtdState)
+ fatalError ("two DTDs?");
+ if (!elementStack.empty ())
+ fatalError ("DTD must precede root element");
+ dtdState = "inside";
+ super.startDTD (root, publicId, systemId);
+ }
+
+ public void notationDecl (String name, String publicId, String systemId)
+ throws SAXException
+ {
+// FIXME: not all parsers will report startDTD() ...
+// we'd rather insist we're "inside".
+ if ("after" == dtdState)
+ fatalError ("not inside DTD");
+ super.notationDecl (name, publicId, systemId);
+ }
+
+ public void unparsedEntityDecl (String name,
+ String publicId, String systemId, String notationName)
+ throws SAXException
+ {
+// FIXME: not all parsers will report startDTD() ...
+// we'd rather insist we're "inside".
+ if ("after" == dtdState)
+ fatalError ("not inside DTD");
+ super.unparsedEntityDecl (name, publicId, systemId, notationName);
+ }
+
+ // FIXME: add the four DeclHandler calls too
+
+ public void endDTD ()
+ throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ if ("inside" != dtdState)
+ fatalError ("DTD ends without start?");
+ dtdState = "after";
+ super.endDTD ();
+ }
+
+ public void characters (char ch [], int start, int length)
+ throws SAXException
+ {
+ int here = start, end = start + length;
+ if (elementStack.empty ())
+ fatalError ("characters must be in an element");
+ while (here < end) {
+ if (ch [here++] != ']')
+ continue;
+ if (here == end) // potential problem ...
+ continue;
+ if (ch [here++] != ']')
+ continue;
+ if (here == end) // potential problem ...
+ continue;
+ if (ch [here++] == '>')
+ fatalError ("character data can't contain \"]]>\"");
+ }
+ super.characters (ch, start, length);
+ }
+
+ public void ignorableWhitespace (char ch [], int start, int length)
+ throws SAXException
+ {
+ int here = start, end = start + length;
+ if (elementStack.empty ())
+ fatalError ("characters must be in an element");
+ while (here < end) {
+ if (ch [here++] == '\r')
+ fatalError ("whitespace can't contain CR");
+ }
+ super.ignorableWhitespace (ch, start, length);
+ }
+
+ public void processingInstruction (String target, String data)
+ throws SAXException
+ {
+ if (data.indexOf ('\r') > 0)
+ fatalError ("PIs can't contain CR");
+ if (data.indexOf ("?>") > 0)
+ fatalError ("PIs can't contain \"?>\"");
+ }
+
+ public void comment (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ if (startedCDATA)
+ fatalError ("comments can't nest in CDATA");
+ int here = start, end = start + length;
+ while (here < end) {
+ if (ch [here] == '\r')
+ fatalError ("comments can't contain CR");
+ if (ch [here++] != '-')
+ continue;
+ if (here == end)
+ fatalError ("comments can't end with \"--->\"");
+ if (ch [here++] == '-')
+ fatalError ("comments can't contain \"--\"");
+ }
+ super.comment (ch, start, length);
+ }
+
+ public void startCDATA ()
+ throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ if (startedCDATA)
+ fatalError ("CDATA starts can't nest");
+ startedCDATA = true;
+ super.startCDATA ();
+ }
+
+ public void endCDATA ()
+ throws SAXException
+ {
+ if (!startedDoc)
+ fatalError ("callback outside of document?");
+ if (!startedCDATA)
+ fatalError ("CDATA end without start?");
+ startedCDATA = false;
+ super.endCDATA ();
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/XIncludeFilter.java b/libjava/classpath/gnu/xml/pipeline/XIncludeFilter.java
new file mode 100644
index 000000000..a1445fa0c
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/XIncludeFilter.java
@@ -0,0 +1,579 @@
+/* XIncludeFilter.java --
+ Copyright (C) 2001,2002 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.net.URL;
+import java.net.URLConnection;
+import java.util.Hashtable;
+import java.util.Stack;
+import java.util.Vector;
+
+import org.xml.sax.Attributes;
+import org.xml.sax.ErrorHandler;
+import org.xml.sax.InputSource;
+import org.xml.sax.Locator;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXParseException;
+import org.xml.sax.XMLReader;
+import org.xml.sax.helpers.XMLReaderFactory;
+
+import gnu.xml.util.Resolver;
+
+
+
+/**
+ * Filter to process an XPointer-free subset of
+ * <a href="http://www.w3.org/TR/xinclude">XInclude</a>, supporting its
+ * use as a kind of replacement for parsed general entities.
+ * XInclude works much like the <code>#include</code> of C/C++ but
+ * works for XML documents as well as unparsed text files.
+ * Restrictions from the 17-Sept-2002 CR draft of XInclude are as follows:
+ *
+ * <ul>
+ *
+ * <li> URIs must not include fragment identifiers.
+ * The CR specifies support for XPointer <em>element()</em> fragment IDs,
+ * which is not currently implemented here.
+ *
+ * <li> <em>xi:fallback</em> handling of resource errors is not
+ * currently supported.
+ *
+ * <li> DTDs are not supported in included files, since the SAX DTD events
+ * must have completely preceded any included file.
+ * The CR explicitly allows the DTD related portions of the infoset to
+ * grow as an effect of including XML documents.
+ *
+ * <li> <em>xml:base</em> fixup isn't done.
+ *
+ * </ul>
+ *
+ * <p> XML documents that are included will normally be processed using
+ * the default SAX namespace rules, meaning that prefix information may
+ * be discarded. This may be changed with {@link #setSavingPrefixes
+ * setSavingPrefixes()}. <em>You are strongly advised to do this.</em>
+ *
+ * <p> Note that XInclude allows highly incompatible implementations, which
+ * are specialized to handle application-specific infoset extensions. Some
+ * such implementations can be implemented by subclassing this one, but
+ * they may only be substituted in applications at "user option".
+ *
+ * <p>TBD: "IURI" handling.
+ *
+ * @author David Brownell
+ */
+public class XIncludeFilter extends EventFilter implements Locator
+{
+ private Hashtable extEntities = new Hashtable (5, 5);
+ private int ignoreCount;
+ private Stack uris = new Stack ();
+ private Locator locator;
+ private Vector inclusions = new Vector (5, 5);
+ private boolean savingPrefixes;
+
+ /**
+ */
+ public XIncludeFilter (EventConsumer next)
+ throws SAXException
+ {
+ super (next);
+ setContentHandler (this);
+ // DTDHandler callbacks pass straight through
+ setProperty (DECL_HANDLER, this);
+ setProperty (LEXICAL_HANDLER, this);
+ }
+
+ private void fatal (SAXParseException e) throws SAXException
+ {
+ ErrorHandler eh;
+
+ eh = getErrorHandler ();
+ if (eh != null)
+ eh.fatalError (e);
+ throw e;
+ }
+
+ /**
+ * Passes "this" down the filter chain as a proxy locator.
+ */
+ public void setDocumentLocator (Locator locator)
+ {
+ this.locator = locator;
+ super.setDocumentLocator (this);
+ }
+
+ /** Used for proxy locator; do not call directly. */
+ public String getSystemId ()
+ { return (locator == null) ? null : locator.getSystemId (); }
+ /** Used for proxy locator; do not call directly. */
+ public String getPublicId ()
+ { return (locator == null) ? null : locator.getPublicId (); }
+ /** Used for proxy locator; do not call directly. */
+ public int getLineNumber ()
+ { return (locator == null) ? -1 : locator.getLineNumber (); }
+ /** Used for proxy locator; do not call directly. */
+ public int getColumnNumber ()
+ { return (locator == null) ? -1 : locator.getColumnNumber (); }
+
+ /**
+ * Assigns the flag controlling the setting of the SAX2
+ * <em>namespace-prefixes</em> flag.
+ */
+ public void setSavingPrefixes (boolean flag)
+ { savingPrefixes = flag; }
+
+ /**
+ * Returns the flag controlling the setting of the SAX2
+ * <em>namespace-prefixes</em> flag when parsing included documents.
+ * The default value is the SAX2 default (false), which discards
+ * information that can be useful.
+ */
+ public boolean isSavingPrefixes ()
+ { return savingPrefixes; }
+
+ //
+ // Two mechanisms are interacting here.
+ //
+ // - XML Base implies a stack of base URIs, updated both by
+ // "real entity" boundaries and element boundaries.
+ //
+ // - Active "Real Entities" (for document and general entities,
+ // and by xincluded files) are tracked to prevent circular
+ // inclusions.
+ //
+ private String addMarker (String uri)
+ throws SAXException
+ {
+ if (locator != null && locator.getSystemId () != null)
+ uri = locator.getSystemId ();
+
+ // guard against InputSource objects without system IDs
+ if (uri == null)
+ fatal (new SAXParseException ("Entity URI is unknown", locator));
+
+ try {
+ URL url = new URL (uri);
+
+ uri = url.toString ();
+ if (inclusions.contains (uri))
+ fatal (new SAXParseException (
+ "XInclude, circular inclusion", locator));
+ inclusions.addElement (uri);
+ uris.push (url);
+ } catch (IOException e) {
+ // guard against illegal relative URIs (Xerces)
+ fatal (new SAXParseException ("parser bug: relative URI",
+ locator, e));
+ }
+ return uri;
+ }
+
+ private void pop (String uri)
+ {
+ inclusions.removeElement (uri);
+ uris.pop ();
+ }
+
+ //
+ // Document entity boundaries get both treatments.
+ //
+ public void startDocument () throws SAXException
+ {
+ ignoreCount = 0;
+ addMarker (null);
+ super.startDocument ();
+ }
+
+ public void endDocument () throws SAXException
+ {
+ inclusions.setSize (0);
+ extEntities.clear ();
+ uris.setSize (0);
+ super.endDocument ();
+ }
+
+ //
+ // External general entity boundaries get both treatments.
+ //
+ public void externalEntityDecl (String name,
+ String publicId, String systemId)
+ throws SAXException
+ {
+ if (name.charAt (0) == '%')
+ return;
+ try {
+ URL url = new URL (locator.getSystemId ());
+ systemId = new URL (url, systemId).toString ();
+ } catch (IOException e) {
+ // what could we do?
+ }
+ extEntities.put (name, systemId);
+ }
+
+ public void startEntity (String name)
+ throws SAXException
+ {
+ if (ignoreCount != 0) {
+ ignoreCount++;
+ return;
+ }
+
+ String uri = (String) extEntities.get (name);
+ if (uri != null)
+ addMarker (uri);
+ super.startEntity (name);
+ }
+
+ public void endEntity (String name)
+ throws SAXException
+ {
+ if (ignoreCount != 0) {
+ if (--ignoreCount != 0)
+ return;
+ }
+
+ String uri = (String) extEntities.get (name);
+
+ if (uri != null)
+ pop (uri);
+ super.endEntity (name);
+ }
+
+ //
+ // element boundaries only affect the base URI stack,
+ // unless they're XInclude elements.
+ //
+ public void
+ startElement (String uri, String localName, String qName, Attributes atts)
+ throws SAXException
+ {
+ if (ignoreCount != 0) {
+ ignoreCount++;
+ return;
+ }
+
+ URL baseURI = (URL) uris.peek ();
+ String base;
+
+ base = atts.getValue ("http://www.w3.org/XML/1998/namespace", "base");
+ if (base == null)
+ uris.push (baseURI);
+ else {
+ URL url;
+
+ if (base.indexOf ('#') != -1)
+ fatal (new SAXParseException (
+ "xml:base with fragment: " + base,
+ locator));
+
+ try {
+ baseURI = new URL (baseURI, base);
+ uris.push (baseURI);
+ } catch (Exception e) {
+ fatal (new SAXParseException (
+ "xml:base with illegal uri: " + base,
+ locator, e));
+ }
+ }
+
+ if (!"http://www.w3.org/2001/XInclude".equals (uri)) {
+ super.startElement (uri, localName, qName, atts);
+ return;
+ }
+
+ if ("include".equals (localName)) {
+ String href = atts.getValue ("href");
+ String parse = atts.getValue ("parse");
+ String encoding = atts.getValue ("encoding");
+ URL url = (URL) uris.peek ();
+ SAXParseException x = null;
+
+ if (href == null)
+ fatal (new SAXParseException (
+ "XInclude missing href",
+ locator));
+ if (href.indexOf ('#') != -1)
+ fatal (new SAXParseException (
+ "XInclude with fragment: " + href,
+ locator));
+
+ if (parse == null || "xml".equals (parse))
+ x = xinclude (url, href);
+ else if ("text".equals (parse))
+ x = readText (url, href, encoding);
+ else
+ fatal (new SAXParseException (
+ "unknown XInclude parsing mode: " + parse,
+ locator));
+ if (x == null) {
+ // strip out all child content
+ ignoreCount++;
+ return;
+ }
+
+ // FIXME the 17-Sept-2002 CR of XInclude says we "must"
+ // use xi:fallback elements to handle resource errors,
+ // if they exist.
+ fatal (x);
+
+ } else if ("fallback".equals (localName)) {
+ fatal (new SAXParseException (
+ "illegal top level XInclude 'fallback' element",
+ locator));
+ } else {
+ ErrorHandler eh = getErrorHandler ();
+
+ // CR doesn't say this is an error
+ if (eh != null)
+ eh.warning (new SAXParseException (
+ "unrecognized toplevel XInclude element: " + localName,
+ locator));
+ super.startElement (uri, localName, qName, atts);
+ }
+ }
+
+ public void endElement (String uri, String localName, String qName)
+ throws SAXException
+ {
+ if (ignoreCount != 0) {
+ if (--ignoreCount != 0)
+ return;
+ }
+
+ uris.pop ();
+ if (!("http://www.w3.org/2001/XInclude".equals (uri)
+ && "include".equals (localName)))
+ super.endElement (uri, localName, qName);
+ }
+
+ //
+ // ignore all content within non-empty xi:include elements
+ //
+ public void characters (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.characters (ch, start, length);
+ }
+
+ public void processingInstruction (String target, String value)
+ throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.processingInstruction (target, value);
+ }
+
+ public void ignorableWhitespace (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.ignorableWhitespace (ch, start, length);
+ }
+
+ public void comment (char ch [], int start, int length)
+ throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.comment (ch, start, length);
+ }
+
+ public void startCDATA () throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.startCDATA ();
+ }
+
+ public void endCDATA () throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.endCDATA ();
+ }
+
+ public void startPrefixMapping (String prefix, String uri)
+ throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.startPrefixMapping (prefix, uri);
+ }
+
+ public void endPrefixMapping (String prefix) throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.endPrefixMapping (prefix);
+ }
+
+ public void skippedEntity (String name) throws SAXException
+ {
+ if (ignoreCount == 0)
+ super.skippedEntity (name);
+ }
+
+ // JDK 1.1 seems to need it to be done this way, sigh
+ void setLocator (Locator l) { locator = l; }
+ Locator getLocator () { return locator; }
+
+
+ //
+ // for XIncluded entities, manage the current locator and
+ // filter out events that would be incorrect to report
+ //
+ private class Scrubber extends EventFilter
+ {
+ Scrubber (EventFilter f)
+ throws SAXException
+ {
+ // delegation passes to next in chain
+ super (f);
+
+ // process all content events
+ super.setContentHandler (this);
+ super.setProperty (LEXICAL_HANDLER, this);
+
+ // drop all DTD events
+ super.setDTDHandler (null);
+ super.setProperty (DECL_HANDLER, null);
+ }
+
+ // maintain proxy locator
+ // only one startDocument()/endDocument() pair per event stream
+ public void setDocumentLocator (Locator l)
+ { setLocator (l); }
+ public void startDocument ()
+ { }
+ public void endDocument ()
+ { }
+
+ private void reject (String message) throws SAXException
+ { fatal (new SAXParseException (message, getLocator ())); }
+
+ // only the DTD from the "base document" gets reported
+ public void startDTD (String root, String publicId, String systemId)
+ throws SAXException
+ { reject ("XIncluded DTD: " + systemId); }
+ public void endDTD ()
+ throws SAXException
+ { reject ("XIncluded DTD"); }
+ // ... so this should never happen
+ public void skippedEntity (String name) throws SAXException
+ { reject ("XInclude skipped entity: " + name); }
+
+ // since we rejected DTDs, only builtin entities can be reported
+ }
+
+ // <xi:include parse='xml' ...>
+ // relative to the base URI passed
+ private SAXParseException xinclude (URL url, String href)
+ throws SAXException
+ {
+ XMLReader helper;
+ Scrubber scrubber;
+ Locator savedLocator = locator;
+
+ // start with a parser acting just like our input
+ // modulo DTD-ish stuff (validation flag, entity resolver)
+ helper = XMLReaderFactory.createXMLReader ();
+ helper.setErrorHandler (getErrorHandler ());
+ helper.setFeature (FEATURE_URI + "namespace-prefixes", true);
+
+ // Set up the proxy locator and event filter.
+ scrubber = new Scrubber (this);
+ locator = null;
+ bind (helper, scrubber);
+
+ // Merge the included document, except its DTD
+ try {
+ url = new URL (url, href);
+ href = url.toString ();
+
+ if (inclusions.contains (href))
+ fatal (new SAXParseException (
+ "XInclude, circular inclusion", locator));
+
+ inclusions.addElement (href);
+ uris.push (url);
+ helper.parse (new InputSource (href));
+ return null;
+ } catch (java.io.IOException e) {
+ return new SAXParseException (href, locator, e);
+ } finally {
+ pop (href);
+ locator = savedLocator;
+ }
+ }
+
+ // <xi:include parse='text' ...>
+ // relative to the base URI passed
+ private SAXParseException readText (URL url, String href, String encoding)
+ throws SAXException
+ {
+ InputStream in = null;
+
+ try {
+ URLConnection conn;
+ InputStreamReader reader;
+ char buf [] = new char [4096];
+ int count;
+
+ url = new URL (url, href);
+ conn = url.openConnection ();
+ in = conn.getInputStream ();
+ if (encoding == null)
+ encoding = Resolver.getEncoding (conn.getContentType ());
+ if (encoding == null) {
+ ErrorHandler eh = getErrorHandler ();
+ if (eh != null)
+ eh.warning (new SAXParseException (
+ "guessing text encoding for URL: " + url,
+ locator));
+ reader = new InputStreamReader (in);
+ } else
+ reader = new InputStreamReader (in, encoding);
+
+ while ((count = reader.read (buf, 0, buf.length)) != -1)
+ super.characters (buf, 0, count);
+ in.close ();
+ return null;
+ } catch (IOException e) {
+ return new SAXParseException (
+ "can't XInclude text",
+ locator, e);
+ }
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/XsltFilter.java b/libjava/classpath/gnu/xml/pipeline/XsltFilter.java
new file mode 100644
index 000000000..86b6190c5
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/XsltFilter.java
@@ -0,0 +1,130 @@
+/* XsltFilter.java --
+ Copyright (C) 2001 Free Software Foundation, Inc.
+
+This file is part of GNU Classpath.
+
+GNU Classpath is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+GNU Classpath is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GNU Classpath; see the file COPYING. If not, write to the
+Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA.
+
+Linking this library statically or dynamically with other modules is
+making a combined work based on this library. Thus, the terms and
+conditions of the GNU General Public License cover the whole
+combination.
+
+As a special exception, the copyright holders of this library give you
+permission to link this library with independent modules to produce an
+executable, regardless of the license terms of these independent
+modules, and to copy and distribute the resulting executable under
+terms of your choice, provided that you also meet, for each linked
+independent module, the terms and conditions of the license of that
+module. An independent module is a module which is not derived from
+or based on this library. If you modify this library, you may extend
+this exception to your version of the library, but you are not
+obligated to do so. If you do not wish to do so, delete this
+exception statement from your version. */
+
+package gnu.xml.pipeline;
+
+import java.io.IOException;
+
+import javax.xml.transform.TransformerFactory;
+import javax.xml.transform.TransformerConfigurationException;
+import javax.xml.transform.sax.*;
+import javax.xml.transform.stream.StreamSource;
+
+import org.xml.sax.SAXException;
+import org.xml.sax.ext.LexicalHandler;
+
+
+/**
+ * Packages an XSLT transform as a pipeline component.
+ * Note that all DTD events (callbacks to DeclHandler and DTDHandler
+ * interfaces) are discarded, although XSLT transforms may be set up to
+ * use the LexicalHandler to write DTDs with only an external subset.
+ * Not every XSLT engine will necessarily be usable with this filter,
+ * but current versions of
+ * <a href="http://saxon.sourceforge.net">SAXON</a> and
+ * <a href="http://xml.apache.org/xalan-j">Xalan</a> should work well.
+ *
+ * @see TransformerFactory
+ *
+ * @author David Brownell
+ */
+final public class XsltFilter extends EventFilter
+{
+ /**
+ * Creates a filter that performs the specified transform.
+ * Uses the JAXP 1.1 interfaces to access the default XSLT
+ * engine configured for in the current execution context,
+ * and parses the stylesheet without custom EntityResolver
+ * or ErrorHandler support.
+ *
+ * @param stylesheet URI for the stylesheet specifying the
+ * XSLT transform
+ * @param next provides the ContentHandler and LexicalHandler
+ * to receive XSLT output.
+ * @exception SAXException if the stylesheet can't be parsed
+ * @exception IOException if there are difficulties
+ * bootstrapping the XSLT engine, such as it not supporting
+ * SAX well enough to use this way.
+ */
+ public XsltFilter (String stylesheet, EventConsumer next)
+ throws SAXException, IOException
+ {
+ // First, get a transformer with the stylesheet preloaded
+ TransformerFactory tf = null;
+ TransformerHandler th;
+
+ try {
+ SAXTransformerFactory stf;
+
+ tf = TransformerFactory.newInstance ();
+ if (!tf.getFeature (SAXTransformerFactory.FEATURE) // sax inputs
+ || !tf.getFeature (SAXResult.FEATURE) // sax outputs
+ || !tf.getFeature (StreamSource.FEATURE) // stylesheet
+ )
+ throw new IOException ("XSLT factory ("
+ + tf.getClass ().getName ()
+ + ") does not support SAX");
+ stf = (SAXTransformerFactory) tf;
+ th = stf.newTransformerHandler (new StreamSource (stylesheet));
+ } catch (TransformerConfigurationException e) {
+ throw new IOException ("XSLT factory ("
+ + (tf == null
+ ? "none available"
+ : tf.getClass ().getName ())
+ + ") configuration error, "
+ + e.getMessage ()
+ );
+ }
+
+ // Hook its outputs up to the pipeline ...
+ SAXResult out = new SAXResult ();
+
+ out.setHandler (next.getContentHandler ());
+ try {
+ LexicalHandler lh;
+ lh = (LexicalHandler) next.getProperty (LEXICAL_HANDLER);
+ out.setLexicalHandler (lh);
+ } catch (Exception e) {
+ // ignore
+ }
+ th.setResult (out);
+
+ // ... and make sure its inputs look like ours.
+ setContentHandler (th);
+ setProperty (LEXICAL_HANDLER, th);
+ }
+}
diff --git a/libjava/classpath/gnu/xml/pipeline/package.html b/libjava/classpath/gnu/xml/pipeline/package.html
new file mode 100644
index 000000000..352f4c87c
--- /dev/null
+++ b/libjava/classpath/gnu/xml/pipeline/package.html
@@ -0,0 +1,255 @@
+<html><head><title>
+blah
+<!--
+/*
+ * Copyright (C) 1999-2001 The Free Software Foundation, Inc.
+ */
+-->
+</title></head><body>
+
+<p>This package exposes a kind of XML processing pipeline, based on sending
+SAX events, which can be used as components of application architectures.
+Pipelines are used to convey streams of processing events from a producer
+to one or more consumers, and to let each consumer control the data seen by
+later consumers.
+
+<p> There is a <a href="PipelineFactory.html">PipelineFactory</a> class which
+accepts a syntax describing how to construct some simple pipelines. Strings
+describing such pipelines can be used in command line tools (see the
+<a href="../util/DoParse.html">DoParse</a> class)
+and in other places that it is
+useful to let processing be easily reconfigured. Pipelines can of course
+be constructed programmatically, providing access to options that the
+factory won't.
+
+<p> Web applications are supported by making it easy for servlets (or
+non-Java web application components) to be part of a pipeline. They can
+originate XML (or XHTML) data through an <em>InputSource</em> or in
+response to XML messages sent from clients using <em>CallFilter</em>
+pipeline stages. Such facilities are available using the simple syntax
+for pipeline construction.
+
+
+<h2> Programming Models </h2>
+
+<p> Pipelines should be simple to understand.
+
+<ul>
+ <li> XML content, typically entire documents,
+ is pushed through consumers by producers.
+
+ <li> Pipelines are basically about consuming SAX2 callback events,
+ where the events encapsulate XML infoset-level data.<ul>
+
+ <li> Pipelines are constructed by taking one or more consumer
+ stages and combining them to produce a composite consumer.
+
+ <li> A pipeline is presumed to have pending tasks and state from
+ the beginning of its ContentHandler.startDocument() callback until
+ it's returned from its ContentHandler.doneDocument() callback.
+
+ <li> Pipelines may have multiple output stages ("fan-out")
+ or multiple input stages ("fan-in") when appropriate.
+
+ <li> Pipelines may be long-lived, but need not be.
+
+ </ul>
+
+ <li> There is flexibility about event production. <ul>
+
+ <li> SAX2 XMLReader objects are producers, which
+ provide a high level "pull" model: documents (text or DOM) are parsed,
+ and the parser pushes individual events through the pipeline.
+
+ <li> Events can be pushed directly to event consumer components
+ by application modules, if they invoke SAX2 callbacks directly.
+ That is, application modules use the XML Infoset as exposed
+ through SAX2 event callbacks.
+
+ </ul>
+
+ <li> Multiple producer threads may concurrently access a pipeline,
+ if they coordinate appropriately.
+
+ <li> Pipeline processing is not the only framework applications
+ will use.
+
+ </ul>
+
+
+<h3> Producers: XMLReader or Custom </h3>
+
+<p> Many producers will be SAX2 XMLReader objects, and
+will read (pull) data which is then written (pushed) as events.
+Typically these will parse XML text (acquired from
+<code>org.xml.sax.helpers.XMLReaderFactory</code>) or a DOM tree
+(using a <code><a href="../util/DomParser.html">DomParser</a></code>)
+These may be bound to event consumer using a convenience routine,
+<em><a href="EventFilter.html">EventFilter</a>.bind()</em>.
+Once bound, these producers may be given additional documents to
+sent through its pipeline.
+
+<p> In other cases, you will write producers yourself. For example, some
+data structures might know how to write themselves out using one or
+more XML models, expressed as sequences of SAX2 event callbacks.
+An application module might
+itself be a producer, issuing startDocument and endDocument events
+and then asking those data structures to write themselves out to a
+given EventConsumer, or walking data structures (such as JDBC query
+results) and applying its own conversion rules. WAP format XML
+(WBMXL) can be directly converted to producer output.
+
+<p> SAX2 introduced an "XMLFilter" interface, which is a kind of XMLReader.
+It is most useful in conjunction with its XMLFilterImpl helper class;
+see the <em><a href="EventFilter.html">EventFilter</a></em> javadoc
+for information contrasting that XMLFilterImpl approach with the
+relevant parts of this pipeline framework. Briefly, such XMLFilterImpl
+children can be either producers or consumers, and are more limited in
+configuration flexibility. In this framework, the focus of filters is
+on the EventConsumer side; see the section on
+<a href="#fitting">pipe fitting</a> below.
+
+
+<h3> Consume to Standard or Custom Data Representations </h3>
+
+<p> Many consumers will be used to create standard representations of XML
+data. The <a href="TextConsumer.html">TextConsumer</a> takes its events
+and writes them as text for a single XML document,
+using an internal <a href="../util/XMLWriter.html">XMLWriter</a>.
+The <a href="DomConsumer.html">DomConsumer</a> takes its events and uses
+them to create and populate a DOM Document.
+
+<p> In other cases, you will write consumers yourself. For example,
+you might use a particular unmarshaling filter to produce objects
+that fit your application's requirements, instead of using DOM.
+Such consumers work at the level of XML data models, rather than with
+specific representations such as XML text or a DOM tree. You could
+convert your output directly to WAP format data (WBXML).
+
+
+<h3><a name="fitting">Pipe Fitting</a></h3>
+
+<p> Pipelines are composite event consumers, with each stage having
+the opportunity to transform the data before delivering it to any
+subsequent stages.
+
+<p> The <a href="PipelineFactory.html">PipelineFactory</a> class
+provides access to much of this functionality through a simple syntax.
+See the table in that class's javadoc describing a number of standard
+components. Direct API calls are still needed for many of the most
+interesting pipeline configurations, including ones leveraging actual
+or logical concurrency.
+
+<p> Four basic types of pipe fitting are directly supported. These may
+be used to construct complex pipeline networks. <ul>
+
+ <li> <a href="TeeConsumer.html">TeeConsumer</a> objects split event
+ flow so it goes to two two different consumers, one before the other.
+ This is a basic form of event fan-out; you can use this class to
+ copy events to any number of output pipelines.
+
+ <li> Clients can call remote components through HTTP or HTTPS using
+ the <a href="CallFilter.html">CallFilter</a> component, and Servlets
+ can implement such components by extending the
+ <a href="XmlServlet.html">XmlServlet</a> component. Java is not
+ required on either end, and transport protocols other than HTTP may
+ also be used.
+
+ <li> <a href="EventFilter.html">EventFilter</a> objects selectively
+ provide handling for callbacks, and can pass unhandled ones to a
+ subsequent stage. They are often subclassed, since much of the
+ basic filtering machinery is already in place in the base class.
+
+ <li> Applications can merge two event flows by just using the same
+ consumer in each one. If multiple threads are in use, synchronization
+ needs to be addressed by the appropriate application level policy.
+
+ </ul>
+
+<p> Note that filters can be as complex as
+<a href="XsltFilter.html">XSLT transforms</a>
+available) on input data, or as simple as removing simple syntax data
+such as ignorable whitespace, comments, and CDATA delimiters.
+Some simple "built-in" filters are part of this package.
+
+
+<h3> Coding Conventions: Filter and Terminus Stages</h3>
+
+<p> If you follow these coding conventions, your classes may be used
+directly (give the full class name) in pipeline descriptions as understood
+by the PipelineFactory. There are four constructors the factory may
+try to use; in order of decreasing numbers of parameters, these are: <ul>
+
+ <li> Filters that need a single String setup parameter should have
+ a public constructor with two parameters: that string, then the
+ EventConsumer holding the "next" consumer to get events.
+
+ <li> Filters that don't need setup parameters should have a public
+ constructor that accepts a single EventConsumer holding the "next"
+ consumer to get events when they are done.
+
+ <li> Terminus stages may have a public constructor taking a single
+ paramter: the string value of that parameter.
+
+ <li> Terminus stages may have a public no-parameters constructor.
+
+ </ul>
+
+<p> Of course, classes may support more than one such usage convention;
+if they do, they can automatically be used in multiple modes. If you
+try to use a terminus class as a filter, and that terminus has a constructor
+with the appropriate number of arguments, it is automatically wrapped in
+a "tee" filter.
+
+
+<h2> Debugging Tip: "Tee" Joints can Snapshot Data</h2>
+
+<p> It can sometimes be hard to see what's happening, when something
+goes wrong. Easily fixed: just snapshot the data. Then you can find
+out where things start to go wrong.
+
+<p> If you're using pipeline descriptors so that they're easily
+administered, just stick a <em>write&nbsp;(&nbsp;filename&nbsp;)</em>
+filter into the pipeline at an appropriate point.
+
+<p> Inside your programs, you can do the same thing directly: perhaps
+by saving a Writer (perhaps a StringWriter) in a variable, using that
+to create a TextConsumer, and making that the first part of a tee --
+splicing that into your pipeline at a convenient location.
+
+<p> You can also use a DomConsumer to buffer the data, but remember
+that DOM doesn't save all the information that XML provides, so that DOM
+snapshots are relatively low fidelity. They also are substantially more
+expensive in terms of memory than a StringWriter holding similar data.
+
+<h2> Debugging Tip: Non-XML Producers</h2>
+
+<p> Producers in pipelines don't need to start from XML
+data structures, such as text in XML syntax (likely coming
+from some <em>XMLReader</em> that parses XML) or a
+DOM representation (perhaps with a
+<a href="../util/DomParser.html">DomParser</a>).
+
+<p> One common type of event producer will instead make
+direct calls to SAX event handlers returned from an
+<a href="EventConsumer.html">EventConsumer</a>.
+For example, making <em>ContentHandler.startElement</em>
+calls and matching <em>ContentHandler.endElement</em> calls.
+
+<p> Applications making such calls can catch certain
+common "syntax errors" by using a
+<a href="WellFormednessFilter.html">WellFormednessFilter</a>.
+That filter will detect (and report) erroneous input data
+such as mismatched document, element, or CDATA start/end calls.
+Use such a filter near the head of the pipeline that your
+producer feeds, at least while debugging, to help ensure that
+you're providing legal XML Infoset data.
+
+<p> You can also arrange to validate data on the fly.
+For DTD validation, you can configure a
+<a href="ValidationConsumer.html">ValidationConsumer</a>
+to work as a filter, using any DTD you choose.
+Other validation schemes can be handled with other
+validation filters.
+
+</body></html>