Wednesday, October 1, 2008

Java, XDK, DOM trees and "instanceof"

Ran into an interesting bug today where code would run one way on the plain xml parser that comes with Java and fail when running with the Oracle XML Parser (XDK).:

public void collectNamespaces(Element element, Map collection)
{
   NamedNodeMap nam = element.getAttributes();
   assert nam!=null : "Attribute map can't be null";

   // Do something with the attributes
   //

   Node parent = element.getParentNode();
   if (parent!=null && parent instanceof Element)
   {
      collectNamespaces((Element)parent, collection);
   }
}

If you ran with the XDK, the last turn of the recursion would have element being an instance of XMLDocument, the XDK implementation of Document, which returns null for getAttributes. Now getAttributes should never return null for an Element and a quick look at org.w3c.dom.Document suggests it extends "org.w3c.dom.Node". So how did we get to element being assigned with a document instance?

It turns out that internally XMLDocument extends XMLElement which implements "org.w3c.dom.Element". It should be free to do this internally and only the instanceof is showing up this problem. But how would the code have been written to not have seen this problem? Well it turns on that Node.getNodeType provide you with enough information to write this code without "instanceof" or casting. (Turns out that getAttributes() is on the Node interface).

public void collectNamespaces(Node element, Map collection)
{
   assert element.getNodeType() == Node.ELEMENT_NODE : "Should only accept Element for namespace lookup"
   NamedNodeMap nam = element.getAttributes();
   assert nam!=null : "Attribute map can't be null";

   // Do something with the attributes
   //

   Node parent = element.getParentNode();
   if (parent!=null && parent.getNodeType()==Node.ELEMENT_NODE)
   {
      collectNamespaces(parent, collection);
   }
}

In general the lesson here is to use instanceof only where there isn't another way of doing things. It can tell you a little bit too much about the class you are working with. Sometimes you really don't need to know it's ancestry in that much detail.

No comments: