Tuesday, 10 January 2012

Approaches to XML - Part 3 - JAXB

If you remember from my previous blogs, I’m covering different approaches to parsing XML messages using the outrageously corny scenario of Pete’s Perfect Pizza, the pizza company with big ideas. In this story, you are an employee of Pete’s and have been asked to implement a system for sending orders from the front desk to the kitchen and you came up with the idea of using XML. You’ve just got your SAX Parser working, but Pete’s going global, opening kitchens around the world taking orders using the Internet. He’s hired some consultants who’ve come up with a plan for extending your cosy XML message and they’ve specified it using a schema. They’ve also enhanced your message by adding in one of there own customer schemas. The result is that the following XSD files land in your inbox and you need to get busy...

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSpy v2011 sp1 (http://www.altova.com) by Roger Hughes (Marin Solutions Ltd) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ppp="http://www.petesperfectpizza.com" xmlns:cust="http://customer.dets" targetNamespace="http://www.petesperfectpizza.com" elementFormDefault="qualified" attributeFormDefault="unqualified" version="1.00">
 <!-- Import the Namespaces required -->
 <xs:import namespace="http://customer.dets" schemaLocation="customer.xsd"/>
 <!-- The Root Node -->
 <xs:element name="PizzaOrder">
  <xs:annotation>
   <xs:documentation>A wrapper around the customer and the pizza order</xs:documentation>
  </xs:annotation>
  <xs:complexType>
   <xs:sequence>
    <xs:element name="orderID" type="ppp:CorrelationIdentifierType"/>
    <xs:element name="date" type="ppp:DateType"/>
    <xs:element name="time" type="ppp:TimeType"/>
    <xs:element name="Customer" type="cust:CustomerType"/>
    <xs:element ref="ppp:pizzas"/>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
 <!-- The Pizza Order-->
 <xs:element name="pizzas">
  <xs:annotation>
   <xs:documentation>This is a list of pizzas ordered by the customer</xs:documentation>
  </xs:annotation>
  <xs:complexType>
   <xs:sequence>
    <xs:element name="pizza" type="ppp:PizzaType"  minOccurs="1" maxOccurs="unbounded"/>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
 <xs:complexType name="PizzaType">
  <xs:sequence>
   <xs:element name="name" type="ppp:PizzaNameType">
    <xs:annotation>
     <xs:documentation>The type of pizza on the menu</xs:documentation>
    </xs:annotation>
   </xs:element>
   <xs:element name="base" type="ppp:BaseType">
    <xs:annotation>
     <xs:documentation>type of base</xs:documentation>
    </xs:annotation>
   </xs:element>
   <xs:element name="quantity" type="ppp:QuantityType">
    <xs:annotation>
     <xs:documentation>quantity of pizzas</xs:documentation>
    </xs:annotation>
   </xs:element>
  </xs:sequence>
 </xs:complexType>
 <xs:simpleType name="PizzaNameType">
  <xs:restriction base="xs:token">
   <xs:enumeration value="Margherita">
    <xs:annotation>
     <xs:documentation>Plain and Simple</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="Marinara">
    <xs:annotation>
     <xs:documentation>Garlic Pizza...</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="Prosciutto e Funghi">
    <xs:annotation>
     <xs:documentation>Ham and Musheroom</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="Capricciosa">
    <xs:annotation>
     <xs:documentation>with an egg</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="BaseType">
  <xs:restriction base="xs:token">
   <xs:enumeration value="thin">
    <xs:annotation>
     <xs:documentation>thin base traditional</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
   <xs:enumeration value="thick">
    <xs:annotation>
     <xs:documentation>Thick base</xs:documentation>
    </xs:annotation>
   </xs:enumeration>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="QuantityType">
  <xs:restriction base="xs:nonNegativeInteger"/>
 </xs:simpleType>
 <xs:simpleType name="CorrelationIdentifierType">
  <xs:restriction base="xs:token">
   <xs:maxLength value="44"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="DateType">
  <xs:annotation>
   <xs:documentation>The date is in the Common Era (minus sign in years is not permitted)</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:date">
   <xs:pattern value="\d{4}-\d{2}-\d{2}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="TimeType">
  <xs:annotation>
   <xs:documentation>The time zone although not included UTC is implied</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:time">
   <xs:pattern value="\d{2}:\d{2}:\d{2}(\.\d+)?"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSpy v2011 sp1 (http://www.altova.com) by Roger Hughes (Marin Solutions Ltd) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:cust="http://customer.dets" targetNamespace="http://customer.dets" elementFormDefault="qualified" attributeFormDefault="unqualified">
 <xs:element name="Customer" type="cust:CustomerType">
  <xs:annotation>
   <xs:documentation>Generic Customer Definition</xs:documentation>
  </xs:annotation>
 </xs:element>
 <xs:complexType name="CustomerType">
  <xs:sequence>
   <xs:element name="name" type="cust:NameType"/>
   <xs:element name="phone" type="cust:PhoneNumberType"/>
   <xs:element name="address" type="cust:AddressType"/>
  </xs:sequence>
 </xs:complexType>
 <xs:complexType name="NameType">
  <xs:sequence>
   <xs:element name="firstName" type="cust:FirstNameType"/>
   <xs:element name="lastName" type="cust:LastNameType"/>
  </xs:sequence>
 </xs:complexType>
 <xs:simpleType name="FirstNameType">
  <xs:annotation>
   <xs:documentation>The Customer's first name</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:token">
   <xs:maxLength value="16"/>
   <xs:pattern value=".{1,16}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="LastNameType">
  <xs:annotation>
   <xs:documentation>The Customer's surname</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:token">
   <xs:pattern value=".{1,48}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:complexType name="AddressType">
  <xs:sequence>
   <xs:element name="houseNumber" type="cust:HouseNumberType"/>
   <xs:element name="street" type="cust:AddressLineType"/>
   <xs:element name="town" type="cust:AddressLineType" minOccurs="0"/>
   <xs:element name="area" type="cust:AddressLineType" minOccurs="0"/>
   <xs:element name="postCode" type="cust:PostCodeType"/>
  </xs:sequence>
 </xs:complexType>
 <xs:simpleType name="HouseNumberType">
  <xs:annotation>
   <xs:documentation>The house number</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:nonNegativeInteger"/>
 </xs:simpleType>
 <xs:simpleType name="AddressLineType">
  <xs:annotation>
   <xs:documentation>A line of an address</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:token">
   <xs:pattern value=".{1,100}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="PhoneNumberType">
  <xs:restriction base="xs:token">
   <xs:maxLength value="18"/>
   <xs:pattern value=".{1,18}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="PostCodeType">
  <xs:restriction base="xs:token">
   <xs:maxLength value="10"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

You realise that with this level of complexity, you’ll be messing around with SAX for a long time, and you could also make a few mistakes. There must be a better way right? After-all XML has been around for some time, so there must be a few frameworks around that could be useful. After a bit more Googling you come across JAXB and realise that there are...

JAXB or the Java Architecture for XML Binding uses the JAXB Binding Compiler xjc to covert an XML schema into a bunch of related Java classes. These define the types required to access the XML elements, attributes and other content in a type-safe way. This blog isn’t a tutorial covering the ins and outs of JAXB, that can be found here, from Oracle and here, from the Glassfish, Project Metro, JAXB pages, which also includes a tutorial; except to say the the key idea for parsing, or unmarshalling, XML is that you compile your Java classes using the xjc compiler and then use those classes inconjunction with the JAXB API to grab hold of XML elements and attributes.

In using any XML schema to Java class compiler, the neatest approach is to put all your schemas and their compiled classes into a separate JAR file. You can mix them in with your application’s source code, but that usually clouds the code base making maintenance more difficult. In creating a JAXB JAR file, you may come up with a POM file that looks something like this:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
 <modelVersion>4.0.0</modelVersion>
 <groupId>com.captaindebug</groupId>
 <artifactId>xml-tips-jaxb</artifactId>
 <packaging>jar</packaging>
 <version>1.0-SNAPSHOT</version>
 <name>Jaxb for Pete's Perfect Pizza</name>
 <dependencies>
  <dependency>
   <groupId>javax.xml.bind</groupId>
   <artifactId>jaxb-api</artifactId>
   <version>2.0</version>
  </dependency>
 </dependencies>
 <build>
  <plugins>
   <plugin>
    <groupId>com.sun.tools.xjc.maven2</groupId>
    <artifactId>maven-jaxb-plugin</artifactId>
    <executions>
     <execution>
      <goals>
       <goal>generate</goal>
      </goals>
     </execution>
    </executions>
    <configuration>
     <generatePackage>com.captaindebug.jaxb</generatePackage>
     <includeSchemas>
      <includeSchema>**/*.xsd</includeSchema>
     </includeSchemas>
     <strict>true</strict>
     <verbose>true</verbose>
    </configuration>
   </plugin>
   <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>2.3.2</version>
    <configuration>
     <source>1.6</source>
     <target>1.6</target>
    </configuration>
   </plugin>
  </plugins>
 </build>
</project>

...which is very straight forward. So, getting back to Pete’s Perfect Pizza, you’ve created your JAXB JAR file and all that’s left to do is to explore how it works, as demonstrated in the JUnit tests below:

  @Test
 
public void testLoadPizzaOrderXml() throws JAXBException, IOException {

   
InputStream is = loadResource("/pizza-order1.xml");

   
// Load the file
   
JAXBContext context = JAXBContext.newInstance(PizzaOrder.class);
    Unmarshaller um = context.createUnmarshaller
();
    PizzaOrder pizzaOrder =
(PizzaOrder) um.unmarshal(is);

    String orderId = pizzaOrder.getOrderID
();
    assertEquals
("123w3454r5", orderId);

   
// Check the customer details...
   
CustomerType customerType = pizzaOrder.getCustomer();

    NameType nameType = customerType.getName
();
    String firstName = nameType.getFirstName
();
    assertEquals
("John", firstName);
    String lastName = nameType.getLastName
();
    assertEquals
("Miggins", lastName);

    AddressType address = customerType.getAddress
();
    assertEquals
(new BigInteger("15"), address.getHouseNumber());
    assertEquals
("Credability Street", address.getStreet());
    assertEquals
("Any Town", address.getTown());
    assertEquals
("Any Where", address.getArea());
    assertEquals
("AW12 3WS", address.getPostCode());

    Pizzas pizzas = pizzaOrder.getPizzas
();
    List<PizzaType> pizzasOrdered = pizzas.getPizza
();

    assertEquals
(3, pizzasOrdered.size());

   
// Check the pizza order...
   
for (PizzaType pizza : pizzasOrdered) {

     
PizzaNameType pizzaName = pizza.getName();
     
if ((PizzaNameType.CAPRICCIOSA == pizzaName) || (PizzaNameType.MARINARA == pizzaName)) {
       
assertEquals(BaseType.THICK, pizza.getBase());
        assertEquals
(new BigInteger("1"), pizza.getQuantity());
     
} else if (PizzaNameType.PROSCIUTTO_E_FUNGHI == pizzaName) {
       
assertEquals(BaseType.THIN, pizza.getBase());
        assertEquals
(new BigInteger("2"), pizza.getQuantity());
     
} else {
       
fail("Whoops, can't find pizza type");
     
}
    }

  }

 
private InputStream loadResource(String filename) throws IOException {

   
InputStream is = getClass().getResourceAsStream(filename);
   
if (is == null) {
     
throw new IOException("Can't find the file: " + filename);
   
}

   
return is;
 
}

The code above may look long and complex, but it really only comprises of three steps:
  1. turn the test file into an input stream. This can then be processed by the JAXB API.
  2. create a JAXB context and its associate unmarshaller. This is then used to read the XML and convert it into the element and, if available, attribute objects
  3. Use the returned list of classes to test the contents are what we'd expected them to be (this is by far the largest step).
Once you’re happy with the largely boilerplate usage of JAXB, you add it into your kitchen XML parser code and distribute it around the world to Pete’s many pizza kitchens.

One of the strong points of using a framework like JAXB is that if there’s ever a change to the schema, then all that’s required to incorporate those changes is to recompile using XJC and then fix up your client code accordingly. This may seem a bit of a headache, but it’s a much smaller headache than trying re-work a SAX parser. On the downside, JAXB has been criticised for being slow, but I’ve never had too many problems. It will theoretically use more memory that SAX - this may or may not be true. It does build sets of classes, but then again so do some SAX ContentHandler derived classes.

It should be remembered that JAXB isn’t the only tool that takes this approach, and to demonstrate this, my next blog tells the same story, but using XMLBeans...


Other blogs in this series...
  1. XML is not a String
  2. What about Sax
  3. JAXB
  4. XMLBeans

The source code is available from GitHub at:

git://github.com/roghughe/captaindebug.git


No comments: