Monday, December 04, 2006

How to write an XSD

Web services are all about communicating with XML messages. The great benefit of XML is that it is a platform neutral technology. These messages shouldn't have any dependency on a particular web service implementation technology (such as .net or java). Unfortunately many of the implementation toolkits (especially ASP.NET) encourage you to think of web services as Remote Procedure Calls (RPC) which can inject unwanted dependencies on the toolkit and often leads to sub-optimal 'chatty' interfaces. That's why it's always best to define your messages using XSD rather than by getting your implementation toolkit (such as visual studio) to spit out type definitions based on your technology specific types (such as .net classes).

The question then becomes how to write effective XSDs. In this document I'd like to give a few pointers. The example for this demonstration is the following XML document:

<Order 
  xmlns="uri:ace-ina.com:schemas:order" 
  xmlns:prd="uri:ace-ina.com:schemas:product" 
  Id="0">
	<OrderLines>
		<OrderLine Id="0">
			<Product Id="0">
				<prd:Name>Bread</prd:Name>
				<prd:Price>0.79</prd:Price>
			</Product>
			<Quantity>2</Quantity>
			<Total>1.58</Total>
		</OrderLine>
		<OrderLine Id="1">
			<Product Id="2">
				<prd:Name>Milk</prd:Name>
				<prd:Price>0.48</prd:Price>
			</Product>
			<Quantity>1</Quantity>
			<Total>0.48</Total>
		</OrderLine>
	</OrderLines>
	<Total>2.06</Total>
</Order>

It's a simple order with an id and a collection of order lines. Each order line defines a product and gives the quantity and total. The namespace of the order is 'uri:ace-ina.com:schemas:order'. A bit of added complication is introduced by defining the product in a separate namespace: 'uri:ace-ina.com:schemas:product'.

Now let's create an XSD that defines the schema for this XML document. The XSD meta-schema is defined in the namespace: 'http://www.w3.org/2001/XMLSchema', and an XSD's root element is always 'schema', so let's start with that:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema'>
</xs:schema>

We also want to define the namespace of the target document which in this case is 'uri:ace-ina.com:schemas:order'. We need to include that namespace and reference it in the targetNamespace attribute. To enforce that all the defined elements in the XSD should belong to the target namespace we need to set elementFormDefault to 'qualified'.

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
</xs:schema>

Next we should define our types. Think of types in your XSD as entities in the same way as you would think of classes in a .net application or tables in a database. In the order document there are two primary types: 'Order' and 'OrderLine'. 'Product' belongs to a seperate namespace and XSD file and we'll be looking at that later. Types that contain attributes and/or elements are known as 'complex types' and are defined in a 'complexType' element. I like to name complex types '<name of target element>Type'. So let's add two complex types to our XSD, OrderType and OrderLineType:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:complexType name="OrderType">
	</xs:complexType>
	<xs:complexType name="OrderLineType">
	</xs:complexType>
</xs:schema>

We can add attributes to our types using the 'attribute' element. Both OrderType and OrderLineType have id attributes which we want to be required integer types:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
	</xs:complexType>
</xs:schema>

Child elements can be defined as part of a 'sequence', 'choice' or 'all' containing element. 'Sequence' requires that all its elements exist in the given sequence in the target document, 'choice' allows only one of it's child elements to exist and 'all' requires that all or none of the defined elements exist, but that the order is not important. Repeating elements are not allowed in an 'all' group. minOccurs and maxOccurs are used to define optional and repeating elements. In this case we want to define 'OrderLines' and 'Total' for 'OrderType' and 'Product', 'Quantity' and 'Total' for 'OrderLine'. They are all required non-repeating elements so we don't need to specify minOccurs and maxOccurs (the default for both is '1') and we'll use 'Sequence' for all of them. We need to define the type of each element, both the OrderType and OrderLineType Total are defined as 'double' and Quantity is defined as 'integer'. We'll leave the types of 'OrderLines' and 'Product' until later:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type=""/>
			<xs:element name="Total" type="xs:double"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type=""/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element name="Total" type="xs:double"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

Because Total has the same name and type in both OrderType and OrderLineType, we can factor out a global element called Total and reference it from inside OrderType and OrderLineType:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type=""/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type=""/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

Now let's consider the OrderLines element in OrderType. In the target document, OrderLines contains a collection of OrderLine types, so we need to create a collection type for OrderLines. We can create a new complex type 'OrderLinesType' with a single repeating element 'OrderLine'. A repeating element is created by setting minOccurs to '0' and maxOccurs to 'unbounded'. We can then set the type of OrderLines to 'OrderLinesType'.

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type="OrderLinesType"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type=""/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLinesType">
		<xs:sequence>
			<xs:element name="OrderLine" type="OrderLineType" minOccurs="0" maxOccurs="unbounded"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

We're still missing the product type. This is defined in a seperate namespace 'uri:ace-ina.com:schemas:product' in a seperate XSD document:

<xs:schema 
	xmlns:xs="http://www.w3.org/2001/XMLSchema" 
	xmlns="uri:ace-ina.com:schemas:product" 
	targetNamespace="uri:ace-ina.com:schemas:product" 
	elementFormDefault="qualified">
	<xs:complexType name="ProductType">
		<xs:sequence>
			<xs:element name="Name" type="xs:string"/>
			<xs:element name="Price" type="xs:double"/>
		</xs:sequence>
		<xs:attribute name="Id" type="xs:integer" use="required"/>
	</xs:complexType>
</xs:schema>

To reference a schema from another schema with a different namespace we use 'import'. To reference another XSD with the same namespace, use 'include'. Here the namespace is different so we need to add an 'import' element to our Order XSD. We also need to define the product namespace and give it a prefix, since we already have a default namespace (uri:ace-ina.com:schemas:order). We'll use 'prd' here. We can now define the Product element's type as 'prd:ProductType':

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	xmlns:prd='uri:ace-ina.com:schemas:product'
	elementFormDefault='qualified'>
	<xs:import namespace="uri:ace-ina.com:schemas:product" schemaLocation="Product.xsd" />
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type="OrderLinesType"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type="prd:ProductType"/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLinesType">
		<xs:sequence>
			<xs:element name="OrderLine" type="OrderLineType" minOccurs="0" maxOccurs="unbounded"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

The last remaining task is to define our top level global element 'Order' with type 'OrderType':

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	xmlns:prd='uri:ace-ina.com:schemas:product'
	elementFormDefault='qualified'>
	<xs:import namespace="uri:ace-ina.com:schemas:product" schemaLocation="Product.xsd" />
	<xs:element name="Order" type="OrderType" />
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type="OrderLinesType"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type="prd:ProductType"/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLinesType">
		<xs:sequence>
			<xs:element name="OrderLine" type="OrderLineType" minOccurs="0" maxOccurs="unbounded"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

Defining your web service message types in terms of XSD decouples your web service from a particular implementation technology and aids interoperability. Also, understanding the XSD syntax allows you to read and understand WSDL, create your own client proxies and control the serialization between your implementation types and the XSD.

For a much more complete and extensive discussion on writing XSD schemas see the world wide web consortium's XML Schema Part 0: Primer Second Edition

9 comments:

Anonymous said...

Thank you, really useful! I have to create a rather simple WS but i had no ideaa where to start from.. Let's just pray that xsd.exe will generate something useful from my xsd file...

Mike Hadlow said...

Thanks anonymous, I'm glad it was useful. If you don't like what xsd.exe produces, have a look at my post Writing your own XSD.exe. Happy coding!

Unknown said...

Mike,
Thanks for this. Your XSD sample really helped me to take the plethora of information about XSD from W3 and filter it into a useful example.
Do you have any suggestions for how you would test the use of the XSD?

Gregory

Mike Hadlow said...

Thanks Gregory.

Testing? Well I guess the obvious thing would be to create some valid and invalid XML documents and run them through the validating reader.

sandy said...

Great thanks to mike......
really it's helped me a lot i was searching for XSD information.
i am very new to XSD's so it is helped me to explore the knowledge on XSD.

Anonymous said...

For so long, I even avoided learning XSD since it seemed too difficult. Your tutorial helped and I wrote my first XSD and that too very big.

Anonymous said...

Mike, thanks for your tutorial. Its a great stuff to start with.

Pankaj said...

Hi Mike,

I am not sure weather my question is silly or not but i am seriously looking for an answer to it.

my que : i want to write a.xsd in such a way that i am able to use the attributes/elements in b.xsd as tags/tag-parameters in a.xsd.

something like writing a meta xsd

is that possible , if yes then how ?

Anonymous said...

Very helpful, Thanks.