Tree/Customization guide
From Code Synthesis Wiki
Revision as of 12:52, 28 September 2006 Boris (Talk | contribs) (→Customizing the XML Schema built-in types) ← Previous diff |
Revision as of 12:55, 28 September 2006 Boris (Talk | contribs) (→Customizing the XML Schema built-in types) Next diff → |
||
Line 735: | Line 735: | ||
} | } | ||
} | } | ||
+ | |||
+ | As in the previous sections we include <code>xml-schema.hxx</code>, not <code>xml-schema-custom.hxx</code>. We also use the <code>transcode</code> function provided by the XSD runtime in order to convert element and attribute values from Xerces-C++ | ||
+ | encoding (UTF-16) to the current code page. |
Revision as of 12:55, 28 September 2006
Note: this guide is a work in progress
Contents |
Introduction
XSD provides you with mechanisms to customize the generated type system for the C++/Tree mapping. Common customization examples include:
- using a different type for one of the built-in types (e.g.,
boost::gregorian::date
from the Boost libraries forxsd:date
) - adding a member function or data member to one or more generated types (e.g., add a
print()
function) - adding virtual functions to the base type of a type hierarchy and implementing them in the derived types (e.g., when using
xsi:type
dynamic typing and/or substitution groups)
XSD provided two command-line options, --custom-type
and --custom-type-regex
, that allow you to specify which types should be customized and how these types will be customized. The format for the --custom-type
option is as follows:
--custom-type name[=type[/base]]
The name component specifies the name of a type as defined in XML Schema. The name is unqualified since the target namespace of the schema being compiled is always assumed. The optional type component is a C++ type name that should be used instead of the generated type. If type is empty or not specified then the same name as the XML Schema type name is assumed. Finally, if the optional base component is provided then the standard mapping for the type is still generated but with this name. A few examples should make all this clear. Suppose we have the following schema:
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.com/hello"> <complexType name="names"> <sequence> <element name="name" type="string"/> </sequence> </complexType> </schema>
Compiling this schema without specifying any options will result in the following C++ code:
namespace hello { class names { ... }; }
Now we compile the above schema with the following option:
--custom-type names
The generated code will look like this:
namespace hello { class names; }
The compiler simply generated a forward declaration for names
expecting you to provide the implementation. Let's say we want to use the my_names
type instead. Then we can use the following option:
--custom-type names=my_names
The generated code will look like this:
namespace hello { typedef my_names names; }
What if we wanted to use class template names
which is defined in namespace templates
? Then we could use the following option:
--custom-type names=::templates::names<char>
The resulting code would look like this:
namespace hello { typedef ::templates::names<char> names; }
Now what if all we want to do is add a simple function to the names
type? It would be too much work if we had to implement all the code that gets generated ourselves. Fortunately we don't have to. We can ask the compiler to generate the standard mapping with a different name and then inherit our custom type from the generated one:
--custom-type names=/names_base
This will result in the following C++ code:
namespace hello { class names_base; class names; class names_base { ... }; }
In our implementation of the names
class we can use names_base
as the base. Here is another example:
--custom-type names=::templates::names<names_base>/names_base
This results in the following C++ code being generated:
namespace hello { class names_base; typedef ::templates::names<names_base> names; class names_base { ... }; }
While this example may seem a bit far-fetched, this technique is actually used in some special cases as will be shown later.
The --custom-type-regex
option is similar to --custom-type
except it allows you to use regular expressions
to match several names at once. For more information on this option refer to the XSD command line interface documentation (man pages).
Customizing the generated type - the simple case
Let's now look at the complete set of steps necessary to do a simple customization. All code in this section is taken from the contacts
example which can be found in the examples/cxx/tree/custom/contacts
directory of the XSD distribution. Our schema looks like this:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:cts="http://www.codesynthesis.com/contacts" targetNamespace="http://www.codesynthesis.com/contacts"> <xsd:complexType name="contact"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="email" type="xsd:string"/> <xsd:element name="phone" type="xsd:string"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="catalog"> <xsd:sequence> <xsd:element name="contact" type="cts:contact" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:element name="catalog" type="cts:catalog"/> </xsd:schema>
We would like to add the print()
function to the generated contact
type. The first step is to compile our schema. We will use the following XSD options (explained below):
--custom-type contact=/contact_base --hxx-epilogue '#include "contacts-custom.hxx"'
The first option tells the compiler that we are providing our own implemntation for the contact
type as well instructs it to generate the standard mapping with the name contact_base
. We will use the generated contact_base
class as a base for our implementation of contact
. The second option instructs the compiler to add #include "contacts-custom.hxx"
at the end of the generated header file. The contacts-custom.hxx
file is where we will define our own contact
class:
// contacts-custom.hxx #include <iosfwd> // std::ostream namespace contacts { class contact: public contact_base { // The following constructor signatures are copied from // contact_base except for the copy constructor and the // _clone function where we had to change the type from // contact_base to contact. // public: contact (const name::type&, const email::type&, const phone::type&); contact (const xercesc::DOMElement&, xml_schema::flags = 0, xml_schema::type* = 0); contact (const contact&, xml_schema::flags = 0, xml_schema::type* = 0); virtual contact* _clone (xml_schema::flags = 0, xml_schema::type* = 0) const; // Our customizations. // public: void print (std::ostream&) const; }; }
The implementation of our contact
class is placed into the contacts-custom.cxx
file:
// contacts-custom.cxx #include <ostream> #include "contacts.hxx" namespace contacts { contact:: contact (const name::type& n, const email::type& e, const phone::type& p) : contact_base (n, e, p) { } contact:: contact (const xercesc::DOMElement& e, xml_schema::flags f, xml_schema::type* container) : contact_base (e, f, container) { } contact:: contact (const contact& c, xml_schema::flags f, xml_schema::type* container) : contact_base (c, f, container) { } contact* contact:: _clone (xml_schema::flags f, xml_schema::type* container) const { return new contact (*this, f, container); } void contact:: print (std::ostream& os) const { os << name () << " e| " << email () << " t| " << phone () << std::endl; } }
There are two things worth noting about this implementation. First, note that we include contacts.hxx
instead of contacts-custom.hxx
. This is important since contacts-custom.hxx
is not self-sufficient; for
example, it refernces but does not define contact_base
. Second, note that all constructors are implemented
by forwarding to the corresponding contact_base
constructors.
And that's pretty much it. The only piece left is some client code that uses our customizations:
// driver.cxx #include <memory> // std::auto_ptr #include <iostream> #include "contacts.hxx" using std::cerr; using std::endl; int main (int argc, char* argv[]) { using namespace contacts; std::auto_ptr<catalog> c (catalog_ (argv[1])); for (catalog::contact::const_iterator i (c->contact ().begin ()); i != c->contact ().end (); ++i) { i->print (cerr); } }
At this stage you may be wondering why all this works. After all the contact
class is used in the generated code even though it hasn't been defined - its definition is only available at the end of the generated header file (remember the --hxx-prologue
option). The reason why this works has two parts to it:
- The compiler generates forward declaration for types that we customize
- The C++/Tree mapping is designed in such a way that a forward declaration is sufficient for all uses of a type in a header file except for inheritance
As a result, the customization technique outlined in this section works as long as you are not customizing a type that is also a base for some other type in the same schema file. The alternative approach that works even in the case of inheritance is described in the next section.
Customizing the generated type - the complex case
This section presents the complex case where the types being customized are inherited from in the same schema. All code in this section is taken from the taxonomy
example which can be found in the examples/cxx/tree/custom/taxonomy
directory of the XSD distribution. The schema for this example looks as follows:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ppl="http://www.codesynthesis.com/people" targetNamespace="http://www.codesynthesis.com/people"> <xsd:complexType name="person"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="superman"> <xsd:complexContent> <xsd:extension base="ppl:person"> <xsd:attribute name="can-fly" type="xsd:boolean" use="required"/> </xsd:extension> </xsd:complexContent> </xsd:complexType> <xsd:complexType name="batman"> <xsd:complexContent> <xsd:extension base="ppl:superman"> <xsd:attribute name="wing-span" type="xsd:unsignedInt" use="required"/> </xsd:extension> </xsd:complexContent> </xsd:complexType> <xsd:complexType name="catalog"> <xsd:sequence> <xsd:element name="person" type="ppl:person" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:element name="catalog" type="ppl:catalog"/> </xsd:schema>
We would like to add the virtual print()
function to the base of the hierarchy (the person
type) and override it in each derived type (superman
and batman
). The first step is to compile our schema. We will use the following XSD options (explained below):
--generate-polymorphic --custom-type "person=person_impl<person_base>/person_base" --custom-type "superman=superman_impl<superman_base>/superman_base" --custom-type "batman=batman_impl<batman_base>/batman_base" --generate-forward --fwd-prologue '#include "people-custom-fwd.hxx"' --hxx-prologue '#include "people-custom.hxx"'
We use the --generate-polymorphic
option because our instance documents will be using the xsi:type
-based dynamic typing.
The set of the --custom-type
options should look familiar by now. They tell the compiler that we are customizing the person
, superman
, and batman
types, that the *_impl<*_base>
class template instantiations should be used instead, and that the original mapping should still be generated but renamed to *_base
.
The --generate-forward
option triggers generation of the people-fwd.hxx
forward declaration file. The content of the people-fwd.hxx
file will look along these lines:
// people-fwd.hxx - automatically generated namespace xml_schema { ... // Declarations for the XML Schema built-in types. } namespace people { class person_base; typedef person_impl<person_base> person; class superman_base; typedef superman_impl<superman_base> superman; class batman_base; typedef batman_impl<batman_base> batman; class catalog; }
This header file won't compile unless we supply the definitions - or at least forward declarations - of the person_impl
, superman_impl
, and batman_impl
class templates. The --fwd-prologue
option does exactly that:
it adds the #include "people-custom-fwd.hxx"
line at the beginning of the generated people-fwd.hxx
. The content of the people-custom-fwd.hxx
file is straightforward:
// people-custom-fwd.hxx namespace people { template <typename base> class person_impl; template <typename base> class superman_impl; template <typename base> class batman_impl; }
Unlike the simple case, when types being customized are inherited from in the same schema file, we cannot include definitions of custom types at the end of the genearted header file. That is why we are using the </code>--hxx-prologue</code> option instead of --hxx-epilogue
to include people-custom.hxx
. Since the custom type definitions are included before the generated base types (person_base
, superman_base
, and batman_base
) are defined we have to use class templates instead of plain classes as we did in the previous section. The content of people-custom.hxx
is presented below:
// people-custom.hxx #include <iosfwd> // std::ostream #include "people-fwd.hxx" namespace people { // // template <typename base> class person_impl: public base { public: person_impl (const xml_schema::string& name); person_impl (const xercesc::DOMElement&, xml_schema::flags = 0, xml_schema::type* = 0); person_impl (const person_impl&, xml_schema::flags = 0, xml_schema::type* = 0); virtual person_impl* _clone (xml_schema::flags = 0, xml_schema::type* = 0) const; public: virtual void print (std::ostream&) const; }; template <typename base> class superman_impl: public base { public: superman_impl (const xml_schema::string& name, bool can_fly); superman_impl (const xercesc::DOMElement&, xml_schema::flags = 0, xml_schema::type* = 0); superman_impl (const superman_impl&, xml_schema::flags = 0, xml_schema::type* = 0); virtual superman_impl* _clone (xml_schema::flags = 0, xml_schema::type* = 0) const; public: virtual void print (std::ostream&) const; }; template <typename base> class batman_impl: public base { public: batman_impl (const xml_schema::string& name, bool can_fly, unsigned int wing_span); batman_impl (const xercesc::DOMElement&, xml_schema::flags = 0, xml_schema::type* = 0); batman_impl (const batman_impl&, xml_schema::flags = 0, xml_schema::type* = 0); virtual batman_impl* _clone (xml_schema::flags = 0, xml_schema::type* = 0) const; public: virtual void print (std::ostream&) const; }; }
Definitions of custom types look very similar to the one we saw in the previous section. The only two differences are the use of templates with the base class as a parameter and the inclusion of the people-fwd.hxx
file. The latter allows us to refer to the generated as well as XML Schema built-in types, e.g., xml_schema::string
.
The implementation of our class templates is placed into the people-custom.cxx file:
// people-custom.cxx #include <ostream> #include "people.hxx" namespace people { // person_impl // template <typename base> person_impl<base>:: person_impl (const xml_schema::string& name) : base (name) { } template <typename base> person_impl<base>:: person_impl (const xercesc::DOMElement& e, xml_schema::flags f, xml_schema::type* container) : base (e, f, container) { } template <typename base> person_impl<base>:: person_impl (const person_impl& p, xml_schema::flags f, xml_schema::type* container) : base (p, f, container) { } template <typename base> person_impl<base>* person_impl<base>:: _clone (xml_schema::flags f, xml_schema::type* container) const { return new person_impl (*this, f, container); } template <typename base> void person_impl<base>:: print (std::ostream& os) const { os << this->name () << std::endl; } // Explicitly instantiate person_impl class template for person_base. // template class person_impl<person_base>; // superman_impl // template <typename base> superman_impl<base>:: superman_impl (const xml_schema::string& name, bool can_fly) : base (name, can_fly) { } template <typename base> superman_impl<base>:: superman_impl (const xercesc::DOMElement& e, xml_schema::flags f, xml_schema::type* container) : base (e, f, container) { } template <typename base> superman_impl<base>:: superman_impl (const superman_impl& s, xml_schema::flags f, xml_schema::type* container) : base (s, f, container) { } template <typename base> superman_impl<base>* superman_impl<base>:: _clone (xml_schema::flags f, xml_schema::type* container) const { return new superman_impl (*this, f, container); } template <typename base> void superman_impl<base>:: print (std::ostream& os) const { if (this->can_fly ()) os << "Flying superman "; else os << "Superman "; os << this->name () << std::endl; } // Explicitly instantiate superman_impl class template for superman_base. // template class superman_impl<superman_base>; // batman_impl // template <typename base> batman_impl<base>:: batman_impl (const xml_schema::string& name, bool can_fly, unsigned int wing_span) : base (name, can_fly, wing_span) { } template <typename base> batman_impl<base>:: batman_impl (const xercesc::DOMElement& e, xml_schema::flags f, xml_schema::type* container) : base (e, f, container) { } template <typename base> batman_impl<base>:: batman_impl (const batman_impl& s, xml_schema::flags f, xml_schema::type* container) : base (s, f, container) { } template <typename base> batman_impl<base>* batman_impl<base>:: _clone (xml_schema::flags f, xml_schema::type* container) const { return new batman_impl (*this, f, container); } template <typename base> void batman_impl<base>:: print (std::ostream& os) const { os << "Batman " << this->name () << " with " << this->wing_span () << "m wing span" << std::endl; } // Explicitly instantiate batman_impl class template for batman_base. // template class batman_impl<batman_base>; }
Again, the implementation should look familiar. The two differences compared to the previous section that are worth mentioning are explicit template instantiations and qualification of member functions with this->
. The explicit template instantiations is a nice optimization that allows us to keep the implementation details in the source file (instead of putting them into the header) and prevent object code bloat that could result from instantiating our member functions in several translation units. This technique works well in our case because for each class template we know the only template argument it will ever be instantiated with (i.e., generated base class).
The qualification of member functions (name
, can_fly
, wing_span
) with this->
is necessary because they are declared in the base type and do not have any arguments that depend on the template argument. For more information on this aspec of C++ templates see your favorite C++ book (e.g., "C++ Templates" by Vandevoorde and Josuttis).
The client code that uses our customizations is presented below:
#include <memory> // std::auto_ptr #include <iostream> #include "people.hxx" using std::cerr; using std::endl; int main (int argc, char* argv[]) { using namespace people; std::auto_ptr<catalog> c (catalog_ (argv[1])); for (catalog::person::const_iterator i (c->person ().begin ()); i != c->person ().end (); ++i) { i->print (cerr); } }
To summarize, the customization technique presented in this section consists of the following steps:
- Write
*-custom-fwd.hxx
with forward decelarations of all your class templates. - Request generation of the forward declaration file with the
--generate-forward
option. - Use the
--fwd-prologue
option to include*-custom-fwd.hxx
at the beginning of the generated forward declaration file. - Write
*-custom.hxx
with definitions of all your class templates. Include the generated forward declaration file to gain access to the generated and XML Schema built-in types. - Use the
--hxx-prologue
option to include*-custom.hxx
at the beginning of the generated header file. - Use the
--custom-type
and/or--custom-type-regex
options to request type customizations. - Write
*-custom.cxx
with implementations of all your class templates. Include the generated header file at the beginning. Use explicit template instantiation to instantiate each template with the corresponding base class.
Customizing the XML Schema built-in types
This section shows how to use a custom types for the XML Schema built-in types. All code in this section is taken from the calendar
example which can be found in the examples/cxx/tree/custom/calendar
directory of the XSD distribution. The schema for this example looks as follows:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:cal="http://www.codesynthesis.com/calendar" targetNamespace="http://www.codesynthesis.com/calendar"> <xsd:complexType name="event"> <xsd:simpleContent> <xsd:extension base="xsd:string"> <xsd:attribute name="title" type="xsd:string" use="required"/> <xsd:attribute name="date" type="xsd:date" use="required"/> </xsd:extension> </xsd:simpleContent> </xsd:complexType> <xsd:complexType name="events"> <xsd:sequence> <xsd:element name="event" type="cal:event" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:element name="events" type="cal:events"/> </xsd:schema>
In this example we would like to the xsd:data
built-in type to the boost::gregorian::date
class from the Boost date_time
library.
Normally, when you compile your schema, code for the XML Schema built-in types is generated inline in the generated header file. In order to customize the built-in types we will need to generate this code into a separate header file (remember that we can only customize types from the curent schema's target namespace). To achive this we will use the --generate-xml-schema
option. This option instructs the compiler to generate code for the XML Schema namespace. The fake schema file provided to the compiler does not have to exist and is only used to derive the name of the generated header file. For more information on this option see the XSD command line interface documentation (man pages). We will use the following command line to generate a header for the XML Schema namespace:
xsd cxx-tree --generate-xml-schema --custom-type date --hxx-epilogue '#include "xml-schema-custom.hxx"' xml-schema.xsd
This will result in the xml-schema.hxx
file. We also used the --custom-type
option to customize the xsd:date
type and the --hxx-epilogue
to include xml-schema-custom.hxx
at the end of
generated xml-schema.hxx
. The xml-schema-custom.hxx
file is where we define our custom date
type:
#include <boost/date_time/gregorian/gregorian.hpp> // boost::gregorian::date namespace xml_schema { class date: public simple_type, public boost::gregorian::date { public: date (const xercesc::DOMElement&, flags = 0, type* = 0); date (const xercesc::DOMAttr&, flags = 0, type* = 0); date (const date&, flags = 0, type* = 0); virtual date* _clone (flags = 0, type* = 0) const; }; }
Our date
class inherits from both xml_schema::simple_type
(all simple types should inherit from xml_schema::simple_type
and complex types from xml_schema::type
) and boost::gregorian::date
.
The implementation of the date
class is placed into the xml-schema-custom.cxx
file:
#include <xsd/cxx/xml/string.hxx> // xml::transcode #include "xml-schema.hxx" using namespace boost; using namespace boost::gregorian; namespace xml = xsd::cxx::xml; namespace xml_schema { date:: date (const xercesc::DOMElement& e, flags f, type* container) : simple_type (e, f, container), gregorian::date ( from_simple_string ( xml::transcode<char> (e.getTextContent ()))) { } date:: date (const xercesc::DOMAttr& a, flags f, type* container) : simple_type (a, f, container), gregorian::date ( from_simple_string ( xml::transcode<char> (a.getValue ()))) { } date:: date (const date& d, flags f, type* container) : simple_type (d, f, container), gregorian::date (d) { } date* date:: _clone (flags f, type* container) const { return new date (*this, f, container); } }
As in the previous sections we include xml-schema.hxx
, not xml-schema-custom.hxx
. We also use the transcode
function provided by the XSD runtime in order to convert element and attribute values from Xerces-C++
encoding (UTF-16) to the current code page.