Tree/FAQ

From Code Synthesis Wiki

< TreeRevision as of 14:38, 23 November 2006; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Contents

General

Parsing

How do I parse an XML instance to a Xerces-C++ DOM document?

While this question is not exactly about XSD or the C++/Tree mapping and it is covered in the Xerces-C++ Programming Guide, this step is a prerequisite to some more advanced techniques covered in this FAQ. Furthermore, the XSD runtime provides some untilities that make the code a little bit more palatable.

#include <istream>

#include <xercesc/dom/DOM.hpp>
#include <xercesc/framework/Wrapper4InputSource.hpp>

#include <xsd/cxx/xml/dom/elements.hxx>
#include <xsd/cxx/xml/dom/bits/error-handler-proxy.hxx>
#include <xsd/cxx/xml/sax/std-input-source.hxx>

#include <xsd/cxx/tree/exceptions.hxx>
#include <xsd/cxx/tree/error-handler.hxx>

xsd::cxx::xml::dom::auto_ptr<xercesc::DOMDocument>
parse (std::istream& is)
{
  using namespace xercesc;
  namespace xml = xsd::cxx::xml;
  namespace tree = xsd::cxx::tree;

  const bool validate (true);
  const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};

  // Get an implementation of the Load-Store (LS) interface.
  //
  DOMImplementation* impl (
    DOMImplementationRegistry::getDOMImplementation (ls_id));

  // Create a DOMBuilder.
  //
  xml::dom::auto_ptr<DOMBuilder> parser (
    impl->createDOMBuilder(DOMImplementationLS::MODE_SYNCHRONOUS, 0));

  // Discard comment nodes in the document.
  //
  parser->setFeature (XMLUni::fgDOMComments, false);

  // Enable datatype normalization.
  //
  parser->setFeature (XMLUni::fgDOMDatatypeNormalization, true);

  // Do not create EntityReference nodes in the DOM tree. No
  // EntityReference nodes will be created, only the nodes
  // corresponding to their fully expanded substitution text
  // will be created.
  //
  parser->setFeature (XMLUni::fgDOMEntities, false);

  // Perform Namespace processing.
  //
  parser->setFeature (XMLUni::fgDOMNamespaces, true);

  // Do not include ignorable whitespace in the DOM tree.
  //
  parser->setFeature (XMLUni::fgDOMWhitespaceInElementContent, false);

  // Enable/Disable validation.
  //
  parser->setFeature (XMLUni::fgDOMValidation, validate);
  parser->setFeature (XMLUni::fgXercesSchema, validate);
  parser->setFeature (XMLUni::fgXercesSchemaFullChecking, validate);

  // We will release the DOM document ourselves.
  //
  parser->setFeature (XMLUni::fgXercesUserAdoptsDOMDocument, true);

  // Set error handler.
  //
  tree::error_handler<char> eh;
  xml::dom::bits::error_handler_proxy<char> ehp (eh);
  parser->setErrorHandler (&ehp);

  // Prepare input stream.
  //
  xml::sax::std_input_source isrc (is);
  Wrapper4InputSource wrap (&isrc, false);

  xml::dom::auto_ptr<DOMDocument> doc (parser->parse (wrap));

  eh.throw_if_failed<tree::parsing<char> > ();

  return doc;
}

Below is a simple program that uses the above code.

#include <fstream>

int
main (int argc, char* argv[])
{
  using namespace xercesc;
  namespace xml = xsd::cxx::xml;

  XMLPlatformUtils::Initialize ();

  {
    std::ifstream ifs (argv[1]);
    xml::dom::auto_ptr<DOMDocument> doc (parse (ifs));
  }
 
  XMLPlatformUtils::Terminate ();
}

How do I handle XML data of an unknown type?

Here we assume that you need to handle XML instances that can be of several predefined types. There is no informtaion that distinguishes one instance from the other other than the root element name.

Suppose we have two root elements defined in our schema: foo and bar with types Foo and Bar, respectively. There are two ways to handle this situation. The first is quite straightforward but slow. It boils down to calling each parsing function in a sequence expecting all except one to fail. The slow part comes from the need to re-parse XML to DOM for each invocation. The following code outlines this approach:

while (true)
{
  try
  {
    std::auto_ptr<Foo> f (foo ("instance.xml")); // Try to parse as Foo.

    // Do something useful with f.

    break;
  }
  catch (xml_schema::unexpected_element const&)
  {
    // Try the next function.
  }

  try
  {
    std::auto_ptr<Bar> b (bar ("instance.xml")); // Try to parse as Bar.

    // Do something useful with b.

    break;
  }
  catch (xml_schema::unexpected_element const&)
  {
    // Try the next function.
  }

  // This instance is of some other type.
}

The second approach involves splitting the parsing process into two stages: XML to DOM and DOM to Tree. After the XML to DOM stage we peek at the root element and decide which parsing function to call:

#include <xercesc/dom/DOM.hpp>
#include <xsd/cxx/xml/string.hxx>

using namespace xercesc;

DOMDocument* dom = ... // Parse XML into DOM.
DOMElement* root = dom->getDocumentElement ();
std::string name (xsd::cxx::xml::transcode (root->getLocalName ()));

if (name == "foo")
{
  std::auto_ptr<Foo> f (foo (*dom)); // Parse dom to Foo.

  // Do something useful with f.
}
else if (name == "bar")
{
  std::auto_ptr<Bar> b (bar (*dom)); // Parse dom to Bar.

  // Do something useful with b.
}

For more information on parsing XML to DOM see How do I parse an XML instance to a Xerces-C++ DOM document?

Serialization

How do I create an empty Xerces-C++ DOM document?

While this question is not exactly about XSD or the C++/Tree mapping and it is covered in the Xerces-C++ Programming Guide, this step is a prerequisite to some more advanced techniques covered in this FAQ. Furthermore, the XSD runtime provides some untilities that make the code a little bit more palatable.

#include <xercesc/dom/DOM.hpp>

#include <xsd/cxx/xml/string.hxx>
#include <xsd/cxx/xml/dom/elements.hxx>

xsd::cxx::xml::dom::auto_ptr<xercesc::DOMDocument>
create (const std::string& root_element_name,
        const std::string& root_element_namespace = "",
        const std::string& root_element_namespace_prefix = "");

xsd::cxx::xml::dom::auto_ptr<xercesc::DOMDocument>
create (const std::string& name,
        const std::string& ns,
        const std::string& prefix)
{
  using namespace xercesc;
  namespace xml = xsd::cxx::xml;

  const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};

  // Get an implementation of the Load-Store (LS) interface.
  //
  DOMImplementation* impl (
    DOMImplementationRegistry::getDOMImplementation (ls_id));

  xml::dom::auto_ptr<DOMDocument> doc (
    impl->createDocument (
      (ns.empty () ? 0 : xml::string (ns).c_str ()),
      xml::string ((prefix.empty () ? name : prefix + ':' + name)).c_str (),
      0));

  return doc;
}

You can use this function like this:

int
main (int argc, char* argv[])
{
  using namespace xercesc;
  namespace xml = xsd::cxx::xml;

  XMLPlatformUtils::Initialize ();

  {
    xml::dom::auto_ptr<DOMDocument> doc (
      create ("example",
              "http://www.example.com/xmlns/example",
              "e"));
  }

  XMLPlatformUtils::Terminate ();
}

The call to create above creates a DOM document with the example element as its root. The example element is in the http://www.example.com/xmlns/example namespace to which we assigned the e namespace prefix.

See also

Personal tools