Apache Axis2/C AXIOM Tutorial

Introduction

What is AXIOM?

AXIOM stands for AXis Object Model and refers to the XML infoset model that is developed for Apache Axis2. XML infoset refers to the information included inside the XML. For programmatical manipulation, it is convenient to have a representation of this XML infoset in a language specific manner. DOM and JDOM are two such XML models. AXIOM is conceptually similar to such an XML model in its external behavior but deep down it is very different.

The objective of this tutorial is to introduce the basics of AXIOM/C and explain the best practices while using AXIOM.

AXIOM/C is a C equivalant of AXIOM/Java. We have done our best to get almost the same kind of API in C.

For whom is this tutorial?

This tutorial can be used by anybody who is interested and wants to go deeper in to AXIOM/C. Knowledge in similar object models such as DOM will be helpful in understanding AXIOM, but such knowledge has not been assumed. Several links are listed in the links section that will help you understand the basics of XML.

What is Pull Parsing ?

Pull parsing is a new trend in XML processing. The previously popular XML processing frameworks such as DOM were "push-based", which means that the control of parsing was with the parser itself. This approach is fine and easy to use, but it is not efficient in handling large XML documents since a complete memory model will be generated in the memory. Pull parsing inverts the control and hence the parser only proceeds at the user's command. The user can decide to store or discard events generated from the parser. AXIOM is based on pull parsing. To learn more about XML pull parsing, see the XML pull parsing introduction.

Features of AXIOM

AXIOM is a lightweight, differed built XML infoset representation based on StAX API derived from JSR 173, which is the standard streaming pull parser API. AXIOM can be manipulated as flexibly as any other object model such as JDOM, but underneath, the objects will be created only when they are absolutely required. This leads to much less memory-intensive programming.

The following is a short feature overview of AXIOM.

  • Lightweight: AXIOM is specifically targeted to be lightweight. This is achieved by reducing the depth of the hierarchy, the number of methods, and the attributes enclosed in the objects. This makes the objects less memory intensive.
  • Differed building: By far, this is the most important feature of AXIOM. The objects are not made unless a need arises for them. This passes the control of building to the object model itself, rather than an external builder.
  • Pull based: For a differed building mechanism, a pull-based parser is required. AXIOM is based on StAX, which is the standard pull parser API.

    Since different XML parsers offer different kinds of pull parser APIs, we define an API derived from StAX. That API is defined in axiom_xml_reader.h. Similarly, we define an XML writer API in axiom_xml_writer.h. These two APIs work as an abstarction layer between any XML parser and AXIOM. So any parser that is going to be used for AXIOM should implement the axiom_xml_reader API and the axiom_xml_writer API using a wrapper layer.

    Currenly we use Libxml2 as our default XML parser.

The AXIOM Builder wraps the raw XML character stream through the axiom_xml_reader API. Hence the complexities of the pull event stream are hidden from the user.

Where does SOAP come into play?

In a nutshell, SOAP is an information exchange protocol based on XML. SOAP has a defined set of XML elements that should be used in messages. Since Axis2 is a "SOAP Engine" and AXIOM is designed for Axis2, a SOAP specific API was implemented on top of AXIOM. We have defined a number of structs to represent SOAP constructs, which wrap general AXIOM structures. Learn more about SOAP.

Working with AXIOM

Axis2/C Environment

Before starting the discussion on AXIOM, it is necessary to get a good understanding of the basics of Axis2/C. Axis2/C is designed to be pluggable to any system written in C or C++. Therefore, Axis2/C has abstracted the functionalities that differ from system to system into a structure axutil_env_t, which we refer to as the Axis2 environment. The environment holds axutil_allocator_t, which is used for memory allocation and deallocation, axutil_error_t, which is used for error reporting, axutil_log_t, which is used for logging mechanisms, and axutil_thread_t which is used for threading mechanisms.

When creating the Axis2 environment, the first thing is to create the allocator.

axutil_allocator_t *allocator = NULL;

allocator = axutil_allocator_init(NULL);

We pass NULL to the above function in order to use the default allocator functions. Then the allocator functions will use the malloc, and free functions for memory management. If you have your own allocator structure, with custom malloc and free functions, you can pass them instead.

Convenient macros AXIS2_MALLOC and AXIS2_FREE are defined to use allocator functions (please have a look at axutil_allocator.h for more information).

In a similar fashion, you can create the error and log structures.

axutil_log_t *log = NULL;

axutil_error_t *error = NULL;

log = axutil_log_create(allocator, NULL, NULL);

log = axutil_log_create(allocator, NULL, "mylog.log");

Now we can create the environment by parsing the allocator, error and log to axutil_env_create_with_error_log() function.

axutil_env_t *env = NULL;

env = axutil_env_create_with_error_log(allocator, error, log);

Apart from the above abstraction, all the other library functions used are ANSI C compliant. Further, platform dependent functions are also abstracted.

As a rule of thumb, all create functions take a pointer to the environment as its first argument, and all the other functions take pointer to this particular struct as the first argument, and a pointer to the environment as the second argument. (Please refer to our coding convention page to learn more about this.)

Example,

axiom_node_t *node = NULL;

axiom_node_t *child = NULL;

node = axiom_node_create(env);

child = axiom_node_get_first_child(node, env);

Note that we are passing the node (pointer to axiom_node_t ) as the first argument and the pointer to the environment as the second.

Building AXIOM

This section explains how AXIOM can be built either from an existing document or programmatically. AXIOM provides a notion of a builder to create objects. Since AXIOM is tightly bound to StAX, a StAX compliant reader should be created first with the desired input stream.

In our AXIOM implementation, we define a struct axiom_node_t which acts as the container of the other structs. axiom_node_t maintains the links that form the linked list used to hold the AXIOM structure.

To traverse this structure, the functions defined in axiom_node.h must be used. To access XML information, the 'data element' struct stored in axiom_node_t must be obtained using the axiom_node_get_data_element function. The type of the struct stored in the axiom_node_t struct can be obtained by the axiom_node_get_node_type function. When we create axiom_element_t, axiom_text_t etc., it is required to parse a double pointer to the node struct as the last parameter of the create function, so that the corresponding node struct can be referenced using that pointer.

Example

axiom_node_t *my_node = NULL;

axiom_element_t *my_ele = NULL;

my_ele = axiom_element_create(env, NULL, "MY_ELEMENT", NULL, &my_node);

Now if we call the axiom_node_get_node_type function on the my_node pointer, it will return AXIOM_ELEMENT.

Code Listing 1

axiom_xml_reader_t *xml_reader = NULL;
axiom_stax_builder_t *om_builder = NULL;
axiom_soap_builder_t *soap_builder = NULL;
axiom_soap_envelope_t *soap_envelope = NULL;

xml_reader = axiom_xml_reader_create_for_file(env, "test_soap.xml", NULL);

om_builder = axiom_stax_builder_create(env, xml_reader);

soap_builder = axiom_soap_builder_create(env, om_builder , AXIOM_SOAP11_SOAP_ENVELOPE_NAMESPACE_URI);

soap_envelope = axiom_soap_builder_get_soap_envelope(soap_builder, env);

As the example shows, creating an AXIOM from xml_reader is pretty straight forward. Elements and nodes can be created programmatically to modify the structure as well. Currently AXIOM has two builders, namely the axiom_stax_builder_t and the axiom_soap_builder_t. These builders provide the necessary information to the XML infoset model to build the AXIOM tree.

Code Listing 2

axiom_namespace_t *ns1 = NULL;
axiom_namespace_t *ns2 = NULL;

axiom_element_t* root_ele = NULL;
axiom_node_t*    root_ele_node = NULL;

axiom_element_t *ele1      = NULL;
axiom_node_t *ele1_node = NULL;

ns1 = axiom_namespace_create(env, "bar", "x");
ns2 = axiom_namespace_create(env, "bar1", "y");

root_ele = axiom_element_create(env, NULL, "root", ns1, &root_ele_node);
ele1     = axiom_element_create(env, root_node, "foo1", ns2, &ele1_node);

Several differences exist between a programmatically created axiom_node_t and a conventionally built axiom_node_t. The most important difference is that the latter will have a pointer to its builder, where as the former does not have a builder.

The SOAP struct hierarchy is made in the most natural way for a programmer. It acts as a wrapper layer on top of the AXIOM implementation. The SOAP structs wrap the corresponding axiom_node_t structs to store XML information.

Adding and Detaching Nodes

Addition and removal methods are defined in the axiom_node.h header file.

Code Listing 3

Add child operation

axis2_status_t
axiom_node_add_child(axiom_node_t *om_node,  
    const axutil_env_t *env, 
    axiom_node_t *child_node);

Detach operation

axiom_node_t *
axiom_node_detach(axiom_node_t *om_node, 
    const axutil_env_t *env);

The detach operation resets the links and removes a node from the AXIOM tree.

This code segment shows how child addition can be done.

Code Listing 4

axiom_node_t *foo_node = NULL;
axiom_element_t *foo_ele = NULL;
axiom_node_t *bar_node = NULL;
axiom_element_t *bar_ele = NULL;

foo_ele = axiom_element_create(env, NULL, "FOO", NULL, &foo_node);
bar_ele = axiom_element_create(env, NULL, "BAR", NULL. &bar_node); 
axiom_node_add_child(foo_node, env, bar_node);

Alternatively, we can pass the foo_node as the parent node at the time of creating the bar_ele as follows.

 bar_ele = axiom_element_create(env, foo_node, "BAR", NULL, &bar_node);

The following shows important methods available in axiom_element to be used to deal with namespaces.

Code Listing 5

axiom_namespace_t * 
axiom_element_declare_namespace(axiom_element_t *om_ele,  
    const axutil_env_t *env, 
    axiom_node_t *om_node, 
    axiom_namespace_t *om_ns);

axiom_namespace_t * 
axiom_element_find_namespace(axiom_element_t *om_ele,
    const axutil_env_t *env, 
    axiom_node_t *om_node, 
    axis2_char_t *uri, 
    axis2_char_t *prefix);

axiom_namespace_t *
axiom_element_find_declared_namespace(axiom_element_t *om_element,
    const axutil_env_t *env,
    axis2_char_t *uri,
    axis2_char_t *prefix);

axis2_status_t
axiom_element_set_namespace(axiom_element_t *om_element,
    const axutil_env_t *env,
    axiom_namespace_t *ns,
    axiom_node_t *element_node);

An axiom_element has a namespace list, the declared namespaces, and a pointer to its own namespace if one exists.

The axiom_element_declare_namespace function is straight forward. It adds a namespace to the declared namespace list. Note that a namespace that is already declared will not be declared again.

axiom_element_find_namespace is a very handy method to locate a namespace in the AXIOM tree. It searches for a matching namespace in its own declared namespace list and jumps to the parent if it's not found. The search progresses up the tree until a matching namespace is found or the root has been reached.

axiom_element_find_declared_namespace can be used to search for a namespace in the current element's declared namespace list.

axiom_element_set_namespace sets axiom_element's own namespace. Note that an element's own namespace should be declared in its own namespace declaration list or in one of its parent elements. This method first searches for a matching namespace using axiom_element_find_namespace and if a matching namespace is not found, a namespace is declared to this axiom_element's namespace declarations list before setting the own namespace reference.

The following sample code segment shows how the namespaces are dealt with in AXIOM.

Code Listing 6

axiom_namespace_t *ns1 = NULL;
axiom_namespace_t *ns2 = NULL;
axiom_namespace_t *ns3 = NULL;

axiom_node_t *root_node = NULL;
axiom_element_t *root_ele = NULL;

axiom_node_t *ele1_node = NULL;
axiom_element_t *ele1   = NULL;

axiom_node_t *text_node = NULL;
axiom_text_t *om_text   = NULL;

ns1 = axiom_namespace_create(env, "bar", "x");
ns2 = axiom_namespace_create(env, "bar1", "y");

root_ele = axiom_element_create(env, NULL , "root", ns1, &root_node);
ele1     = axiom_element_create(env, root_node, "foo", ns2, &ele1_node);
om_text  = axiom_text_create(env, ele1_node, "blah", &text_node);

Serialization of the root element produces the following XML:

<x:root xmlns:x="bar">
  <y:foo xmlns:y="bar1">blah</y:foo>
</x:root>

If we want to produce

<x:foo xmlns:x="bar" xmlns:y="bar1">Test</x:foo>

we can use set_namespace and declare namespace functions as follows.

axiom_node_t *foo_node = NULL;
axiom_element_t *foo_ele  = NULL;
axiom_namespace_t *ns1 = NULL;
axiom_namespace_t *ns2 = NULL;

foo_ele = axiom_element_create(env, NULL,"foo" ,NULL, &foo_node);

ns1 = axiom_namespace_create(env, "bar", "x");
ns2 = axiom_namespace_create(env, "bar1","y");

axiom_element_set_namespace(foo_ele, env, ns1, foo_node);
axiom_element_declare_namespace(foo_ele, env, ns2, foo_node);
axiom_element_set_text(foo_ele, env, "Test", &foo_node);

Traversing

Traversing the AXIOM structure can be done by obtaining an iterator struct. You can either call the appropriate function on an AXIOM element or create the iterator manually. AXIOM/C offers three iterators to traverse the AXIOM structure. They are:

  • axiom_children_iterator_t
  • axiom_child_element_iterator_t
  • axiom_children_qname_iterator_t

The iterator supports the 'AXIOM way' of accessing elements and is more convenient than a list for sequential access. The following code sample shows how the children can be accessed. The children can be of type AXIOM_TEXT or AXIOM_ELEMENT.

Code Listing 7

axiom_children_iterator_t *children_iter = NULL;
children_iter = axiom_element_get_children(om_ele, env, om_node);
if(NULL != children_iter )
{
    while(axiom_children_iterator_has_next(children_iter, env))
    {
        axiom_node_t *node = NULL;
        node = axiom_children_iterator_next(children_iter, env);
        if(NULL != node)
        {
           if(axiom_node_get_node_type(node, env) == AXIOM_ELEMENT)
           {
               /* processing logic goes here */
           }
        } 

    }
}

Apart from this, every axiom_node_t struct has links to its siblings. If a thorough navigation is needed, the axiom_node_get_next_sibling() and axiom_node_get_previous_sibling() functions can be used. A restrictive set can be chosen by using axiom_element_xxx_with_qname() methods. The axiom_element_get_first_child_with_qname() method returns the first child that matches the given axutil_qname_t and axiom_element_get_children_with_qname() returns axiom_children_qname_iterator_t which can be used to traverse all the matching children. The advantage of these iterators is that they won't build the whole object structure at once; it builds only what is required.

Internally, all iterator implementations stay one step ahead of their apparent location to provide the correct value for the has_next() function . This hidden advancement can build elements that are not intended to be built at all.

Serialization

AXIOM can be serialized using the axiom_node_serialize function. The serialization uses axiom_xml_writer.h and axiom_output.h APIs.

Here is an example that shows how to write the output to the console (we have serialized the SOAP envelope created in code listing 1).

Code Listing 8

axiom_xml_writer_t *xml_writer = NULL;
axiom_output_t *om_output = NULL;
axis2_char_t *buffer = NULL;

..............

xml_writer = axiom_xml_writer_create(env, NULL, 0, 0);
om_output = axiom_output_create(env, xml_writer);

axiom_soap_envelope_serialize(envelope, env, om_output);
buffer = (axis2_char_t*)axis2_xml_writer_get_xml(xml_writer, env);
printf("%s ", buffer);

An easy way to serialize is to use the to_string function in om_element

Code Listing 9

axis2_char_t *xml_output = NULL; 
axiom_node_t *foo_node = NULL;
axiom_element_t *foo_ele = NULL;
axiom_namespace_t* ns = NULL;

ns = axiom_namespace_create(env, "bar", "x");

foo_ele = axiom_element_create(env, NULL, "foo", ns, &foo_node);

axiom_element_set_text(foo_ele, env, "EASY SERAILIZATION", foo_node);

xml_output = axiom_element_to_string(foo_ele, env, foo_node);

printf("%s", xml_output);
AXIS2_FREE(env->allocator, xml_output);

Note that freeing the returned buffer is the user's responsibility.

Using axiom_xml_reader and axiom_xml_writer

axiom_xml_reader provides three create functions that can be used for different XML input sources.

  • axiom_xml_reader_create_for_file can be used to read from a file
  • axiom_xml_reader_create_for_io uses a user defined callback function to pull XML
  • axiom_xml_reader_create_for_memory can be used to read from an XML string that is in a character buffer

ls of the latest version can be found on the Apache Axis2/C

  • axiom_xml_writer_create_for_file can be used to write to a file
  • axiom_xml_writer_create_for_memory can be used to write to an internal memory buffer and obtain the XML string as a character buffer

Please refer to axiom_xml_reader.h and axiom_xml_writer.h for more information.

How to Avoid Memory Leaks and Double Frees When Using AXIOM

You have to be extremely careful when using AXIOM, in order to avoid memory leaks and double free errors. The following guidelines will be extremely useful:

1. The axiom_element struct keeps a list of attributes and a list of namespaces, when an axiom_namespace pointer or an axiom_attribute pointer is added to these lists, which will be freed when the axiom_element is freed. Therefore a pointer to a namespace or an attribute should not be freed, once it is used with an axiom_element.

To avoid any inconvenience, clone functions have been implemented for both the axiom_namespace and axiom_attribute structures.

2. AXIOM returns shallow references to its string values. Therefore, when you want deep copies of returned values, the axutil_strdup() function should be used to avoid double free errors.

Example

axiom_namespace_t *ns = NULL;

axis2_char_t *uri = NULL;

ns = axiom_namespace_create(env, "http://ws.apache.org", "AXIOM");

uri = axiom_namespace_get_uri(ns, env);

/* now uri points to the same place where namespace struct's uri

pointer is pointing. Therefore following will cause a double free */

AXIS2_FREE(env->allocator, uri);

axiom_namespace_free(ns, env);

3. When creating AXIOM programatically, if you are declaring a namespace with an axiom_element, it is advisable to find whether the namespace is already available in the elements scope using the axiom_element_find_namespace function. If available, that pointer can be used instead of creating another namespace struct instance to minimize memory usage.

Complete Code for the AXIOM Based Document Building and Serialization

The following code segment shows how to use AXIOM for building a document completely and then serializing it into text, pushing the output to the console.

Code Listing 10

#include <axiom.h>
#include <axis2_util.h>
#include <axutil_env.h>
#include <axutil_log_default.h>
#include <axutil_error_default.h>
#include <stdio.h>

FILE *f = NULL;
int read_input_callback(char *buffer, int size, void* ctx)
{
     fread(buffer, (char), size, f);
}
int close_input_callback(void *ctx)
{
     fclose(f);
}
axutil_env_t * create_environment()
{
    axutil_allocator_t *allocator = NULL;
    axutil_env_t *env = NULL;
    axutil_log_t *log = NULL;

    axutil_error_t *error = NULL;
    allocator = axutil_allocator_init(NULL);
    log = axutil_log_create(allocator, NULL, NULL);

    error = axutil_error_create(allocator);
    env = axutil_env_create_with_error_log(allocator, error, log);
     env;
}

build_and_serialize_om(axutil_env_t *env)
{
    axiom_node_t *root_node = NULL;

    axiom_element_t *root_ele = NULL;
    axiom_document_t *document = NULL;
    axiom_stax_builder_t *om_builder = NULL;

    axiom_xml_reader_t *xml_reader = NULL;
    axiom_xml_writer_t *xml_writer = NULL;
    axiom_output_t *om_output = NULL;

    axis2_char_t *buffer = NULL;
    
    f = fopen("test.xml","r");
    xml_reader = axiom_xml_reader_create_for_io(env, read_input_callback,
                                                    close_input_callback, NULL, NULL);
    (!xml_reader)
         -1;

    om_builder = axiom_stax_builder_create(env, xml_reader);
    (!om_builder)
    {
        axiom_xml_reader_free(xml_reader, env);
         AXIS2_FAILURE;
    }
    document = axiom_stax_builder_get_document(om_builder, env);
    (!document)
    {
         axiom_stax_builder_free(om_builder, env);
         AXIS2_FAILURE;
    }
    
    root_node = axiom_document_get_root_element(document, env);
    (!root_node)
    {
        axiom_stax_builder_free(om_builder, env);
         AXIS2_FAILURE;
    }        
    (root_node)
    {
        (axiom_node_get_node_type(root_node, env) == AXIOM_ELEMENT)
        {
            root_ele = (axiom_element_t*)axiom_node_get_data_element(root_node, env);
            (root_ele)
            {
   printf(" %s" ,axiom_element_get_localname(root_ele, env));
            }
        }
    }

    axiom_document_build_all(document, env);
    axiom_document_build_all(document, env);

    xml_writer = axiom_xml_writer_create_for_memory(env, NULL, AXIS2_TRUE, 0, AXIS2_XML_PARSER_TYPE_BUFFER);

    om_output = axiom_output_create(env, xml_writer);

    axiom_node_serialize(root_node, env, om_output);

    buffer = (axis2_char_t*)axiom_xml_writer_get_xml(xml_writer, env);

    printf("The output XML is ->>>>\n %s ", buffer);
  
    
    
    axiom_output_free(om_output, env);
    
    
    axiom_stax_builder_free(om_builder, env);
    
     AXIS2_SUCCESS;
    
}
int main()
{
    int status = AXIS2_SUCCESS;
    
    axutil_env_t *env = NULL;
    axutil_allocator_t *allocator = NULL;
    env = create_environment();

    status = build_and_serialize_om(env);

    (status == AXIS2_FAILURE)
    {
        printf(" build AXIOM failed");
    }
    
    axutil_env_free(env);
    
     0;
}