Handling Binary Data with Axis2 (MTOM/SwA)

This document describes how to use the Axis2 functionality to send/receive binary data with SOAP.

Content

Introduction

Despite the flexibility, interoperability, and global acceptance of XML, there are times when serializing data into XML does not make sense. Web services users may want to transmit binary attachments of various sorts like images, drawings, XML docs, etc., together with a SOAP message. Such data is often in a particular binary format.

Traditionally, two techniques have been used in dealing with opaque data in XML;

  1. "By value"
  2. Sending binary data by value is achieved by embedding opaque data (of course after some form of encoding) as an element or attribute content of the XML component of data. The main advantage of this technique is that it gives applications the ability to process and describe data, based only on the XML component of the data.

    XML supports opaque data as content through the use of either base64 or hexadecimal text encoding. Both techniques bloat the size of the data. For UTF-8 underlying text encoding, base64 encoding increases the size of the binary data by a factor of 1.33x of the original size, while hexadecimal encoding expands data by a factor of 2x. The above factors will be doubled if UTF-16 text encoding is used. Also of concern is the overhead in processing costs (both real and perceived) for these formats, especially when decoding back into raw binary.

  3. "By reference"

    Sending binary data by reference is achieved by attaching pure binary data as external unparsed general entities outside the XML document and then embedding reference URIs to those entities as elements or attribute values. This prevents the unnecessary bloating of data and wasting of processing power. The primary obstacle for using these unparsed entities is their heavy reliance on DTDs, which impedes modularity as well as the use of XML namespaces.

    There were several specifications introduced in the Web services world to deal with this binary attachment problem using the "by reference" technique. SOAP with Attachments is one such example. Since SOAP prohibits document type declarations (DTD) in messages, this leads to the problem of not representing data as part of the message infoset, therefore creating two data models. This scenario is like sending attachments with an e-mail message. Even though those attachments are related to the message content they are not inside the message. This causes the technologies that process and describe the data based on the XML component of the data to malfunction. One example is WS-Security.

Where Does MTOM Come In?

MTOM (SOAP Message Transmission Optimization Mechanism) is another specification that focuses on solving the "Attachments" problem. MTOM tries to leverage the advantages of the above two techniques by trying to merge the two techniques. MTOM is actually a "by reference" method. The wire format of a MTOM optimized message is the same as the SOAP with Attachments message, which also makes it backward compatible with SwA endpoints. The most notable feature of MTOM is the use of the XOP:Include element, which is defined in the XML Binary Optimized Packaging (XOP) specification to reference the binary attachments (external unparsed general entities) of the message. With the use of this exclusive element, the attached binary content logically becomes inline (by value) with the SOAP document even though it is actually attached separately. This merges the two realms by making it possible to work only with one data model. This allows the applications to process and describe by only looking at the XML part, making the reliance on DTDs obsolete. On a lighter note, MTOM has standardized the referencing mechanism of SwA. The following is an extract from the XOP specification.

At the conceptual level, this binary data can be thought of as being base64-encoded in the XML Document. As this conceptual form might be needed during some processing of the XML document (e.g., for signing the XML document), it is necessary to have a one-to-one correspondence between XML Infosets and XOP Packages. Therefore, the conceptual representation of such binary data is as if it were base64-encoded, using the canonical lexical form of the XML Schema base64Binary datatype (see [XML Schema Part 2: Datatypes Second Edition] 3.2.16 base64Binary). In the reverse direction, XOP is capable of optimizing only base64-encoded Infoset data that is in the canonical lexical form.

Apache Axis2 supports Base64 encoding, SOAP with Attachments and MTOM (SOAP Message Transmission Optimization Mechanism).

MTOM with Axis2

Programming Model

AXIOM is (and may be the first) Object Model that has the ability to hold binary data. It has this ability as OMText can hold raw binary content in the form of javax.activation.DataHandler. OMText has been chosen for this purpose with two reasons. One is that XOP (MTOM) is capable of optimizing only base64-encoded Infoset data that is in the canonical lexical form of XML Schema base64Binary datatype. Other one is to preserve the infoset in both the sender and receiver. (To store the binary content in the same kind of object regardless of whether it is optimized or not).

MTOM allows to selectively encode portions of the message, which allows us to send base64encoded data as well as externally attached raw binary data referenced by the "XOP" element (optimized content) to be sent in a SOAP message. You can specify whether an OMText node that contains raw binary data or base64encoded binary data is qualified to be optimized at the time of construction of that node or later. For optimum efficiency of MTOM, a user is advised to send smaller binary attachments using base64encoding (non-optimized) and larger attachments as optimized content.

        OMElement imageElement = fac.createOMElement("image", omNs);

        // Creating the Data Handler for the file.  Any implementation of
        // javax.activation.DataSource interface can fit here.
        javax.activation.DataHandler dataHandler = new javax.activation.DataHandler(new FileDataSource("SomeFile"));
      
        //create an OMText node with the above DataHandler and set optimized to true
        OMText textData = fac.createOMText(dataHandler, true);

        imageElement.addChild(textData);

        //User can set optimized to false by using the following
        //textData.doOptimize(false);

Also, a user can create an optimizable binary content node using a base64 encoded string, which contains encoded binary content, given with the MIME type of the actual binary representation.

        String base64String = "some_base64_encoded_string";
        OMText binaryNode =fac.createOMText(base64String,"image/jpg",true);

Axis2 uses javax.activation.DataHandler to handle the binary data. All the optimized binary content nodes will be serialized as Base64 Strings if "MTOM is not enabled". You can also create binary content nodes, which will not be optimized at any case. They will be serialized and sent as Base64 Strings.

        //create an OMText node with the above DataHandler and set "optimized" to false
        //This data will be send as Base64 encoded string regardless of MTOM is enabled or not
        javax.activation.DataHandler dataHandler = new javax.activation.DataHandler(new FileDataSource("SomeFile"));
        OMText textData = fac.createOMText(dataHandler, false); 
        image.addChild(textData);

Enabling MTOM Optimization on the Client Side

In Options, set the "enableMTOM" property to True when sending messages.

        ServiceClient serviceClient = new ServiceClient ();
        Options options = new Options();
        options.setTo(targetEPR);
        options.setProperty(Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE);
        serviceClient .setOptions(options);

When this property is set to True, any SOAP envelope, regardless of whether it contains optimizable content or not, will be serialized as an MTOM optimized MIME message.

Axis2 serializes all binary content nodes as Base64 encoded strings regardless of whether they are qualified to be optimized or not

The user does not have to specify anything in order for Axis2 to receive MTOM optimised messages. Axis2 will automatically identify and de-serialize accordingly, as and when an MTOM message arrives.

Enabling MTOM Optimization on the Server Side

The Axis 2 server automatically identifies incoming MTOM optimized messages based on the content-type and de-serializes them accordingly. The user can enableMTOM on the server side for outgoing messages,

To enableMTOM globally for all services, users can set the "enableMTOM" parameter to True in the Axis2.xml. When it is set, all outgoing messages will be serialized and sent as MTOM optimized MIME messages. If it is not set, all the binary data in the binary content nodes will be serialized as Base64 encoded strings. This configuration can be overriden in services.xml on the basis of per service and per operation.

<parameter name="enableMTOM">true</parameter>

You must restart the server after setting this parameter.

Accessing Received Binary Data (Sample Code)

  • Service

public class MTOMService {
    public void uploadFileUsingMTOM(OMElement element) throws Exception {

       OMText binaryNode = (OMText) (element.getFirstElement()).getFirstOMChild();
       DataHandler actualDH;
       actualDH = (DataHandler) binaryNode.getDataHandler();
            
       ... Do whatever you need with the DataHandler ...
    }
  }
  • Client

        ServiceClient sender = new ServiceClient();        
        Options options = new Options();
        options.setTo(targetEPR); 
        // enabling MTOM
        options.set(Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE);
        ............

        OMElement result = sender.sendReceive(payload);
        OMElement ele = result.getFirstElement();
        OMText binaryNode = (OMText) ele.getFirstOMChild();
        
        // Retrieving the DataHandler & then do whatever the processing to the data
        DataHandler actualDH;
        actualDH = binaryNode.getDataHandler();
        .............

MTOM Databinding

You can define a binary element in the schema using the schema type="xsd:base64Binary". Having an element with the type "xsd:base64Binary" is enough for the Axis2 code generators to identify possible MTOM attachments, and to generate code accordingly.

Going a little further, you can use the xmime schema (http://www.w3.org/2005/05/xmlmime) to describe the binary content more precisely. With the xmime schema, you can indicate the type of content in the element at runtime using an MTOM attribute extension xmime:contentType. Furthermore, you can identify what type of data might be expected in the element using the xmime:expectedContentType. Putting it all together, our example element becomes:

      <element name="MyBinaryData" xmime:expectedContentTypes='image/jpeg' >
        <complexType>
          <simpleContent>
            <extension base="base64Binary" >

              <attribute ref="xmime:contentType" use="required"/>
            </extension>
          </simpleContent>
        </complexType>
      </element>

You can also use the xmime:base64Binary type to express the above mentioned data much clearly.

      <element name="MyBinaryData" xmime:expectedContentTypes='image/jpeg' type="xmime:base64Binary"/>

MTOM Databinding Using ADB

Let's define a full, validated doc/lit style WSDL that uses the xmime schema, has a service that receives a file, and saves it in the server using the given path.

<wsdl:definitions xmlns:tns="http://ws.apache.org/axis2/mtomsample/"
        xmlns:mime="http://schemas.xmlsoap.org/wsdl/mime/"
        xmlns:http="http://schemas.xmlsoap.org/wsdl/http/"
        xmlns:soap12="http://schemas.xmlsoap.org/wsdl/soap12/"
        xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
        xmlns:wsaw="http://www.w3.org/2006/05/addressing/wsdl"
        xmlns:xsd="http://www.w3.org/2001/XMLSchema"
        xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
        xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
        xmlns="http://schemas.xmlsoap.org/wsdl/"
        targetNamespace="http://ws.apache.org/axis2/mtomsample/">

        <wsdl:types>
                <xsd:schema xmlns="http://schemas.xmlsoap.org/wsdl/"
                        attributeFormDefault="qualified" elementFormDefault="qualified"
                        targetNamespace="http://ws.apache.org/axis2/mtomsample/">

                        <xsd:import namespace="http://www.w3.org/2005/05/xmlmime"
                                schemaLocation="http://www.w3.org/2005/05/xmlmime" />
                        <xsd:complexType name="AttachmentType">
                                <xsd:sequence>
                                        <xsd:element minOccurs="0" name="fileName"
                                                type="xsd:string" />
                                        <xsd:element minOccurs="0" name="binaryData"
                                                type="xmime:base64Binary" />
                                </xsd:sequence>
                        </xsd:complexType>
                        <xsd:element name="AttachmentRequest" type="tns:AttachmentType" />
                        <xsd:element name="AttachmentResponse" type="xsd:string" />
                </xsd:schema>
        </wsdl:types>
        <wsdl:message name="AttachmentRequest">
                <wsdl:part name="part1" element="tns:AttachmentRequest" />
        </wsdl:message>
        <wsdl:message name="AttachmentResponse">
                <wsdl:part name="part1" element="tns:AttachmentResponse" />
        </wsdl:message>
        <wsdl:portType name="MTOMServicePortType">
                <wsdl:operation name="attachment">
                        <wsdl:input message="tns:AttachmentRequest"
                                wsaw:Action="attachment" />
                        <wsdl:output message="tns:AttachmentResponse"
                                wsaw:Action="http://schemas.xmlsoap.org/wsdl/MTOMServicePortType/AttachmentResponse" />
                </wsdl:operation>
        </wsdl:portType>
        <wsdl:binding name="MTOMServiceSOAP11Binding"
                type="tns:MTOMServicePortType">
                <soap:binding transport="http://schemas.xmlsoap.org/soap/http"
                        style="document" />
                <wsdl:operation name="attachment">
                        <soap:operation soapAction="attachment" style="document" />
                        <wsdl:input>
                                <soap:body use="literal" />
                        </wsdl:input>
                        <wsdl:output>
                                <soap:body use="literal" />
                        </wsdl:output>
                </wsdl:operation>
        </wsdl:binding>
        <wsdl:binding name="MTOMServiceSOAP12Binding"
                type="tns:MTOMServicePortType">
                <soap12:binding transport="http://schemas.xmlsoap.org/soap/http"
                        style="document" />
                <wsdl:operation name="attachment">
                        <soap12:operation soapAction="attachment" style="document" />
                        <wsdl:input>
                                <soap12:body use="literal" />
                        </wsdl:input>
                        <wsdl:output>
                                <soap12:body use="literal" />
                        </wsdl:output>
                </wsdl:operation>
        </wsdl:binding>
        <wsdl:service name="MTOMSample">
                <wsdl:port name="MTOMSampleSOAP11port_http"
                        binding="tns:MTOMServiceSOAP11Binding">
                        <soap:address
                                location="http://localhost:8080/axis2/services/MTOMSample" />
                </wsdl:port>
                <wsdl:port name="MTOMSampleSOAP12port_http"
                        binding="tns:MTOMServiceSOAP12Binding">
                        <soap12:address
                                location="http://localhost:8080/axis2/services/MTOMSample" />
                </wsdl:port>
        </wsdl:service>
</wsdl:definitions>

The important point here is we import http://www.w3.org/2005/05/xmlmime and define the element 'binaryData' that utilizes MTOM.

The next step is using the Axis2 tool 'WSDL2Java' to generate Java source files from this WSDL. See the 'Code Generator Tool' guide for more information. Here, we define an Ant task that chooses ADB (Axis2 Data Binding) as the databinding implementation. The name we list for the WSDL above is MTOMSample.wsdl, and we define our package name for our generated source files to 'sample.mtom.service' . Our Ant task for this example is:

        
<target name="generate.service">
                 <java classname="org.apache.axis2.wsdl.WSDL2Java">
                        <arg value="-uri" />
                        <arg value="${basedir}/resources/MTOMSample.wsdl" />
                        <arg value="-ss" />
                        <arg value="-sd" />
                          <arg value="-g"/>
                        <arg value="-p" />
                        <arg value="sample.mtom.service" />
                        <arg value="-o" />
                        <arg value="${service.dir}" />
                        <classpath refid="class.path" />
                </java>
          </target>

Now we are ready to code. Let's edit output/src/sample/mtom/service/MTOMSampleSkeleton.java and fill in the business logic. Here is an example:

        public org.apache.ws.axis2.mtomsample.AttachmentResponse attachment(
                        org.apache.ws.axis2.mtomsample.AttachmentRequest param0) throws Exception
        {
                AttachmentType attachmentRequest = param0.getAttachmentRequest();
                Base64Binary binaryData = attachmentRequest.getBinaryData();
                DataHandler dataHandler = binaryData.getBase64Binary();
                File file = new File(
                                attachmentRequest.getFileName());
                FileOutputStream fileOutputStream = new FileOutputStream(file);
                dataHandler.writeTo(fileOutputStream);
                fileOutputStream.flush();
                fileOutputStream.close();
                
                AttachmentResponse response = new AttachmentResponse();
                response.setAttachmentResponse("File saved succesfully.");
                return response;
        }

The code above receives a file and writes it to the disk using the given file name. It returns a message once it is successful. Now let's define the client:

        public static void transferFile(File file, String destination)
                        throws RemoteException {
                MTOMSampleStub serviceStub = new MTOMSampleStub();

                // Enable MTOM in the client side
                serviceStub._getServiceClient().getOptions().setProperty(
                                Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE);
                //Increase the time out when sending large attachments
                serviceStub._getServiceClient().getOptions().setTimeOutInMilliSeconds(10000);

                // Populating the code generated beans
                AttachmentRequest attachmentRequest = new AttachmentRequest();
                AttachmentType attachmentType = new AttachmentType();
                Base64Binary base64Binary = new Base64Binary();

                // Creating a javax.activation.FileDataSource from the input file.
                FileDataSource fileDataSource = new FileDataSource(file);

                // Create a dataHandler using the fileDataSource. Any implementation of
                // javax.activation.DataSource interface can fit here.
                DataHandler dataHandler = new DataHandler(fileDataSource);
                base64Binary.setBase64Binary(dataHandler);
                base64Binary.setContentType(dataHandler.getContentType());
                attachmentType.setBinaryData(base64Binary);
                attachmentType.setFileName(destination);
                attachmentRequest.setAttachmentRequest(attachmentType);

                AttachmentResponse response = serviceStub.attachment(attachmentRequest);
                System.out.println(response.getAttachmentResponse());
        }

The last step is to create an AAR with our Skeleton and the services.xml and then deploy the service. You can find the completed sample in the Axis2 standard binary distribution under the samples/mtom directory

SOAP with Attachments (SwA) with Axis2

Receiving SwA Type Attachments

Axis2 automatically identifies SwA messages based on the content type. Axis2 stores the references on the received attachment parts (MIME parts) in the Message Context. Axis2 preserves the order of the received attachments when storing them in the MessageContext. Users can access binary attachments using the attachement API given in the Message Context using the content-id of the mime part as the key. Care needs be taken to rip off the "cid" prefix when content-id is taken from the "Href" attributes. Users can access the message context from whithin a service implementation class using the "setOperationContext()" method as shown in the following example.

Note: Axis2 supports content-id based referencing only. Axis2 does not support Content Location based referencing of MIME parts.

  • Sample service which accesses a received SwA type attachment
public class SwA {
    public SwA() {
    }
    
    public void uploadAttachment(OMElement omEle) throws AxisFault {
        OMElement child = (OMElement) omEle.getFirstOMChild();
        OMAttribute attr = child.getAttribute(new QName("href"));
        
        //Content ID processing
        String contentID = attr.getAttributeValue();
        contentID = contentID.trim();
        if (contentID.substring(0, 3).equalsIgnoreCase("cid")) {
            contentID = contentID.substring(4);
        }
        
        MessageContext msgCtx = MessageContext.getCurrentMessageContext();
        Attachments attachment = msgCtx.getAttachmentMap();
        DataHandler dataHandler = attachment.getDataHandler(contentID);
        ...........
    }
}

Sending SwA Type Attachments

The user needs to set the "enableSwA" property to True in order to be able to send SwA messages. The Axis2 user is not expected to enable MTOM and SwA together. In such a situation, MTOM will get priority over SwA.

This can be set using the axis2.xml as follows.

  
        <parameter name="enableSwA">true</parameter>

"enableSwA" can also be set using the client side Options as follows

  
        options.setProperty(Constants.Configuration.ENABLE_SwA, Constants.VALUE_TRUE);

Users are expected to use the attachment API provided in the MessageContext to specify the binary attachments needed to be attached to the outgoing message as SwA type attachments. Client side SwA capability can be used only with the OperationClient api, since the user needs the ability to access the MessageContext.

  • Sample client which sends a message with SwA type attachments
   public void uploadFileUsingSwA(String fileName) throws Exception {

        Options options = new Options();
        options.setTo(targetEPR);
        options.setProperty(Constants.Configuration.ENABLE_SWA, Constants.VALUE_TRUE);
        options.setTransportInProtocol(Constants.TRANSPORT_HTTP);
        options.setSoapVersionURI(SOAP11Constants.SOAP_ENVELOPE_NAMESPACE_URI);
        options.setTo(targetEPR);
  
        ServiceClient sender = new ServiceClient(null,null);
        sender.setOptions(options);
        OperationClient mepClient = sender.createClient(ServiceClient.ANON_OUT_IN_OP);
        
        MessageContext mc = new MessageContext();   
        mc.setEnvelope(createEnvelope());
        FileDataSource fileDataSource = new FileDataSource("test-resources/mtom/test.jpg");
        DataHandler dataHandler = new DataHandler(fileDataSource);
        mc.addAttachment("FirstAttachment",dataHandler);
       
        mepClient.addMessageContext(mc);
        mepClient.execute(true);
    }

MTOM Backward Compatibility with SwA

MTOM specification is designed to be backward compatible with the SOAP with Attachments specification. Even though the representation is different, both technologies have the same wire format. We can safely assume that any SOAP with Attachments endpoint can accept MTOM optimized messages and treat them as SOAP with Attachment messages - any MTOM optimized message is a valid SwA message.

Note : Above backword compatibility was succesfully tested against Axis 1.x

  • A sample SwA message from Axis 1.x
Content-Type: multipart/related; type="text/xml"; 
          start="<9D645C8EBB837CE54ABD027A3659535D>";
                boundary="----=_Part_0_1977511.1123163571138"

------=_Part_0_1977511.1123163571138
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: binary
Content-Id: <9D645C8EBB837CE54ABD027A3659535D>

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="...."....>
    ........
                <source href="cid:3936AE19FBED55AE4620B81C73BDD76E" xmlns="/>

    ........
</soapenv:Envelope>
------=_Part_0_1977511.1123163571138
Content-Type: text/plain
Content-Transfer-Encoding: binary
Content-Id: <3936AE19FBED55AE4620B81C73BDD76E>

Binary Data.....
------=_Part_0_1977511.1123163571138--
  • Corresponding MTOM message from Axis2
Content-Type: multipart/related; boundary=MIMEBoundary4A7AE55984E7438034;
                         type="application/xop+xml"; start="<0.09BC7F4BE2E4D3EF1B@apache.org>";
                         start-info="text/xml; charset=utf-8"

--MIMEBoundary4A7AE55984E7438034
content-type: application/xop+xml; charset=utf-8; type="application/soap+xml;"
content-transfer-encoding: binary
content-id: <0.09BC7F4BE2E4D3EF1B@apache.org>

<?xml version='1.0' encoding='utf-8'?>
<soapenv:Envelope xmlns:soapenv="...."....>
  ........
         <xop:Include href="cid:1.A91D6D2E3D7AC4D580@apache.org" 
                        xmlns:xop="http://www.w3.org/2004/08/xop/include">
         </xop:Include>
  ........

</soapenv:Envelope>
--MIMEBoundary4A7AE55984E7438034
content-type: application/octet-stream
content-transfer-encoding: binary
content-id: <1.A91D6D2E3D7AC4D580@apache.org>

Binary Data.....
--MIMEBoundary4A7AE55984E7438034--

Advanced Topics

File Caching for Attachments

Axis2 comes handy with a file caching mechanism for incoming attachments, which gives Axis2 the ability to handle very large attachments without buffering them in the memory at any time. Axis2 file caching streams the incoming MIME parts directly into the files, after reading the MIME part headers.

Also, a user can specify a size threshold for the File caching (in bytes). When this threshold value is specified, only the attachments whose size is bigger than the threshold value will get cached in the files. Smaller attachments will remain in the memory.

Note : It is a must to specify a directory to temporarily store the attachments. Also care should be taken to clean that directory from time to time.

The following parameters need to be set in Axis2.xml in order to enable file caching.

<axisconfig name="AxisJava2.0">

    <!-- ================================================= -->
    <!-- Parameters -->
    <!-- ================================================= -->
    <parameter name="cacheAttachments">true</parameter>
    <parameter name="attachmentDIR">temp directory</parameter>

    <parameter name="sizeThreshold">4000</parameter>
    .........
    .........
</axisconfig>

Enabling file caching for client side receiving can be done for the by setting the Options as follows.

options.setProperty(Constants.Configuration.CACHE_ATTACHMENTS,Constants.VALUE_TRUE);
options.setProperty(Constants.Configuration.ATTACHMENT_TEMP_DIR,TempDir);
options.setProperty(Constants.Configuration.FILE_SIZE_THRESHOLD, "4000");