Thursday, January 7, 2016

Converting XML To Native Objects

The Problem

Sometimes, we are forced to work with XML. Sometimes, what we want to do is to load that data into an object so we can use it. Usually, we do not even care about most of the attributes, but simply want the names turned in to fields, and and the values placed in the appropriate fields. It turns out that writing long, convoluted functions that are specific to each XML is incredibly fragile, incredibly hard to troubleshoot when things go wrong, and generally very CPU intensive.

The Solution

I will show you a way that you can effortlessly change XML into a Map, then into JSON, and finally into a native object. It does this in less than 100 lines of code, no matter how big your XML file is (not including the cost of writing the Apex Code you are transforming into, of course). There are some inherent limitations you have to watch out for in this demo version, but I hope that somebody finds this useful. Feel free to tweak this code any way that you see fit for your particular purpose. Also note that this version doesn't handle just strings, but also handles Boolean values, numbers, dates, and times. They will be converted to a native format on a best-effort basis.

The Code

This code has been heavily documented so reader can get a feel for how this code works, so I am not going to spend a lot of time explaining it. Please leave comments if you have any questions.
public class XmlToJson { // Try to determine some data types by pattern static Pattern boolPat = Pattern.compile('^(true|false)$'), decPat = Pattern.compile('^[-+]?\\d+(\\.\\d+)?$'), datePat = Pattern.compile('^\\d{4}.\\d{2}.\\d{2}$'), timePat = Pattern.compile('^\\d{4}.\\d{2}.\\d{2} '+ '(\\d{2}:\\d{2}:\\d{2} ([-+]\\d{2}:\\d{2})?)?$'); // Primary function to decode XML static Map<Object, Object> parseNode(Dom.XmlNode node, Map<Object, Object> parent) { // Iterate over all child elements for a given node for(Dom.XmlNode child: node.getChildElements()) { // Pull out some information String nodeText = child.getText().trim(), name = child.getName(); // Determine data type Object value = // Nothing String.isBlank(nodeText)? null: // Try boolean boolPat.matcher(nodeText).find()? (Object)Boolean.valueOf(nodeText): // Try decimals decPat.matcher(nodeText).find()? (Object)Decimal.valueOf(nodeText): // Try dates datePat.matcher(nodeText).find()? (Object)Date.valueOf(nodeText): // Try times timePat.matcher(nodeText).find()? (Object)DateTime.valueOf(nodeText): // Give up, use plain text (Object)nodeText; // We have some text to process if(value != null) { // We already have a value here, convert it to a list if(parent.containsKey(name)) { try { // We already have a list, so just add it ((List<Object>)parent.get(name)).add(value); } catch(Exception e) { // We don't have a list, so convert to a list parent.put(name, new List<Object>{parent.get(name), value}); } } else { // Store a new value parent.put(name, value); } } else if(child.getNodeType() == Dom.XmlNodeType.ELEMENT) { // If it's not a comment or text, recursively process the data Map<Object, Object> temp = parseNode(child, new Map<Object, Object>()); // If at least one node, add a new element into the array if(!temp.isEmpty()) { // Again, create or update a list if we have a value if(parent.containsKey(name)) { try { // If it's already a list, add it ((List<Object>)parent.get(name)).add(temp); } catch(Exception e) { // Otherwise, convert the element into a list parent.put(name, new List<Object> { parent.get(name), temp }); } } else { // New element parent.put(name, temp); } } } } return parent; } // This function converts XML into a Map public static Map<Object, Object> parseDocumentToMap(Dom.Document doc) { return parseNode(doc.getRootElement(), new Map<Object, Object>()); } // This function converts XML into a JSON string public static String parseDocumentToJson(Dom.Document doc) { return JSON.serialize(parseDocumentToMap(doc)); } // This function converts XML into a native object // If arrays are expected, but not converted automatically, this call may fail // If so, use the parseDocumentToMap function instead and fix any problems public static Object parseDocumentToObject(Dom.Document doc, Type klass) { return JSON.deserialize(parseDocumentToJson(doc), klass); } }

The Unit Test

Of course, no code would be worth having without a unit test, so here is the related unit test that you would use to deploy this code to production. It has 100% coverage, and demonstrates the proper way to write a unit test.
@isTest class XmlToJsonTest { @isTest static void test() { Dom.Document doc = new Dom.Document(); doc.load( '<a>'+ '<b><c>Hello World</c><d>2016-05-01</d><e>2016-05-01 '+
'11:29:00 +03:00</e><f>true</f><g>3.1415</g><h>Two</h><h>Parts</h></b>'+ '<b><c>Hello World</c><d>2016-05-01</d><e>2016-05-01 '+
'11:29:00 +03:00</e><f>true</f><g>3.1415</g><h>Two</h><h>Parts</h></b>'+ '</a>' ); A r = (A)XmlToJson.parseDocumentToObject(doc, a.class); System.assertNotEquals(null, r); System.assertNotEquals(null, r.b); for(Integer i = 0; i != 2; i++) { System.assertNotEquals(null, r.b[i].c); System.assertNotEquals(null, r.b[i].d); System.assertNotEquals(null, r.b[i].e); System.assertNotEquals(null, r.b[i].f); System.assertNotEquals(null, r.b[i].g); System.assertNotEquals(null, r.b[i].h); } } class A { public B[] b; } class B { public String c; public Date d; public DateTime e; public Boolean f; public Decimal g; public String[] h; } }

Warnings

This code may behave oddly if you use anything other than XML formatted as in the example above. Mixing in text, comments, or CDATA in places could cause problems. Also, as commented in the code, the JSON parser may fail if it expects an array and does not find one. This means you may need to some post-processing; that is the reason why there are three functions, so that a developer can stop at any stage of the processing to do some manipulation.

Future Enhancements

Here are a few things I thought of while writing this code, but I did not implement simply because I wanted the code to be as straight-forward as possible:
  • Improve empty element support
  • Assume that members ending in "s" are plural (and thus, automatically create a list)
  • Add CDATA support
  • Provide additional data types, like Blobs.

Conclusion

Just a few lines of code can help you parse a variety of XML formats into native objects. The code runs reasonably fast, and is far less difficult to read than functions that may span hundreds or thousands of lines of code consisting of conditional branches and loops, and also easier to maintain. If you found this code useful, please let me know. If you have any suggestions for improvements (within the limited scope of making this post more useful), I'd like to hear them as well.

6 comments:

  1. Nice post, Brian! Your code looks promising. Perhaps i’ll use the tooling api to get XML and then your code to put it into a salesforce object

    ReplyDelete
  2. Nice article!
    I found many useful information in your blog, it was awesome to read,thanks for sharing this great content, keep sharing..
    Salesforce Support and Maintenance Services

    Its really an Excellent post.

    ReplyDelete
  3. Thanks a lot for explaining practically.
    Really Appreciable!
    Admin Guide to Export Field Permissions for Permission Sets using BOFC

    ReplyDelete
  4. I searched for the importance of the Salesforce Management system. Thanks, admin, for sharing such wonderful content on this topic. Now I have got everything I need about it.

    ReplyDelete