Friday, March 29, 2024

Calculate Values Dynamically In Lightning Web Components With Proxy

If you've ever worked with Lightning Web Components, you know that they don't support expressions, like Aura does. The usual recommendation is to copy the data and add any properties that you need to calculate. For example, one common use case is to create a record link from a record Id.

  @track accountRecord;
  @wire(getRecord, { recordId: '$recordId', fields: FIELDS }) handleGetRecord({ error, data }) {
    if(data) {
      this.accountRecord = {
        ...data,
        accountUrl: `/${data.id}`
      };
    } else {
      this.accountRecord = undefined;
    }
    this.error = error;
  }

This works perfectly fine, but what about the case where you need to handle dozens of values? If the only tool you have is the one above, you'll need to write at least one line per property, and you'll have to remember to add and remove properties as your markup changes.

  @track accountRecord;
  @wire(getRecord, { recordId: '$recordId', fields: FIELDS }) handleGetRecord({ error, data }) {
    if(data) {
      this.accountRecord = {
        ...data,
        accountUrl: `/${data.id}`,
        lastModifiedByIdUrl: `/${data.lastModifiedById}`,
        recordTypeIdUrl: `/${data.recordTypeIdUrl}`
      };
    } else {
      this.accountRecord = undefined;
    }
    this.error = error;
  }

What if there was a better way?

Introducing Proxy. With Proxy, you can intercept certain types of operations and react. Most developers have only met Proxy in the context of using console.log(someProperty) only to find that you can't see the data, but instead get a Proxy.

However, we can make Proxy work for us. By writing a custom proxy, we can automatically generate links for fields just by adding a prefix or suffix. Doing this requires only a few lines of code. They can even be nested on top of each other if you wanted to create a library of functions.

For our example, we're going to make a small proxy handler that will create a link given a field value. In this case, we'll say that it will always be a link to a Salesforce ID.

To do this, we just need two small adjustments. First, we need to create a proxy handler. In our case, we just need to override the get method, so we'll write this:

const linkify = {
  get(target, key) {
    // Property exists
    if(Object.prototype.hasOwnProperty.call(target, key)) {
      return target[key];
    }
    // Property can be turned into a link:
    if(key.indexOf('_toLink') > -1) {
      return `/${target[key.substring(0, key.indexOf('_toLink'))]`;
    }
    // We could add other functions here
    return null;
  }
}

Now, we just need to wrap our data with the Proxy:

  @wire(getRecord, { recordId: '$recordId', fields: FIELDS }) handleGetRecord({ error, data }) {
    if(data) {
      this.accountRecord = new Proxy(data, linkify);
    } else {
      this.accountRecord = undefined;
    }
    this.error = error;
  }

Now, we don't need to worry about adding or removing any more JavaScript. We can just refer to these new properties in our markup:


  <a href={accountRecord.id_toLink}>{accountRecord.fields.Name.value}</a>

As you can see, we can create links for any field by adding a suffix. We could do other things as well, such as calculating dates or numbers, transforming text to lower- or uppercase. We can also use the set function to perform validation or data cleanup. As we've demonstrated in this article, Proxy can be a useful tool to reduce the amount of code we have to write.

If you like this content, you can help support me in the following ways:

Buy me a beer
Paypal: brian.m.fear@gmail.com
CashApp: $BrianFear
Venmo: https://account.venmo.com/u/brianmfear

Your contributions will enable me to provide more content in the future. Comments and suggestions for future topics are welcome.

Tuesday, January 25, 2022

Is HTML A Programming Language?

There is a lot of debate on the Internet about whether or not HTML is, in fact, a programming language. On one side, you have those that state that HTML is a declarative programming language, and those that state otherwise are either gate-keeping, or downplaying the importance of HTML. On the other side, you have those that state that HTML is not a programming language, because it is a markup language. That's as far as the debate usually goes, but most people don't know exactly why one side is correct. There is only one correct answer, but to get there, we need to understand how computers work.

A computer uses RAM to store data it is working with in an randomly addressable area of storage. In most computers, this memory is just a contiguous area of storage, though some devices, like certain video game consoles, might use different banks of memory that may not actually be logically contiguous. For simplicity, we'll just assume that memory is linear and can be used for any type of data. In most programs, memory is used in one of three ways.

The first classification of memory is known as "code." This type of memory is loaded by the OS or virtual machine, then typically marked read-only so it cannot be modified during runtime. Historically, viruses would use the fact that code memory was writeable, so it could modify itself while it ran, defeating the anti-viruses of the time. Later, various mechanisms were created so that executing code could not be modified, and data could not be executed as code. Any error in code typically results in a program halting.

After this, we have an area of memory called the stack. It typically grows from a fixed address downwards towards zero. As such, this type of memory is a fixed size, and usually cannot be changed once set up. The processor uses the stack to store local variables, and also to remember functions that were called in order to return control to the calling function when a "return" is executed. Going past address 0 results in a stack overflow, and going past the fixed address is a stack underflow. Either of these conditions can cause the program to crash, since it no longer remembers what came before. Stacks use a First In, Last Out design, and is typically modified through push, pop, call, and return instructions.

Finally, we have the third area of memory, called the heap. This is dynamically allocated memory, and is therefore virtually unlimited in size, given that free memory exists. When a program allocates space on the stack, the memory address of that heap is assigned to a local variable in the stack, often called a pointer, or a reference. When the program is done with the data, that memory then needs to be freed. In many languages, this is automatic, but some older languages had to specifically free the memory that was no longer used. If the addresses are "lost" before the heap memory is freed, this results in a behavior known as a memory leak.

At this point, we should briefly talk about the concept of a "virtual machine." A virtual machine emulates a physical processor, but can execute its own set of instructions that may not be native to the physical processor. It has a second defined as code, a stack, and heap memory. For purposes of deciding if a language is a programming language, a virtual machine is the same as a physical processor. In other words, even though the instructions are technically data as far as the physical processor is concerned, they are still considered machine instructions in the context of a virtual machine, as those instructions are directly converted to physical instructions.

We can also talk about interpreters. They are essentially virtual machines, but have the added ability that a developer can pause the machine at any time, change the instructions as they want, or even run adhoc functions without first compiling any code. Many modern languages have some sort of interpreter that can be paused, inspected, and modified at any time. This helps facilitate shorter development cycles than traditional languages that need to be compiled and debugged.

Now, let's discuss what a programming language is. It is source code that is somehow directly converted into instructions that the processor uses to execute some algorithm. These algorithms use data on the stack and the heap in order to process input, perform calculations, and then generate some kind of output. For example, given a simple program that adds two numbers, the algorithm reads two inputs, performs a mathematical calculation, then outputs the result.

Alan Turing was one of the first to describe computer logic, the Turing Machine. This machine had three parts; the main processor, registers that informed the processor of its state, and an infinitely long tape to read from, and store to, data used in its operation. This basic description of a Turing Machine is the same general philosophy that all modern devices are built on. The code, stack, and heap memory areas are, more or less, the three parts of the Turing Machine.

Given this, we can now examine the evidence for or against HTML as a programming language. There is no access to a code section, there is no stack, no control statements, no allocation of dynamic memory objects, no addresses or references, or anything else remotely approaching the description of a virtual machine or processor. Some people will make an argument that a collection of pages form a Turing Machine, but that only points to the file structure being a finite automata. The HTML itself is still just data.

In reality, the HTML is parsed in to a data structure called the Document Object Model. This DOM is loaded into the browser's heap memory, and then rendered as output on the screen or destined for a PDF or printer. At no point does the DOM implement any algorithms. It is used for input for the rendering engine, a part of the browser's code, in order to generate output. This is not the same as being a program whose instructions are executed one at a time in order.

As further evidence against HTML being a programming language, one should note that code cannot modify code. You can generate and execute code on the fly, but you cannot directly modify code from code in virtually any language. If HTML was a programming language, then JavaScript should not be able to modify it directly, as this would be a major security risk. HTML already has plenty of potential security vulnerabilities without also being an executable set of instructions.

We have different terms to define a job's skill set. Designers work only with things considered data, including HTML, images, and so on. Programmers work only with programming languages, typically text apps and server-side code. Developers take on the role of both programmers and designers, utilizing CSS, HTML, and JavaScript in equal parts. In each of these three roles, significantly skilled people earn a very respectable living.

The area of Information Technology is that of Computer Science. Science is a collection of knowledge, and that includes classifying various types of data. We cannot call HTML a programming language and preserve the purity of that scientific knowledge. HTML is not, by any technical definition, a programming language. It doesn't compile to code, is not executed in a real or virtual processor, it can be modified in real-time by JavaScript, and it cannot perform even the most simple calculations or conditional branches without CSS or JavaScript. It is truly a document format, much like a Word Document or PDF.

In conclusion, we should now understand two things. HTML is not a programming language, as it has none of the hallmarks of what we consider a programming language, and HTML is a vitally important tool for any developer and designer working on web-based applications, as anything with a web-based UI must use HTML at some level. Those that use HTML effectively are definitely in high demand, and will continue to be so for the foreseeable future. Nobody is trying to "gate keep" anybody by defining HTML as not-a-programming-language. They are simply using the pure definition of what a programming language is, and wish to see the term preserved.

Wednesday, March 25, 2020

Small Framework for Server-Side Calls in Aura

Promises in Aura

In Aura, we spend a lot of time writing the same method over and over again: calling the server. Now, we can include a small script that calls the server and uses Promises to promote small, reusable code. This one method means that we only have to write about three lines of code to call the server, including error handling. This requires minimal adaptation, but it should make it far easier to write server calls in a consistent manner.

The Static Resource

We simply need a small piece of code that we will load in our components.

aura.js


    window.server = function({component, method, params}) {
        return new Promise(
            $A.getCallback((resolve, reject) => 
                serverSync({component:component, 
                            method:method, 
                            params:params,

success:resolve,

                            error:reject}))
        );
      }
    window.serverSync = function({component, method, params, success, error}) {
        var action = component.get(method);
        params && action.setParams(params);
        action.setCallback(this, result => {
            switch (result.getState()) {
                case "DRAFT": case "SUCCESS":
                    success.call(this, result.getReturnValue());
                    break;
                default:
                    error.call(this, result.getError());
            }
        });
        $A.enqueueAction(action);
    }

These two simple functions allows us to write small code for all of our components. We can add further features later, but that is outside the scope of this post. For now, we are simply interested in calling the server as efficiently as possible.

Using in a Component or App

With this small script, we now need to write our components a little differently. Instead of handling the aura:valueInit handler like we used to, we now need to wait for our simple script to load. It is small and loads nearly instantly, so there should not be any visible lag time. Here is what a component should start off with now:


    <aura:component controller="staticDemo">
    <ltng:require scripts="{!$Resource.aura}" afterScriptsLoaded="{!c.init}" />

Calling the Server

Now that we have loaded our script, we can now call the server in several different ways.

Calling Once

In the most usual case, we can call the server with just one method and get the result.


    server({component: component, method: "c.myMethod", params: {key: value}})
    .then(result => component.set("v.someAttribute", result))
    .catch(error => helper.displayError(error));

Calling Multiple at Once

Similarly, we can call more than one method at a time, and continue when we're done:


    Promise.all([
        server({component: component, method: "c.method1"})

            .then(result => component.set("v.value1", result)),
        server({component: component, method: "c.method2"})

            .then(result => component.set("v.value2", result)),
        server({component: component, method: "c.method3"})

            .then(result => component.set("v.value3", result))
    ])
    .then(results => {})
    .catch(error => helper.displayError(error));

Calling Multiple in Serial

Sometimes, we need the result of a previous value to call the next step. We can do this with a Promise chain:


    server({component:component, method:"c.method1"})
    .then(result => (component.set("v.value1", result),

          server({component:component, method:"c.method2", params: {someValue: result}})))
    .then(result => component.set("v.value2", result))
    .catch(error => helper.displayError(error));

Supporting Cacheable Methods

Cacheable (a.k.a. Storable) methods may end up calling the resolve/reject method more than once. This is not compatible with Promises. If you find yourself in a situation where you need to call those methods, use the serverSync method. While it is not actually a synchronous method, it bears a different name to remind you that you need to pass in your own success and error handlers, and that you cannot chain them with promises. This is still arguably more legible than writing the same fifteen or so lines of code every time.


    serverSync({component:component, 
                method:"c.serverAction",

success: result => component.set("v.value1", result),

                error: error => helper.displayError(error)});

The Commas

You might notice the use of (param) => (operation1, operation2) in the examples above. This design is used to take advantage of the comma operator. I feel that this is marginally more legible than using code blocks ({}). With the comma operator, the function on the left is evaluated and the results are discarded, then the function on the right is called. This allows us to chain Promises together while storing the intermediate results back to our component. Do not forget the parentheses, as they are necessary to make sure the comma operator works as we want to.

Conclusion

I found this pattern to be immensely useful, and I wanted to share it with others, as well as document this for my future self, should I ever forget to use it.

Thursday, March 2, 2017

The Aggregate Query Update Pattern

Introduction

We have all read the bulkification post on the developerforce board, but there are very few, if any, official documentation examples of how to actually go about this in practice. I have posted about the Aggregate Query Update pattern on several questions on the official forums, but it still seems to come up from time to time. So, I decided to write this quick post to describe what it is, and why a developer should use it. I named this pattern after the three steps one needs to bulkify any type of trigger or method. Most problems with governor limits can be resolved simply by following this pattern. We'll go over the steps in detail, and then look at an example.

Aggregate

In the first step, we aggregate together values that we want to process or query. The specific type of aggregate will depend on exactly what we are trying to do. Generally speaking, this will consist of creating a Set, and then using a loop to get the values we want. In some cases, we can skip this step, because we can use Trigger.new or Trigger.old to get the Id values of the records, particularly useful when you want to query child records. So, in general, the aggregate method will look like the following code.

Set<Object> values = new Set<Object>();

for(SObject record: records) {

values.add(record.Field);

}

values.remove(null);

We should replace Object with the type of data we're using (typically String, Id, or Date, but we can use any type that's required), replace SObject with the type of record we're using, and Field with the name of the field. Here's a concrete example using the above pattern.

Set<String> emails = new Set<String>();

for(Contact record: Trigger.new) {

emails.add(record.Email);

}

emails.remove(null);

We generally want to remove null values, because we cannot efficiently query null values, and we generally want to ignore those values anyways. In some cases, we can skip this step, such as when we're getting the Id or Name value of a record, because these fields are never null.

Query

In the second step, we typically need to get data from the database. This will generally either be the parents, children, or matching records we're interested in. This typically means that we will use the values we aggregated before. Generally, we need to have a Map to do this. In the general case, that means we will follow one of the following patterns.

Parent Records or Unique Values

Map<Object, SObject> records = new Map<Object, SObject>();

for(SObject record: [SELECT Field FROM SObject WHERE Field = :values]) {

records.put(record.Field, record);

}

Records by Id

Map<Id, SObject> records = new Map<Id, SObject>(

[SELECT Field FROM SObject WHERE Id = :values]

);

Child Records or Non-Unique Values

Map<Object, SObject[]> records = new Map<Object, SObject[]>();

for(Object value: values) {

records.put(value, new SObject[0]);

}

for(SObject record: [SELECT Field FROM SObject WHERE Field = :values]) {

records.get(record.Field).add(record);

}

Once we have the results from the query, we can then perform some action using the results.

Update

The third and final step is to perform an update. This involves using the Map to update values as appropriate. This usually follows one of two designs, as follows.

Copying Parent Data to Children from Child Records

for(SObject record: records) {

if(record.ParentId != null) {

record.Field = values.get(record.ParentId);

}

Copying Parent Data to Children from Parent Records

for(SObject parent: records) {

for(SObject child: children.get(record.Id)) {

child.Field = parent.Field;

}

Updating Parents with Children Data

for(SObject record: parents.values()) {

record.Field = 0;

}

for(SObject record: records) {

parents.get(record.ParentId).Field += record.Field;

}

Uses for the Aggregate Query Update Pattern

There's many things we can do with this pattern. We can detect and prevent or merge duplicate records, implement our own roll-up summary calculations, and copy values between children and parent records. In other words, we can perform almost all basic business functions using this pattern. It should be the most common tool in our arsenal. By using this one simple tool, we can reduce the number of DML statements and queries we use, and speed up bulk data loads.

Examples

Copy Account Address to Contact Address

trigger updateContactAddresses on Account (after update) {

Map<Id, Contact[]> contacts = new Map<Id, Contact[]>();

Contact[] updates = new Contact[0];

for(Account record: Trigger.new) {

contacts.put(record.Id, new Contact[0]);

}

for(Contact record: [SELECT AccountId FROM Contact WHERE AccountId = :Trigger.new]) {

contacts.get(record.AccountId).add(record);

}

for(Account accountRecord: Trigger.new) {

for(Contact contactRecord: contacts.get(record.Id)) {

contactRecord.MailingStreet = accountRecord.BillingStreet;

contactRecord.MailingCity = accountRecord.BillingCity;

contactRecord.MailingState = accountRecord.BillingState;

contactRecord.MailingPostalCode = accountRecord.BillingPostalCode;

contactRecord.MailingCountry = accountRecord.BillingCountry;

}

updates.addAll(contacts.get(record.Id));

}

update updates;

}

Copy Account Phone when Account Changes

trigger copyPhoneOnContactCreate on Contact (before insert, before update) {

Set<Id> accountIds = new Set<Id>();

Contact[] changes = new Contact[0];

for(Contact record: Trigger.new) {

if(record.AccountId != null && (Trigger.isInsert ||

Trigger.oldMap.get(record.Id).AccountId <> record.AccountId)) {

changes.add(record);

accountIds.add(record.AccountId);

}

if(accountIds.isEmpty()) {

return;

}

Map<Id, Account> accounts = new Map<Id, Account>(

[SELECT Phone FROM Account WHERE Id = :accountIds]

)

for(Contact record: changes) {

changes.Phone = accounts.get(record.AccountId).Phone;

}

Sum All Contacts on an Account

trigger updateContactCount on Contact (after insert, after update, after delete, after undelete) {

Set<Id> accountIds = new Set<Id>();

if(Trigger.isInsert || Trigger.isUpdate || Trigger.isUndelete) {

for(Contact record: Trigger.new) {

accountIds.add(record.AccountId);

}

if(Trigger.isUpdate || Trigger.isDelete) {

for(Contact record: Trigger.old) {

accountIds.add(record.AccountId);

}

accountIds.remove(null);

Map<Id, Account> updates = new Map<Id, Account>();

for(Id accountId: accountIds) {

updates.put(accountId, new Account(Id=accountId, ContactCount__c=0));

}

for(AggregateResult result: [SELECT AccountId Id, COUNT(Id) sum FROM Contact WHERE

AccountId = :accountIds GROUP BY AccountId]) {

updates.get((Id)result.get('Id')).ContactCount__c = (Decimal)result.get('sum');

}

update updates.values();

}

Conclusion

I hope that this post will prove useful to future visitors. This is the most common pattern we use, and one of the most simple ones to learn. Using this pattern can save us governor limits and keep our code easy to read. Of course, I've left out some parts, such as error handling, minor optimizations, and so on, in order to avoid cluttering the code too much, but I hope that this code will help people get started using more efficient code.

Monday, January 11, 2016

With Sharing, Without Sharing, and You

Using Sharing Responsibly

The "with sharing" and "without sharing" keywords seem to be a poorly understood feature of the salesforce.com platform. Even I have made mistakes in some of the answers I've previously written on the topic. This post will document what the "with sharing" and "without sharing" keywords actually do, when you should use a particular mode, and the consequences of abusing these seemingly simple keywords.

What Is Sharing?

Sharing is what determines if a user can do something with a particular record, based solely on the Share tables. When sharing is enabled in code, all DML operations will be checked against the Share table for any affected records to see if the user is allowed to update that record based solely on the entries in the Share table for the object. This means that a user who tries to update a record when they only have read access to the record won't be able to; they'll get an error.

What Sharing Isn't

Other than a precious few profile permissions, which you could accurately describe as a "sharing rule" granted to that one user, the "with sharing" and "without sharing" keywords mean nothing in regards to enforcing profile permissions, such as the ability for a user to read or write to a particular object or field. In other words, a user that could edit a record if the profile allowed it will still be able to update that record from Apex Code. As developers, we need to make sure that any such updates are either "system actions" that should never fail, or we need to check to see if a user can access a particular object before performing an action on their behalf.

Why Not Use "With Sharing" All The Time?

There's several compelling reasons, but it really boils down to this: most of the time, the system will protect data that needs protecting without us doing anything special to protect the data. In most cases, using neither keyword will result in the correct behavior. There are a few exceptions, however, when you must use "with sharing" or "without sharing," in order to guarantee the correct behavior. Using these keywords all the time has an associated penalty in terms of CPU time, so they should be used sparingly.

When Do I Use "With Sharing"?

If you have code that is called from a user-facing interface, such as a Visualforce page, and it performs a query or performs any DML on behalf of a user, use "With Sharing." Without this keyword, it is possible for users to view or update records they do not have access to. There are a few times when this is desirable, of course, but those are really the exceptions to the rule. If a class does not perform any DML operation or query, do not specify either mode.

When Do I Use "Without Sharing"?

Usually, never. Since the default mode for code is usually "without sharing," there's rarely an opportunity to use this keyword. You'd use it to "break out" of sharing mode (say, to update a normally read-only record) while in sharing mode. This is far less common than you'd think, because usually code called from within sharing mode needs to be with sharing, and code called from without sharing mode needs to be without sharing. If you do need to use this keyword, it may indicate that something else is wrong in your code. Of course, sometimes it's completely unavoidable, but every attempt to be made before resorting to "Without Sharing."

When Can I Omit The Sharing Mode?

You generally only need to specify the sharing mode for classes that actually perform a query or DML operation. Classes that don't fall in either category should not be marked as either with sharing or without sharing. This means they will inherit their permissions from the current mode. The same is also true of trigger utility/helper classes as well as Visualforce helper classes. Generally speaking, there's no need to ever mark a utility class as "with sharing" or "without sharing." It should be able to operate correctly based on its' current sharing mode.

Should I Use Either Mode?

If you're not sure if a specific model should be used, simply ask yourself these questions:

Am I using any DML operations or queries?

If the answer is no, leave the sharing as the default mode, because otherwise you're simply wasting CPU time for nothing. This is true even if the class happens to be a Visualforce page controller.

Does this class act as a controller for a page or component?

If the answer is yes, you should always use "with sharing" to prevent unauthorized updates.

Do I need to expose potentially restricted data, or update records the user may not have access to?

If the answer is yes, you should usually use "without sharing" to allow the unauthorized access. This should only be used as a last resort, because usually the default model is correct.

Summary

Except for Visualforce controllers, most classes should actually be written using the default sharing model. Visualforce pages, custom REST API calls, and the like should specify "with sharing," while most other classes should use the default model. "With Sharing" should only be used when the default model is causing issues that can't be resolved either by fixing sharing rules, profile permissions, etc. In a typical project, the majority of your classes will use the default model, the majority of page controllers will use "with sharing," and a precious few will use "without sharing."

Thursday, January 7, 2016

Converting XML To Native Objects

The Problem

Sometimes, we are forced to work with XML. Sometimes, what we want to do is to load that data into an object so we can use it. Usually, we do not even care about most of the attributes, but simply want the names turned in to fields, and and the values placed in the appropriate fields. It turns out that writing long, convoluted functions that are specific to each XML is incredibly fragile, incredibly hard to troubleshoot when things go wrong, and generally very CPU intensive.

The Solution

I will show you a way that you can effortlessly change XML into a Map, then into JSON, and finally into a native object. It does this in less than 100 lines of code, no matter how big your XML file is (not including the cost of writing the Apex Code you are transforming into, of course). There are some inherent limitations you have to watch out for in this demo version, but I hope that somebody finds this useful. Feel free to tweak this code any way that you see fit for your particular purpose. Also note that this version doesn't handle just strings, but also handles Boolean values, numbers, dates, and times. They will be converted to a native format on a best-effort basis.

The Code

This code has been heavily documented so reader can get a feel for how this code works, so I am not going to spend a lot of time explaining it. Please leave comments if you have any questions.


public class XmlToJson {
    // Try to determine some data types by pattern
    static Pattern 
        boolPat = Pattern.compile('^(true|false)$'),
        decPat = Pattern.compile('^[-+]?\\d+(\\.\\d+)?$'), 
        datePat = Pattern.compile('^\\d{4}.\\d{2}.\\d{2}$'), 
        timePat = Pattern.compile('^\\d{4}.\\d{2}.\\d{2} '+
                                  '(\\d{2}:\\d{2}:\\d{2} ([-+]\\d{2}:\\d{2})?)?$');
    // Primary function to decode XML
    static Map<Object, Object> parseNode(Dom.XmlNode node, Map<Object, Object> parent) {
        // Iterate over all child elements for a given node
        for(Dom.XmlNode child: node.getChildElements()) {
            // Pull out some information
            String nodeText = child.getText().trim(), name = child.getName();
            // Determine data type
            Object value = 
                // Nothing
                String.isBlank(nodeText)? null:
            // Try boolean
            boolPat.matcher(nodeText).find()? 
                (Object)Boolean.valueOf(nodeText):
            // Try decimals
            decPat.matcher(nodeText).find()?
                (Object)Decimal.valueOf(nodeText):
            // Try dates
            datePat.matcher(nodeText).find()?
                (Object)Date.valueOf(nodeText):
            // Try times
            timePat.matcher(nodeText).find()? 
                (Object)DateTime.valueOf(nodeText):
            // Give up, use plain text
            (Object)nodeText;
            // We have some text to process
            if(value != null) {
                // We already have a value here, convert it to a list
                if(parent.containsKey(name)) {
                    try {
                        // We already have a list, so just add it
                        ((List<Object>)parent.get(name)).add(value);
                    } catch(Exception e) {
                        // We don't have a list, so convert to a list
                        parent.put(name, new List<Object>{parent.get(name), value});
                    }
                } else {
                    // Store a new value
                    parent.put(name, value);
                }
            } else if(child.getNodeType() == Dom.XmlNodeType.ELEMENT) {
                // If it's not a comment or text, recursively process the data
                Map<Object, Object> temp = parseNode(child, new Map<Object, Object>());
                // If at least one node, add a new element into the array
                if(!temp.isEmpty()) {
                    // Again, create or update a list if we have a value
                    if(parent.containsKey(name)) {
                        try {
                            // If it's already a list, add it
                            ((List<Object>)parent.get(name)).add(temp);
                        } catch(Exception e) {
                            // Otherwise, convert the element into a list
                            parent.put(name, new List<Object> { parent.get(name), temp });
                        }
                    } else {
                        // New element
                        parent.put(name, temp);
                    }
                }
            }
        }
        return parent;
    }
    // This function converts XML into a Map
    public static Map<Object, Object> parseDocumentToMap(Dom.Document doc) {
        return parseNode(doc.getRootElement(), new Map<Object, Object>());
    }
    // This function converts XML into a JSON string
    public static String parseDocumentToJson(Dom.Document doc) {
        return JSON.serialize(parseDocumentToMap(doc));
    }
    // This function converts XML into a native object
    // If arrays are expected, but not converted automatically, this call may fail
    // If so, use the parseDocumentToMap function instead and fix any problems
    public static Object parseDocumentToObject(Dom.Document doc, Type klass) {
        return JSON.deserialize(parseDocumentToJson(doc), klass);
    }
}

The Unit Test

Of course, no code would be worth having without a unit test, so here is the related unit test that you would use to deploy this code to production. It has 100% coverage, and demonstrates the proper way to write a unit test.


@isTest class XmlToJsonTest {
    @isTest static void test() {
        Dom.Document doc = new Dom.Document();
        doc.load(
            '<a>'+
            '<b><c>Hello World</c><d>2016-05-01</d><e>2016-05-01 '+

               '11:29:00 +03:00</e><f>true</f><g>3.1415</g><h>Two</h><h>Parts</h></b>'+
            '<b><c>Hello World</c><d>2016-05-01</d><e>2016-05-01 '+

               '11:29:00 +03:00</e><f>true</f><g>3.1415</g><h>Two</h><h>Parts</h></b>'+
            '</a>'
        );
        A r = (A)XmlToJson.parseDocumentToObject(doc, a.class);
        System.assertNotEquals(null, r);
        System.assertNotEquals(null, r.b);
        for(Integer i = 0; i != 2; i++) {
            System.assertNotEquals(null, r.b[i].c);
            System.assertNotEquals(null, r.b[i].d);
            System.assertNotEquals(null, r.b[i].e);
            System.assertNotEquals(null, r.b[i].f);
            System.assertNotEquals(null, r.b[i].g);
            System.assertNotEquals(null, r.b[i].h);
        }
    }
    class A {
        public B[] b;
    }
    class B {
        public String c;
        public Date d;
        public DateTime e;
        public Boolean f;
        public Decimal g;
        public String[] h;
    }
}

Warnings

This code may behave oddly if you use anything other than XML formatted as in the example above. Mixing in text, comments, or CDATA in places could cause problems. Also, as commented in the code, the JSON parser may fail if it expects an array and does not find one. This means you may need to some post-processing; that is the reason why there are three functions, so that a developer can stop at any stage of the processing to do some manipulation.

Future Enhancements

Here are a few things I thought of while writing this code, but I did not implement simply because I wanted the code to be as straight-forward as possible:

Improve empty element support
Assume that members ending in "s" are plural (and thus, automatically create a list)
Add CDATA support
Provide additional data types, like Blobs.

Conclusion

Just a few lines of code can help you parse a variety of XML formats into native objects. The code runs reasonably fast, and is far less difficult to read than functions that may span hundreds or thousands of lines of code consisting of conditional branches and loops, and also easier to maintain. If you found this code useful, please let me know. If you have any suggestions for improvements (within the limited scope of making this post more useful), I'd like to hear them as well.

Monday, January 4, 2016

Avoid Checking For Non-Existent Nulls

Introduction

At some point, every developer that's worked in Apex Code has received a System.NullPointerException. It's inevitable. Sometimes the situation is figured out quickly, and other times, developers may not know what caused the exception, so developers may start sprinkling null guards everywhere to try and keep it from happening again. Eventually, code may end up with so many of these guards that the code is less readable, more verbose than it should be, and, worst of all, running slower than it could be, sometimes by a significant amount.

In this post, we're going to explore things that are never null, so readers can avoid the most common null checks I've observed in code, which will improve the code's performance. Since we only have so much time to execute code, known as the Apex CPU limit, it's in our best interest to reduce the amount of time our code takes to run. Besides, your users will thank you for a faster, more responsive user interface and API.

Queries

One common theme that I see in code is that developers will check to make sure a query is not null before trying to access it. These checks do carry a penalty in the form of Apex CPU time that's wasted on a check that will never prevent a System.NullPointerException. While it is true that fields returned from the database may be null, depending on their data type, the list returned from a query will never be null. This code only serves to confuse newer developers into thinking that an empty list cannot be returned from a query, thus perpetuating null checks.

Example


Account[] myAccounts = [SELECT Id, Name FROM Account WHERE OwnerId = :UserInfo.getUserId()];
// The query results will never be null. Why did I check this?
if(myAccounts != null && myAccounts.size() > 0) { ...

Query Result Elements

Similarly, there seems to be some confusion about what the results in the array may look like. I have actually seen code similar to the following:


for(Account record:[SELECT Id, Name FROM Account]) {
if(record != null) { ...

This condition will always be true. A single record from a query will not be null. We do not need to check inpidual records are not somehow null before trying to do something with them.

Query Standard Universally Required Fields

When we query records normally, such as a non-aggregate SOQL call, or any SOSL call, we always get the Id field back in the query, and it will never be null. There's never any reason to check the Id field to see if it is null when it's returned from a query. Obviously, the Id may be null if we cloned the record, cleared the fields, constructed a new record, or accessed a related Id via a parent relationship (e.g. a query that returns Contact records and Account.Id will have a null Id if AccountId is also null). However, unless we've manipulated the records in some way, we can safely assume that the Id is present on the record directly returned from the query.

Query Relationships

One subtle point about the system is that relationships may be null, but do not throw System.NullPointerException if you do not check for them. You can safely avoid checking if a relationship is null if it came from a query, although you will want to check if the field you accessed was null. This only applies to statically compiled references, however, so if you're using dynamic record navigation via SObject.getSObject or SObject.getSObjects, you will want to check for nulls, because you can get an exception.

Examples


// Example 1
for(Contact record: [SELECT Account.Name FROM Contact]) {
    if(record.Account.Name != null) { ...
// Example 2
Account[] accounts = new Account[0];
for(Account record: [SELECT (SELECT Id FROM Contact) FROM Account]) {
    record.No_of_Contacts__c = record.Contacts.size();
    accounts.add(record);
}
update accounts;

Note

Even though relationships are protected against nulls, inpidual fields are not. Do not assume this code is safe:


// If account is null, toUpperCase will fail with an exception.
if(contactRecord.Account.Name.toUpperCase().equals('HELLO WORLD')) { ...

Checking Your Own Variables

Generally speaking, you should avoid having null variables. You should be able to tell which values are null or not null based simply on their origins. You should never have to guess about if your own variables are null or not, especially for sets, lists, and maps. Conversely, you should generally assume that any field that comes the database, Visualforce bindings, or callouts may contain nulls, unless it's obvious that a value can not possibly be null. Fields that will never be null include "Id", "CreatedDate", "CreatedById", "LastModifiedDate", "LastModifiedById", and "Name", as well as any field that is Boolean or required by the system.

Learning The System Library

The system library prefers not to return null values. For example, you may safely assume that List.size will always return a positive, non-null value. There is never a need to check if the value returned is null. Generally speaking, any function that will not accept a null value will not return a null value. The following code may safely be run without checking for nulls:


Date firstSundayOfMonth = Date.today().toStartOfMonth().addDays(6).toStartOfWeek();

Each function in that chain returns another Date, so we are guaranteed to receive a valid date value in firstSundayOfMonth. Most other functions also follow this behavior; the usual way to signal a bad input value is by way of exceptions, so as long as you're checking the parameters you pass to the system library, most functions will never return a null value. Those functions that do are more of an exception than the rule. In fact, those methods are usually explicitly documented as returning a null value when possible.

Conclusion

Apex Code is strongly typed, but not as optimized as Java, so taking a few extra moments to learn which functions return a null value, and which do not, will go a long way in writing code that is easier to read and maintain, will be less likely to run in to governor limits, and should be easier to maintain code coverage for.