· #servicenow#integration

Importing a million records daily in ServiceNow (Pre-Quebec)

Lessons learned building a high-performance scripted REST endpoint and import set pipeline for million-record daily imports in ServiceNow.

Note: Make sure to check out part 2 of this (as this is no longer the most performant solution).

I had a client ask me to create an integration with a service that will create/update 1 million records a day and store that information in ServiceNow.

I am not used to putting my performance hat on when working with ServiceNow because most of my previous projects are focused on client side optimization (utilizing javascript functions or reducing the number of calls to the backend).

So I took this as an opportunity to jump into the deep end.

ServiceNow is great about documentation, but it can be hard to find official best practices. This post is dedicated to my lessons learned and experiences. I hope it helps someone else in the future.

Architecture

Let’s start with deciding the architecture. There are a couple questions we have to ask ourselves:

  1. How is the data getting into ServiceNow? Is it going to be a push or pull model?
  2. What technology are we going to use to import the records?
  3. What performance considerations do we need to keep in mind?
  4. What scope are we doing this in?

In this example, we are going to answer them as the following:

  1. The external application is going to push data into ServiceNow. The external application wants to keep a standard format (JSON) for pushing to multiple applications. This narrows down our options
    1. We won’t be using Data Sources to pull the data.
    2. We can’t utilize ServiceNow’s standard import set APIs
    3. We can explore some of the event management APIs, but we want to consider licensing / modules. In this case they didn’t have ITOM.
    4. Data streaming would be a nice way to do this (and probably another blog), but the client isn’t on Paris.
    5. This leaves us with a Scripted REST endpoint as the solution I chose.
  2. Since we now know how data will be received, we have a few options on how to import the data.
    1. We could just use a standard GlideRecord and inspect the payload and set the fields for each record sent.
      1. I chose to not do this because, I want a ServiceNow admin to not have to write a lot of code if it changes in the future.
      2. Another reason could be performance. If we are opening 1 million GlideRecord’s it could be painful for ServiceNow to handle
    2. I went with utilizing the Import Set Javascript API
  3. The first performance question that comes to mind is: cleaning up the data. Since we are using a staging table and transform map, once we successfully import the data, we need to clean up the staged data. We don’t want to keep a million records growing daily. The second performance question that comes to mind is: Will import sets survive having 1 million records in a single import set?
    1. So what is the most performant way to clean up the million records? I chose the Table Cleanup Policy (Auto-flush) over a scheduled Job.
    2. So what is the best structure for large import sets? For example, 10 import sets with 10,000 records is better than 1 import set with 1 million records.
  4. This one is always a tricky one but relates to number 2. In this example, the client wanted it in the HR related scope.

Now that we understand the architecture, we can draw it and start building it.

The Magic

Alright. Enough talk. Show me the magic.

Scripted REST Endpoint

var staging_table = 'some_table';

var requestBody = request.body.data;
var responseBody = {};
var data = requestBody.data;
var sID, impSet;
//Make sure the payload exists in the request
if (data && typeof(data) != undefined && data.length > 0) {
    var size = 10000; //10 for testing;
    var i;
    var len = data.length;

    var gr = new GlideRecord(staging_table);
    //Check if the import set is already full
    if (data.length > size) {
        //multiple import sets
        while (data.length) {
            var arr = data.splice(0, size);

            for (i = 0; i < arr.length; i++) {
                gr.initialize();
                gr.u_field1 = arr[i].field1;
                gr.u_field2 = arr[i].field2;
                gr.u_field3 = arr[i].field3;
                sID = gr.insert();
                //retrieve Import Set sys_id from the last record inserted in the batch
                if (i == arr.length - 1) {
                    impSet = getImportSet(sID);
                }
            }
            //close the active import set
            closeImportSet(impSet);
        }
    } else {
        //just 1 import set
        for (i = 0; i < data.length; i++) {
            gr.initialize();
            gr.field1 = data[i].field1;
            gr.field2 = data[i].field2;
            gr.field3 = data[i].field3;
            gr.field4 = data[i].field4;
            sID = gr.insert();

            //retrieve Import Set sys_id from the last record inserted in the batch
            if (i == data.length - 1) {
                impSet = getImportSet(sID);
            }
        }
        //close the active import set
        closeImportSet(impSet);
    }
    response.setStatus('201');
    responseBody.size = len;
    response.setBody(responseBody);
} else {
    response.setStatus('400');
    responseBody.error = 'Please provide format in the following spec: {data:[]}.';
    response.setBody(responseBody);
}

//Return the import set sys id as a string
function getImportSet(sID) {
    var grSHCJMS = new GlideRecord(staging_table);
    if (grSHCJMS.get(sID)) {
        return String(grSHCJMS.sys_import_set);
    }
}
// Close the import set, so we can create a new one
function closeImportSet(impSet) {
    var util = new global.HRIntegrationsUtils();
    util.setImportSetState(impSet, 'processed');
}

Auto-flush Record (make sure it is in the same scope of the staging table)

Tablename: <the name of the staging table>
Matchfield: <sys_created_on>
age in seconds: 86,400 (1 day)
Active: true
Cascade delete: false
conditions: sys_import_stateINprocess,cancelled^EQ