Track Users Who Are Offline in Google Analytics

Use this solution to automatically collect data from users who are offline, and send the data to Google Analytics once they are back online.

The steady increase in mobile use over the last years has introduced some new challenges for web analytics. It’s not just about mismatches in the tracking model (the concept of sessions is even more absurd for apps than it is for desktop browsing), but about something more fundamental, more basic. Think about it: if a visitor visits the website using a mobile device, there’s a significant chance of them losing internet connectivity and going unintentionally offline.

Actually, it’s enough for them to simply traverse an area with poor coverage - if the HTTP requests sent by the browser don’t complete in time, they timeout, and the hits are lost.

For Google Analytics pageviews it’s not such a big deal, because if the user sees the web page, it’s very likely the first pageview has completed. However, what about all the other interactions that we want to track, and the user doesn’t have an internet connection to boot? That’s right - we’ll lose these interactions, since the requests are dropped by the browser and never picked up, even when the user gets their connection back.

In this article, the brilliant David Vallejo and I will offer a solution for retrieving these hits initially dropped by the browser due to the internet connection being offline. OK, who am I kidding, this is all David. I’m just a glorified editor at this point.

Anyway, let’s frame it like this: the visitor is viewing our contact page, and we have an event queued up for when they click on our email contact information. However, the visitor is also on the subway, and the moment they click the email, they enter a tunnel. Oh noes! They lose their internet connection (damn subways without WiFi), and we miss our vital conversion tracking.

That’s the premise. Here’s the execution.

In this article, we’ll touch upon a number of fairly technical concepts, but we’ll try to frame them so that they make sense in the big picture.

  1. Universal Analytics Tasks API

  2. The browser’s Storage API

  3. Sending Google Analytics hits with a delay (the &qt parameter)

  4. Sending custom POST and HEAD requests (HTTP protocol)

  5. Batching Universal Analytics Hits

All of these concepts are very useful to know if you want to know more about the mechanisms that the web browser employs to compile and dispatch requests to Universal Analytics.

Let’s go!

1. Universal Analytics Tasks API

Each time a send command is dispatched to the ga() global method, a series of tasks are run in sequence by the analytics.js library. These tasks do a number of things, such as construct the payload, validate it, sample the requests, and finally dispatch the requests to the Universal Analytics endpoint.

The neat thing about these tasks is that we can intercept them and modify them using the API provided by analytics.js.

You can find a list of all available tasks in the API here. However, in this article we will focus on just a very special, very significant task: customTask. It’s a new task introduced very recently (here’s a guide by Simo).

This task is true to its name - it’s entirely customizable. It runs first in the task queue, so you can use it to configure the tracker object or even to set other tasks up for customization.

In this guide, we’ll use customTask to modify the sendHitTask. This way we can check if the user has internet connectivity when sendHitTask is run, and we can do a number of things if the user has dropped their connection. We use customTask instead of directly accessing sendHitTask simply because this way is much more Google Tag Manager-friendly.

In short, here’s the process:

customTask modifies sendHitTask with OUR CODE before hit is sent.

In order to detect if the user has internet connectivity, we could simply poll our own web server endpoint. However, that wouldn’t be a good litmus test since it could be just Google’s servers that are not responding. That’s why we’ll actually poll Google’s endpoint to check if the user has the connectivity required for Google Analytics tracking.

2. The offline tracker

The solution is that with every single request to Universal Analytics, we’ll send an HTTP HEAD request to the Universal Analytics endpoint. If the endpoint isn’t responding, we can infer that the user does not have connectivity for communicating with Google Analytics, and so we’ll store the hit into browser storage until such a time that internet coverage is restored.

We’ll use localStorage for the queue, but we’ll need to cheat a little. localStorage itself doesn’t introduce any structure or a deeper data model - it just processes key-value pairs. So, to give us some additional flexibility, we’ll use the Lockr open-source database layer. It’s a simple and efficient framework, and it has pretty solid browser support.

This solution picks up the Universal Analytics hit payload from sendHitTask, and if there is no internet connection, this hit is stored in localStorage with the timestamp of the request. Thus, when we later do manage to send the stored hit, we can send it at its original timestamp to Universal Analytics.

2.1. The &qt parameter

The &qt Measurement Protocol stands for queue time. Basically, you can set a number in milliseconds in that parameter, and the hit will be sent with a timestamp that many milliseconds in the past. For example, if I know that the hit I want to send actually happened 45 minutes ago, I can set the parameter to:

&qt=2700000

There’s just an odd little quirk you need to know about &qt. The latest you can send the displaced hit is at 03:59:59 the following day (in the timezone of the Google Analytics view the hit is being sent to). Thus, the maximum value for &qt is 27 hours, 59 minutes, and 59 seconds (in milliseconds), if the hit occurred at exactly midnight, and you then send it the following morning, just before 4 AM.

Yes, it might be difficult to grasp, so we’ll go with the official recommendation: avoid sending hits more than 4 hours in the past, since there’s no guarantee they will get sent.

3. The HTTP HEAD request

So what is this HEAD request and why are we using it? Well, it’s identical to GET, except it only returns the HTTP headers (and associated metadata), never any content.

It’s thus a great method to use if we only want to test an endpoint, and not get into the expensive process of actually retrieving data from it.

Since we are only interested in knowing if the Universal Analytics endpoint responds, the HTTP HEAD request is perfect for this purpose. Also, see how efficient it is compared to POST and GET:

4. The JavaScript code

The code comes in three parts. First is the library for extending the database: Lockr. Next we have the customTask execution, and finally we’ll chain the HTTP HEAD and batch requests together to make the whole thing click.

4.1. Lockr download

To get started, go ahead and load Lockr on your site in any way you want. If you’re using Google Tag Manager, we recommend loading the following code in a Custom HTML Tag that fires on All Pages with the highest possible Tag Priority. Alternatively, if you accept some overhead and redundancy, you can just add the library to the top of the Custom JavaScript Variable itself, as in the example in the next chapter.

Here’s the minified JavaScript code  - it should be executed by the browser before the offline tracking solution is run:

!function(e,t){"undefined"!=typeof exports?"undefined"!=typeof module&&module.exports&&(exports=module.exports=t(e,exports)):"function"==typeof define&&define.amd?define(["exports"],function(r){e.Lockr=t(e,r)}):e.Lockr=t(e,{})}(this,function(e,t){"use strict";return Array.prototype.indexOf||(Array.prototype.indexOf=function(e){var t=this.length>>>0,r=Number(arguments[1])||0;for((r=r<0?Math.ceil(r):Math.floor(r))<0&&(r+=t);r<t;r++)if(r in this&&this[r]===e)return r;return-1}),t.prefix="",t._getPrefixedKey=function(e,t){return(t=t||{}).noPrefix?e:this.prefix+e},t.set=function(e,t,r){var o=this._getPrefixedKey(e,r);try{localStorage.setItem(o,JSON.stringify({data:t}))}catch(r){console&&console.warn("Lockr didn't successfully save the '{"+e+": "+t+"}' pair, because the localStorage is full.")}},t.get=function(e,t,r){var o,n=this._getPrefixedKey(e,r);try{o=JSON.parse(localStorage.getItem(n))}catch(e){o=localStorage[n]?{data:localStorage.getItem(n)}:null}return o?"object"==typeof o&&void 0!==o.data?o.data:void 0:t},t.sadd=function(e,r,o){var n,a=this._getPrefixedKey(e,o),i=t.smembers(e);if(i.indexOf(r)>-1)return null;try{i.push(r),n=JSON.stringify({data:i}),localStorage.setItem(a,n)}catch(t){console.log(t),console&&console.warn("Lockr didn't successfully add the "+r+" to "+e+" set, because the localStorage is full.")}},t.smembers=function(e,t){var r,o=this._getPrefixedKey(e,t);try{r=JSON.parse(localStorage.getItem(o))}catch(e){r=null}return r&&r.data?r.data:[]},t.sismember=function(e,r,o){return t.smembers(e).indexOf(r)>-1},t.keys=function(){var e=[],r=Object.keys(localStorage);return 0===t.prefix.length?r:(r.forEach(function(r){-1!==r.indexOf(t.prefix)&&e.push(r.replace(t.prefix,""))}),e)},t.getAll=function(e){var r=t.keys();return e?r.reduce(function(e,r){var o={};return o[r]=t.get(r),e.push(o),e},[]):r.map(function(e){return t.get(e)})},t.srem=function(e,r,o){var n,a,i=this._getPrefixedKey(e,o),c=t.smembers(e,r);(a=c.indexOf(r))>-1&&c.splice(a,1),n=JSON.stringify({data:c});try{localStorage.setItem(i,n)}catch(t){console&&console.warn("Lockr couldn't remove the "+r+" from the set "+e)}},t.rm=function(e){var t=this._getPrefixedKey(e);localStorage.removeItem(t)},t.flush=function(){t.prefix.length?t.keys().forEach(function(e){localStorage.removeItem(t._getPrefixedKey(e))}):localStorage.clear()},t});

After this code puke, we’re ready to jump in the deep end with some offline hit tracking!

4.2. Offline hit tracker for on-page Universal Analytics

The following JavaScript runs with the default Universal Analytics tracker, and thus any hits sent with the ga('send', '...'); will be included in the process.

To make the whole thing work, you should setup your code in the following order:

<head>
  ...
  <script>
    // Put the Lockr code here first
  </script>
  <script>
    var _offlineTracker = function(customTaskModel) {
      // _offlineTracker (see below) here
    };
  </script>
  <script>
    // Universal Analytics snippet here
    ga('create', 'UA-12345-1');
    // Add the following line AFTER the 'create' command and BEFORE the first 'send' command
    ga('set', 'customTask', _offlineTracker);
    ga('send', 'pageview');
  </script>
  ...
</head>

And here’s the code for the _offlineTracker callback function.

var _offlineTracker = function(customTaskModel) {

  Lockr.prefix = 'ga_';
  // Grab the original sentHitTask Function from the first tracker. to kept the original hit sending function.
  var originalSendHitTask = customTaskModel.get('sendHitTask');
  customTaskModel.set('sendHitTask', function(model) {
    // Let's send the original hit using the native functionality
    originalSendHitTask(model);
    // Grab the hit Payload
    var payload_lz = model.get('hitPayload');
    // Check if GA Endpoint is Ready
    var http = new XMLHttpRequest();
    http.open('HEAD', 'https://www.google-analytics.com/collect');
    http.onreadystatechange = function() {
      // Google Analytics endpoint is not reachable, let's save the hit                
      if (this.readyState === this.DONE && this.status !== 200) {
        Lockr.sadd('hits', payload_lz + "&qt=" + (new Date() * 1));
      } else {
        // Google Analytics endpoint is available, let's check if there are any unsent hits
        if (Lockr.smembers("hits").length > 0) {                        
          // Process hits in queue
          var current_ts = new Date() * 1 / 1000;
          var hits = Lockr.smembers("hits");

          // ./batch endpoint only allows 20 hits per batch, let's chunk the hits array. 
          var chunk_size = 20;
          var chunked_hits = Lockr.smembers("hits").map(function(e, i) {
            return i % chunk_size === 0 ? hits.slice(i, i + chunk_size) : null;
          }).filter(function(e) {
            return e;
          });
          // Let's loop thru the chunks array and send the hits to GA
          for (var i = 0; i < chunked_hits.length; i++) {
            var xhr = new XMLHttpRequest();
            xhr.open('POST', 'https://www.google-analytics.com/batch', true);
            // Build the Batch Payload and Take care of calculating the Queue Time 
            xhr.send(chunked_hits[i].map(function(x) {
              if (x.indexOf("&qt=") > -1) {
                return x.replace(/qt=([^&]*)/, "qt=" + Math.round(current_ts - x.match(/qt=([^&]*)/)[1] / 1000) * 1000);
              } else return x;
            }).join("\n"));
          }
          //Hits sent, flush the Storage
          Lockr.flush();
        }
      }
    };
    http.send();
  });
};

Once you create this _offlineTracker and invoke it in the ga('set', 'customTask', _offlineTracker) command, every single hit that uses this tracker will be stored in the queue if there is no internet connectivity. Once a hit is sent with a solid connection, all hits in the queue are sent as well.

4.3. Offline hit tracker for Google Tag Manager

With Google Tag Manager, you can get by with a single Custom JavaScript variable. This variable can be configured to include the Lockr code as well, so it’s completely self-contained. Give the variable a descriptive name, e.g. {{JS - customTask Offline Hit Tracker}} and put the following code within:

function() {
  return function(customTaskModel) {
    // Load Lockr if it hasn't already been loaded
    if (!window.Lockr) {
      !function(e,t){"undefined"!=typeof exports?"undefined"!=typeof module&&module.exports&&(exports=module.exports=t(e,exports)):"function"==typeof define&&define.amd?define(["exports"],function(r){e.Lockr=t(e,r)}):e.Lockr=t(e,{})}(this,function(e,t){"use strict";return Array.prototype.indexOf||(Array.prototype.indexOf=function(e){var t=this.length>>>0,r=Number(arguments[1])||0;for((r=r<0?Math.ceil(r):Math.floor(r))<0&&(r+=t);r<t;r++)if(r in this&&this[r]===e)return r;return-1}),t.prefix="",t._getPrefixedKey=function(e,t){return(t=t||{}).noPrefix?e:this.prefix+e},t.set=function(e,t,r){var o=this._getPrefixedKey(e,r);try{localStorage.setItem(o,JSON.stringify({data:t}))}catch(r){console&&console.warn("Lockr didn't successfully save the '{"+e+": "+t+"}' pair, because the localStorage is full.")}},t.get=function(e,t,r){var o,n=this._getPrefixedKey(e,r);try{o=JSON.parse(localStorage.getItem(n))}catch(e){o=localStorage[n]?{data:localStorage.getItem(n)}:null}return o?"object"==typeof o&&void 0!==o.data?o.data:void 0:t},t.sadd=function(e,r,o){var n,a=this._getPrefixedKey(e,o),i=t.smembers(e);if(i.indexOf(r)>-1)return null;try{i.push(r),n=JSON.stringify({data:i}),localStorage.setItem(a,n)}catch(t){console.log(t),console&&console.warn("Lockr didn't successfully add the "+r+" to "+e+" set, because the localStorage is full.")}},t.smembers=function(e,t){var r,o=this._getPrefixedKey(e,t);try{r=JSON.parse(localStorage.getItem(o))}catch(e){r=null}return r&&r.data?r.data:[]},t.sismember=function(e,r,o){return t.smembers(e).indexOf(r)>-1},t.keys=function(){var e=[],r=Object.keys(localStorage);return 0===t.prefix.length?r:(r.forEach(function(r){-1!==r.indexOf(t.prefix)&&e.push(r.replace(t.prefix,""))}),e)},t.getAll=function(e){var r=t.keys();return e?r.reduce(function(e,r){var o={};return o[r]=t.get(r),e.push(o),e},[]):r.map(function(e){return t.get(e)})},t.srem=function(e,r,o){var n,a,i=this._getPrefixedKey(e,o),c=t.smembers(e,r);(a=c.indexOf(r))>-1&&c.splice(a,1),n=JSON.stringify({data:c});try{localStorage.setItem(i,n)}catch(t){console&&console.warn("Lockr couldn't remove the "+r+" from the set "+e)}},t.rm=function(e){var t=this._getPrefixedKey(e);localStorage.removeItem(t)},t.flush=function(){t.prefix.length?t.keys().forEach(function(e){localStorage.removeItem(t._getPrefixedKey(e))}):localStorage.clear()},t});
    }
    Lockr.prefix = 'ga_';
    // Grab the original sentHitTask Function from the first tracker. to kept the original hit sending function.
    var originalSendHitTask = customTaskModel.get('sendHitTask');
    customTaskModel.set('sendHitTask', function(model) {
      // Let's send the original hit using the native functionality
      originalSendHitTask(model);
      // Grab the hit Payload
      var payload_lz = model.get('hitPayload');
      // Check if GA Endpoint is Ready
      var http = new XMLHttpRequest();
      http.open('HEAD', 'https://www.google-analytics.com/collect');
      http.onreadystatechange = function() {
        // Google Analytics endpoint is not reachable, let's save the hit                
        if (this.readyState === this.DONE && this.status !== 200) {
          Lockr.sadd('hits', payload_lz + "&qt=" + (new Date() * 1));
        } else {
          // Google Analytics endpoint is available, let's check if there are any unsent hits
          if (Lockr.smembers("hits").length > 0) {                        
            // Process hits in queue
            var current_ts = new Date() * 1 / 1000;
            var hits = Lockr.smembers("hits");

            // ./batch endpoint only allows 20 hits per batch, let's chunk the hits array. 
            var chunk_size = 20;
            var chunked_hits = Lockr.smembers("hits").map(function(e, i) {
              return i % chunk_size === 0 ? hits.slice(i, i + chunk_size) : null;
            }).filter(function(e) {
              return e;
            });
            // Let's loop thru the chunks array and send the hits to GA
            for (var i = 0; i < chunked_hits.length; i++) {
              var xhr = new XMLHttpRequest();
              xhr.open('POST', 'https://www.google-analytics.com/batch', true);
              // Build the Batch Payload and Take care of calculating the Queue Time 
              xhr.send(chunked_hits[i].map(function(x) {
                if (x.indexOf("&qt=") > -1) {
                  return x.replace(/qt=([^&]*)/, "qt=" + Math.round(current_ts - x.match(/qt=([^&]*)/)[1] / 1000) * 1000);
                } else return x;
              }).join("\n"));
            }
            //Hits sent, flush the Storage
            Lockr.flush();
          }
        }
      };
      http.send();
    });
  };
}

Add this code to all your Universal Analytics tags by scrolling to More settings -> Fields to set. Here you add a new field with:

Field name: customTask
Value: {{JS - customTask Offline Hit Tracker}}

Once you’ve done this, then all your tags with this customTask setting are protected against poor connectivity, and whenever the connection is restored, the batch queue is processed.

4.4. About batching

Universal Analytics lets you send hits to its endpoint in batches. This is used mostly by the iOS and Android SDKs. The main point in using the batch endpoint (/batch) is to have as few HTTP requests dispatched as possible. Here, batching means that we can send multiple Universal Analytics payloads in a single HTTP request.

Batching does have some limitations we’ll need to consider:

  • A maximum of 20 hits can be specified per request.

  • The combined size of all hit payloads cannot be greater than 16K bytes.

  • No single hit payload can be greater than 8K bytes.

  • The request needs to be done using POST

For our solution, each time the number of stored hits is more than 1, we send the payloads using the batch endpoint. In case there are too many hits stored in the queue, we’re chunking them so that multiple batch requests are sent in succession until the entire queue is processed.

5. Improvements

Keep in mind that this current post is meant to show how to track the hits that may happen while the user is offline (due to a connectivity gap). At the same time, we’ve taken the opportunity to showcase some cool and relatively little-known Google Analytics JavaScript API functionalities.

A solid improvement would be to skip the originalSendTask part, and just overwrite the sendHitTask task entirely. This way you can skip the HTTP HEAD request, because you can just check if the initial hit to Google Analytics is dispatched successfully.

The one thing you need to keep in mind if you want to overwrite the sendHitTask is that you’ll need to replicate the transport logic for the request. The analytics.js library supports three different ways to dispatch the requests:

  1. 'image' - max 2K payload size sent with a GET request to a pixel endpoint

  2. 'xhr' - max 8K payload size sent as an HTTP POST request

  3. 'beacon' - POST request that uses the navigator.sendBeacon() to make sure your requests are sent even if the user has already navigated away from the page

So you’ll need to replicate this logic in your custom sendHitTask method. It’s not exactly trivial to do, but an able developer should be able to do it, especially once the constraints (see previous paragraph) are known.

Another thing you might want to do is add a Custom Dimension to all the payloads that are stored in the queue. This Custom Dimension could be named something like Offline hit, and you should set the value to true if the hit was sent from the offline queue. Thus you can monitor how many hits in your data were initially aborted due to poor internet connectivity.

6. Summary

I’m really glad to have David guest star on this blog - it’s been a long time coming! This solution is great for two reasons. First, it’s actually very usable, especially if your site caters to mobile visitors. Second, it showcases a number of features of the web browser and the analytics.js library that can be extended to other purposes, too.

The Tasks API is really interesting, as it allows you to manipulate the payloads dispatched by the website. And with the introduction of customTask, we finally have a very handy way of accessing tasks with Google Tag Manager.

Note that if you have a web application with offline capabilities, and you are using service workers to manage this functionality, Google has released a useful library for employing service workers to do precisely the same thing we’re doing in this article.

We hope you enjoyed this solution! Let us know in the comments