Recently I published an article on how to set up an impact test for the “flicker effect” omnipresent in client-side A/B-testing tools. Be sure to check out that article first to get some context to what we’re going to be talking about here.

In this short follow-up, I’ll show you how to measure the average time of the anti-flicker snippet delaying page visibility, if you choose to deploy the snippet. The methodology is very similar.

If you recall, the anti-flicker snippet hides the entire page while waiting for the Optimize container to load. So we’ll be measuring how long it took for the page to become unhidden. Visibility is restored if the Optimize container loads successfully, or if the load ends in a timeout (4 seconds by default).

The test below is run by splitting 50% of traffic to the asynchronous Optimize snippet and 50% of the traffic to the Google Tag Manager Optimize tag.

We’re using Google Analytics: App + Web with its wonderful BigQuery export for the analysis. We’ll be using Google Tag Manager to collect and send the data forward.

Modify the page template

You need to edit the page template. The anti-flicker snippet must be added directly to the page template, and we also need to write the logic that determines whether the user should see the Optimize snippet or whether Optimize should be loaded via Google Tag Manager.

At the very top of the <head> element on your experiment pages, add the following HTML block:

<!-- The style declaration for the anti-flicker snippet -->
<style>.async-hide { opacity: 0 !important} </style>

<script>
  (function() {
    // Modify the optimizeId to match your Optimize container ID, gtmId
    // to match your GTM container ID, and dataLayerName to match the name
    // of the dataLayer array on your site.
    var optimizeId = 'GTM-NGM64B',
        gtmId = 'GTM-PZ7GMV9',
        dataLayerName = 'dataLayer',
        hideObj = {},
        hideGTMId = Math.random() < 0.5 ? optimizeId : gtmId;
    
    hideObj[hideGTMId] = true;
    
    // Helper to handle the dataLayer.push()
    var dPush = function(status) {
      window[dataLayerName].push({
        event: 'optimize_anti_flicker_test',
        milestones: {
          antiFlickerStart: window[dataLayerName].hide.start,
          antiFlickerEnd: new Date().getTime(),
          testStatus: status
        }
      });
    };

    // MODIFIED anti-flicker snippet
    (function(a,s,y,n,c,h,i,d,e) {
      s.className + = ' ' + y;
      h.start = 1 * new Date;
      h.end = i = function(){
        clearTimeout(t); 
        s.className = s.className.replace(RegExp(' ?' + y), '')
      };
      (a[n] = a[n] || []).hide = h;
      var t = setTimeout(function() {
        dPush('timeout');
        i();
        h.end = null;
      }, c);
      h.timeout = c;
    })(window, document.documentElement, 'async-hide', dataLayerName, 4000, hideObj);

    // Determine where to load Optimize from (inline vs. GTM)
    if (hideGTMId === optimizeId) {
      var el = document.createElement('script');
      el.src = 'https://www.googleoptimize.com/optimize.js?id=' + optimizeId;
      el.addEventListener('error', function() {
        dPush('optimizeSnippetError');
        window[dataLayerName].hide.end && window[dataLayerName].hide.end();
      });
      document.head.appendChild(el);
    } else {
      window[dataLayerName].push({
        gtmOptimize: true
      });
    }
    
    // Configure the Optimize callback
    function gtag() {dataLayer.push(arguments)};
    gtag('event', 'optimize.callback', {
      callback: function() {
        dPush(hideGTMId === optimizeId ? 'optimizeSnippet' : 'gtmTag');
      }
    });
  })();
</script>

You should only add this snippet on pages that are actually running the experiment, to make sure you don’t accidentally collect measurements from pages that aren’t actually running Optimize.

In the first block of variables, make sure you update optimizeId, gtmId, and dataLayerName to reflect your Optimize ID, Google Tag Manager container ID, and name of the dataLayer array, respectively.

var hideGTMId = Math.random() < 0.5 ? optimizeId : gtmId; chooses randomly (50% chance) whether to load Optimize using the asynchronous inline snippet or through a Google Tag Manager tag.

The anti-flicker snippet is modified in this version. The main change is that if the timeout happens (by default 4000 milliseconds after the page was hidden), a dataLayer.push() is called with this information. Because of this, another modification to the snippet is to stop the timeout counter in case the page is unhidden (to avoid the timeout being erroneously reported to dataLayer):

h.end = i = function() {
  clearTimeout(t);
  ...
}
var t = setTimeout(function() {
  dPush('timeout');
  ...
}, c);

The following block checks if Optimize should be loaded via the snippet or via GTM.

// Determine where to load Optimize from (inline vs. GTM)
if (hideGTMId === optimizeId) {
  var el = document.createElement('script');
  el.src = 'https://www.googleoptimize.com/optimize.js?id=' + optimizeId;
  el.addEventListener('error', function() {
  dPush('optimizeSnippetError');
    window[dataLayerName].hide.end && window[dataLayerName].hide.end();
  });
  document.head.appendChild(el);
} else {
  window[dataLayerName].push({
    gtmOptimize: true
  });
}

If the inline snippet wins the draw, the Optimize element is added to the page together with an error listener that unhides the page in case there’s an error loading Optimize (e.g. user is blocking the script load).

If Optimize is loaded via Google Tag Manager, then the key gtmOptimize is pushed to dataLayer with the value true. This is then later used as a trigger condition for the Optimize tag.

As soon as the Optimize or Google Tag Manager container loads, or the anti-flicker snippet runs into its timeout, or there’s an error in loading Optimize, a dataLayer.push() happens with the following content:

{
  event: 'optimize_anti_flicker_test',
  milestones: {
    antiFlickerStart: window[dataLayerName].hide.start,
    antiFlickerEnd: new Date().getTime(),
    testStatus: status
  }
}

Here, status is one of gtmTag (if Optimize loaded via Google Tag Manager), optimizeSnippet (if loaded via Optimize), timeout (if the timeout is reached), or optimizeSnippetError (if the Optimize snippet ran into an error).

One thing to note is that this setup does not test if Google Tag Manager is blocked. This is something you might want to test for as well, if you want to get an even more comprehensive idea of what’s going on with your experiment implementations.

Google Tag Manager setup

In Google Tag Manager, we’ll need to create an App + Web tag (because we want to do the analysis in BigQuery again). We’ll also need a Custom Event trigger and some Data Layer variables.

The trigger

This is what the Custom Event trigger looks like.

This trigger will fire whenever the dataLayer.push() with the snippet test data is executed on the page. There’s also a condition to only fire this tag on the homepage, which you can and should modify/remove if you’re running experiments elsewhere as well!.

The variables

You’re going to need four dataLayer variables.

Variable name Value of the “Data Layer Variable Name” field Purpose
DLV - milestones.testStatus milestones.testStatus One of gtmTag, optimizeSnippet, timeout, or optimizeSnippetError.
DLV - milestones.antiFlickerStart milestones.antiFlickerStart Timestamp of when page hiding began.
DLV - milestones.antiFlickerEnd milestones.antiFlickerEnd Timestamp of when page hiding ended.
DLV - gtmOptimize gtmOptimize Is true if Optimize should be loaded in a GTM tag.

This is what a variable would look like, per the specification in the table above:

The tags

First, you’ll need to create the App + Web Event tag. Make sure you have a base tag as well!

The tag is set to trigger with the Custom Event trigger we created above, and it sends the values of the three milestones variables to App + Web as custom parameters. Feel free to change the keys and event names as you wish.

Next, we need to fire the Google Optimize tag conditionally, depending on if the gtmOptimize key is in dataLayer with the value true.

Since Optimize needs to run in a tag sequence, this is actually pretty convoluted to do. In addition to the Google Optimize tag itself, you need a new Universal Analytics Page View tag which only fires when gtmOptimize is true, and you need to block your regular Page View tag in this circumstance as well.

Don’t worry about the Tag Sequencing setting, you set it up in the new Page View tag that you’ll also need to create:

Take note of the new trigger: “All Pages - Anti-flicker” is a Page View trigger with a single condition: {{DLV - gtmOptimize}} equals true.

Finally, make sure you block your regular Page View tag to avoid double-counting on pages where the Optimize-specific Page View already fired:

The blocking trigger is a Custom Event trigger that blocks any event if gtmOptimize is true:

Test the setup

Once you’ve set everything up, try loading the page with the page template modification in Preview mode. Make sure you see a request to App + Web with your custom parameters in place.

If it doesn’t work, check the browser console for errors. Also, make sure that the Optimize container actually loads, and that you have an experiment running on the page where you are testing!

Dig deep with BigQuery

Once the data starts flowing into BigQuery, you should find your events with a query like this:

SELECT
  *
FROM 
  `project.dataset.events_202006*`
WHERE
  event_name = 'optimize_anti_flicker_snippet_test'

We’re simply loading all the hits with the optimize_anti_flicker_snippet_test data to get an overview of what those hits look like.

To get a count of different test types, you can run a query like this:

SELECT
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'test_status') as test_status,
  COUNT(*) as count
FROM
  `project.dataset.events_202006*`
WHERE
  event_name = 'optimize_anti_flicker_snippet_test'
GROUP BY 1
ORDER BY 2 DESC

This query pulls in the value of test_status from the events, and does an aggregate count of each status. This is what the end result would look like:

Finally, to get some averages in place, you can modify the query to look like this:

WITH milestones AS (
  SELECT 
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'test_status') as test_status,
    (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'anti_flicker_start') as anti_flicker_start,
    (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'anti_flicker_end') as anti_flicker_end
  FROM 
    `project.dataset.events_202006*`
  WHERE
    event_name = 'optimize_anti_flicker_snippet_test'
)
SELECT 
  test_status,
  COUNT(*) as count,
  ROUND(AVG(anti_flicker_end - anti_flicker_start), 2) as average_delay_in_ms,
FROM 
  milestones
WHERE anti_flicker_end - anti_flicker_start < 5000
GROUP BY 1
ORDER BY 3 DESC

The WITH...AS block creates a source table with just the test status and the flicker start and end times in place. We can then query this common table expression (CTE) to get our counts and averages properly.

As you can see, I have a WHERE clause in place where I make sure the delta is no more than 5000 milliseconds. This is because sometimes the experiment resulted in abnormally high deltas, probably due to the Optimize container being extremely slow to load, and thus producing the end time much later than the timeout.

With the WHERE clause, we ignore such abnormal deltas. We can do that because we are only interested in how long the page was hidden, and if the page is hidden more than 4 seconds (and change), it’s automatically revealed by the anti-flicker snippet itself.

Anyway, this query produces the following result for my dataset:

The average time for the page to be hidden if the snippet runs into its timeout is 3.8 seconds. That’s kinda weird as the timeout shouldn’t happen before 4000 milliseconds have passed. I’d dig deeper into it, but I can also just assume that all timeout occurrences were hidden the full 4 seconds nevertheless.

When Optimize is loaded using a Google Tag Manager tag, the page is hidden for an average of 964 milliseconds. When loaded with the asynchronous inline snippet, it’s hidden for 581 milliseconds by average.

Summary

It’s a fairly convoluted setup, but the purpose isn’t to do this for every single one of your experiments. It’s to satisfy your curiosity about whether or not the anti-flicker snippet is degrading user experience or not, and this gives you just one variable to work with (average delay of page unhiding).

It’s not without its flaws - the timer starts when the anti-flicker snippet is executed at the top of <head>, when realistically it should only start when the page would normally produce the first visible element (First Contentful Paint).

The analysis is also detached - just knowing the delay isn’t that interesting. What would make it more significant is to see if the length of the anti-flicker effect had an impact on how the user interacts with the site. It’s possible a lengthy delay in painting the page could result in the user bouncing back to the previous page, as they might assume the page does not work.

Just to belabor the point: this article showed you a methodology you could potentially employ to measure the effect of flicker mitigation efforts in client-side A/B-testing. The flicker is a problem, and knowing just how much of a problem is the first step in solving it.

Let me know in the comments if you have suggestions for improving the experiment!