Updated 15 April 2020: Fix the message forwarder to properly clone objects before they are passed to postMessage

Here I am, back with <iframe> and cross-domain tracking. I've published a couple of articles before on the topic, with my upgraded solution being the most recent one. These articles tackle the general problem of passing the Client ID from the parent to the <iframe>.

By doing so, the <iframe> can take the Client ID from the frame URL and create the _ga cookie in the <iframe>, allowing hits from the parent and the <iframe> to use the same Client ID. Great.

However, there's a catch that's become more and more troubling with the increase of browsers’ tracking protection mechanisms: if the <iframe> loads content cross-site, the browser must allow cookie access in third-party context. In other words, if the browser blocks third-party cookies, the <iframe> will not be able to write the cookie, and Google Analytics tracking will fail.

This article offers two solutions to this problem.

The first is that it sends the Google Analytics Client ID from the parent to the <iframe> using window.postMessage, and the child frame can poll for this information on every page, meaning cookies are not needed to store the Client ID.

The second solution is that the child frame actually sends every single dataLayer message to the parent, so that the child frame doesn't have to track anything by itself. The parent handles the tracking for the child frame instead.

Using window.postMessage is more robust than the link decoration of my earlier proposals, because it doesn't force a reload of the frame, nor does it require the child frame to support cookies. It's a bit more elaborate to set up, however.

Warning! I'm not kidding about that last statement. Setting this up requires quite a bit of custom code, and you're working with bilateral communication between two windows. Please read this article carefully so that you understand what's going on before copy-pasting any code to your Google Tag Manager container(s).

Table of Contents

What exactly is the problem?

Glad you asked!

When a page loads an <iframe> from a cross-site origin, that frame is loaded in a third-party context, and any access to browser storage from within that <iframe> will require the browser to allow third-party cookies for the <iframe site.

A key term here is cross-site. This means that the top privately-controlled domain name of the <iframe> is different from that of the parent page.

In the examples below, the top privately-controlled domain name is in italics.

  • www.simoahava.com
  • www.gtmtools.com
  • www.ebay.co.uk
  • sahava.github.io
  • analytics.google.com

If any one of the sites above loaded any other site from the list in an <iframe>, that content would be loaded cross-site, and any cookie operations within the embedded page would require third-party cookie access.

The following examples are all same-site, even if they are cross-origin:

  • www.simoahava.com
  • simoahava.com
  • blog.simoahava.com
  • tracking.endpoint.simoahava.com

Any communication between pages from the list above would happen in a same-site (or first-party) context, and cookie access would not be restricted.

As I mention in the introduction, this is only going to spell doom for tracking within embedded content due to how browsers implement tracking protections.

From www.cookiestatus.com From www.cookiestatus.com

Even though Google recently made a veritable non-announcement by saying they'll phase out third-party cookies by 2022, Google Chrome will actually make things harder for cross-site cookie access much, much sooner.

Chrome v80 (released on February 4, 2020), enforces SameSite cookie restrictions, which means that if a cookie should be accessible in third-party context, it requires the SameSite=None and Secure flags set. By default, these are not set. Unfortunately, the _ga cookie used by Google Analytics does not have these flags set, and currently there is no timeline for when support for these flags are added.

So that's the problem! On the majority of browsers, the _ga cookie does not (or will stop to) function in third-party context, which applies to all cross-site <iframe> embeds.

Luckily, there's a way around this.

We can ignore cookies altogether.

Solution 1: Pass the Client ID from the parent to the child

Let's take a look at another celebrated illustration from yours truly:

There are many potential race conditions in the mix, so some precautions need to be taken. Here's how the parent page works:

  1. The parent page starts listening for messages from the <iframe> as soon as Google Tag Manager loads.
  2. Once the parent page receives the childReady message, it starts polling for the Google Analytics tracker until a maximum timeout is reached.
  3. As soon as the Google Analytics tracker is available, the parent page sends the Client ID back to the <iframe>.

And here's how the <iframe> page reacts:

  1. The <iframe> page starts sending the childReady message as soon as Google Tag Manager loads.
  2. Once the parent page responds with the Client ID (or a timeout is reached), the child page stops sending the message.
  3. The child page writes the Client ID into dataLayer.

So now we know how the Client ID will be passed to the <iframe>, but that's not enough, yet.

Every single Google Analytics tag in the <iframe> must be configured to work with this setup!

More specifically, they all need two fields set, preferably in a Google Analytics Settings variable:

  • storage set to none to avoid the tracker failing if it can't write the Client ID cookie.
  • clientId set to the value from the dataLayer.

We'll get to these shortly, don't worry.

Solution 2: Forward all Data Layer messages from child to parent

If you don't want to do any tracking within the <iframe> (I don't blame you), you can actually delegate tracking to the parent by sending all dataLayer messages to the parent for processing.

This means the parent page would manage tags for both the parent page's native interactions as well as those that happen within the <iframe>.

The process is almost the same as with the first solution. Here's how the parent page works:

  1. The parent page starts listening for messages from the <iframe> as soon as Google Tag Manager loads.
  2. Once the parent page receives the childReady message, it responds with a parentReady message.
  3. If the child frame sends a message in dataLayer-compatible format, the parent page pushes this message into its own dataLayer.

On the <iframe>, this is what happens:

  1. The <iframe> page starts sending the childReady message as soon as Google Tag Manager loads.
  2. Once the parent page responds with parentReady, the child frame “hijacks” the dataLayer.push() method, and sends all the messages passed to it over to the parent page.

The messages from the child frame are namespaced to keep them separate from the parent's “native” dataLayer messages, and they include some metadata about the <iframe> (mainly the URL and title).

The parent page setup

This article combines both solutions into a single set of Custom HTML tags, one for the parent page and one for the child <iframe>.

On the parent page, i.e. the page sending the Client ID to the <iframe> and waiting for the messages sent from the child, you need to create a Custom HTML tag that fires on a Page View trigger. You can use the All Pages trigger if you wish, but you might as well create a Page View trigger that only fires on pages where you know the <iframe> to exist.

The Custom HTML tag

The Custom HTML tag itself should contain the following code:

<script>
  (function() {
    // Tracking ID whose _ga cookie to use
    var trackingId = 'UA-40669554-1';
    
    // Maximum time in milliseconds to wait for GA tracker to load
    var maxGATime = 2000;
    
    // Set to the origin ("https://www.domain.com") of the iframe you want to communicate with
    var childOrigin = 'https://www.gtmtools.com';
    
    // Don't touch anything that follows
    var pollInterval = 200;
    
    var postCallback = function(event) {
      if (event.origin !== childOrigin) return;
      if (event.data !== 'childReady' && !event.data.event) return;
      
      if (event.data === 'childReady') {
        // Send event that parent is ready
        event.source.postMessage('parentReady', event.origin);

        var pollCallback = function() {
          // Stop polling if maxTime reached
          maxGATime -= pollInterval;
          if (maxGATime <= 0) window.clearInterval(poll);
          
          // Only proceed if GA loaded and tracker accessible
          var ga = window[window['GoogleAnalyticsObject']];
          if (ga && ga.getAll) {
            // Get tracker that matches the Tracking ID you provided
            var tracker = ga.getAll().filter(function(t) { 
              return t.get('trackingId') === trackingId; 
            }).shift();
            
            // Send message back to frame with Client ID
            if (tracker) {
              event.source.postMessage({
                event: 'clientId',
                clientId: tracker.get('clientId')
              }, event.origin);
            }
            // Stop polling if not already done so
            window.clearInterval(poll);
          }
        };
        
        // Start polling for Google Analytics tracker
        var poll = window.setInterval(pollCallback, pollInterval)
      }
      
      // Push dataLayer message from iframe to dataLayer of parent
      if (event.data.event) {
        window.dataLayer.push(event.data);
      }
    };
    
    // Start listening for messages from child frame
    window.addEventListener('message', postCallback);
  })();
</script>

There's quite a lot happening here, so let's walk through the code! If you don't care about this deep-dive, you can skip right to how you might need to configure the parent page container to support the message forwarding setup.

Configuration

First, there's the configuration stuff:

// Tracking ID whose _ga cookie to use
var trackingId = 'UA-40669554-1';

// Maximum time in milliseconds to wait for GA tracker to load
var maxGATime = 2000;

// Set to the origin ("https://www.domain.com") of the iframe you want to communicate with
var childOrigin = 'https://www.gtmtools.com';

The trackingId should be set to the Google Analytics tracking ID of the tracker whose _ga cookie you want to use. This is because there might be multiple GA cookies each storing the Client ID for a different Tracking ID. If you're unsure, just type your regular tracking ID as the value of the trackingId variable.

Set maxGATime to the maximum amount of time that the page waits for Google Analytics to load. You can certainly set this a lot higher than 2000 (2 seconds) if you want, but I would recommend against indefinite polling.

Set the childOrigin to the origin of the <iframe> you want to send the data to. The origin is everything in the URL up to the first path slash. So, if the URL is https://www.simoahava.com/my-home-page/, the origin would be https://www.simoahava.com.

It's important to not have the trailing slash, as that's part of the path component and not the origin.

The listener

On the last line of the main code block, we add the message listener:

window.addEventListener('message', postCallback);

This means that when an <iframe> sends a postMessage to the parent page, the listener fires and executes the postCallback function.

var postCallback = function(event) {
  if (event.origin !== childOrigin) return;
  if (event.data !== 'childReady' && !event.data.event) return;

  if (event.data === 'childReady') {
    // Send event that parent is ready
    event.source.postMessage('parentReady', event.origin);

    var pollCallback = function() {
      // Stop polling if maxTime reached
      maxGATime -= pollInterval;
      if (maxGATime <= 0) window.clearInterval(poll);

      // Only proceed if GA loaded and tracker accessible
      var ga = window[window['GoogleAnalyticsObject']];
      if (ga && ga.getAll) {
        // Get tracker that matches the Tracking ID you provided
        var tracker = ga.getAll().filter(function(t) { 
          return t.get('trackingId') === trackingId; 
        }).shift();

        // Send message back to frame with Client ID
        if (tracker) {
          event.source.postMessage({
            event: 'clientId',
            clientId: tracker.get('clientId')
          }, event.origin);
        }
        // Stop polling if not already done so
        window.clearInterval(poll);
      }
    };

    // Start polling for Google Analytics tracker
    var poll = window.setInterval(pollCallback, pollInterval)
  }
  
  // Push dataLayer message from iframe to dataLayer of parent
  if (event.data.event) {
    window.dataLayer.push(event.data);
  }
};

First, if the message retrieved was not expected (e.g. coming from a different <iframe> or has the wrong content), the callback stops execution.

If the message was from the <iframe>, the first thing that's done is notify the child frame that the parent has received the message and is ready to start bilateral communication.

Next, a window.setInterval starts polling the parent page every 200 milliseconds up to the default of 2 full seconds. With each poll, the script checks if the Google Analytics tracker has been created. If it has, the script takes the clientId from the tracker, and sends it back to the <iframe> using event.source.postMessage(). After this is done, the polling is manually stopped to avoid having the Client ID being sent multiple times to the <iframe>.

In essence, the parent page needs to wait for two things:

  1. For the child page to signal it is ready to receive messages.
  2. For the Google Analytics tracker to be available, so that the Client ID can be grabbed from it.

The last code block checks if the message from the child frame contains an object with the event property, in which case it pushes this entire object into the parent page dataLayer.

Configuring the message forwarding system

The parent page listens for dataLayer messages forwarded from the embedded <iframe>. You can create tags, triggers, and variables that react to these messages.

All triggers will be of type Custom Event trigger. That's because all the events sent from the frame will be appended with the iframe. prefix - even those without an event value (they'll be sent as iframe.Message). So, if you want to fire a tag when a link is clicked in the <iframe>, the trigger could look like this:

It requires quite a bit of manual configuration to get the whole thing up and running. But it can make your whole setup run much smoother, as you won't need to embed any extra code within the <iframe>.

Each message sent from the <iframe> is automatically enhanced with some page-level information:

{
  iframe: {
    pageData: {
      url: 'https://www.iframe-domain.com/iframe-page?iframequery=iframevalue#iframeHash',
      title: 'The Iframe Page Title | Iframe Company'
    }
  }
}

If you want to update your parent page tags to use the <iframe> page-level data, you need to create Data Layer variables for iframe.pageData.url (the URL of the <iframe> page) and iframe.pageData.title (the page title).

Once you have all these configured, you're ready to configure the <iframe> page!

The embedded (child) page setup

This is where things get tricky. In this guide, we're covering a fairly typical use case where the <iframe> content is only ever interacted with as an embed. So there's no scenario where the user would visit the page loaded in the <iframe> in a first-party or top-frame context. I'll briefly discuss that scenario later as well, but for now let's assume that the page in the <iframe> is only accessed as an embedded element.

You'll need to do three things in the Google Tag Manager container of the <iframe> page.

  1. Create a Custom HTML tag that communicates with the parent page.
  2. Update the settings for all of your Universal Analytics (and App+Web while you're at it) tags.
  3. Update the triggers for your Universal Analytics tags so they don't fire until they've received the Client ID from the parent.

The Custom HTML tag

Create a new Custom HTML tag, and set it to fire on the All Pages trigger. If the <iframe> is a single-page app, you should still only fire the Custom HTML tag on the All Pages trigger, and not with every SPA page change, for example.

Here's what you should copy-paste into the tag:

<script>
  (function() {
    // If not in iframe, do nothing
    try {
      if (window.top === window.self) return;
    } catch(e) {}
    
    // Set to false to prevent dataLayer messages from being sent to parent
    var sendDataLayerMessages = true;
    
    // Set the prefix that will be used in the event name, and under which all
    // the dataLayer properties will be embedded
    var dataLayerMessagePrefix = 'iframe';
    
    // Set to parent origin ("https://www.domain.com")
    var parentOrigin = 'https://www.simoahava.com';

    // Maximum time in milliseconds to poll the parent frame for ready signal
    var maxTime = 2000;
    
    // Don't touch anything that follows
    var pollInterval = 200;
    var parentReady = false;
    
    var postCallback = function(event) {
      if (event.origin !== parentOrigin) return;
      if (event.data.event !== 'clientId' && event.data !== 'parentReady') return;
      
      if (event.data.event === 'clientId') {
        window.dataLayer.push({
          event: 'clientId',
          clientId: event.data.clientId
        });
      }
      
      if (event.data === 'parentReady' && !parentReady) {
        window.clearInterval(poll);
        if (sendDataLayerMessages) startDataLayerMessageCollection();
        parentReady = true;
      }
    };
    
    var pollCallback = function() {
      // If maximum time is reached, stop polling
      maxTime -= pollInterval;
      if (maxTime <= 0) window.clearInterval(poll);
      // Send message to parent that iframe is ready to retrieve Client ID
      window.top.postMessage('childReady', parentOrigin);
    };
    
    var createMessage = function(obj) {
      if (!Array.isArray(obj) && typeof obj === 'object') {
        var flattenObj = JSON.parse(JSON.stringify(obj));
		var message = {};
        // Add metadata about the page into the message
        message[dataLayerMessagePrefix] = {
          pageData: {
            url: document.location.href,
            title: document.title
          }
        };
        for (var prop in flattenObj) {
          if (flattenObj.hasOwnProperty(prop) && prop !== 'gtm.uniqueEventId') {
            if (prop === 'event') {
              message.event = dataLayerMessagePrefix + '.' + flattenObj[prop];
            } else {
              message[dataLayerMessagePrefix][prop] = flattenObj[prop];
            }
          }
        }
        if (!message.event) message.event = dataLayerMessagePrefix + '.Message';
        return message;
      }
      return false;
    };
    
    var startDataLayerMessageCollection = function() {
      // Send the current dataLayer content to top frame, flatten the object
      window.dataLayer.forEach(function(obj) {
        var message = createMessage(obj);
        if (message) window.top.postMessage(message, parentOrigin);
      });
      // Create the push listener for future messages
      var oldPush = window.dataLayer.push;
      window.dataLayer.push = function() {
        var states = [].slice.call(arguments, 0);
        states.forEach(function(arg) {
          var message = createMessage(arg);
          if (message) window.top.postMessage(message, parentOrigin);
        });
        return oldPush.apply(window.dataLayer, states);
      };
    };
    
    // Start polling the parent page with "childReady" message
    var poll = window.setInterval(pollCallback, pollInterval);
    
    // Start listening for messages from the parent page
    window.addEventListener('message', postCallback);
  })();
</script>

The following chapters will walk through this code. You can skip right to configuring the Google Analytics tags if you don't care about the walkthrough.

Configuration

This is the first code block:

// If not in iframe, do nothing
try {
  if (window.top === window.self) return;
} catch(e) {}

// Set to false to prevent dataLayer messages from being sent to parent
var sendDataLayerMessages = true;

// Set the prefix that will be used in the event name, and under which all
// the dataLayer properties will be embedded
var dataLayerMessagePrefix = 'iframe';

// Set to parent origin ("https://www.domain.com")
var parentOrigin = 'https://www.simoahava.com';

// Maximum time in milliseconds to poll the parent frame for ready signal
var maxTime = 2000;

First of all, if the page is not loaded in an <iframe>, the code does not and should not execute at all. There's no parent page to communicate with, so executing any of the following code would be unnecessary.

You can set sendDataLayerMessages to false if you don't want to forward the dataLayer messages from the child to the parent. This is useful if you're comfortable with tracking everything within the <iframe> itself, and don't want to bother with setting up the corresponding tags in the parent page.

The dataLayerMessagePrefix value is what will be used to namespace the dataLayer messages that are forwarded from the child to the parent (if you haven't disabled forwarding per the previous paragraph).

The parentOrigin should be set to the origin of the parent page, i.e. the page to which the messages are sent. Remember, origin is everything from the protocol to the first slash of the path component. In other words, the origin of https://www.simoahava.com/some-page is https://www.simoahava.com.

Finally, the maxTime is how long the <iframe> tries to send the childReady message to the parent before it stops.

The poller

The child frame polls the parent page with the childReady message to signal it's ready for the bilateral communication to start.

var pollCallback = function() {
  // If maximum time is reached, stop polling
  maxTime -= pollInterval;
  if (maxTime <= 0) window.clearInterval(poll);
  // Send message to parent that iframe is ready to retrieve Client ID
  window.top.postMessage('childReady', parentOrigin);
};

// Start polling the parent page with "childReady" message
var poll = window.setInterval(pollCallback, pollInterval);

It runs every 200 milliseconds until the maximum poll time (2000 milliseconds by default) is reached.

The listener

At the same time as it starts polling, a postMessage listener is also initiated. This listener waits for two things:

  1. The parent to signal parentReady, so that the <iframe> can start forwarding its dataLayer messages to the parent.
  2. The parent to return a Client ID string, so that the <iframe> can use this in Google Analytics tags.
var postCallback = function(event) {
  if (event.origin !== parentOrigin) return;
  if (event.data.event !== 'clientId' && event.data !== 'parentReady') return;

  if (event.data.event === 'clientId') {
    window.dataLayer.push({
      event: 'clientId',
      clientId: event.data.clientId
    });
  }

  if (event.data === 'parentReady' && !parentReady) {
    window.clearInterval(poll);
    if (sendDataLayerMessages) startDataLayerMessageCollection();
    parentReady = true;
  }
};

// Start listening for messages from the parent page
window.addEventListener('message', postCallback);

If the parent sends a clientId message, then the page pushes this data into dataLayer, and Google Analytics tags firing in the <iframe> can utilize this information in their settings.

If the parent sends a parentReady message, the <iframe> stops sending the childReady message. Then, it fires up the dataLayer message forwarder (unless the user has chosen to prevent this).

The message forwarder

The forwarder comprises two methods: createMessage(obj) and startDataLayerMessageCollection().

Without going into too much detail, the basic setup is this:

  1. The createMessage() method is a utility that wraps the dataLayer message from the <iframe> page with the prefix configured in the beginning of the Custom HTML tag ("iframe") by default. This prefix is used with the event name (so gtm.js becomes iframe.gtm.js) as well as with the object itself (so {key: 'value'} becomes {iframe: {key: value}}).
  2. The createMessage() method also automatically adds a pageData object which contains the url and title of the page in the <iframe>.
  3. The startDataLayerMessageCollection() method first sends everything in the dataLayer array collected thus far to the parent.
  4. Then the method rewrites the dataLayer.push function to send everything pushed into dataLayer to the parent page as well.

In other words, everything added to the dataLayer array within the <iframe> is forwarded to the parent page, so that the parent page can handle tracking of interactions and events within the <iframe> as well.

For example, here's how the forwarded transpiles and sends a dataLayer message to the parent page.

// This is what's pushed into dataLayer:
{
  event: 'userLogin',
  user: {
    id: 'abcd-1234',
    status: 'gold'
  }
}

// This is what's sent to the parent page
{
  event: 'iframe.userLogin',
  iframe: {
    pageData: {
      url: 'https://www.iframe-domain.com/iframe-page/',
      title: 'Iframe Page Title'
    },
    user: {
      id: 'abcd-1234',
      status: 'gold'
    }
  }
}

Google Analytics tags

If you do want to collect data to Google Analytics from within the <iframe> page, you need to configure all your Google Analytics tags with the following settings:

As the <iframe> will no longer use cookies to persist the Client ID, you need to set the storage field value to none.

Then, because the child frame uses the Client ID pushed into dataLayer by the message listener, you'll need to update the clientId field to use a Data Layer variable for clientId. It should look like this:

None of your tags should fire if the Client ID is not yet available. You can use the clientId in a Custom Event trigger to fire your tags when Client ID has been pushed to dataLayer.

You can also update your existing triggers to not fire until clientId has a valid value.

Multi-purpose container

One important caveat to note is that if you set up all your tags per the instructions in the previous chapter, then the container running in the <iframe> will not function properly if the page is visited directly as a top-level page, i.e. not as an embed.

I would generally recommend to avoid mixing too many use cases in a single container. It might be wiser to create a different container for scenarios where the page is visited directly, and have the developers update the site code to load the correct container, depending on whether the page is embedded in an <iframe> or not.

However, if you do want the container to cater to different use cases, then a good idea is to create a separate set of tags for when the page is accessed in the top frame vs. when the page is accessed as an embed.

You can use a simple utility variable to check if the page is accessed as an <iframe> or not. This Custom JavaScript variable returns true if the page is NOT in an <iframe> and false otherwise.

function() {
  try {
    return window.top === window.self;
  } catch(e) { return false; }
}

You can then check for this variable value in your tags, triggers, and variables, to make sure that tracking is configured correctly depending on how the page is accessed.

Summary

This has been a difficult article to write, and I feel a bit embarrassed to leave so much work to you, my noble reader.

The thing is - working with this type of a bilateral messaging setup requires both the parent page and the <iframe> page to be in sync. The Custom HTML tags I've designed have been built to naturally reject race conditions, but it's possible you'll need to modify one or the other (or both) to make things work on your site.

Over the course of my career, and especially over the time I've been blogging about Google Tag Manager, trying to perfect tracking of <iframe> elements has been my personal Mount Everest. Cross-site tracking of embedded content has take over my life to such an extent that I'm losing sleep and seriously considering hunting down Ibrahim or Isabella Frame, or whoever the person is who thought the <iframe> is a great addition to the HTML spec.

Regardless, it's still such a common use case on the SaaS- and micro service -rich web to embed cross-site content. Naturally, there's an incentive to know what happens in these <iframe> blocks that I consider the time I've spent trying to solve this particular riddle to be almost worthwhile.

Anyway, I expect and hope that you have comments and/or questions about this setup. Let's continue the discussion in the comments of this article, thank you!