Simple RegEx Table For Google Tag Manager
When our good friends in the Google Tag Manager developer team first introduced the Lookup Table Macro, we were excited. For many of us, it soon became the weapon of choice especially when used as a management and optimization tool for the container itself.
However, the macro wasn’t considered perfect. In fact, the most frequently heard request had to do with the core functionality of the feature itself: the macro should support operations, that is, predicate logic. It’s not enough to just have equal match lookups; people wanted support for operations such as “is x larger than y” or “does y contain x”.
The thing is, I don’t agree with changing the Lookup Table Macro to support these types of operations. Sure, a table whose values you can query with more complex operations than simple lookups would be awesome, but it wouldn’t be a lookup table anymore. We’d need a different variable type for those use cases.
The Simmer Newsletter
Subscribe to the Simmer newsletter to get the latest news and content from Simo Ahava into your email inbox!
Look it up!
Even though I’m a product of the unsurpassed Finnish education system, I suck at ornithology. So you’ll excuse me for the following, clumsy metaphor.
Consider the homing pigeon. It has an intimate knowledge of a location, and it flies to that location. If there is nothing there, it gets confused and poops. If it does find a recipient or a message, it does its thing and coos happily.
Well, when you have a lookup table, it’s the same thing. You use a variable reference to pinpoint to a specific cell in a table. If this cell exists, any value stored within is returned. If the cell doesn’t exist, the script gets confused and poops an undefined or an error.
This is what makes lookup tables so incredibly efficient. It’s all based on binary logic.
There are no complex operations, no predicates to be evaluated. It’s just a question of “does table X have a value under label Y”.
If you use a plain object as a lookup table, it’s common to call it an associative array or a hash table, but we’ll call them lookup tables here for clarity.
So, you can perform lookups on all Array-like structures. The three examples listed above can be used for lookups like this:
// Plain object var newValue = objectTable['key']; // Array var anotherValue = arrayTable; // String var newestValue = 'String';
With the plain object, you can also use dot notation in some cases.
As you can see, you’re directly requesting a specifically labelled value in the table, and if it exists, it’s returned to you without any further operations.
Now, if you were to introduce predicate logic into the mix, with something like
table['[Kk]ey'] (fictional example), it would mean that the lookup should check every single cell until a match is made in the table to see if they have either ‘Key’ or ‘key’ as their label.
As soon as you add predicate logic into the mix, you’re forcing the lookup to check every single cell until a match is made, because you can’t label cells with regular expressions or dynamic values (e.g. ‘Key/key’).
The difference between the binary check of the lookup vs. the traversal of a more complex operation becomes clearer when thinking in terms of performance.
Queries on a lookup table are said to work in constant time. Since you’re querying for a specific label in a table of arbitrary size, the complexity of the operation will always be the same. Either the label exists or it doesn’t. The table can be huge or it can be miniscule, the performance is always the same.
Performance is usually indicated with Big O notation. The notation for constant time (i.e. the lookup) would be O(1).
When using predicate logic, you achieve O(1) only if you match the query with the first cell that is checked. Every subsequent cell that is checked for a match incurs a linear decrease in the performance. Thus, comparison logic is said to work in linear time.
Describing linear time operations with O(1) would be fairly optimistic. For this reason, Big O notation tends to describe the worst-case scenario. The worst-case scenario of linear time would be that the value is in the very last cell that is queried. Thus, the notation would be O(n), where n is the number of cells in the table.
This also means that the larger the table, the more expensive the operation becomes, in terms of performance.
With small tables this difference is pretty trivial, but with large tables and multitudes of chained variables, the performance hit can be significant, especially if it takes time to make the match, and the labels are arbitrary enough that you can’t use facilitating data structures or search algorithms.
So if you’re concerned about performance, and you should be if it’s a web page, always use the Lookup Table variable.
Well I know you’re not satisfied with my explanation, and you’re still craving for a more flexible way to fetch values from a table.
I hope the GTM developers will, at some point, introduce another variable type that’s essentially a lookup table but where you can specify the predicate logic used row-by-row.
Until then, you can make do with workarounds such as the script below.
Here’s how the variable works:
First, you give it the input: some variable or value that you want to assess in the table rows
Next, you insert the rows
Rows are actually Arrays within the
1. The first cell contains the regular expression you want to evaluate against the input (note, you will need to _double_ escape reserved characters!) 2. The second cell contains the value that is returned if a match is made 3. The third cell is optional, and can contain any regular expression flags (e.g. 'g', 'i') you might want to use
Finally, there’s a little for-loop which loops through each row of the
tableArray, checking the regular expression against the input variable. If and when a match is made, the specified return value is returned by the function
If no match is made, the specified default value is returned
As always, this solution is educational first, a proof-of-concept second, and a usable, out-of-the-box workaround last. So feel free to modify it to your own purposes, or just ditch it completely.
The key takeaway from this article should be an understanding of how Lookup Tables work, and how much more complicated they would get if operational logic would be introduced as well. For that reason, my feature request remains that the Lookup Table would be kept as it is, but a new variable type would be introduced, where you can specify the operation on a row-by-row basis. This way, everyone wins.
The book has the basics summed up really well. The next step would be to grab a book about design patterns and more complex data structures. It’s all very educational.