In my intense love affair with the Google Cloud Platform, I’ve never felt more inspired to write content and try things out. After starting with a Snowplow Analytics setup guide, and continuing with a Lighthouse audit automation tutorial, I’m going to show you yet another cool thing you can do with GCP. In this guide, I’ll show you how to use an open-source web crawler running in a Google Compute Engine virtual machine (VM) instance to scrape all the internal and external links of a given domain, and write the results into a BigQuery table.
Google Cloud Platform is very, very cool. It’s a fully capable, enterprise-grade, scalable cloud ecosystem which lets even total novices get started with building their first cloud applications. I wrote a long guide for installing Snowplow on the GCP, and you might want to read that if you want to see how you can build your own analytics tool using some nifty open-source modules. But this guide will not be about Snowplow.
This article is a collaboration between Simo and Dan Wilkerson. Dan’s one of the smartest analytics developers out there, and he’s already contributed a great #GTMTips guest post. It’s great to have him back here sharing his insight on working with Accelerated Mobile Pages (AMP). So, we’re back on AMP! Simo wrote a long, sprawling AMP for Google Tag Manager guide a while ago, and Dan has also contributed to the space with his guide for AMP and Google Analytics.