Performing a website audit to identify tag issues and understand overall data quality is a crucial element of proper data governance.
But how big should a website audit be? Should you audit everything?
If “yes” is your answer of choice, then you’re likely thinking of data governance as a form of omniscient data surveillance—you want to be able to see everything and fix everything as it occurs.
Unfortunately this isn’t realistic, as there are significant challenges with auditing every single page on a website all the time (even with automation). The main challenges are:
- Comprehensive audits take a lot of time to execute, process and analyze.
- Websites change frequently, so the data from a site-wide audit becomes obsolete relatively quickly.
- Once you pass a certain threshold of pages audited, you’re mostly just going to see more of the same tag behavior (especially for template-based websites).
Not to mention we humans—the end consumers of data—are not wired to ingest vast amounts of data, structured or unstructured.
Think about it this: even just one vendor with one account may have 50 variables. On a 100,000-page site, that would bring in 5M data points.
It’s just so much data, you’re not doing yourself any favors by auditing everything.
So what’s the answer? How do you identify serious issues in a timely manner without checking everything?
The answer is sample audits.
The Sample Website Audit
Performing a sample audit means validating a portion of each section of a website with comparable tagging profiles (i.e. product pages, blog pages, etc.). A sample audit gives you a pulse on what’s going on in each section of your site. The size of a sample audit will depend on the size of the corresponding section being audited.
You might be a bit leery about the idea of not auditing everything, because what if you miss something? Site-wide website audits do play a role in tag governance, but they should be an infrequent occurrence—they’re just too bulky and inefficient.
Sample audits are highly effective methods for quickly identifying systemic issues on sections of your site, keeping your data governance agile and efficient.
Sample Audit Best Practices
One of the principal challenges with smaller, section-specific audits is sampling error. Just as with sampling a human population when performing a survey, you want to make sure your test group in your sample audit accurately represents the whole (without over-auditing). You can avoid sampling error and “missing an issue” by:
- Starting with the “why”
- Optimizing your audit size
- Randomizing your audit pool
Start with the “why”
The urge to audit everything comes from a false belief that when you collect Big Data, answers to your problems just seem to manifest themselves. As any analyst will tell you, the real value comes in segmenting down the data into more consumable pieces for more targeted analysis. Such it is with website audits.
When creating a website audit, you should have specific questions or business requirements in mind. These questions will usually be section-specific, corresponding to the scope of your audit.
A basic example would be to validate whether or not a campaign variable is being set on all of your landing pages. This is a very simple question, with a relatively narrow scope, and will easily help you align your auditing priorities.
The optimal audit size
The goal of a sample audit is to audit enough but not too much. There’s a sweet spot for pinning down the best audit size to avoid sampling error, and the size will vary from audit to audit.
The recommended upper bound of a sample audit is 10,000 pages. You should rarely exceed the 10,000-page audit ceiling, even if a section of your website has 100,000+ pages, as you’ll just get more of the same and drown yourself in data.
A good rule of thumb is to start off with a 500-page sample audit (with a specific question in mind). If your 500-page audit comes back completely clean, then you can bump up your audit in 500-page increments until you have answered your question or feel satisfied with your data quality.
Note: You can also take a more statistical approach to determining audit size by using a sample size calculation formula.
Randomizing your audit pool
This principle is simple: when performing sample audits, don’t audit the same pages every time. Doing so could cause you to miss fatal issues on other pages. Audit different pages each time.
If you’re using an automated solution to validate your site, you can use the starting URL and include/exclude filters to limit where the audit goes, giving you a fairly randomized pool to test.
The Power of the Sample Audit
Sample audits make tag validation agile, targeted, timely, actionable and relevant. With a wide portfolio of rules-governed audits, you’ll be able to stay abreast in your data governance efforts, without overexerting your resources on bulky site-wide audits.