How we get around data sampling issues in Google Analytics without spending 150k

Kevin McCarthy
2 min readFeb 18, 2021

TL:DR Create a bunch of individual reports in Google Sheets underneath the sampling limit and stitch the different reports together using a query. Example given is how we stitch together four reports from four worksheets into one sheet.

=QUERY({report_1!A16:D; report_2!A16:D; report_3!A16:D; report_4!A16:D}, "where Col1 <>''")

When Google Analytics Samples?

Google Analytics samples data anytime your looking at daterange with over 500k sessions. Per About data sampling

Ad-hoc queries of your data are subject to the following general thresholds for sampling:

Analytics Standard: 500k sessions at the property level for the date range you are using

Analytics 360: 100M sessions at the view level for the date range you are using

So you think okay, maybe I need to upgrade to Analytics 360. Unfortunately this upgrade starts at about $150,000 a year. So not really an option for us.

What’s wrong with sampled data?

We found it can’t be trusted. We saw wildly different results between sampled reports vs unsampled reports

Steps to get around the sampling using Google Sheets

  1. Install the Google Analytics add-on for Google Sheets (here’s a guide)
  2. Create your report for the dataset you want
  3. Check out the “Contains Sampled Data” field in the output data
  4. Change the date range until you are no longer being sampled
  5. Copy and paste the report settings across and update the title and the date range so you cover the entire daterange you want

eg

report_1 7daysAgo to 1daysAgo,

report_2 14daysAgo to 8 daysAgo,

report_3 21daysAgo to 15daysAgo,

report_4 28daysAgo to 22daysAgo

6. Create a sheet which will bring together all the data from all the reports using query like

=QUERY({report_1!A16:D; report_2!A16:D; report_3!A16:D; report_4!A16:D}, "where Col1 <>''")

That sounds tedious — is there another way?

These people say they can generate unsampled reports for you http://www.unsampler.io/ but I’ve never used them

If anyone has a more elegant solution to this problem please do let me know in the comments. This does work but is very time-consuming.

--

--