Google Analytics is the most popular freemium web analytics service in the world and it is estimated that today over 50 million websites across hundreds of countries use the tool. Google Analytics today is nearly twelve years old and has come in several different shapes and sizes over the years adding ever increasing levels of insight.
The unrivaled level of data and tracking that it offers today means that if you donât have it installed on your website you’re missing out on such crucial statistics as:
- What user demographics are visiting your site
- How they are reaching your site
- What pages on your site are performing well or badly
⌠and much more!
What webmaster in their right mind wouldnât want access to such crucial data that can help them grow their business for free? None, of course. But what happens when you skew the data with spam? What happens when the statistics that you thought were giving you such in-depth insights into user
None, of course.
But what happens when you skew the data with spam? What happens when the statistics that you thought were giving you such in-depth insights into user behavior on your website are actually bogus? Well, it completely devalues the data and renders it almost useless, especially if youâre a smaller brand with much less website traffic than a huge company.
Whether you are a small business owner conducting landing page A/B tests or part of a large company and need to present data to your boss, analytics spam can cause you a mighty headache.
But donât fear! This article will fully explain each type of spam that may affect your Google Analytics profile and exactly how you can remove it to prevent spam from ruining your data in the future.
Before We Continue: Get Two Free Google Analytics Bonus Guides + 5 Custom Reports
Google Analytics can be complicated enough without worrying about spam. Once you’ve filtered out all the junk from your account, learn how to leverage it for success with these free resources:
- Five Questions About Your Site Google Analytdics Can Answer: Learn how Google Analytics can help you better understand your site’s performance.
- Write Smarter Content With Google Analytics: Learn how to write content your audience wants, backed by real data.
- Best Times Google Analytics Custom Reports: Get pre-built one-click dashboards to help you identify the best times to post on social media, send email, and more.
Get Your Download Now
Plus, join our email list to stay up-to-date.
Prepping your download!
Success! Your download should start shortly.
Tired of the marketing mess?
Awesome news! You're invited to a 1-on-1 marketing demo of CoSchedule! In 30 mins or less, you can see how to:
- End the frustration of missed deadlines.
- Get total visibility into ALL of your marketing in one place.
- Save 20 hrs this week alone (and every week after).
If you've ever kicked the tires on CoSchedule, now's the time to see what it's really like.
Success! Your download should start shortly.
Clean up the chaos with your CoSchedule editorial calendar!
With CoSchedule, you'll:
- Save time with blogging, social, and email (think HOURS every week)
- Schedule your social posts in batches (and increase your posting frequency) super easily
- Get your sh*t together (and hold yourself accountable to publishing like the boss you are!)
Nowâs the perfect time to start your 14-day free trial to see for yourself!
Want to use Click to Tweet on your blog?
How Does Analytics Spam Work & Why Would Anyone Bother?
There are several types of spam that affect Google Analytics but by far and away the most popular is referral spam, sometimes referred to as ghost spam, as often it never actually reaches the affected website. Apart from the contamination of statistics, luckily this spam technique never actually harms the affected sites.
Referral spam is the process that spammers use where they utilize web spiders or bots to send one or more fake hits to a Google Analytics account (a hit can take the form of anything from a page view to a transaction).
The spammers will target thousands of websites per day, presumably at random, so donât be lead to believe that you or your agency has committed an SEO crime which has plagued you with this problem!
In some cases, advanced spammers can make bots read your Google Analytics script that is embedded on your web page. They will then extract your unique Property ID, add it to their database and continually spam your site, making it look like they have sent huge amounts of referral traffic to your site and putting them top of your referral list in Google Analytics.
Sneaky.
"Okay, I get it so far, but why would anyone bother to create Google Analytics spam?" I hear you ask. Well, apart from the pure fun of it, there are a number of reasons people are doing this:
- Generate traffic of their own. Arguably the main reason for referral/ghost spam. Not only are spammers inflating and blemishing your traffic stats, they are also increasing their own through peopleâs natural curiosity to click their links and itâs reported that they are in fact increasing sales with this tactic. Whether they redirect the link to a site where they are selling physical products or services or simply taking you to sites covered in ads, the additional traffic is profitable
- Get commission. Affiliates often get commission through increasing traffic statistics and this is an easy way for corrupt sites to do so
- Propaganda. Believe it or not, some users are even creating referral spam to spread their own personal political beliefs and propaganda. The most notorious case being the pro-Trump spam spread by Russian spammer Vitaly Popov.
4. Black Hat SEOs. There have been rumors in some cases where SEOâs or marketing agencies have used referral spam as a way of bloating their own client's traffic statistics to give a false impression of success.
5. Harvesting emails. The spammers will then sell the email addresses to 3rd parties who can use them for bulk email campaigns.
6. Spreading malware. Thereâs nothing spammers like more than causing havoc for the sake of it.
Want to use Click to Tweet on your blog?
Now weâve established what exactly Analytics spam is, why anyone bothers to create it, and how harmful it can be for your Google Analytics account and business, letâs talk about how we can prevent it.
Know the Different Types of Google Analytics Spam Out There (& How to Prevent Them)
Google Analytics spam comes in different shapes and forms. Here are some common varieties to be aware of.
Referral Spam
Referral Spam is the main method here and one weâve touched on already. It started a few years ago with the main culprits being sites such as "semalt" and "buttons-for-websites" and now itâs rare to see an analytics account that doesnât have "abc.xyz", "free-traffic.xyz" or "ilovevitaly.com" as a referrer.
How can you stop it? One of the first suggestions that surfaced on forums, social media and SEO news sites was blocking the related URLâs through your .htaccess file in the root directory of your domain.
How to Block Spam Bots in .htacess
This method involves copy and pasting a bunch of code into your site and can be dangerous when done incorrectly as the .htaccess file is extremely important and defines how your server behaves. Entering the code incorrectly can take down the whole site so exercise this technique with extreme caution.
If you do feel skilled enough at having a go at this method then check out this handy guide on blocking spam bots in .htaccess.
Although in a lot of cases this works, as mentioned previously, most of the bots now do not even visit the website rendering blocking their URL in your .htaccess useless. Itâs also extremely time-consuming to block all the URLâs, especially as there are new ones seemingly popping up every day. There must be a better solutionâŚ
Filtering Spam Bots with Custom Filters in Google Analytics
Utilizing Custom Filters was the most effective workaround to the Google Analytics referral spam problem for a long while and itâs still a handy solution to know. Itâs a much easier method to implement than blocking spam domains through .htaccess and you donât need any coding skill. The only potential hazard here is filtering the wrong set of data and further polluting your data by entering the correct filter string.
Follow the guide below and you wonât have any problems:
Step 1: Head over to your Google Analytics profile and click Acquisition > All Traffic > Referrals:
Step 2: Sort Referral Traffic by Bounce Rate:
And make sure that youâve set the timeframe to at least a couple of months:
A zero or 100% bounce rate, with at least 20 sessions is a strong indicator that the referrer domain is spammy. If youâre unsure the domain youâre looking at is, in fact, spammy and you want to make sure youâre not excluding legitimate traffic sources, then you can check them against our trusted Ultimate Referral List (a 128 domain strong list of referral spammers).
Step 3: What to do if you can't find the domain?
If you canât find the domain in our list or other user-generated âReferral Spam Domain Listsâ that you can find online, such as ...
- https://github.com/piwik/referrer-spam-blacklist
- https://referrerspamblocker.com/blacklist
- https://perishablepress.com/4g-ultimate-referrer-blacklist/
... then consider search the siteâs domain on Google or social media to see what others are saying about it. Remember to err on the side of caution and not visit the potentially spammy sites as we already know they can be malicious and full of viruses.
Step 4: Now that you have collated your final list of spammy referring domains we can then block them using âCustom Filtersâ. Head over to your Google Analytics profile and hit the âAdminâ tab.
Then under the View column, you will want to click the Filters button:
Once youâve navigated to the âFiltersâ page click the big red Add Filter button:
Step 5: Once youâre on the Edit Filter page you can enter the name that youâd like to call the filter. In the example below I have simply labeled them âSpam Referrersâ but you may wish to make this more descriptive or specific, especially if you have a lot of spam sites that you are wishing to filter out.
Select the Custom Filter Type and check the Exclude option from the options below. Under Filter Field, you will want to select âCampaign Sourceâ.
Next, youâll want to add in the Filter Pattern as you see it in the example below.
âexample.com|example2.com|example3.comâ etc.
The pipe symbol â|â is used as an "or" operator meaning that the filter âexample.comâ or example2.com or example3.com will be excluded. However, Don't end the filter pattern with a pipe, as Google themselves warn that this âwill exclude ALL referral sourcesâ.
Remember that you can only add in around ten or so spam referring domains as there is a 255 character limit for each filter. To get around this simply add in more filters until youâve blocked all of the desired domains.
Want to use Click to Tweet on your blog?
Referral Exclusion List
The final method of blocking spam referral domains in Google Analytics that weâre going to look at is called the âReferral Exclusion Listâ method and is the first solution that Google themselves have created for excluding rogue domains.
The first thing that needs mentioning about this method is that it only works if youâre using the newer Universal Analytics rather than Classic Google Analytics.
When Google released this new feature within Universal Analytics they highlighted two common uses for it:
- Third Party Referrals. Especially payment processors. After you have added a payment processor into your Referral Exclusion List, when a user leaves your site to the payment processor then returns to your site following a purchase, the processor wonât show as the referring domain.
A great example of this is if you use PayPal and you constantly see that as a referring source. Add PayPal as a referral exclusion and this will no longer skew the data.
- Self-Referrals. If you have several subdomains that a user may jump back and forth from you can add your own site into the exclusion list to prevent artificially inflating traffic data.
Your site is automatically added to this when you set up your Analytics property so needs no implementation.
However, Google make no mention of using this technique to block spam and some will argue that there is a reason behind this!
Google states:
âWhen you exclude a referral source, traffic that arrives to your site from the excluded domain doesnât trigger a new session.â
But, this doesnât mean that the visit has been completely removed from your Analytics data.
In some cases, it has been reported that Google Analytics merely attributes the source of the visit differently, from a referral to âdirectâ traffic. It has removed the referral spam attribution but the data can still be ruined. Again, exercise with caution!
With all that in mind, hereâs how to access the Referral Exclusion List and block single domains:
Step 1: Head over to the Admin tab in your Google Analytics account then click Tracking Info within the Property column:
Step 2: Under the âTracking Infoâ dropdown select âReferral Exclusion Listâ
...then click the â+ ADD REFERRAL EXCLUSIONâ button.
Step 3: Simply add in the domains that you wish to exclude from your referral traffic. One downside to this technique is that you canât add domains in bulk which is another reason that the filter method (mentioned previously) may be more efficient.
Language Spam & The Future of Spam
Just when we thought we had it all figured, along came a brand new brand of spam ready to pollute our beautiful Analytics data once more!
Towards the back end of 2016, a new wave of Analytics spam hit many accounts, Language Spam. Ever notice sentences similar to this in your Language Report on Analytics?
How about these:
- o-o-8-o-o.com search shell is much better than google!
- Vitaly rules google â*:・ăďžď˝Ľ*ă˝(^á´^)ďž*シăďžď˝Ą:*â ÂŻ_(ă)_/ÂŻ(ಠçಠ)(಼âżŕ˛Ľ)(ĘâżĘ)á(ಠ_ಠá)( ͥ° ÍĘ ÍĄÂ°)ă˝(ďžĐďž)ďžĘâ˘ĚŤÍĄâ˘Ęáś áľá´Ľáľáś (=^ ^=)oO
The Language Spam problem was combined with yet more referral spam and often saw sites such as brateg.xyz, begalka.xyz, bukleteg.xyz etc. in your data. Eventually, the referral spam was managing to use legitimate websites such as reddit.com, twitter.com & thenextweb.com to disguise itself.
According to reports, research has shown that this is not too dissimilar to the standard referral spam we saw previously in how it works which means that itâs easy enough to block.
Once the Language Spam has been recorded there is no way to permanently remove it from your historical data but you can filter it to report accurate historical statistics. You can also utilize the same methods to block Language Spam as you can with Referral Spam as we explain in detail further up the page.
Want to use Click to Tweet on your blog?
Hereâs how you can utilize Custom Segments to view accurate historical data even after youâve been inflicted with Language Spam:
Step 1: On report view in your Google Analytics account click â+ Add Segmentâ at the top of the page ...
... then click the big, red â+New Segmentâ button.
Step 2: Staying on the âDemographicsâ tab that you default to, give your new Segment a simple, easy to identify name:
Step 3: Next up, you want to select âdoes not match regexâ from the drop down tab next to âLanguageâ and enter the following expression and then hit save:
.{15,}|s[^s]*s|.|,|!|/
The filter above excludes any traffic where the language name contains 15+ characters as most, .com,.co.uk, .fr for example, donât go over that limit yet the Language Spam names often do.
It also excludes symbols which are invalid for use in the language field, such as commas, exclamations marks and speech marks, that are used to construct these spammy domain names.
Once saved, you can now look back at data in your Google Analytics account over any period of time and analyze non-infiltrated stats. Lovely!
Google Analytics spam is not a new thing and has been frustrating website owners and marketers alike for quite a while now. Whilst, with these techniques, you can block most of the spam coming through, there will no doubt be something else affecting your account somewhere along the line. Keep an eye out for any suspicious looking websites and the methods in this article may just be able to still help you.