How to Crush Google Analytics Spam in 2017

Google Analytics is the most popular freemium web analytics service in the world and it is estimated that today over 50 million websites across hundreds of countries use the tool. Google Analytics today is nearly twelve years old and has come in several different shapes and sizes over the years adding ever increasing levels of insight.

The unrivaled level of data and tracking that it offers today means that if you don’t have it installed on your website you’re missing out on such crucial statistics as:

  • What user demographics are visiting your site
  • How they are reaching your site
  • What pages on your site are performing well or badly

… and much more!

What webmaster in their right mind wouldn’t want access to such crucial data that can help them grow their business for free? None, of course. But what happens when you skew the data with spam? What happens when the statistics that you thought were giving you such in-depth insights into user

None, of course.

But what happens when you skew the data with spam? What happens when the statistics that you thought were giving you such in-depth insights into user behavior on your website are actually bogus? Well, it completely devalues the data and renders it almost useless, especially if you’re a smaller brand with much less website traffic than a huge company.

Whether you are a small business owner conducting landing page A/B tests or part of a large company and need to present data to your boss, analytics spam can cause you a mighty headache.

But don’t fear! This article will fully explain each type of spam that may affect your Google Analytics profile and exactly how you can remove it to prevent spam from ruining your data in the future.

Before We Continue: Get Two Free Google Analytics Bonus Guides + 5 Custom Reports

Google Analytics can be complicated enough without worrying about spam. Once you’ve filtered out all the junk from your account, learn how to leverage it for success with these free resources:

  • Five Questions About Your Site Google Analytdics Can Answer: Learn how Google Analytics can help you better understand your site’s performance.
  • Write Smarter Content With Google Analytics: Learn how to write content your audience wants, backed by real data.
  • Best Times Google Analytics Custom Reports: Get pre-built one-click dashboards to help you identify the best times to post on social media, send email, and more.

Want to use Click to Tweet on your blog?

How Does Analytics Spam Work & Why Would Anyone Bother?

There are several types of spam that affect Google Analytics but by far and away the most popular is referral spam, sometimes referred to as ghost spam, as often it never actually reaches the affected website. Apart from the contamination of statistics, luckily this spam technique never actually harms the affected sites.

Referral spam is the process that spammers use where they utilize web spiders or bots to send one or more fake hits to a Google Analytics account (a hit can take the form of anything from a page view to a transaction).

The spammers will target thousands of websites per day, presumably at random, so don’t be lead to believe that you or your agency has committed an SEO crime which has plagued you with this problem!

In some cases, advanced spammers can make bots read your Google Analytics script that is embedded on your web page. They will then extract your unique Property ID, add it to their database and continually spam your site, making it look like they have sent huge amounts of referral traffic to your site and putting them top of your referral list in Google Analytics.

Sneaky.

“Okay, I get it so far, but why would anyone bother to create Google Analytics spam?” I hear you ask. Well, apart from the pure fun of it, there are a number of reasons people are doing this:

  1. Generate traffic of their own. Arguably the main reason for referral/ghost spam. Not only are spammers inflating and blemishing your traffic stats, they are also increasing their own through people’s natural curiosity to click their links and it’s reported that they are in fact increasing sales with this tactic. Whether they redirect the link to a site where they are selling physical products or services or simply taking you to sites covered in ads, the additional traffic is profitable
  2. Get commission. Affiliates often get commission through increasing traffic statistics and this is an easy way for corrupt sites to do so
  3. Propaganda. Believe it or not, some users are even creating referral spam to spread their own personal political beliefs and propaganda. The most notorious case being the pro-Trump spam spread by Russian spammer Vitaly Popov.

Trump spam in Google Analytics

 

4. Black Hat SEOs. There have been rumors in some cases where SEO’s or marketing agencies have used referral spam as a way of bloating their own client’s traffic statistics to give a false impression of success.

5. Harvesting emails. The spammers will then sell the email addresses to 3rd parties who can use them for bulk email campaigns.

6. Spreading malware. There’s nothing spammers like more than causing havoc for the sake of it.

Want to use Click to Tweet on your blog?

Now we’ve established what exactly Analytics spam is, why anyone bothers to create it, and how harmful it can be for your Google Analytics account and business, let’s talk about how we can prevent it.

Know the Different Types of Google Analytics Spam Out There (& How to Prevent Them)

Google Analytics spam comes in different shapes and forms. Here are some common varieties to be aware of.

Referral Spam

Referral Spam is the main method here and one we’ve touched on already. It started a few years ago with the main culprits being sites such as “semalt” and “buttons-for-websites” and now it’s rare to see an analytics account that doesn’t have “abc.xyz”, “free-traffic.xyz” or “ilovevitaly.com” as a referrer.

How can you stop it? One of the first suggestions that surfaced on forums, social media and SEO news sites was blocking the related URL’s through your .htaccess file in the root directory of your domain.

How to Block Spam Bots in .htacess

This method involves copy and pasting a bunch of code into your site and can be dangerous when done incorrectly as the .htaccess file is extremely important and defines how your server behaves. Entering the code incorrectly can take down the whole site so exercise this technique with extreme caution.

If you do feel skilled enough at having a go at this method then check out this handy guide on blocking spam bots in .htaccess.

Although in a lot of cases this works, as mentioned previously, most of the bots now do not even visit the website rendering blocking their URL in your .htaccess useless. It’s also extremely time-consuming to block all the URL’s, especially as there are new ones seemingly popping up every day. There must be a better solution…

Filtering Spam Bots with Custom Filters in Google Analytics

Utilizing Custom Filters was the most effective workaround to the Google Analytics referral spam problem for a long while and it’s still a handy solution to know. It’s a much easier method to implement than blocking spam domains through .htaccess and you don’t need any coding skill. The only potential hazard here is filtering the wrong set of data and further polluting your data by entering the correct filter string.

Follow the guide below and you won’t have any problems:

Step 1: Head over to your Google Analytics profile and click Acquisition > All Traffic > Referrals:

Acquisition > All Traffic > Referrals

Step 2: Sort Referral Traffic by Bounce Rate:

Sort referral traffic by bounce rate

And make sure that you’ve set the timeframe to at least a couple of months:

Adjust the date range

A zero or 100% bounce rate, with at least 20 sessions is a strong indicator that the referrer domain is spammy. If you’re unsure the domain you’re looking at is, in fact, spammy and you want to make sure you’re not excluding legitimate traffic sources, then you can check them against our trusted Ultimate Referral List (a 128 domain strong list of referral spammers).

Step 3: What to do if you can’t find the domain?

If you can’t find the domain in our list or other user-generated ‘Referral Spam Domain Lists’ that you can find online, such as …

… then consider search the site’s domain on Google or social media to see what others are saying about it. Remember to err on the side of caution and not visit the potentially spammy sites as we already know they can be malicious and full of viruses.

Step 4: Now that you have collated your final list of spammy referring domains we can then block them using ‘Custom Filters’. Head over to your Google Analytics profile and hit the ‘Admin’ tab.

Click the Admin gear

Then under the View column, you will want to click the Filters button:

Under View column, click Filters

Once you’ve navigated to the ‘Filters’ page click the big red Add Filter button:

Adding a filter

Step 5: Once you’re on the Edit Filter page you can enter the name that you’d like to call the filter. In the example below I have simply labeled them ‘Spam Referrers’ but you may wish to make this more descriptive or specific, especially if you have a lot of spam sites that you are wishing to filter out.

Select the Custom Filter Type and check the Exclude option from the options below. Under Filter Field, you will want to select ‘Campaign Source’.

Under the Filter field, select Campaign Source

Next, you’ll want to add in the Filter Pattern as you see it in the example below.

‘example.com|example2.com|example3.com’ etc.

Blog-George-GoogleAnalyticsSpam-8

The pipe symbol ‘|’ is used as an “or” operator meaning that the filter ‘example.com’ or example2.com or example3.com will be excluded. However, Don’t end the filter pattern with a pipe, as Google themselves warn that this “will exclude ALL referral sources”.

Remember that you can only add in around ten or so spam referring domains as there is a 255 character limit for each filter. To get around this simply add in more filters until you’ve blocked all of the desired domains.

Want to use Click to Tweet on your blog?

Referral Exclusion List

The final method of blocking spam referral domains in Google Analytics that we’re going to look at is called the ‘Referral Exclusion List’ method and is the first solution that Google themselves have created for excluding rogue domains.

The first thing that needs mentioning about this method is that it only works if you’re using the newer Universal Analytics rather than Classic Google Analytics.

When Google released this new feature within Universal Analytics they highlighted two common uses for it:

  • Third Party Referrals. Especially payment processors. After you have added a payment processor into your Referral Exclusion List, when a user leaves your site to the payment processor then returns to your site following a purchase, the processor won’t show as the referring domain.

A great example of this is if you use PayPal and you constantly see that as a referring source. Add PayPal as a referral exclusion and this will no longer skew the data.

  • Self-Referrals. If you have several subdomains that a user may jump back and forth from you can add your own site into the exclusion list to prevent artificially inflating traffic data.

Your site is automatically added to this when you set up your Analytics property so needs no implementation.

However, Google make no mention of using this technique to block spam and some will argue that there is a reason behind this!

Google states:

“When you exclude a referral source, traffic that arrives to your site from the excluded domain doesn’t trigger a new session.”

But, this doesn’t mean that the visit has been completely removed from your Analytics data.

In some cases, it has been reported that Google Analytics merely attributes the source of the visit differently, from a referral to ‘direct’ traffic. It has removed the referral spam attribution but the data can still be ruined. Again, exercise with caution!

With all that in mind, here’s how to access the Referral Exclusion List and block single domains:

Step 1: Head over to the Admin tab in your Google Analytics account then click Tracking Info within the Property column:

Blog-George-GoogleAnalyticsSpam-9

Step 2: Under the ‘Tracking Info’ dropdown select ‘Referral Exclusion List’

Blog-George-GoogleAnalyticsSpam-10

…then click the ‘+ ADD REFERRAL EXCLUSION’ button.

Screen Shot 2017-03-06 at 12.06.10.png

Step 3: Simply add in the domains that you wish to exclude from your referral traffic. One downside to this technique is that you can’t add domains in bulk which is another reason that the filter method (mentioned previously) may be more efficient.

Screen Shot 2017-03-06 at 12.07.44.png

Language Spam & The Future of Spam

Just when we thought we had it all figured, along came a brand new brand of spam ready to pollute our beautiful Analytics data once more!

Towards the back end of 2016, a new wave of Analytics spam hit many accounts, Language Spam. Ever notice sentences similar to this in your Language Report on Analytics?

Screen Shot 2017-03-06 at 12.22.31.png

How about these:

  • o-o-8-o-o.com search shell is much better than google!
  • Vitaly rules google ☆*:。゜゚・*ヽ(^ᴗ^)ノ*・゜゚。:*☆ ¯_(ツ)_/¯(ಠ益ಠ)(ಥ‿ಥ)(ʘ‿ʘ)ლ(ಠ_ಠლ)( ͡° ͜ʖ ͡°)ヽ(゚Д゚)ノʕ•̫͡•ʔᶘ ᵒᴥᵒᶅ(=^ ^=)oO

The Language Spam problem was combined with yet more referral spam and often saw sites such as brateg.xyz, begalka.xyz, bukleteg.xyz etc. in your data. Eventually, the referral spam was managing to use legitimate websites such as reddit.com, twitter.com & thenextweb.com to disguise itself.

According to reports, research has shown that this is not too dissimilar to the standard referral spam we saw previously in how it works which means that it’s easy enough to block.

Once the Language Spam has been recorded there is no way to permanently remove it from your historical data but you can filter it to report accurate historical statistics. You can also utilize the same methods to block Language Spam as you can with Referral Spam as we explain in detail further up the page.

Want to use Click to Tweet on your blog?

Here’s how you can utilize Custom Segments to view accurate historical data even after you’ve been inflicted with Language Spam:

Step 1: On report view in your Google Analytics account click ‘+ Add Segment’ at the top of the page …

Blog-George-GoogleAnalyticsSpam-11

… then click the big, red ‘+New Segment’ button.

Blog-George-GoogleAnalyticsSpam-12

Step 2: Staying on the ‘Demographics’ tab that you default to, give your new Segment a simple, easy to identify name:

Blog-George-GoogleAnalyticsSpam-13

Step 3: Next up, you want to select ‘does not match regex’ from the drop down tab next to ‘Language’ and enter the following expression and then hit save:

.{15,}|s[^s]*s|.|,|!|/

Blog-George-GoogleAnalyticsSpam-14

The filter above excludes any traffic where the language name contains 15+ characters as most, .com,.co.uk, .fr for example, don’t go over that limit yet the Language Spam names often do.

It also excludes symbols which are invalid for use in the language field, such as commas, exclamations marks and speech marks, that are used to construct these spammy domain names.

Once saved, you can now look back at data in your Google Analytics account over any period of time and analyze non-infiltrated stats. Lovely!

Google Analytics spam is not a new thing and has been frustrating website owners and marketers alike for quite a while now. Whilst, with these techniques, you can block most of the spam coming through, there will no doubt be something else affecting your account somewhere along the line. Keep an eye out for any suspicious looking websites and the methods in this article may just be able to still help you.

Want to use Click to Tweet on your blog?