So you’re looking to do the unthinkable - block Google from indexing your WordPress site.
Most webmasters are pulling their hair out trying to get Google to rank their site higher, so kudos to you for breaking the trend and going the opposite direction.
But here’s the thing about Google and other search engines:
They want to index the entire web. That’s the entire point of their existence. So if you want to stop Google from indexing your WordPress site, you need to be proactive about it.
Without taking the proper steps to block Google from your site, Google will invariably find your site, crawl it, and put it in the index.
If you have a development site - that’s bad. If you have a private site - that’s bad.
I’m not going to try to understand why you want to block Google from indexing your WordPress site. I’ll assume you have your reasons. Here are a few ways in which you can do it…
Have WordPress Tell Google to Stop Crawling Your Site
To get started, the easiest way to keep Google away from your site is to use a core WordPress function.
WordPress has a default setting for Search Engine Visibility under Settings → Reading:
If you check that box, WordPress does two things:
First, WordPress modifies your site’s robots.txt file and adds two little lines of code to it:
Here’s what each part of the code means:
- User-agent: * - this part means it applies to every single crawler, rather than just a specific one. This is going to tell every single robot to stay away.
- Disallow: / - this part tells all the robots referenced in part 1 to not visit any pages on your site.
While WordPress can’t “force” robots to follow these directions, the major search engines should all adhere to your request.
The second thing that WordPress does is add another short line of code to your website’s header:
The robots.txt line told robots to stay away. But it didn’t specifically tell them not to index your site. That’s what this does.
The noindex tag, as applied by WordPress, tells every single robot not to index your site.
Put the two together and WordPress is essentially telling search engines:
- Stay away from my site
- Don’t index my site
You could always manually add these two code snippets to your robots.txt file and header.php file...but given that all you need to do is check a box in the WordPress settings, I don’t think there’s any benefit to doing it that way.
What About Removing Pages That Are Already Indexed?
The above method will stop Google and other search engines from indexing new pages of your site. But if Google already indexed your site, visitors will still be able to see your site in the search results for a period of time. And because your site is otherwise public, they’ll also be able to click through from the search results to your site.
Google will eventually get around to removing your site from the index - but there will be a gap period where your site is still indexed and available.
If you need to get your site removed from Google quickly, though, you can speed up this process by submitting a request to temporarily remove URLs.
To temporarily remove URLs from Google’s index (while you wait for Google to permanently do it itself), you can use Google Search Console. Before you can access the tool, you’ll need to have added your website to Google Search Console.
Then, you can access the tool by going to Google Index → Remove URLs inside the Search Console interface for the site you want to remove from Google:
Unfortunately, you’ll need to process each URL individually. But you should be able to delist all your important pages using this tool.
Going Further - Password Protecting Your Site
If you absolutely want to ensure that Google is completely, 100% blocked from indexing your site, the most foolproof way is to set up some sort of password or IP restriction for your site.
Of course, this will also make it difficult for the general public to access your site. But I’m assuming that isn’t a major concern if you want your site removed from Google…
If I’m working on a development or staging site, I always add a password to prevent anyone random from gaining access.
To password protect your site, you have a few different options:
- Install a plugin that helps you password protect WordPress.
- Use .htaccess to restrict access to WordPress
- Use cPanel’s password tool
Because we’ve already covered the first two (just click those links above for the instructions!), I’ll only show you how to handle cPanel’s tool in this section.
You can access the tool from your cPanel dashboard by looking for the Password Protect Directories icon:
Click on it. Then, select the site you want to password protect from the drop-down and hit Go:
On the next screen, click on the URL for your site:
And finally, you need to check the box to Password protect this directory, enter a name, and create a username and password:
Once you do that, cPanel will block anyone without the username and password.
As omniscient as Google tries to be, I don’t think they’re able to figure out your password...yet! For that reason, you can now fully rest easy knowing that Google can’t get into your site and start indexing it.
If you already have pages indexed, they’ll slowly fall out of the index as Google recrawls them. Additionally, anyone who happens to click on your site in the search results won’t be able to access any content without the username and password.
Don’t Forget That You Blocked Google From Indexing Your Site
As I mentioned, blocking Google is a common feature of development sites. But if you ever push your development site to your live server, don’t forget to remove the search engine block and allow Google to index your site again.
It may seem like a stupid simple SEO mistake...but I actually know of a hundred million dollar startup that pushed their development site live with the no-index tag still intact. Talk about a surprise when their homepage dropped out of Google’s search results!
That’s my final warning - always remember any sites for which you’ve blocked Google so that you can make appropriate changes as needed.