Quantcast
Channel: Web Junkies
Viewing all articles
Browse latest Browse all 3220

How to Block Unwanted Links on Your Website by Blocking Google from Accessing your Site’s Search Results

$
0
0
There's a lot of website out there which will definitely do anything for exposure. Some will do legal ways like guest posting on famous website but others employ dirty tactics or we often know as black hat seo which are really annoying by the way.

For example, someone will comment something on one of your post which is not relevant to the topic. You can stop that by deleting that comment. But, what if there are links added on your website that you don't know of? What if you received this message:
Googlebot encountered extremely large numbers of links on your site. This may indicate a problem with your site’s URL structure… As a result Googlebot may consume much more bandwidth than necessary, or may be unable to completely index all of the content on your site.
This means that tons of new links has been added to your website without your permission.

How did this happen?

Some spam domains may have linked to the search page of your website using search queries in a certain language that obviously returned no search results. Each search link is technically considered a separate web page – as they they have unique addresses – and hence the Googlebot was trying to crawl them all thinking they are different pages.

And when thousands of such fake links are generated, Googlebot will assume that these many pages have been suddenly added to the site.

How to Stop Unwanted Links on Your Website by Blocking Google from Accessing your Site’s Search Results

Prevent the Googlebot from indexing these non-existent search pages on my website. Just start your VIM editor, opened the robots.txt file and added this line at the top. You’ll find this file in the root folder of your website.

Block Search pages from Google with robots.txt

The directive essentially prevents Googlebot, and any other search engine bot, from indexing links that have the “s” parameter the URL query string. If your site uses “q” or “search” or something else for the search variable, you may have to replace “s” with that variable.

The other option is to add the NOINDEX meta tag but that won’t have been an effective solution as Google would still have to crawl the page before deciding not to index it. Also, this is a WordPress specific issue because the Blogger robots.txt already blocks search engines from crawling the results pages.

Viewing all articles
Browse latest Browse all 3220

Trending Articles