The Difference Between 404s And Soft 404s And Why It Is A Problem
Posted: 21st February 2013
A 404 error code is used to indicate that a server cannot find the page that has been requested. This may happen for many reasons, such as the server couldn’t understand the request made. Of course, this is not to be confused with an error code indicating that the server cannot be found.
When you see a page that has a 404 Not Found message on it, it doesn’t necessarily mean that you are really getting a 404 error. Some web site owners prefer to have custom 404 pages, or redirect requests to their home page and when this happens, search engines like Google have a hard time understanding that this page doesn’t actually exist. The code that it returns is commonly 200 OK, and not 404 Not Found, and this is what’s known as a Soft 404, or “crypto” 404. This means that Google thinks that there is content on this page when there is actually nothing.
Why They Cause Problems
If this page is thought of by Google as a real page with real content, it will get ranked and you may get a high enough score to have this page show up when users search for results. This is problematic because it’s not useful for getting traffic to your site at all. If Google thinks this page is real, it will index it.
Your crawl coverage may be affected when this happens because this page, that should really be returning a 404, gets crawled and indexed instead of the page with unique content.
For users requesting a web page, this can be an entirely confusing experience and decrease the amount of traffic to your site. When a soft 404 is created, what displays is actually a 200 OK message. Since this message tells a user that you should, by all means, be able to see the page and navigate it, they may have no idea what it means to get a 200 OK code.
Avoiding Soft 404s
The best way to avoid any of this hassle is to use 404 codes the way they were meant to be used. Refrain from redirecting from 404 pages, or provide an explanation within the 404 Not Found page. By returning the code that is appropriate, users are able to clearly understand exactly what is happening.
Provide links that can redirect users without redirecting them automatically. This way, search engines understand there is no content here, and it can be indexed the way it should be. Your web page will get ranked the way it should be and the content you want to be crawled by Google will be crawled.
Mike is the SEO Manager at TechWyse Internet Marketing. Staying on top of the latest SEO trends isn’t only part of his job description, yet a passion of his. To keep up to Mike’s speed follow @Techwyse on Twitter.