Many sites feature a lot of duplicate content, most of the time completely legitimately – it’s just one of those things that can happen with bigger sites, or ecommerce sites. For example, if you were selling a can of cat food, it would be possible to get to the specific page selling the cat food any number of different ways, especially if you have user IDs or other parameters within your URLs.
If you tell Google to for instance, ignore the session ID, then you are in effect allowing them to only index one version of that page. This will help you avoid any duplicate content penalties and potentially improve your crawl efficiency so that the Google bots will be able to crawl your site easily and hopefully more regularly.
As a fringe benefit, it could also lead to cleaner URLs across your site, which will satisfy your inner stickler. Also, in the long term it should mean that Google can crawl in a more efficient way across the internet and present cleaner, more accurate results, which will be good for all of us.
Related posts:


