{"id":16134,"date":"2009-11-06T11:21:13","date_gmt":"2009-11-06T16:21:13","guid":{"rendered":"52002 at http:\/\/www.webpronews.com"},"modified":"2009-11-06T11:21:13","modified_gmt":"2009-11-06T16:21:13","slug":"duplicate-content-on-google-bing-yahoo","status":"publish","type":"post","link":"https:\/\/mereja.media\/index\/16134","title":{"rendered":"Duplicate Content on Google, Bing &amp; Yahoo"},"content":{"rendered":"<p>Duplicate content is a common occurrence on the web and in many cases can hurt search engine rankings. While the search engines may not always technically penalize webmasters for duplicate content, there are still a lot of ways it can hurt. <\/p>\n<p>WebProNews is covering the <a href=\"http:\/\/searchmarketingexpo.com\/east\">Search Marketing Expo (SMX) East<\/a> in New York, where representatives from the three major search engines (Google, Yahoo, and Bing) discussed how their respective web properties handle duplicate content issues. Following are some takeaways from each. <br \/>\n<strong><br \/>\nDuplicate Content in Google<\/strong><\/p>\n<p><a href=\"http:\/\/research.google.com\/pubs\/author3778.html\"><img decoding=\"async\" align=\"right\" alt=\"Duplicate Content on Google - Joachim Kupke\" title=\"Duplicate Content on Google - Joachim Kupke\" style=\"margin: 10px;\" src=\"http:\/\/images.ientrymail.com\/webpronews\/article_pics\/google-duplicate-content.jpg\" \/><\/a>The way Google handles duplicate content has been discussed a lot in recent memory. This is largely due to <a href=\"http:\/\/www.webpronews.com\/topnews\/2009\/09\/16\/google-busts-the-duplicate-content-myth\">a video<\/a> Google&#8217;s Greg Grothaus uploaded, in which he discusses at length, the way Google handles a variety of different elements of the duplicate content conversation. <\/p>\n<p>Joachim Kupke, Sr. Software Engineer of Google&#8217;s Indexing Team reiterated much of what Grothaus said. He also said that Google has a ton of infrastructure for content duplication elimination:<\/p>\n<blockquote><p>&#8211; redirects<br \/>\n&#8211; detection of recurrent URL patterns (the ability to &#8216;learn&#8217; recurrent url patterns to find duplicated content)<br \/>\n&#8211; actual contents<br \/>\n&#8211; most recently crawled version <br \/>\n&#8211; earlier content<br \/>\n&#8211; contents minus things that don&rsquo;t change on a site<\/p><\/blockquote>\n<p>Kupke said to avoid dynamic URLs when possible (although Google is &quot;rather good&quot; at eliminating dupes). If all else fails, use the canonical link element. Kupke calls this a &quot;Swiss Army Knife&quot; for duplicate content issues. <\/p>\n<p>Google says the canonical link element has been tremendously successful. It didn&#8217;t even exist a year ago, and is has grown exponentially. It has had a huge impact on Google&#8217;s canonicalization decisions, and <strong>2 out of 3 times, the canonical tag actually alters the organic decision in Google<\/strong>. <\/p>\n<p>Google says a common mistake is designating a 404 as canonical, and this is typically caused by unnecessary relative links. So, avoid changing rel=&quot;canonical&quot; designations, and avoid designating permanent redirects as canonical. <\/p>\n<p>Also, <strong>do not disallow directives in robots.txt to annotate duplicate content.<\/strong> It makes it harder to detect dupes, and disallowed 404s are a nuisance. There is an exception however, and that is that interstitial login pages may be a good candidate to &quot;robot out,&quot; according to Kupke. <\/p>\n<p>Kupke says that canonical works, but indexing takes time. &quot;Be patient and we WILL use your designated canonicals.&quot; Cleaning up an existing part of the index takes even longer, and this may leave dupes serving for a while despite rel=canonical, Kupke adds. <\/p>\n<p>At SMX, Google announced that <strong>cross domain rel=canonical is coming within this year.<\/strong> So for example, if the Chicago Tribune has an article on the New York Times, and the rel=canonical points to the Chicago Tribune then Google will only credit the Chicago Tribune with the content.<\/p>\n<p><strong>Duplicate Content in Bing<\/strong><\/p>\n<p><a href=\"http:\/\/www.searchenginestrategies.com\/sanjose\/sasi-parthasarathy.php\"><img decoding=\"async\" align=\"left\" alt=\"Sasi Parthasarathy\" title=\"Sasi Parthasarathy\" style=\"margin: 10px;\" src=\"http:\/\/images.ientrymail.com\/webpronews\/article_pics\/sasi.jpg\" \/><\/a><\/p>\n<p>As far as how Bing views duplicate content, intention is key. If your intent is to manipulate the search engine, you will be penalized. <\/p>\n<p>Sasi Parthasarathy, Program Manager of Bing says to consolidate all versions of a page under one URL. &quot;Less is more, in terms of duplicate content.&quot; If possible, use only one URL per piece of content.<\/p>\n<p><strong>Bing isn&#8217;t supporting the canonical link element<\/strong> (as a ranking factor) yet, but it is coming. They do say to use it, but it&#8217;s just not really a ranking factor in Bing yet. Bing says that there has been an increase in the usage of canonical tags in the past 6 months, but adoption issues still exist. According to Parthasarathy, 30% of canonical tags point to the same domain (which is fine), and 9% use it to point to other domains. This could be a mistake or it could be manipulative. Bing says they will look for other factors to try and determine which it is. <\/p>\n<p>Bing says <strong>canonical tags are hints and not directives<\/strong>. &quot;Use it with caution,&quot; and not as an alternative to good web design. <\/p>\n<p>With regards to www vs non-www, just pick one and stick with it consistently. Remove default filenames at the end of your URLs. Bing also says 301 redirects are your best friend for redirecting, use rel=&quot;nofollow&quot; on useless pages, and use robots.txt to keep content you don&#8217;t want crawled out. <\/p>\n<p><strong>Duplicate Content in Yahoo<\/strong><\/p>\n<p><a href=\"http:\/\/searchmarketingexpo.com\/bio.php?id=251\"><img decoding=\"async\" align=\"right\" alt=\"Cris Pierry\" title=\"Cris Pierry\" style=\"margin: 10px;\" src=\"http:\/\/images.ientrymail.com\/webpronews\/article_pics\/cris.jpg\" \/><\/a><\/p>\n<p>If everything goes according to plan, you&#8217;re going to need to worry about how Bing handles duplicate content if you&#8217;re worried about how Yahoo handles it, but Yahoo&#8217;s Cris Pierry, Sr. Director of Search, offered a few additional tips. <\/p>\n<p>Pierry says descriptive URLs should be easily readable, and it&#8217;s not a good idea to change URLs every year. In addition, use canonical, <strong>avoid case sensitivity, and avoid session IDs and parameters. <\/strong><\/p>\n<p>Pierry also says to use sitemaps, and submit them to Yahoo Site Explorer. Improve indexing by proper robots.txt usage, and use Site Explorer to delete URLs that you dont&#8217; want Yahoo to index. Finally, provide feeds to Yahoo Site Explorer, and report spam sites linking to you in Site Explorer.<\/p>\n<p>Yahoo says metadata and SearchMonkey are enhancing presentation.<\/p>\n<p><em>WebProNews reporter Mike McDonald contributed to this article from SMX East.<\/em><br \/>\n&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<div class=\"feedflare\">\n<a href=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?a=WCxgrHgpHmY:uPl8wBU8kl4:yIl2AUoC8zA\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?d=yIl2AUoC8zA\" border=\"0\"><\/img><\/a> <a href=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?a=WCxgrHgpHmY:uPl8wBU8kl4:wF9xT3WuBAs\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?i=WCxgrHgpHmY:uPl8wBU8kl4:wF9xT3WuBAs\" border=\"0\"><\/img><\/a> <a href=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?a=WCxgrHgpHmY:uPl8wBU8kl4:7Q72WNTAKBA\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?d=7Q72WNTAKBA\" border=\"0\"><\/img><\/a> <a href=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?a=WCxgrHgpHmY:uPl8wBU8kl4:V_sGLiPBpWU\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?i=WCxgrHgpHmY:uPl8wBU8kl4:V_sGLiPBpWU\" border=\"0\"><\/img><\/a> <a href=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?a=WCxgrHgpHmY:uPl8wBU8kl4:qj6IDK7rITs\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/webpronews\/all?d=qj6IDK7rITs\" border=\"0\"><\/img><\/a>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~r\/webpronews\/all\/~4\/WCxgrHgpHmY\" height=\"1\" width=\"1\"\/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Duplicate content is a common occurrence on the web and in many cases can hurt search engine rankings. While the search engines may not always technically penalize webmasters for duplicate content, there are still a lot of ways it can hurt. WebProNews is covering the Search Marketing Expo (SMX) East in New York, where representatives [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-16134","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/16134","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/comments?post=16134"}],"version-history":[{"count":0,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/16134\/revisions"}],"wp:attachment":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/media?parent=16134"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/categories?post=16134"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/tags?post=16134"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}