vWhat is duplicate content? Duplicate content generally refers to multiple versions of the same content that exist on different pages, either within one domain or across different domains. These blocks of similar information might either completely match each other, or appear similar. Duplicate Content Issues We have already stated that both Internet users and search engines want fresh, unique, and quality content. Nevertheless, online business practice often has to contend with repetitive Web page content, which has been created either accidentally or deliberately. That's why duplicate content has become a huge topic of discussion recently. The harm caused by duplicated content has become increasingly apparent thanks to the new filters search engines have implemented. Many webmasters and SEO/SEM experts often speculate about the percentage of similar pages, and try to predict the figures which may lead to penalties for carrying duplicated content. However, it is very difficult to distinguish between the percentage of pages that are absolute duplicates and that would trigger a duplicate content filter, and those pages that are slightly similar. In fact, the detection of duplicate content involves more than simple, direct comparison: when comparing two similar pages, search engines consider other factors such as site authority, link popularity, domain age, and others. Types of duplicate content In its Webmasters/Site Owners Help, Google identifies the following types of non-malicious duplicate content: - Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices - Store items shown or linked via multiple distinct URLs - Printer-only versions of Web pages. Besides, Google is already able to evaluate navigation panels, common header text, ads, footer text, and repetitive page links. These instances of duplicate content are not penalized but are ignored. Other types of content deliberately duplicated across domains and created to manipulate search engine rankings are considered malicious. These may include similar landing pages created to attract more visitors to your site, subdomains, or domains with substantially duplicated content, and pages with stolen content. In most cases, you are very unlikely to run the rick of being penalized for duplicate content if you do not create it deliberately. However, you should be armed with knowledge to make ensure you do not use malicious duplicate content and accidentally trigger a search engine's filter. How search engines treat duplicate content Most webmasters have already learned that search engines do not like duplicate content. The problem is that multiple pages with the same content confuse SEs, which aim to list the most relevant, unique, and original results, not clutter. Thus, in an effort to provide more varied results to their users, search engines filter websites that appear too similar to each other: except for the most relevant results; similar results are excluded. As stated in Google's Webmasters/Site Owners Help, "Google tries hard to index and show pages with distinct information. This filtering means, for instance, that if your site has a "regular" and "printer" version of each article, and neither of these is blocked in robots.txt or with a noindex meta tag, we'll choose one of them to list." In other words, duplicate content filters are algorithms designed to compare one page against another. If the filter considers two or more pages to be substantially similar, it simply keeps the most trusted one in the primary index, while moving the others to the supplemental index. Penalties may arise when you start copying hundreds or thousands of pages from other domains or create exact replicas of existing sites. Moreover, you run a risk of being penalized in case the ratio of Unique Content vs. Borrowed Content is too low on your site. Search engines' initiative with the new "canonical tag" As you have understood search engines got hard task to exclude duplicate results from their indexes. Duplicate content appeared here and there as the result of many situations as article publishing, blogs posts, different URLs of the same site leading to one content. The contemporary trends demanded some help for the site owners especially those having ecommerce sites with several pages listing the same set of products. That was why a nice idea to make new standard was realized by top search engines. They offered to implement new "canonical tag": This tag is used in the head section of the page in order to give search engines the canonicalization suggestions. Thus specified "canonical" page http://www.example.com/product.php?item=product-name becomes the preferred version of a set of pages with highly similar content. Canonical tag is useful in the case of multiple URLs pointing at the same page, but might also be used when multiple versions of a page exist. This tag will operate in a similar way to a 301 redirect for all URLs that display the page with this tag. You can use relative or absolute links, but absolute links are recommended by search engines. Here are some more Google Webmasters/Site owners Help recommendations: "To migrate to a completely different domain, permanent (301) redirects are more appropriate. Google currently will take canonicalization suggestions into account across subdomains (or within a domain), but not across domains. So site owners can specify a canonical page on www.example.com from a set of pages on example.com or help.example.com, but not on example-widgets.com." Search engines do hope this tag will help to regulate and facilitate the duplicate content question. If this tag can't be implemented they’ll keep using algorithms designed to compare one page against another to determine the canonical.
Please Rate this Article 5 out of 54 out of 53 out of 52 out of 51 out of 5
Not yet Rated