Someone recently asked me
I read this overview of what you said at an SES conference:
Matt Cutts – Google Not prepared, but informal remarks. High order nits: what do people worry about? He often finds that honest webmasters worry about dupe content when they don’t need to. G tries to always return the “best” version of a page. Some people are less conscious. The person claimed he was having problems with dupe content and not appearing in both G and Y. Turns out he had 2500 domains. A lot of people ask about articles split into parts and then printable versions. Do not worry about G penalizing for this. Different top level domains: if you own a .com and a.fr, for example, don’t worry about dupe content in this case. General rule of thumb: think of SE’s as a sort of a hyperactive 4 year old kid that is smart in some ways and not so in others: use KISS rule and keep it simple. Pick a preferred host and stick with it…such as domain.com or www.domain.com.
If this is an accurate summary, and I’m reading what you’re saying, then there’s no need to worry about duplicate content issues when submitting articles. Is that correct?
What I was saying was: I often get questions from whitehat sites who are worried that they might receive duplicate content penalties because they have the same article in different formats ( e.g. a paginated version and a printer-ready version). While it’s helpful to try to pick one of those articles and exclude the other version from indexing, typically a whitehat site doesn’t neet to worry about 1-3 versions of an article on their own site. However, I would be mindful that taking all your articles and submitting them for syndication all over the place can make it more difficult to determine how much the site wrote its own content vs. just used syndicated content. My advice would be 1) to avoid over-syndicating the articles that you write, and 2) if you do syndicate content, make sure that you include a link to the original content. That will help ensure that the original content has more PageRank, which will aid in picking the best documents in our index.
We use additional heuristics of course, but I figured other people might want to hear that take.