Counting the ways rev="canonical" helps the Web and a rel="short*" rebuttal

Well I am very excited to write this post, but I concede if the title did not peak your interest this may be as exciting as watching grass grow. If you are confused at all at this point, I should mention that there has been a lot going on that has lead up to my post, so I'm going to post a bunch of links in chronological order below before getting in to my thoughts, so that we can insure you are on my wavelength.

Timeline

  1. 2005-12-??: Link Relationships - Web Authoring Statistics by Google.
  2. 2006-01-04: SEO advice: url canonicalization by Matt Cutts.
  3. 2006-06-05: [whatwg] Where did the "rev" attribute go? by Ian Hickson.
  4. 2008-05-02: Domain Canonicalization by Nathan Buggia.
  5. 2008-08-20: The Difference Between REL and REV attributes of the A Tag (or REL vs. REV) by Erik Vold.
  6. 2008-11-18: [whatwg] Absent rev? by Ian Hickson.
  7. 2008-11-30: URL Referrer Tracking by Nathan Buggia.
  8. 2009-02-12: Partnering to help solve duplicate content issues by Nathan Buggia.
  9. 2009-02-12: Specify your canonical by Joachim Kupke and Maile Ohye.
  10. 2009-02-12: Fighting Duplication: Adding more arrows to your quiver by Priyank Garg.
  11. 2009-03-11: The Rev Attribute, Link Types, and Vote Links Explained by Erik Vold.
  12. 2009-03-17: How To Use The Rev Attribute by Erik Vold.
  13. 2009-04-01: Short URL Auto-Discovery by Robert Spychala.
  14. 2009-04-02: DiggBar Launches Today! by Kevin Rose.
  15. 2009-04-03: on url shorteners by Joshua Schachter.
  16. 2009-04-03: URL Shortening Hinting by Kellan Elliott-McCrea.
  17. 2009-04-09: Google Juice & Page Views: Or How I Learned to Stop Worrying and Love the DiggBar by John Quinn.
  18. 2009-04-10: Save the Internet with rev="canonical" by Chris Shiflett.
  19. 2009-04-11: A rev="canonical" Rebuttal by Ben Ramsey.
  20. 2009-04-11: rev="canonical": DiggBar outrage causes bad ideas to come out of the wood work by Dare Obasanjo.
  21. 2009-04-11: Summarizing My rev="canonical" Argument by Ben Ramsey.
  22. 2009-04-11: A rev="canonical" HTTP Header by Chris Shiflett.
  23. 2009-04-11: rev=canonical bookmarklet and designing shorter URLs by Simon Willison.
  24. 2009-04-11: Revving up by Jeremy Keith.
  25. 2009-04-12: RevCanonial blog turns 10! by Kellan Elliott-McCrea.
  26. 2009-04-12: rev=canonical considered harmful (complete with sensible solution) by Sam Johnston.
  27. 2009-04-12: Specifying rev="canonical" With HTTP by Ben Ramsey.
  28. 2009-04-13: Introducing rel="shortlink" - a better alternative to URL shorteners by Sam Johnston.
  29. 2009-04-13/14: I (used to) like rev="canonical" by Leslie Michael.
  30. 2009-04-14: rev=canonical by Anne van Kesteren.
  31. 2009-04-14: Rev-canonical should be handled with care by Ciaran McNulty.
  32. 2009-04-14: Counting the ways that rev="canonical" hurts the Web by Mark Nottingham.
  33. 2009-04-15: (Yet) Another DiggBar Update by John Quinn.

My Thoughts

Well as you can see, it seems that a lot of confusion over a few seemingly separate things have all come to a head this month. I think the confusion stems from:

The definition of "canonical", which a lot of people seem to misuse.

The term "canonical" as it is used in computer science, comes from set theory where the term "canonical" identifies an element as representative of a set. Matt Cutts back in 2006 described "canonicalization" as the process of picking the best url when there are several choices. Nathan Buggia said "The practice of consolidating all versions of a page under one URL is referred to as 'canonicalization' (because you collapse all versions under the 'canonical' or true version).", then goes on to describe how to use 301s to achieve canonicalization (along with other techniques). So you can see "canonical" is an adjactive, and was initially referring to URLs in the SEO world which included 301 redirects.

However, since Google, Yahoo, and Microsoft all announced support for rel="canonical" it seems that some people have gotten confused, and think that canonicalization process starts after redirects occur and only applies to documents (opposed to urls), and thus conclude that rel="canonical" cannot be used on a 301 redirect, which is incorrect.

I also want to mention another couple points here. Firstly the search engines at this point do not support cross domain links that use rel=canonical, or any rev=canonical. Matt Cutts described the latter here and Andy Mabbett described the solution to both problems here. What Andy was saying is this: If page A on domain A has rev=canonical to page B on domain B, and page B has a rel=canonical to page A, then both Matt Cutts rev=canonical concern and the cross domain rel=canonical concern are both taken care of (2 ways).

The purpose of the rev attribute.

The rev attribute was meant to represent the reverse of the rel attribute originally. However with time multiple arguments popped up against it's use (which has resulted in rev being removed from the html 5 draft), such as:

  1. It is barely used at all.
  2. It is misunderstood and misused as a result.
  3. It is misspelled a lot (one letter diff between rel and rev).
  4. Every possible rev attribute value can be expressed as a rel attribute instead.
  5. It would be a benefit to authors for HTML validtors to complain about the attribute rather than support it.

Now allow me to tear these arguments apart once and for all:

  1. That study is from 2005 first of all, I'm sure the results would be different today. Now with the rise of rev=canonical and votelinks, we have two uses that I would bet show up with results today. Last point here, if the rev attribute existed then link types such as author and help as examples could easy be used, and would require seperate rel values if rev does not exist.
  2. rev is the reverse of rel, this is really easy basic stuff, people may not have gotten it before but they would with time no doubt. Humans are capable of overcoming the great complexity of rel and rev, trust me.
  3. We are talking about code, so I say so what.
  4. Not without adding a lot more values, so sure I agree.. but since rev is already in the html4 spec and is obviously useful, why depreciate it when people were just starting to get it and cause a bunch of useless confusion and work, which both waste time, which is money (3rd way).
  5. omg..

The use cases described for rev=canonical.

Since the rev=canonical idea came to be, there has been opposition to it. The opposition first claims that rev should not be used, which I have hopefully shut down already, if not then suppose I am right for now. Next, if the canonical url has a rev=canonical link, then should it not list all of the pages for which it is the canonical? To this I say you can if you want to, but unless there is a use case that justifies it I would not, because it would just be extra useless markup, thus you are not requied to list all of the urls. The last argument to stem from using the rev attribute that I have found is that rev=canonical is not descriptive enough for a short url. To this I say that the point was not to point to a short url explicitly, it was to point to a short url implicitly, thus if a user is looking for a shorter url, then they can check out the list of urls which the canconical url marked as rev=canonical to find a shorter version. And just so that everyone is clear, one can determine that a rev=canonical url is a short version of the canconical url by simply comparing the lengths of the two urls. I think a lot of people forget that last point.

rev=canonical is a result of people taking parts that already existed out in the world to solve a particular problem, linkrot and the proliferation of third party tinyurl services. Rememeber that nothing new was required, rev is a link attribute for both html 4 and http headers already, and canonical is a new link type which will be certainly be in the html spec some day. In the end the rev=canonical usage conforms with the rel=canonical usage, and in my opinion it finishes the thought.

Furthermore, I think that rev=canonical is a use case to top them all for why rev should not be depricated in html 5 (4th way).

What I think about rel=short*

The rel=short* debate was created by the opposition to rev=canonical, because they thought it was not descriptive enough. To be fair though, rel=shorturl came before rev=canonical I think, but practically speaking it was about the same time, and now rel=shorturl is included in the rel=short* debate. For argument's sake pretend that rel=shorturl is the outcome of the rel=short* debate.

rel=shorturl is a rev=canonical by definition of "canonical" (see above), therefore in the use case of discovering a short url for the canonical url rel=shorturl is redundant if the @rev were alive and well.

I do think that rel=shorturl provides a marginal benefit however, which is that it provides the publisher with the ability to select a preferred subset of the set of urls which are shorter in length than the canonical url.

So in the end I think both rev=canonical and rel=shorturl are two new handy tools in the web publisher's toolbelt that can now be used.

Someone is WRONG on the internet

© Erik Vold 2007-2010. Contact Erik Vold. Top ^