Track Your rel=canonical URL With Google Analytics
I have been putting some (but very little) effort in to getting my pages tagged with rel=canonical and I have noticed that some of the time I was getting 1+ pageviews with a different page name that were actually for the same page in Google Analytics. So I decided that a little javascript to grab my rel="canonical" and dump that in to pageTracker._trackPageview(); would be nice.
this.getCanonicalURL = function(doNotUseDomain){
// check if doNotUseDomain is defined
if (typeof doNotUseDomain == "undefined"){
var doNotUseDomain = true;
}
var canonicalURL = "";
try {
var links = document.getElementsByTagName('link');
var link = "";
for (var i = 0; i < links.length; i++) {
link = links[i];
if ((/canonical/).exec(link.getAttribute('rel'))) {
canonicalURL = link.getAttribute('href');
break;
}
}
if (canonicalURL != "" && doNotUseDomain) {
return stripDomainName(canonicalURL);
}
}
catch(e){
canonicalURL = "";
}
return canonicalURL;
};
var stripDomainName = function( urlString ){
// check if href is defined
if (typeof urlString == "undefined" || !urlString){
return "";
}
urlString = urlString.replace(/^https?:\/\/[^\/]*\//i,"/");
return urlString;
}
};
Directions
First download the gaRelCanonical.js javascript file from here, then put it on your server, and do not rely on my site to provide the javascript for your site. Then change the old GA tracking code from:
try{
var pageTracker = _gat._getTracker("UA-XXXXXXX-YY");
pageTracker._trackPageview();
}
catch(err){}
</script>
To the following (remember to edit the location of the gaRelCanonical.js file):
<script type="text/javascript">
try{
var pageCanonical = new gaRelCanonical();
var pageTracker = _gat._getTracker("UA-XXXXXXX-YY");
pageTracker._trackPageview(pageCanonical.getCanonicalURL());
}
catch(e){
try{
var pageTracker = _gat._getTracker("UA-XXXXXXX-YY");
pageTracker._trackPageview();
}
catch(err){}
}
</script>
The getCanonicalURL(doNotUseDomain) method accepts a parameter doNotUseDomain, which is true by default. When doNotUseDomain is true, it attempts to remove the "http://" or "https://" part of the rel="canonical"'s href and the whole domain name, leaving a leading "/" as is required. Also if you do not have a rel=canonical link element in your html, then the getCanonicalURL() method will return a blank string, which will mean your ga.js tracking code will act as it did before the change.
About rel=canonical
Please refer to the following articles:

@erikvold