The insanity of URL shorteners and tracking

Want to help support this blog? Try out Oh Dear, the best all-in-one monitoring tool for your entire website, co-founded by me (the guy that wrote this blogpost). Start with a 10-day trial, no strings attached.

We offer uptime monitoring, SSL checks, broken links checking, performance & cronjob monitoring, branded status pages & so much more. Try us out today!

Profile image of Mattias Geniar

Mattias Geniar, June 18, 2012

Follow me on Twitter as @mattiasgeniar

Usage tracking on the internets, it’s nothing new. We’ve had tracking cookies for decades so I should assume it’s only logical that each URL we click on has to be registered by a few parties as well, so they all know where we’re going to and where we came from.

For instance, here’s an example from Facebook. If you click on any URL in Facebook, say someone linking to his Flickr page, you get transferred to the following page.

http://www.facebook.com/l.php?u=http://flic.kr/p/abc123

You hardly ever see this though, because if you hover over a link someone places in Facebook, the statusbar will show you the location you’ll be heading to and it won’t show you the Facebook Tracking URL – they use clever Javascript events to alter the URL as you click on it. So you only see the real Flickr URL, perhaps unaware that every click you make in Facebook is tracked.

Once you do click on a link, and thus follow the Tracking URL, you get transferred to your next location.

$ curl -A "Firefox/13" -i "http://www.facebook.com/l.php?u=http://flic.kr/p/abc123"
HTTP/1.1 200 OK
...
Content-Length: 14781

Anyone knowing HTTP headers can tell you that this is no ordinary redirect: you receive a full page again (HTTP Status 200), 14KB in size, just to transfer you on to the next page. There’s no use of a 301 or 302 HTTP redirect at all, as proper redirects should be.

The page you receive has some HTML head tags to transfer you to the next page if you don’t support javascript, using the noscript-tags.

[snip]
<noscript><meta http-equiv="refresh" content="0; URL=/l.php?u=http%3A%2F%2Fflic.kr%2Fp%2Fabc123&_fb_noscript=1" /></noscript>
[snip]

But if you do have Javascript, they’ll track a whole lot more. After all, the 14KB page has to account for something, right?

Then when you leave the Facebook Tracking URL, up comes the Flickr Tracker.

$ curl -A "Firefox/13" -i "http://flic.kr/p/abc123"
HTTP/1.1 302 Found
...
Location: http://www.flickr.com/photo.gne?short=abc123
Content-Length: 3266

At least they properly use the HTTP redirects. They still have a body, but only 3KB in size, for those browsers that don’t handle the 302 redirects.

Now, this only consists of 2 redirects: from Facebook to the Flickr tracker and on to your actual URL. With URL shorteners like Bitly that let anyone customize their ‘vanity URLs’, we’re seeing more and more URL shorteners within URL shorteners within URL shorteners within … Even Xzibit would find that overkill.

All tracking worries aside, what frustrates me most is the many different hops a browser must take to actually load the content you want. Each URL shortener has to be resolved. We’ve actually come to a point where if bit.ly is unavailable due to server issues, a whole lot of URLs will simply stop working. That’s troublesome, it creates an enormous Single Point of Failure (SPOF) that many people don’t think about.

I understand the need for URL shortening on services like Twitter, where input-fields are limited in size. I do not understand the need for URL shortening on any other site that allows longer URLs. I’ll bet most people use the URL shorteners just for the sake of vanity and seeing their name appear in the URLs. They hardly use the tracking aspect (how many clicks, when, from where, …) that those services provide. Now doesn’t that just seem like way too much overkill? And doesn’t Bitly have way too much information on our browsing habits?

Perhaps it’s time to get my Shortlink Revealer Firefox plugin working again …



Want to subscribe to the cron.weekly newsletter?

I write a weekly-ish newsletter on Linux, open source & webdevelopment called cron.weekly.

It features the latest news, guides & tutorials and new open source projects. You can sign up via email below.

No spam. Just some good, practical Linux & open source content.