Standardising the “URL”

Want to help support this blog? Try out Oh Dear, the best all-in-one monitoring tool for your entire website, co-founded by me (the guy that wrote this blogpost). Start with a 10-day trial, no strings attached.

We offer uptime monitoring, SSL checks, broken links checking, performance & cronjob monitoring, branded status pages & so much more. Try us out today!

Mattias Geniar, February 01, 2017

Follow me on Twitter as @mattiasgeniar

You’d think that the concept of “a URL” would be pretty clearly defined by now, with the internet being what it is today. Well, turns out – it isn’t.

But Daniel Stenberg, from curl fame, is trying to fix that.

This document is an attempt to describe where and how RFC 3986 (86), RFC 3987 (87) and the WHATWG URL Specification (TWUS) differ. This might be useful input when trying to interop with URLs on the modern Internet.

This document focuses on network-using URL schemes (http, https, ftp, etc) as well as ‘file’.

URL Interop

What really strikes me as odd is the interoperability comparison for each “fragment” in the URL;

<th>
  Value
</th>

<th>
  Known interop issues exist
</th>

<td>
  http
</td>

<td>
  no
</td>

<td>
  ://
</td>

<td>
  YES
</td>

<td>
  user:password
</td>

<td>
  YES
</td>

<td>
  www.example.com
</td>

<td>
  YES
</td>

<td>
  80
</td>

<td>
  YES
</td>

<td>
  index.html
</td>

<td>
  YES
</td>

<td>
  top
</td>

<td>
  no
</td>

Component
scheme
divider
userinfo
hostname
port number
path
fragment

It’s amazing a “URL” even works.

I’ve said it before and I’ll say it again: the internet is held together with duct tape. I hope this proposal gets somewhere, it’ll make parsing URLs a whole lot easier and more reliable.

Source: URL Interop

Want to subscribe to the cron.weekly newsletter?

I write a weekly-ish newsletter on Linux, open source & webdevelopment called cron.weekly.

It features the latest news, guides & tutorials and new open source projects. You can sign up via email below.

No spam. Just some good, practical Linux & open source content.