Good URLs

  • strict warning: Non-static method view::load() should not be called statically in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/views.module on line 906.
  • strict warning: Declaration of views_handler_argument::init() should be compatible with views_handler::init(&$view, $options) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/views_handler_argument.inc on line 744.
  • strict warning: Declaration of views_handler_filter::options_validate() should be compatible with views_handler::options_validate($form, &$form_state) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/views_handler_filter.inc on line 607.
  • strict warning: Declaration of views_handler_filter::options_submit() should be compatible with views_handler::options_submit($form, &$form_state) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/views_handler_filter.inc on line 607.
  • strict warning: Declaration of views_handler_filter_boolean_operator::value_validate() should be compatible with views_handler_filter::value_validate($form, &$form_state) in /hermes/walnaweb12a/b57/moo.greydragoncom/nodsw/sites/all/modules/views/handlers/views_handler_filter_boolean_operator.inc on line 159.
Leeland's picture

Good URLs are a great thing. I am trying to make them come out here. In running around to find the right mix of code to make them nice I ran over this excellent article on "Best URLs". Since I agree with it entirely, and cannot think of anything to add to it just going to reference it and say "ditto". Also in the interest of vanishing Internet resources and articles I include the complete article here (with permission from Gary Love granted on 9/24/2010):

Written by Gary Love on May 9, 2004 (original article at http://www.newmediajournalism.com/bestpractices/besturls) re-Posted with permission.

URLs

Purpose

URLs were developed and agreed upon in order to allow anyone at anytime to form a link to any resource on the internet.

Scope

This particular article applies to content that is produced on a periodic basis with a defined structure.

What a URL should be

Nathan Ashby-Kuhlman is his week long series on article URLs mentioned 4 clear principles for what a good URL would be:

  • Permanent
  • Readable
  • Hierarchical
  • Brief and Clean

Jakob Nielsen came to a similar conclusion in his Alertbox column, adding "a short and easy to remember domain name" to the list.

Permanent

The internet is made to be a fluid atmosphere, with links flowing from one piece of content to another in ever changing patterns. However, that flow is quickly interrupted if those links are broken because of address changes. Tim Berners-Lee, the inventor of the world wide web, addresses this in "Cool URIs don't change".

Berners-Lee suggests avoiding putting the following information into the url, because of the probability for change:

Berners-Lee also suggests eliminating the file name extension and subject from the URL, however that may sacrifice readability and hierarchy for the sake of added permanence.

Minor changes/additions to a story or a switch to paid access (for instance paid archives) should not affect the url. There is an obvious advantage when switching to a paid archive system, to have links from other sites, emails, and search engine remaining intact. Nathan Ashby-Kuhlman gives praise to the Arizona Daily Sun for handling a story's transition from free to pay appropriately.

Readable

In a perfect world, urls would just be the machine-readable addresses that are hidden behind well written links on web pages. However, in Jesse James Garrett's article on user-centered url design, he argues that that we don't live in a perfect world and urls need to be both computer and human readable. Just as computers use that address to figure out which file in which folder on which computer a user is requesting, as user should be able to figure out what they are requesting, where it is, and what else might be available by reading the url.

To illustrate this point, imagine a reading an article and stumbling upon the following passage:

But Senator Lindsey Graham says Rumsfeld also was preparing the public for more disturbing events.

At this point, you have a decision to make. Do you follow the link or continue reading the story? You mouse over the link to see where it goes, in order to decide. Is the link regarding a transcript of Rumsfeld's testimony, a related story, or a paid contextual ad? Without context provided within the url itself, it is impossible to make a knowledgable decision.

To help support readability or "guessability", it is important to avoid CMS specific id's, template names, session information, etc within the url itself. Instead, it is important to use information that adds context to the link, for instance the date published, section, and/or short title.

Hierarchical

The importance of hierarchical urls is connected to the importance of having "hackable urls", essentially allowing users to modify the url to get a broader set of information. For instance, Nathan Ashby-Kuhlman suggests a good news organization would use the url structure of example.com/section/subsection/YYYY/MM/DD/slug. Where YYYY/MM/DD is the date (for instance 2004/05/07) and the slug is a short title (for instance rumsfeld-transcript). When using this structure, the hacking the url should allow users to get to these pages:

  • example.com/section/YYYY/MM/DD/, a list of all articles in that section on that day
  • example.com/section/YYYY/MM/, a list of all articles in that section in that month
  • example.com/section/YYYY/, a list of the articles-by-month pages or long list of articles
  • example.com/section/, the index page for the requested section

Arguments have been made that the majority of users are not sophisticated enough to "hack" the url and therefore it is a waste of time to organize the structure this way. However, Peter Seebach in his Cranky User column makes the case for making sites and urls "expert-friendly" in addition to just being user-friendly. The argument is that users are normally smarter than sites give them credit for, however they are also often more easily frustrated and might be inclined to navigate elsewhere.

Brief and Clean

URLs take on a life of their own once they're published. They may travel through email, on a document, into a book, or by word of mouth and just like luggage or a fragile parcel, the owner should do everything possible to make sure they make it to their final destination.

  • When passed on through email messages, links should be no more than 78 characters to avoid wrapping.
  • Many characters are difficult to represent on paper (1, l, O, or 0) if possible they should be avoided or obvious.
  • Many characters are not supposed to be in urls (~, spaces, etc) and therefore are not guaranteed to work on all browsers.
  • Since links are often represented with an underline, underscores can be difficult to see. Of note, google understands Google understands dog-pound as two worlds, while reads dog_pound as one (rare) word.
  • URLs are case sensitive (although many web servers compensate for this), to avoid confusion all urls should be lowercase.

Conclusion

For the vast majority of articles on news sites, Nathan Ashby-Kuhlman suggested the best url format. It is:

example.com/section/YYYY/MM/DD/slug

or

example.com/section/subsection/YYYY/MM/DD/slug

With a readable section, subsection, and slug this allows for a readable and hackable url. As long as special characters are avoided, names are reasonably short, and the article is permanently accessible then these urls should have higher success rates in search engines, allow for deeper navigation by readers, and make for happier readers overall.

Written by Gary Love on May 9, 2004 (original article at http://www.newmediajournalism.com/bestpractices/besturls)

Thread Slivers eBook at Amazon