URI’s

This document is about naming things on the web. Maybe it should be called something else.

1 naming things

‘Tis but thy name that is my enemy.
Thou art thyself, though not a Montague.
What’s Montague? it is nor hand, nor foot,
Nor arm, nor face, nor any other part
Belonging to a man. O, be some other name!
What’s in a name? that which we call a rose
By any other name would smell as sweet.
So Romeo would, were he not Romeo called,
Retain that dear perfection which he owes
Without that title. Romeo, doff thy name,
And for that name which is no part of thee
Take all myself.

Romeo & Juliet

I’m collecting some thoughts here.

A group of attributes is what each substance here is known-as, they form its sole cash-value for our actual experience. The substance is in every case revealed through THEM; if we were cut off from THEM we should never suspect its existence; and if God should keep sending them to us in an unchanged order, miraculously annihilating at a certain moment the substance that supported them, we never could detect the moment, for our experiences themselves would be unaltered. Nominalists accordingly adopt the opinion that substance is a spurious idea due to our inveterate human trick of turning names into things. Phenomena come in groups—the chalk-group, the wood-group, etc.—and each group gets its name. The name we then treat as in a way supporting the group of phenomena. The low thermometer to-day, for instance, is supposed to come from something called the ‘climate.’ Climate is really only the name for a certain group of days, but it is treated as if it lay BEHIND the day, and in general we place the name, as if it were a being, behind the facts it is the name of. But the phenomenal properties of things, nominalists say, surely do not really inhere in names, and if not in names then they do not inhere in anything.

William James, Pragmatism, Lecture 3

http://www.gutenberg.org/ebooks/5116

And finally, there’s a bit in a Rich Hickey talk where he describes the psychology of perception. As I recall, he was arguing (after Alfred North Whitehead) that we don’t perceive mutable things as a flow, so much as a series of states. This served as a background for a discussion about names and references, in which identity is something we assign to a series of values, not something in which that thing “inheres” (to use James’ word).

All this ties rather mundanely to the way that URI’s allow us to name resources, and the way that HTTP tries to deal with changes to those things.

2 use a naked domain

This rule canonicalizes the www-free form of all addresses. This is also known as a “naked domain.” Is this wise? While there are of course arguments for and against1, the only concensus is that you must be consistent. That’s what this rule does.

It’s true as of of this writing that just about any “big” web site you can think of uses www. The only respectable counterexample is archive.org. Since the only thing I care about is that the addresses are archival, that’s good enough for me. As for the other arguments—willshake will never use cookies.

Using the RedirectMatch directive from Apache’s “alias” module2, you permanently redirect from any www address to the equivalent non-www address. I do this by creating a separate VirtualHost for the www form, so it’s important that the non-www form not specify the www form as a ServerAlias, as you often see.

<VirtualHost *:80>
  ServerName www.willshake.net
  RedirectMatch permanent (.*) https://willshake.net$1
</VirtualHost>

Obviously, this is only for production.

3 canonicalize no trailing slash

Consider the two paths:

  • /plays/Ham
  • /plays/Ham/

They look pretty similar, right? Well, left to its own, willshake will serve exactly the same content for both. And that doesn’t bother me per se.

But with pre-rendering, the slash-free version will actually be served by /plays/Ham/index.html, which will be redirected by mod_dir to /plays/Ham/. Supposedly this is good practice, because it canonicalizes the “right” form for directories. But willshake doesn’t work like that. There’s no difference between a “directory” and a “file” in willshake, there are only locations. It makes no sense that adding child locations should change the canonical URL of the parent (from no slash to slash), but that’s exactly how things work right now with the site pre-rendered.

Bottom line, I hate those redirects. It puts me at conflict with myself, because I don’t (and won’t) write the URL’s with the slash, but since the server redirects them (for paths that do in fact have children, anyway), I’m telling Google the opposite.

So I noticed this because of pre-rendering the site, but it’s not really specific to that. It’s better practice to observe a canonical form, and I’m sure that if I must do so, then the slash-free form is it.

Finally, I’d like to fix this without resorting to mod_rewrite, since I’d prefer to avoid the use of that module altogether.

Footnotes:

2

Apache mod_alias”, Apache HTTP Server Documentation, Version 2.4

about willshake

Project “willshake” is an ongoing effort to bring the beauty and pleasure of Shakespeare to new media.

Please report problems on the issue tracker. For anything else, public@gavinpc.com

Willshake is an experiment in literate programming—not because it’s about literature, but because the program is written for a human audience.

Following is a visualization of the system. Each circle represents a document that is responsible for some part of the system. You can open the documents by touching the circles.

Starting with the project philosophy as a foundation, the layers are built up (or down, as it were): the programming system, the platform, the framework, the features, and so on. Everything that you see in the site is put there by these documents—even this message.

Again, this is an experiment. The documents contain a lot of “thinking out loud” and a lot of old thinking. The goal is not to make it perfect, but to maintain a reflective process that supports its own evolution.

graph of the program

about

Shakespeare

An edition of the plays and poems of Shakespeare.

the works