How do URL shorteners work?

As a recent convert to the twitterverse, I’m now using URL shorteners. I use bit.ly, which will take something like this:

http://chronicle.com/article/50-Years-of-Stupid-Grammar/25497

and turn it into:

http://bit.ly/bTTwD

How does that work? (When you explain this, keep in mind that I have only the vaguest idea how a regular URL works, so please don’t get supertechnical.) My understanding is that a regular address takes you to a server where someone posted something. Where does the shortened address take you? To the bit.ly server, which then redirects?

I think you got it. My understanding is that the second link takes you to the bit.ly server, where it looks up bTTwD. At that location it finds a forward to your first link.

That’s it.

And there is an advantageous byproduct of this: if you sign up for a free account with bit.ly, the bit.ly server can track how many people clicked through the bit.ly server to your link. It’s handy for marketing via Twitter and so on, or just for fulfilling your curiosity.

Yeah, I was aware of the counting feature, which I’ve checked out for things I’ve tweeted. Since the link I just “created” had 622 hits before I posted, I infer that they don’t create a new link every time someone requests one.

ETA: Thanks!

On reason the URLs can be shorter is they make use of more symbols in the ASCII set than just lower case letters like a typical URL. Youtube does something like this, too.

Most web URLS are drawn from 26 letters plus 10 digits. Upper & lower case are equivalent, so you only have about 36 distinctive characters available for each place (some other chars can be used, but I’m not sure how many, so it’s a little more than 36). But if you make a distinction between lower & upper, you have 26 more chars to play with without expanding the length of the URL.

And the URL created doesn’t have to make any sense to humans; all combinations of letters/numbers are usable.

So UrL4W can be a different addr from uRl4w.

52^5 is 380 million so they can stay in business for a long time.

In reality, though, I doubt that even 1 million people have used the service, so they could just as easily used numbers from 000001 to 999999 and gotten the same result. Adding a seventh number would last ten times as long. Etc.

Probably doesn’t look as cool, though. :cool:

Assuming BOTH letters and numbers can be used, at least, that would be a 62 character set, not just 52 (26+26+10).

And it is possible that every user needs several URLS, so they might be used up a bit faster than you are calculating.

Tinyurl, one of the oldest of thesem, started with five letters after the tinyurl.com. Now there are seven characters (it’s not case-sensitive, so it has fewer options). Bit.ly will start doing the same over time – their URLs will grow longer. But not as long as most of the originals.

Note that the domain name (the part before the slash, usually ending in .com or .org or .edu or whatever), which roughly speaking says what computer the site is on, is always case-insensitive. For the rest of the URL, describing which file and where on that computer, it depends on the OS and webhosting software that computer is running. Windows-based machines are case insensitive, and Unix-based machines are case sensitive, but the web server software will often tweak things so it behaves as if it weren’t.

I’m still puzzled at why you would take a descriptive URL & replace it with noise.

Because this link:

is not a descriptive url and is way too long to tweet.

Because Twitter is stupid.

A nice thing Bit.ly offers is the ability to customize your shortened version. So instead of some mishmash you can have http://bit.ly/something. (E.g. when I was promoting a video related to a book trailer, I used nwvideo – “nw” being short for the book title. It’s recognizable and still shorter than the original link.)

Strange, the ones that I’ve done under my own account are only counted for me, not for others who’ve already done the URL compression.

It isn’t just Twitter, it’s also because email clients are often stupid, and will wrap an address (insert line breaks) to make it fit in 72 or 80 columns, forcing people receiving the address to either fix it or just not bother.

For example:

http://www.example.com/?foo=fram;foow245;f3242eadsfc;3435fdsfDwqe21;fdsafewr5435tafsfds;ft524

will get turned into something like:

http://www.example.com/?foo=fram;foo
w245;f3242eadsfc;3435fdsfDwqe21;fds
afewr5435tafsfds;ft524

Notice how that’s no longer a valid address. It’s easier for the recipient if it’s been shortened into something that will not be wrapped.

I get spam E-mails sometimes from Craigslist, and when I put my mouse over the link, it starts out http: //bit.ly/something. I don’t give URL shorteners much thought, so I didn’t realize that’s what it is. Good to know.

Hm, I haven’t set up an account there, that’s probably the difference. It would be unique if I did?

tinyurl has been around over 12 years. No telling how many url’s they’ve created. Maybe a million in that time? They claim a billion on the web site. That seems like an exaggeration.

I’ve wondered if they check and delete tiny url’s that aren’t valid any more?

A billion probably isn’t an exaggeration. There are uncountable web sites and services that automatically generate URLs at tinyurl.com and other popular URL shorteners. Lots of web sites automatically generate short URLs for every page, for example.

Why would they delete old ones? It would cost more to figure out which ones are invalid than it would to keep them. And there’s no significant benefit in doing so.

By the way, Google has now opened to the public their own URL shortener, with various features such as letting you know how many people have used your short URL, etc…
It’s at http://goo.gl