I have always wondered what the two slashes are for after the service: in some URLs and also why some services like mailto don’t have them (although frankly, that wouldn’t make sense). I googled around but the only thing I found was the answer to the question of why the file protocol usually has three slashes after it.
No answer, but I just wanted to point out that ftp:// and https:// also has the two slashes. There are others that you can use in your browser but they’re not common for the everyday web surfer.
You also said that the // is in some web addresses, but I’ve never seen one that had less than 2 slashes.
I looked at the rfcs, but it has been a long time since I took automata theory. I guess what I don’t understand is why there needs to be any slashes after the colon at all. Did URLs precede the world wide web? I don’t remember ever using one before I used a web browser, but back in those days, I was just using trn and occasionally ftp as command line programs.
Can you please define scheme authority etc. I know a few licks of cisco, but not enough to really decode anything. I’m interested of course, but the scope of your links is a bit intimidating.
I understand pieces, like in HTML I can use #index to bring me up to the index on the same page (assuming it’s designated that way of course), I’m jsut not knowledgable on the higher (lower?) level things such as “authority” and such you mentioned.
Yes, in particular they were used by FTP and Gopher for a long while before HTTP was around.
Both FTP and Gopher using the :// form, so the origin definitely predates web addresses. It always seemed rather arbitrary to me; the precise reasons for the decision are probably lost to time. (Note that the RFCs which standardize URIs were written long after they were already in common use as an ad hoc, and poorly-defined, standard.)
Because nothing else was using that, so there was no confusion with existing services.
I suppose if “/” or “” means “root of a local device”, like C:, then “//” or “\” could logically be up one level, or something encompassing more than one device. I’ve seen “///” somewhere in the past, too, IIRC. All these forms predate Microsoft’s “Desktop” and “Network Neighborhood”.
Scheme is the part like “http” or “ftp” or “mailto”. The authority is like “www.straightdope.com”. For some protocols like ftp, you can have an optional username and password. Path is like /sdmb/newreply.php and query is like do=newreply&p=9572205.
The file protocol has three slashes after it when it is referring to a local authority. You could either write file:///path/to/file.html or file://my-machine-name/path/to/file.html. I am not sure how you would have an authority other than your local machine when using the file protocol, but I guess it’s possible.
To recap the previous posts, it sounds like the “//” is there for legacy reasons. Is that about right? The rfc dates from Dec. '94.
In very much laymen’s terms with lots of details ignored or glossed over:
“Scheme” refers to the type of protocol at a high level.
e.g “ftp://myserver/myfile” would refer to asking a server named “myserver” to use the ftp protocol to send you a file named “myfile”. And “http://myserver/myfile” would refer to asking a server named “myserver” to use the http protocol to send you a file named “myfile”. Same physical server, same actual file, but the extra packaging & communication needed for the two computers to converse would be utterly different.
The internet stared out with many different schemes. http & its encrypted version https pretty well won the race.
For user convenience, a modern browser will let you type “google.com” into the address bar and it’s work. But what actually goes on the wire is “http://www.google.com”.
So it’d be easy for a modern user to be clueless as to why the http:// part even exists.
“Authority”. Fancy term for the computer(s) at the other end of the wire. It might be a single specific comuputer, or it might, like google.com, be a gigantic complex where the authority even refers to different complexes depending on where you are on Earth.
The mechanism for determining how to make contact with the authority named in a url, much less the thousands of steps needed to actually make such contact, operates at a lower level of the whole protocol stack & isn’t part of this discussion.
“Path” is defined however the authority wants it to be. Slashes and something that resembles a hierarchical disk folder/file structure is common, but is not strictly necessary.
“query” & fragment are both ways to pass additional information to the server. In the case of http, the intent is that path refers to a specific asset on the authority. If that asset is a dumb file, query & fragment are ignored. If the asset is a program capable of dynamic behavior, then the query or fragment can be used to influence that behavior. This corresponds very roughly to setting options on a dialog box.
In the case of http, fragment is further defined to be used at the client end, where, as somebody noted above, it can be used to direct a browser to jump to a particular spot on a results page.
I could be mistaken here but I do not think mailto: is part of a URL. Is is a metacommand used in HTML to request the browser to invoke the local default mail client. If you type a mailto: command in your browser address box it will bring up a mail message but I think that is more browser implementation than part of the standard for URLs.