Telnet to view source code

Is there any way that i can telnet into my web servers in order to view the source on a page that isn’t published? I barely remember something from my old networking classes that goes like:



telnet www.whatever.com http
GET HTTP1.0 www.whatever.com/coolpage.htm


Since it isn’t working right, i’m sure my syntax is wrong.

The type of telnet session your thinking of is basically just manually typing out the HTTP requests that a web browser would send; in other words, if you can get the page via that method you can get it with a web browser.

Depends a lot on how your web server is set up. Most web servers these days don’t allow telnet connections. FTP works for most sites though.

This worked for me…the bold text is what I typed.



$ **telnet www.google.com 80**
Trying 66.102.7.147...
Connected to www.google.com.
Escape character is '^]'.
**GET / HTTP/1.1 www.google.com**

HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html
Set-Cookie: PREF=ID=888b11d924129e3b:TM=1118162608:LM=1118162608:S=-5OTXl0QatEXj942; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
Server: GWS/2.1
Transfer-Encoding: chunked
Date: Tue, 07 Jun 2005 16:43:28 GMT

7c7
<html><head>...


For some reason (and I’m sure there is a perfectly logical one) I had to hit Enter twice after the GET line.

However, you can’t use this method to reach a page that can’t be reached via a normal web browser. So if you can’t get to http://www.somewhere.com/page.html by typing the address in your browser you won’t be view it via telnet either. All you are doing with this telnet stuff is what your web browser does behind the scenes for you.

are you telnetting into port 80?

How does the webserver tell the difference between a telnet connection and a browser connection to the same port?

The HTTP 1.0 syntax would be:



telnet www.whatever.com http
GET /coolpage.htm HTTP/1.0


followed by two carriage returns.

The (more up-to-date) HTTP 1.1 syntax would be:



telnet www.whatever.com http
GET /coolpage.htm HTTP/1.0
Host: www.whatever.com


However, as already noted, there is no difference between doing this and accessing whatever.com/coolpage.htm in a browser, then viewing the page source.

Arrgh … that second snippet should say 1.1, obviously…

You can also use wget if it’s available on your system. wget basically does the telnet and saves the result to a local file; various options allow you to set the attributes following the GET request line.

[nitpick]Well, almost no difference. A browser will typically send several other attributes (besides the Host: attribute), such as a User-Agent string (to identify the browser type), Accept* (acceptable display formats and languages), Cookie, Referer (URL of referring page), etc. Some webservers require acceptable values for some of these attributes (to prevent easy hotlinking or deeplinking) with unacceptable values causing (e.g.) a redirect to the homepage or a no-hotlinks image. You can of course type these in manually if you know what they should be.[/nitpick]

It doesn’t. I suspect that engineer_comp_geek meant that the telnet service/daemon on web servers is usually disabled (for security reasons.) But that doesn’t mean you can’t open a terminal to an open port (in this case the HTTP port - 80) and send the expected GET command manually.

The OP was asking about viewing the source code of a page, not the rendered result.

Summarizing what others have said …

For 100% static html pages source and rendered result are the same thing. Load the page with a browser & use the browser’s view source feature. No need for telnet.
For any dynamic pages (ASP, PHP, etc.), the telnet trick just retrieves the page the server generates. That doesn’t get you to the source code of the page in any sense.

In fact, some pages in some technologies are generated entirely by compiled binary code; there is no source page in the traditional web-server sense of a file copied from the server to the browser.

Further, any generated page is dynamic, with the http output stream depending on incoming http attributes, server state, cookies, etc. Unless our intrepid telnet operator knows exactly what the server will do with each of those items, they’ll be hard-pressed to stimulate the server in exactly the same way as their browser would. And even of they got it right, they’d still be looking at a rendered result, not source contents.
Finally, many modern pages, even if static html files on the server, use browser-side script to modify the DOM after it gets to the browser, or use DHTML. The result is the output as rendered in the browser is different from the output as sent by the server. Normally the view source option in the browser gives you the pre-modification version, ie the DOM stream as delivered from the server before the client-side starts massaging it.
When you couple dynamic server-side generation with dynamic client-side scripting, the idea that you can “get to the bottom of things” with a tool like telnet is silly. For somebody’s home-brew basic HTML 101 site sure, but not for anything commercial, or even anything built by a modern bot or design tool.