I don’t think I understand you. What about HTML is redundant and needs to be trimmed? A lot of new tricks, such as DHTML and CSS help parse pages faster.
The true culpit of slow-loading web pages are TABLEs. Good luck simplifying that code, it’s rather simple as-is.
Monfort, tables load and display a lot faster than most alternate methods.
HTML is already fairly well optimized. It is in text format, and ASCII text compresses well over modem or ethernet connections. Graphic formats like .jpg are heavily compressed. The big problem is the new tricks like CSS, which add tons of bloated code to .html files.
I don’t think HTML parsing is what makes the web slow. I also don’t think “endless language parsing” is a good way to describe it. Perhaps companies have not “seen the need” for something more because there is no need. Parsing a text file does not actually take that long.
I would say the bigger problem with the slow web is the sheer congestion of it, which companies are spending “mucho millions” on improving now. A more immediate need if you ask me.
HTML parsing is done entirely client-side, so does not affect the Internet, as such, in any way. The only connection is the size of files being transmitted.
Regarding size, I opened a plain hand-coded html file in MS Word to do some things that are just easier with a word processor. After the conversion from plain old HTML to CSS, etc. the filesize jumped from about 1.5k to 22k. That is the problem. Bloated files that don’t do anything different from the plain HTML version.
I think another problem that we are seeing more is that the web is increasingly being generated by server side engines such as CGI, ASP, PHP, JSP, Cold Fusion, etc. While the files they generate (create on the fly) are HTML the programs that create them are often not compiled. The effect is the browser (client) parsing a single file (HTML stream) generated by a server trying to generate hundreds or thousands of similar requests simultaniously. More recent server tecnologies (JSP, ASP+) are partially compiled in the hope of speeding up this part of the equation.
IMHO nothing has “killed” the “World Wide Wait” more than the invention of the <img> tag. Not just because images run up file sizes, but because it makes the web appealling to those who are impressed by pretty pictures. Oh, for the days when the WWW was just for geeks exchanging information, instead of millions of pictures of someone else’s cat.
But, by the same token, the bottom line is bandwidth. And while all those “cats” may be killing bandwidth overall, when I switched to DSL the WWW turned into the “World Wide Wow”. And the path my data travels is 90% the same as before. Not to hijack, but what gives with that?