Email Attachments?

any idea of why the attached file is so much bigger than the original one?

Internet mail must be “7-bit ASCII” in which each byte of the message has one of 128 possible values (from 0 to 127). In an attachment, each byte may have any of 256 possible values (from 0 to 255). How can you send any of 256 possible values use a system where only 128 values are possible?

By using an encoding scheme. However, since 128 values are not capable of representing as much information as 256 values, you need more bytes with 128 possible values to represent the original bytes with 256 possible values.

Of course, the receiving program must understand the encoding scheme and decode the message to obtain the original attachment. There are stand-alone programs to handle cases that your email program doesn’t understand, for example Funduc Explorer Extensions

Encoding schemes include UUENCODE (Unix and some PC use), BASE64 (probably the msot common today), and BinHex (Macintosh). There’s also Quoted-Printable and XXENCODE, but you don’t see them used very often nowadays.

Electronic Exchange - Encoding

Oh, you might want to know why Internet mail is 7-bit ASCII. I can’t say for sure, but I’ll bet that it’s because it’s always been that way, it works pretty well, and changing it would be a monumental task (replacing every email client and server in the world …).

The fundamental specification of Internet email is RFC822: Standard for the Format of ARPA Internet Text Messages, last revised in 1982. In that document, “text” is defined as “any CHAR, including bare CR & bare LF, but NOT including CRLF”. “Char” is defined as “Any ASCII character (octal 0-177, decimal 0-127)”. There are other standards that extend the definitions to non-English character sets.

In the extremely remote case that someone wants to research this further, the Internet standards body is The Internet Engineering Task Force. All the mail related standards are available at The Internet mail Consortium. You can search Internet standards at Internet RFC/STD/FYI/BCP Archives.

chief, they have to convert them into letters & numbers & that uses a lot more space.

Internet content standards (IRC, Usenet, E-Mail, and (in some circumstances) even HTTP and FTP) specify 7-bit encoding. In the bad old days, networking was usually mediated by Old Crappy Modems, which used the most significant bit of each eight-bit byte as a “parity” bit: This bit would be set or reset to make an even (or odd) number of 1’s in the byte. Old Crappy Modems used this feature as a very primitive method of error checking; checksums and CRC’s hadn’t yet come into wide use.

A lot of software, mostly Unix Usenet and E-Mail servers, were written in the old days to the 7-bit standard. Even though all modern computers and networking systems can transfer the full eight bits of every byte, it’s “easier” to keep the old standards rather than re-write all the old server software to take advantage of the extra bit.

original ASCII had 7 bits and only later, to allow for foreign characters, several 8 bit codes were developed.

I have no knowledge about internet servers but I believe some will work with 8 bit codes while some won’t. Can anyone clarify this?

thanks for the reply… very informative!