Why does weird text keep popping up in posts?

I’m an admitted techno-idiot, but even I know this is probably a dumb question. Realizing that doesn’t give me the answer, though, and I don’t know where else to turn, so please be patient with me.

Occasionally when I’m reading a post on a message board (not the SDMB), these strange characters will appear mid-word, making it hard to read. Here’s an example: ’. (I added the period.) I use a MacBook and assume these other posters are using PC’s and that therein lies the rub, but perhaps I’m off-base. In any case, it’s annoying. Is there any way to make it go away?

I think this is a character encoding problem. The browser is seeing characters it does not know how to render.

I am using Windows 7 with Firefox 4. In there if you go to:


There is a selection for character encoding.

I have the check box that allows the web site to specify the font and I have “Western (ISO-8859-1)” selected.

You will have to noodle out where this is in your browser.

Let us know if that helps.

Thanks, Whack-a-Mole. I usually use Google Chrome but have had this trouble with Firefox, too. On Chrome, I found “Encoding” under view, and it was already set to “Western (ISO-8859-1).” Could it be a problem on the other end? Not all posts have the weird text in them, and it does seem like it tends to happen in the same posters’ text. But then I’d assume everyone saw the weird characters, and nobody else complains.

It’s quicker in FF4 to go to Firefox>Web Developer>Character Encoding.

If something is composed in a word processing program that uses Unicode character encoding – MS Word is notorious for this but far from the only culprit – and then posted on the Web and viewed with a browser that uses ASCII or ANSI encoding, any embedded Unicode characters that do not share encoding will display as the sorts of bizarre characters noted. Particularly common are “smart quotes” – the ones that the quotation marks around the words smart quotes would seem to be [sup]66[/sup] and [sup]99[/sup] rather than [sup]11[/sup] – as well as apostrophes, ellipsis marks (…, represented as a single symbol of three conjoined periods rather than three period symbols in Unicode), and em and en dashes. The gibberish is the browser’s attempt to figure out what that bizarre coding that it does not recognize was supposed to be.

If this is one site, it’s possible it is incorrectly reporting its character encoding. Go to that site, find an incorrect character, and try changing the encoding as mentioned above. You may have to refresh between changes.

If that doesn’t fix it, then it may actually be in their softare. (TVTropes had this problem for a while) Or it might be a problem with the browser of the poster.

And, yes, there’s the idea that they encoded it in Word, and then pasted it. That’s not a good idea. Work in Notepad if you need an external app.

The spellcheck in Notepad really sux though.

I’ve occasionally had the same problem on some sites: some pages on Comics 101, for instance.

On that page, with my default encoding – Western (ISO Latin 1) – I see this:

If I change to Unicode (UTF-8), I see

I’m having this problem Big Time in another forum. I’m using FF10 and Windows 7. MY posts are full of weird characters usually when I copy/paste from another web site, but sometimes when I compost directly in the post window (so I’m told). What do my settings need to be so my posts won’t show up on other computers full of weird characters? I currently have character encoding set to Western (ISO-8859-1).

Compost in the post window really stinks. :smiley:

I do not think there is a universal solution to this problem. Different character encoding settings will be right for different sites, and wrong for others. For sites in English, Western (ISO-8859-1) is probably the most commonly used and the best for most people most of the time. Generally speaking, when it does not work, it is the site designers fault.

I thought that people were trending away from ISO 8859-1 and using Unicode by default these days. I know I have Unicode set as my browser default–but that’s only for pages that do not announce what character set they are using.

Well, which version of Unicode? I am seeing 4 different options in Firefox.

Character encoding choices largely seems to be a crapshoot to me.

UTF8 is the only sane unicode encoding.

I think a lot of people got close to the target but missed: this problem occurs when a webpage doesn’t tell the browser what character set it’s using, and the browser has to guess (your default). What you’re seeing is the browser guessing wrong.

The only real solution is for the software running the site to be written correctly by people who are aware of character encoding issues. **Every **website should be setting its encoding, there’s no excuse for not doing this in 2012.