Emoji title test 😀 😃 😄 😁 😆 😂 ☺️ 😊 😇

How large is the Unicode code space allocated to emojis? How many can there be?

ME want SMASHIE!

Me want :wally:

Maybe we can get the Powers That Be to make this a bankable offense.

There isn’t really any fixed code space. There are code blocks that are defined, but none of them are just “emoji.” New blocks are added as more characters are added, just like any other characters.

The bulk of emoji are in the Miscellaneous Symbols and Pictographs block, which is from U+1F300 to U+1F5FF. But do note that all 767 of these points are already defined, and only 637 of them are classified as emoji.

Before that block, there are 199 emoji, spread among different blocks. These are the preexisting ones I mentioned. They include even the copyright, trademark, and registered trademark symbols.

After the aforementioned block, there are other blocks that primarily consist of emoji. All 80 characters in the emoticon block (smileys and such, U+1F600 to U+1F64F) are classified as emoji. The Transport and Map Symbols block (U+1F680 to U+1F6FF) have 94 out of 117 considered emoji. The latest block that contains emoji is the Suplemental Symbols and Pictographs block (U+1F900 to U+1F9FF), of which 134/148 are considered emoji, with 108 code points left undefined.

If we need more than 108, another block will be added, up to the upper limit for Unicode, which is 1,114,112 characters. As of the latest Unicode 10.0, 136,755 characters are defined. So there’s an extremely large amount of space left.

Do note that country flags and various skin tones are not all separate emoji, but two emoji combined together. (The country flags are made of two characters representing the country code. So, for example, the US flag uses the U country flag emoji followed by the S country flag emoji. This is so that flags can be easily altered as countries and flags come and go.)

ETA: Took too long typing: ninja’ed by BigT.

Ref Emoji - Wikipedia there are presently 1182 defined emoji.

There’s not a reserved block of space for emoji. Instead they’re scattered around here and there in the total code space, so there’s not a fixed upper limit. Practically speaking, emoji-ness is a yes/no property of each and every Unicode character.

The maximum capacity of Unicode itself is roughly 1.15 million characters. Of which roughly 120,000 are defined today. So about 1.03 million slots are as-yet undefined. And they could all potentially be used for emoji.

Sounds silly, but we’re actually getting close to having a Unicode definition for all known written scripts, ancient and modern. So the growth of “letters” is pretty limited. The as-yet uncovered pictographs/ideographs *a la *Chinese are also becoming more limited as time goes on. Chinese, etc., of course could keep inventing new ideographs indefinitely, just as English invents new words. But they don’t seem to be going that way.

Meanwhile, there’s not much limit to the inventiveness of emoji artists. Unfortunately.

Sun emoji?

Hmmm: :wheelchair::radioactive::bomb:

If folks are going to start using these things, it’s irritating that it appears vBulletin doesn’t handle hex entities correctly.

I can keystroke ampersand pound <decimal number> semicolon and get a character, including an emoji character. e.g. ampersand pound 128640 semicolon gives “������” (a 1950s SF rocket ship going northeast).

But if I keystroke ampersand pound x<hexadecimal number> semi-colon I get that sequence rendered literally. e.g. ampersand pound x1f680 semicolon gives “:rocket:” instead of the desired rocket ship.

Then again, if this obstacle slows down the onslaught of emojis, it’s probably a desirable feature, not an undesirable bug.

Hi, I’m :hourglass:, pleased to meet you.

What does it mean for a character to be “classified as an emoji”? Is there any application anywhere in the line which treats characters classified as emoji in a different way from other characters? I know that there are some characters from various languages (Korean, I think?) which are sometimes used as emoji even though that’s not what they were designed for.

To be clear, you mean this just in terms of the way they’re encoded, not in terms of the images that are displayed, right? Because the left half of the US flag looks nothing at all like the left half of the UK flag. And of course it adds complication to have encodings which don’t represent any particular glyph or symbol on their own: What do you display if you have just a U flag character? And if you have, say, flag characters for FRU all in a row, is that an undisplayable character followed by a Russian flag, or a French flag followed by an undisplayable character?

You left out the emojis for “stop”, “worry”, and “love”. But it’s a damn good start. :slight_smile:

Who knew our flaky Sultantheme would italicize the emoji which actually makes them lean over? Italic pictures. Whodathunkit?

Hi there EggTimer,Raw. I’m LSLGuy; it’s a pleasure to meet you. :smiley:

This is a decent explanation: Regional indicator symbol - Wikipedia

In summary, there are 26 special code points that if displayed standalone will render like uppercase A through Z in a sans serif font. But when two code points are placed adjacent, such as US, UK, RU, etc., and the combo of “letters” is a recognized one, the render engine is supposed to draw the appropriate national or regional flag if it knows how. If it doesn’t know that flag, or the combo is unrecognized, the letter-like glyphs are displayed.

I suspect whether flag-F flag-R flag-U will display a French flag followed by something resembling a “U” or something resembling an “F” followed by a Russian flag is system dependent.

Emojis we don’t have yet but need:

Hitler

An atomic explosion mushroom cloud

Box. 5 in B&W, one really tiny one. Two more in B&W.

Chrome.

:sunny::calendar: is the closest I can find.

+1 :stuck_out_tongue:

Sun emoji’s are easy. Here’s 4: ������ ������ ������ ������

The problem is finding the Daze. Though maybe the Og smash emoji over one of the faces or people might do.

Good god, are we about to spend the next 500 years reliving the origins and evolution of Chinese writing? All that stuff started out as quickie sketches that looked like the nouns or actions they represented.

How did Chinese character-writing originally evolve pictographs for abstract nouns, most verbs, and nearly all adjectives and adverbs?

Yes, it is just how they are encoded.

I’m not sure why they did it this way. You also see the potential problems. It seems easier to me to have assigned 676 code points for each two letter combination. (Granted, that gets large if you allow three letter codes, but those codes are, I believe, unofficial.)

I’m not sure there is a standard representation for symbols that do not resolve into a proper country code. I’ve seen flags with the letter(s) on them, letters in boxes, and just the letters. And, of course, just the box that means “invalid character.”

I do believe the codes are greedy and work in the text direction of the language in question, so FRU would be [French flag][U-flag (invalid)].

And this is an example of why you have to be careful with emoji here. I have the sun emoji that you describe, but I can’t see them. vBulletin was not made to be smiley friendly, and the codes wound up munged.

This was part of my reason for testing (in response to Darren’s question in another thread.)