Emoji and Pictographs
Q: What are emoji?
A: Emoji are “picture characters” originally associated with cellular
telephone usage in Japan, but now popular worldwide. The word emoji
comes from the Japanese
絵
(e ≅ picture) +
文字
(moji ≅ written character).
Emoji are often pictographs—images of things such as faces, weather,
vehicles and buildings, food and drink, animals and plants—or icons that
represent emotions, feelings, or activities. In cellular phone usage, many
emoji characters are presented in color (sometimes as a multicolor image),
and some are presented in animated form, usually as a repeating sequence of
two to four images—for example, a pulsing red heart.
Q. What is the plural of emoji?
A: Both “emoji” and “emojis” are considered acceptable
pluralizations of the word emoji in written English. Unicode uses “emoji” as
the plural due to the Japanese origin of this word. Other publications such as
the Associated Press Stylebook recommend “emojis” as the English plural.
Q: Where can I find out more about emoji in Unicode?
A: See Unicode Emoji,
which introduces the Emoji Subcommittee and its processes, and has links
to many emoji-related charts.
Unicode Technical
Standard #51, Unicode Emoji (UTS #51)
is the technical introduction to Unicode emoji and their implementation.
Q: Are emoji the same thing as emoticons?
A: Not exactly. Emoticons (from “emotion” plus “icon”) are
specifically intended to depict facial expression or body posture as a way of
conveying emotion or attitude in e-mail and text messages. They originated as
ASCII character combinations such as :-) to indicate a smile—and by extension,
a joke—and :-( to indicate a frown. In East Asia, a number of more elaborate
sequences have been developed, such as (")(-_-)(") showing an upset face with
hands raised. Over time, many systems began replacing such sequences with
images, and also began providing ways to input emoticon images directly, such
as a menu or palette. The emoji sets used by Japanese cell phone carriers
contain a large number of characters for emoticon images, along with many
other non-emoticon emoji.
Q. What is the difference between emoji and pictographs?
Pictographs are symbols, such as U+26E8 ⛨ BLACK CROSS ON SHIELD,
that are pictorial representations of objects, sometimes quite simplified.
The set of Unicode emoji intersects, but is not the same as the set
of pictographs in the Unicode standard. Some characters are both emoji and
pictographs, such as U+1F32D 🌭 HOT DOG. Some characters are
emoji but not pictographs, such as U+203C ‼ DOUBLE EXCLAMATION MARK. Some
characters are not emoji but are pictographs, such as
U+26E8 ⛨ BLACK CROSS ON SHIELD.
Q: What are the most popular emoji characters?
A. The emojitracker.com
site tracks the realtime use of many emoji in Twitter, so you can see the most
and least used emoji characters there. The Instagram and Swiftkey reports on
Emoji Press
are also interesting.
Q: Can you point me to some examples of emoji
characters in Unicode?
A: The emoji are spread throughout many blocks of Unicode. See
Unicode Emoji Charts for
a listing of the emoji characters.
Q: Do emoji characters have to look the same
wherever they are used?
A: No, they don’t have to look the same. For example, here are just
some of the possible images for U+1F36D LOLLIPOP, U+1F36E CUSTARD,
U+1F36F HONEY POT, and U+1F370 SHORTCAKE:

In other words, any pictorial representation of a lollipop, custard,
honey pot or shortcake respectively, whether a line drawing, gray scale, or
colored image (possibly animated) is considered an acceptable rendition for
the given emoji. However, a design that is too different from other vendors’
representations may cause interoperability problems: see
Design
Guidelines in
UTS #51.
Q: What about diversity?
A: As with the examples of emoji characters representing food items
above, The Unicode Standard does not require a particular appearance for
characters that depict people or body parts, such as U+1F474 OLDER MAN or
U+270B RAISED HAND. In fact,
UTS #51
recommends that such depictions be as neutral or generic as possible with
respect to physical appearance, for example using non-realistic colors for
skin tone, and a gender-neutral appearance for characters other than
those listed in the
Gender section
as having a gendered appearance.
However, many emoji users desire to use emoji for people and body
parts that display a variety of more realistic skin tones. To support this,
many such emoji may be followed by an emoji modifier character that can
indicate one of 5 skin tones, based on the
Fitzpatrick scale.
See Diversity.
Furthermore,
ZWJ sequences
may be used to specify an explicitly gendered appearance for human-form
characters that by themselves are recommended to have a gender-neutral
appearance; see the list of sequences recommended for general interchange in
emoji-zwj-sequences.txt.
Of course, there are many other types of diversity in human
appearance besides different skin tones, including different hair styles and
color. It is beyond the scope of Unicode to provide an encoding-based
mechanism for representing every aspect of human appearance diversity that
emoji users might want to indicate. The best approach for communicating very
specific human images—or any type of image in which preservation of specific
appearance is very important—is the use of embedded graphics as described in
What is the longer term plan for emoji?
See also What about characters whose names include WHITE or BLACK?
Q: How were emoji originally encoded in Unicode?
A: See the introduction in Unicode Technical
Standard #51, Unicode Emoji.
Q: Do emoji characters have single semantics?
A: No. Because emoji characters are treated as pictographs, they are
encoded in Unicode based primarily on their general appearance, not on an
intended semantic. In fact, when used as emoji, many of these characters
acquire multiple meanings based on their appearance; for example, an emoji
character for “bank” which includes the letters “BK” has taken on in Japan
the secondary meaning “bakkureru” (a slang term for evading one’s
responsibilities). The meaning of each emoji may vary depending on language,
culture, and context. For the curious,
Emojipedia is a good source of information
about the current meanings of various emoji.
Q: Does the Unicode character name define the
meaning of an emoji character?
A: The character name is a unique identifier, but may not encompass all
the possible meanings of an emoji character, and in some cases may even be
misleading. There are annotations in the Unicode Charts and in the Emoji Charts
that help to define the intended meanings and usage.
Q: How many emoji characters are in Unicode now?
A: See Which
Characters are Emoji in
UTS #51.
Q. Which characters should an emoji font or
keyboard support?
A: Any font or keyboard whose goal is to support emoji should
support the characters and sequences listed in the data files referenced by
UTS #51.
Q: Will more emoji characters be added?
A: Yes. It is anticipated that roughly 60 characters would be added
per year, until longer-term solutions come into play. Moreover, the
Consortium may decide that some other current characters should be treated
as emoji. Other features may change, such as the characters used as emoji
modifier bases. Much of this depends on how emoji are handled by vendors,
since developing customary usage is important in determining the Unicode
recommendation and guidelines for interoperability.
Q: Don’t emoji detract from the other work of the consortium?
A: Their encoding, surprisingly, has been a boon for language support. The emoji draw on Unicode mechanisms that are used by various languages, but which had been incompletely implemented on many platforms. Because of the demand for emoji, many implementations have upgraded their Unicode support substantially. That means that implementations now have far better support for the languages that use the more complicated Unicode mechanisms. See L2/18-044.
Q: Do emoji otherwise contribute to language support?
A: The Adopt-a-Character campaign,
funding digitally disadvantaged languages and historic scripts/languages,
has also benefited from the attention paid to emoji.
For details about the campaign, see
Adopt a Character
and How to Apply for an Adopt-a-Character Grant.
Q: How should emoji be displayed?
A: While emoji symbols may be presented using color and animation
(“emoji presentation”), they can also be presented as using a plain black &
white “text presentation”. For guidelines on which characters should be
displayed with an emoji presentation and how, see
Presentation
Style in
UTS #51.
Q: Is there any way to control the emoji
presentation?
A: Certain characters can be followed by a special character called
a variation selector to request a particular appearance: U+FE0F for
the emoji style (typically colored), and U+FE0E for the text style (black and
white). For more information, see
Presentation
Style in UTS #51.
Q: What about characters whose names include WHITE
or BLACK?
A: Names of symbols such as BLACK MEDIUM SQUARE or WHITE MEDIUM
SQUARE are not meant to indicate that the corresponding character must be
presented in black or white, respectively; rather, the use of “black” and
“white” in the names is generally just to contrast filled versus outline
shapes, or a darker color fill versus a lighter color fill. Similarly, in
other symbols such as the hands U+261A BLACK LEFT POINTING INDEX and
U+261C WHITE LEFT POINTING INDEX, the words “white” and “black” also refer
to outlined versus filled, and do not indicate skin color.
Q: What about other colors in the name?
A: Other colors in names, such as BLUE HEART or ORANGE BOOK, are
the recommended appearance when the characters are rendered in color. (The
black and white images in the Unicode charts use various shading techniques
as a stand-in for color.)
Q: What is the difference between emoji and dingbats?
A: Most of the characters in the Dingbats block are derived from a
well-established set of glyphs, the ITC Zapf Dingbats series 100, which
constitutes the industry standard “Zapf Dingbat” font currently available in
most laser printers. Emoji and dingbats have some similarities (and a few
characters in the Dingbats block are treated as emoji). However, while there
is often a great deal of flexibility in the range of glyph shapes that may be
used for presentation of emoji, most characters in the Dingbats block are
expected to be presented with glyph shapes that closely align with those
shown in the Unicode Standard, when shown with a “text presentation”.
Q: Can you fix the Unicode image for X?
A: The images shown for Emoji 11.0 (eg in emoji-released) are samples from various providers, and don’t reflect the final appearances when implemented by vendors later in the year. Keep in mind, though, that emoji are not expected to be anatomically correct!
That said, the Unicode Consortium can pass on your feedback to the providers of these images so that they may be improved.
Q: But doesn’t the Unicode Consortium determine the design of the images?
A: No. The Unicode Consortium provides character code charts that
show a representative glyph (in a black-and-white text presentation), but is not a designer or purveyor of emoji images, nor is it the owner of any of the color images used in Unicode emoji documents and charts, nor does it negotiate licenses for their use. Inquiries for permission to use vendor images should be directed to those vendors, not to the Unicode Consortium.
See Emoji Images and Rights.
The Sample Colored Glyphs columns use a variety of different styles to illustrate some possible presentations. However, the actual presentations on phones and other devices are up to vendors, subject to the considerations in UTR #51, Unicode Emoji.
Q: I’d like my favorite emoji added to my phone.
Can the Unicode Consortium add it?
A: The Unicode Consortium does not control which emoji are
supported on your phone, or what the emoji on your phone look like. Please
see: Once the Unicode Consortium encodes an emoji character, when
will it appear on my phone?
Q: How can I get the Unicode Consortium to add
a Unicode emoji?
A: To submit a proposal for an emoji, see Submitting Emoji Character Proposals. That page also describes the process
and timeline. You should also look at the Emoji Submission FAQ.
Q: Why is the process so long and complicated?
A: Unicode is the foundation for all modern software: that’s how all
mobile phones, desktops, and other computers represent all text of every
language. You are using Unicode every time you type a key on your phone or
desktop computer, and every time you look at a web page or text in an
application.
It is thus very important that the standard be stable, and that
every character that goes into it be scrutinized carefully.
Q: Once the Unicode Consortium encodes an emoji
character, when will it appear on my phone?
A: As part of normal software release cycles, platform vendors
periodically make decisions about which Unicode characters to support in new
versions of their software. Supporting new emoji characters involves additions
to fonts, enhancements to emoji input methods (keyboards or palettes), and
often updates to libraries that determine character properties and behavior
(such as word selection or line breaking).
They are typically released in the year that Unicode finalizes them.
Q: Why can’t I find my national flag in my mobile
application or on my smart phone?
A: For concerns about the emoji and flag symbols available in any
particular application or mobile platform, please contact the manufacturer.
Their software determines what characters are available on your device.
Q: But the Unicode Standard includes other flags,
why don’t you include my flag?
A: The Unicode Standard encodes a set of
regional
indicator symbols. These can be used in pairs to represent any
territory that has a
Unicode
region subtag as defined by CLDR,
such as “DE” for Germany. The pairs are typically displayed as national flags:
there are currently 257 such combinations. For more information, see
Annex B:
Valid Emoji Flag Sequences in
UTS #51.
UTS #51 also specifies a mechanism in which an
emoji
tag sequence can be used to represent a
unicode_subdivision_id
defined by CLDR for regions such as England, Scotland, and Wales; see
Annex C:
Valid Emoji Tag Sequences in UTS #51.
Q: What about other flags?
A: Other flags have been proposed for addition, such as an aboriginal flag.
For such cases, there is no good external coding system to follow for validity.
Thus, they need a proposal that provides the same information as required for other
emoji proposals, including establishing likely high frequency of usage.
In general, flags are often not recognized outside of a relatively
small community, and even in those communities often do not have very
high frequency usage. So it is often hard to justify their inclusion
except in very limited cases. Proposers are advised to look
for some related, more “emoji-like” object.
Q: What is the longer term plan for emoji?
A: The Unicode Consortium encourages the use of embedded graphics
(a.k.a. “stickers”) as a longer-term solution, since they allow much more
freedom of expression. See
Longer
Term Solutions in
UTS #51.
Q. Are emoji a new language?
A: Emoji aren’t really a “language”; they don’t
have the grammar or vocabulary to substitute for written language. But in
social media, people like to use them to add color and whimsy to their messages,
and to help to make up for the lack of gestures, facial expressions, and tone
of voice.
They also add a “useful ambiguity” to messages, allowing
the writer to convey many different possible concepts at the same time. You
can probably view them more like borrowings of foreign words rather than a
language by themselves.
Q: But aren't emoji universal?
A: No, emoji are not necessarily "universal". The images
represented by emoji can have or develop very different overtones and usage
depending on a user's language and culture. People can use combinations that
refer to specific words in their language, such as a bombshell movie
in English:



People also use emoji for verbs or adjectives as well as nouns; when
they do, they often follow the order used by their language. Some languages
put verbs at the end, for example; others put adjectives after nouns.
Q: If I include an emoji character in a document,
will someone accessing it 100 years from now be able to read it properly?
A: Let's consider the broader question of any character, not just
emoji. Consider the "@". That character was already commonly
included on typewriters manufactured in the United States in the first half
of the 20th century. It was used for prices on shop signs or advertisements
and for accounting: "tomatoes @ 12¢/lb" (tomatoes at 12 cents per
pound) and so forth. Back in the 1960s, in the "ancient" history
of computers, it was encoded in ASCII as the commercial at sign. Email had
not really even been invented. This particular symbol got picked up for
other uses, including marking identifiers in some programming languages.
Today, the most common use is in email addresses: chris@example.com. That
change in function could not have been anticipated, but such changes occur
all the time for various symbols—including, of course, emoji symbols.
For the Unicode Consortium, the important thing about the stability
of "@" is that in 1963 it was 0x40 COMMERCIAL AT in ASCII, and at
present in the Unicode Standard it is U+0040 COMMERCIAL AT, as it will remain.
Software still clearly identifies it as the same character today, some 50
years after its first use with computers. There is no reason to suppose that
50 years from now, U+0040 will not still be clearly interpretable in text
data stores as the same thing, even if people invent additional uses for it.
Q: What keeps these characters stable?
A: When new characters are added to the Unicode Standard, they are
added in a way that does not invalidate anything in the prior
versions of the standard. This is called forward compatibility. Everyone
developing any kind of computing system, from laptops to phones to some future
quantum computing cyborg implant has very strong incentives to ensure
that is the case. At this point, nearly 90% of all text data created and
interchanged on the internet is already in Unicode (using the UTF-8 format:
https://w3techs.com/technologies/details/en-utf8/all/all),
and that percentage keeps growing. Even larger volumes of data are generated
and maintained in servers and computers not directly visible on the internet.
There are vast, growing quantities of such data. It would require a complete
collapse of the information technology structure worldwide for all that stored
information to suddenly become uninterpretable. The Unicode data is actually
much more robust and stable than the particular hardware it might be stored on
in any given decade.
Q: What about my emoji question?
A: Just like the commercial at sign, emoji can have and take on
different meanings. For example, U+1F336 HOT PEPPER is a plant symbol that
represents a food item commonly called a hot pepper or a chili pepper. It’s
also frequently used as a menu symbol to indicate the degree of spiciness
in menu items, like the stars used in movie reviews. It could take on another
entirely different meaning in the future, but even if it does, it will remain
stable as the encoded character U+1F336, with that same numeric value and with
the "HOT PEPPER" name, so anybody could still look it up in the
standard, and could interchange it accurately via whatever future version of
software and hardware might be involved in exchanging textual data.