Unicode Technical Note #33

List of Dandas
in the Unicode Standard

Version	2
Authors	Ken Whistler, Rick McGowan
Date	2012-07-11
This Version	http://www.unicode.org/notes/tn33/tn33-2.html
Previous Version	http://www.unicode.org/notes/tn33/tn33-1.html
Latest Version	http://www.unicode.org/notes/tn33/

Summary

This document provides a list of danda characters in the Unicode Standard.

Status

This document is a Unicode Technical Note. Sole responsibility for its contents rests with the author(s). Publication does not imply any endorsement by the Unicode Consortium. This document is not subject to the Unicode Patent Policy.

For information on Unicode Technical Notes including criteria for acceptance, see http://www.unicode.org/notes/.

Introduction
The Table of Dandas

1 Introduction

Dandas are punctuation characters commonly seen in the typographic traditions of writing systems of South and Southeast Asia. While they occur in many scripts, they are primarily found in traditional materials written in scripts historically derived from the Brahmi script.

The typical appearance of a danda is simply a vertical bar. Two vertical bars may also be paired together in a corresponding punctuation mark known as a double danda. Tripled forms may also occur, but are much less common. Although forms based on a simple vertical bar are typical, in some scripts more elaborate forms have developed, and in some cases—such as Tibetan, in which the danda is termed a shad—the danda mark may accrue additional adornments.

Dandas generally delimit phrase-, sentence-, or section-level divisions in text. When both a single and a double danda occur, the double danda is used to demarcate larger units of text than the single danda. This usage is roughly comparable to the use of commas and full stops in Western typography, although dandas typically mark larger phrasal units than what might be separated by commas in Western typography. In many traditional materials, dandas and double dandas delimit what might be best termed verses or sections, and do not map easily onto concepts such as "sentence". Usage may also vary by script, by language, and by corpus.

Many South and Southeast Asian scripts in modern usage have adopted Western typographic practice in varying degrees. In such contexts dandas are often supplanted by common-use Western punctuation marks.

Many of the danda characters encoded in the Unicode Standard have the word "DANDA" in their name, but there are many instances where punctuation marks are encoded, which historically and functionally are dandas, but which have distinct names specific to a particular script. Also, because danda characters do not all have simple, vertical bar shapes, they are not always easy to find when searching the code charts.

To make it easier to identify danda characters in the Unicode Standard, this Technical Note includes a specific list of known danda characters as of Unicode 6.0. This list may be periodically updated in the future, if further danda characters are added to the Standard.

2 The Table of Dandas

The table below is in the usual Unicode Data File format of semi-colon delimited fields optionally followed by "#" and a comment. The table contains a list of characters in the Unicode Standard that are dandas. The first field is a codepoint or codepoint range. The second field is the General Category of the character. The third field is a comment giving the names of the characters or the first and last characters in the range.

# Dandas

# [Not derivable]

0964..0965    ; Po #   [2] DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA
0E5A          ; Po #       THAI CHARACTER ANGKHANKHU
0F08          ; Po #       TIBETAN MARK SBRUL SHAD
0F0D..0F12    ; Po #   [7] TIBETAN MARK SHAD..TIBETAN MARK RGYA GRAM SHAD
104A..104B    ; Po #   [2] MYANMAR SIGN LITTLE SECTION..MYANMAR SIGN SECTION
1735..1736    ; Po #   [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION
17D4..17D5    ; Po #   [2] KHMER SIGN KHAN..KHMER SIGN BARIYOOSAN
1AA8..1AAB    ; Po #   [4] TAI THAM SIGN KAAN..TAI THAM SIGN SATKAANKUU
1B5E..1B5F    ; Po #   [2] BALINESE CARIK SIKI..BALINESE CARIK PAREREN
1C3B..1C3C    ; Po #   [2] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION NYET THYOOM TA-ROL
1C7E..1C7F    ; Po #   [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD
A876..A877    ; Po #   [2] PHAGS-PA SHAD..PHAGS-PA MARK DOUBLE SHAD
A8CE..A8CF    ; Po #   [2] SAURASHTRA DANDA..SAURASHTRA DOUBLE DANDA
A92F          ; Po #       KAYAH LI SIGN SHYA
A9C8..A9C9    ; Po #   [2] JAVANESE PADA LINGSA..JAVANESE PADA LUNGSI
AA5D..AA5F    ; Po #   [3] CHAM PUNCTUATION DANDA..CHAM PUNCTUATION TRIPLE DANDA
AAF0          ; Po #       MEETEI MAYEK CHEIKHAN
ABEB          ; Po #       MEETEI MAYEK CHEIKHEI
10A56..10A57  ; Po #   [2] KHAROSHTHI PUNCTUATION DANDA..KHAROSHTHI PUNCTUATION DOUBLE DANDA
11047..11048  ; Po #   [2] BRAHMI DANDA..BRAHMI DOUBLE DANDA
110C0..110C1  ; Po #   [2] KAITHI DANDA..KAITHI DOUBLE DANDA
11141..11142  ; Po #   [2] CHAKMA DANDA..CHAKMA DOUBLE DANDA
111C5..111C6  ; Po #   [2] SHARADA DANDA..SHARADA DOUBLE DANDA

3 References

[Glossary]	Unicode Glossary http://www.unicode.org/glossary/ For explanations of terminology used in this and other documents.
[UCD]	Unicode Character Database http://www.unicode.org/ucd/ For detailed documentation about the Unicode Character Database, see Unicode Standard Annex #44: Unicode Character Database http://www.unicode.org/reports/tr44/
[Unicode]	The Unicode Standard For the latest version, see: http://www.unicode.org/versions/latest/

Modifications

The following summarizes modifications from the previous version of this document.

2	Updated for Unicode 6.1 additions.
1	Initial version, corresponding to Unicode 6.0.

Copyright © 2010-2012 Rick McGowan, Ken Whistler, and Unicode, Inc. All Rights Reserved. The Unicode Consortium and the authors make no expressed or implied warranty of any kind, and assume no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information or programs contained or accompanying this technical note. The Unicode Terms of Use apply.

Unicode and the Unicode logo are trademarks of Unicode, Inc., and are registered in some jurisdictions.