L2/12-074 Subject: Property Metadata To: UTC From: Mark Davis Date: 2012-02-05 In http://www.unicode.org/L2/L2010/10052-metaprop.txt, Ken presented a proposal for property metadata. While there was a lot of value to that proposal, overall it is rather complicated. I think we can make some incremental progress by: a) focusing at the features of properties that are most important to implementers b) taking the items one-by-one, and proposing concrete data files for them. First I’d like to look at the Status. Right now, we have quite a number of different definitions that fall into that category, including (Immutable, Normative, Informative, Provisional, Contributory, Deprecated, Stabilized, and Obsolete). However, when you look at the use in practice, there are almost no overlaps among the main definitions of that list. (Other metaproperties like Overridable I think need to remain orthogonal.) The few overlaps there are can be resolved pretty simply. What I propose for these is that we have an enumerated metaproperty called Property_Status, contained in a text file called PropertyStatus.txt. Here is a proposal for the contents of that file, which is based on v6.1 data scraped from UAX #44, with a couple of small changes. # PropertyName; Status Decomposition_Mapping Immutable; Name Immutable; Canonical_Combining_Class Immutable; Pattern_Syntax Immutable; Pattern_White_Space Immutable; Numeric_Value Normative; Case_Folding Normative; Simple_Case_Folding Normative; Simple_Lowercase_Mapping Normative; Simple_Titlecase_Mapping Normative; Simple_Uppercase_Mapping Normative; Name_Alias Normative; Age Normative; Block Normative; Bidi_Class Normative; Decomposition_Type Normative; General_Category Normative; Hangul_Syllable_Type Normative; Joining_Group Normative; Joining_Type Normative; Line_Break Normative; NFC_Quick_Check Normative; NFD_Quick_Check Normative; NFKC_Quick_Check Normative; NFKD_Quick_Check Normative; Numeric_Type Normative; ASCII_Hex_Digit Normative; Bidi_Control Normative; Bidi_Mirrored Normative; Composition_Exclusion Normative; Default_Ignorable_Code_Point Normative; Deprecated Normative; Full_Composition_Exclusion Normative; Grapheme_Base Normative; Grapheme_Extend Normative; IDS_Binary_Operator Normative; IDS_Trinary_Operator Normative; Join_Control Normative; Logical_Order_Exception Normative; Noncharacter_Code_Point Normative; Radical Normative; Soft_Dotted Normative; Unified_Ideograph Normative; Variation_Selector Normative; White_Space Normative; Bidi_Mirroring_Glyph Informative; Lowercase_Mapping Informative; NFKC_Casefold Informative; Titlecase_Mapping Informative; Uppercase_Mapping Informative; Unicode_1_Name Informative; kRSUnicode Informative; Script Informative; East_Asian_Width Informative; Grapheme_Cluster_Break Informative; Sentence_Break Informative; Word_Break Informative; Alphabetic Informative; Case_Ignorable Informative; Cased Informative; Changes_When_Casefolded Informative; Changes_When_Casemapped Informative; Changes_When_Lowercased Informative; Changes_When_NFKC_Casefolded Informative; Changes_When_Titlecased Informative; Changes_When_Uppercased Informative; Dash Informative; Diacritic Informative; Extender Informative; Hex_Digit Informative; ID_Continue Informative; ID_Start Informative; Ideographic Informative; Lowercase Informative; Math Informative; Quotation_Mark Informative; STerm Informative; Terminal_Punctuation Informative; Uppercase Informative; XID_Continue Informative; XID_Start Informative; kAccountingNumeric Provisional; kOtherNumeric Provisional; kPrimaryNumeric Provisional; kCompatibilityVariant Provisional; CJK_Radical Provisional; Emoji_DCM Provisional; Emoji_KDDI Provisional; Emoji_SB Provisional; Named_Sequences Provisional; Named_Sequences_Prov Provisional; Script_Extensions Provisional; kBigFive Provisional; kCCCII Provisional; kCNS1986 Provisional; kCNS1992 Provisional; kCangjie Provisional; kCantonese Provisional; kCheungBauer Provisional; kCheungBauerIndex Provisional; kCihaiT Provisional; kCowles Provisional; kDaeJaweon Provisional; kDefinition Provisional; kEACC Provisional; kFenn Provisional; kFennIndex Provisional; kFourCornerCode Provisional; kFrequency Provisional; kGB0 Provisional; kGB1 Provisional; kGB3 Provisional; kGB5 Provisional; kGB7 Provisional; kGB8 Provisional; kGSR Provisional; kGradeLevel Provisional; kHDZRadBreak Provisional; kHKGlyph Provisional; kHKSCS Provisional; kHanYu Provisional; kHangul Provisional; kHanyuPinlu Provisional; kHanyuPinyin Provisional; kIBMJapan Provisional; kIICore Provisional; kIRGDaeJaweon Provisional; kIRGDaiKanwaZiten Provisional; kIRGHanyuDaZidian Provisional; kIRGKangXi Provisional; kIRG_GSource Provisional; kIRG_HSource Provisional; kIRG_JSource Provisional; kIRG_KPSource Provisional; kIRG_KSource Provisional; kIRG_MSource Provisional; kIRG_TSource Provisional; kIRG_USource Provisional; kIRG_VSource Provisional; kJIS0213 Provisional; kJapaneseKun Provisional; kJapaneseOn Provisional; kJis0 Provisional; kJis1 Provisional; kKPS0 Provisional; kKPS1 Provisional; kKSC0 Provisional; kKSC1 Provisional; kKangXi Provisional; kKarlgren Provisional; kKorean Provisional; kLau Provisional; kMainlandTelegraph Provisional; kMandarin Provisional; kMatthews Provisional; kMeyerWempe Provisional; kMorohashi Provisional; kNelson Provisional; kPhonetic Provisional; kPseudoGB1 Provisional; kRSAdobe_Japan1_6 Provisional; kRSJapanese Provisional; kRSKanWa Provisional; kRSKangXi Provisional; kRSKorean Provisional; kSBGY Provisional; kSemanticVariant Provisional; kSimplifiedVariant Provisional; kSpecializedSemanticVariant Provisional; kTaiwanTelegraph Provisional; kTang Provisional; kTotalStrokes Provisional; kTraditionalVariant Provisional; kVietnamese Provisional; kXHC1983 Provisional; kXerox Provisional; kZVariant Provisional; Indic_Matra_Category Provisional; Indic_Syllabic_Category Provisional; Jamo_Short_Name Contributory; Other_Alphabetic Contributory; Other_Default_Ignorable_Code_Point Contributory; Other_Grapheme_Extend Contributory; Other_ID_Continue Contributory; Other_ID_Start Contributory; Other_Lowercase Contributory; Other_Math Contributory; Other_Uppercase Contributory; FC_NFKC_Closure Deprecated; Expands_On_NFC Deprecated; Expands_On_NFD Deprecated; Expands_On_NFKC Deprecated; Expands_On_NFKD Deprecated; Grapheme_Link Deprecated; Hyphen Stabilized; ISO_Comment Obsolete;