[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #8838(accepted data)

Opened 2 years ago

Last modified 4 months ago

Collation details for the Māori Language (mi)

Reported by: graham_oliver@… Owned by: markus
Component: collation Data Locale: mi
Phase: rc Review:
Weeks: Data Xpath:
Xref:

Description

I have been researching this for a while now and I have

a) Produced an academic poster summarising the development of collation in the Maori language since it was first written down https://www.academia.edu/8917175/Orthography-collation-go
b) Written code and test cases in Python to reproduce the sorting scheme used by the Māori Language Commission.
c) Corresponded with the I.T. person that implemented the sorting scheme for the Māori Language Commission.

What follows are my best efforts at defining the minimal rules (with explanation) as described in http://cldr.unicode.org/index/cldr-spec/collation-guidelines

At Level 1
There are 2 digraphs 'ng' and 'wh'
n < ng
w < wh

At Level 2
The macronised vowels are sorted *after* the non-macronised vowels
My understanding is that this is how DUCET does it so no rule is necessary

At Level 3
UPPER CASE sort before lower case
Ā <<< ā
Ē <<< ē
Ī <<< ī
Ō <<< ō
Ū <<< ū
NG <<< Ng <<< ng
WH <<< Wh <<< wh

Punctuation (basically dashes and spaces) are removed before sorting

I have included a stripped down version of the code I have used to test the above.

There is no English reference to point to. The best I could do is to scan some pages from the normative reference dictionary (He Pātaka Kupu). All in Maori however.

Let me know if you need any more information

Regards
Graham Oliver

btw - Thanks for a great project!

Attachments

maori-collation-tests-for-cldr.py (2.8 KB) - added by graham_oliver@… 2 years ago.

Change History

Changed 2 years ago by graham_oliver@…

comment:1 Changed 23 months ago by emmons

  • Status changed from new to accepted
  • Priority changed from assess to medium
  • Phase changed from dsub to rc
  • Milestone changed from UNSCH to 29
  • Owner changed from anybody to markus
  • Type changed from unknown to data

comment:2 Changed 21 months ago by emmons

  • Milestone changed from 29 to upcoming

comment:3 Changed 9 months ago by graham_oliver@…

Hi there
Is this going to be included in release 30?
Thanks
g

comment:4 follow-up: ↓ 5 Changed 9 months ago by markus

  • Milestone changed from upcoming to 31

No, sorry, CLDR 30 is done, this is not in there.

comment:5 in reply to: ↑ 4 Changed 8 months ago by graham_oliver@…

Replying to markus:

No, sorry, CLDR 30 is done, this is not in there.

ok thanks, hopefully 31 then

comment:6 Changed 4 months ago by markus

  • Milestone changed from 31 to 32
View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.