[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #5257(accepted data)

Opened 6 years ago

Last modified 3 years ago

document fa_AF=prs & merged fa_AF=ps collation tailoring

Reported by: markus Owned by: shervin
Component: collation Data Locale: fa_AF
Phase: dsub Review:
Weeks: 0.1 Data Xpath:


common/collation/fa_AF.xml has <alias source="ps" path="//ldml/collations"/>, which means that Persian-in-Afghanistan supposedly has the same sort order as Pashto which is very different from the Persian tailoring.

This seems strange because Persian (fa) and Pashto (ps) are significantly different languages, see http://en.wikipedia.org/wiki/Pashto_language

The main locale data is not aliased: common/main/fa_AF.xml overrides some of fa.xml and inherits other parts, but does not have any aliases.

I suggest we have someone familiar with languages of that area review this, and consider removing the common/collation/fa_AF.xml file.


Change History

comment:1 Changed 6 years ago by ake.persson@…

This resource might be useful, http://www.evertype.com/standards/af/

comment:2 Changed 6 years ago by roozbeh

As far as I remember, me and Michael Everson tried to come up with a unified collation for languages of Afghanistan when we were writing the UNDP report. This is especially useful because Afghan names may be written in any of the languages. Since the Pashto alphabet is a superset of the Persian alphabet and keeps the order of characters, the unification was very easy and straight-forward. So, yes, I would expect Pashto and Afghan Persian collation to be the same.

comment:3 Changed 6 years ago by markus

The Wikipedia article says "Pashto is one of the two official languages of Afghanistan (the other being Dari Persian)", and that article says that Dari Persian has the language code prs not fa_AF.

If this is correct, then we should remove collation/fa_AF.xml and move its contents (basically an alias to collation/ps.xml) into a new file collation/prs.xml.

In any case, wherever we have this alias relationship, please add a comment in both the aliasing and the aliased files with basically Roozbeh's information.

comment:4 Changed 6 years ago by roozbeh

For normal locale data, we shouldn't move "fa_AF" to "prs". In the ISO 639-3 model, "fa" is a macrolanguage, like Chinese and Arabic, and it has two sublanguages, "pes" for Iranian Persian and "prs" for Afghan Persian. Also, the differences in written "pes" and "prs" are minimal (and are limited to things like country and language names), which makes any Iranian Persian localization usable for Dari. The current model of having "fa_AF" and "fa" works perfectly for that. But I have no object to aliasing the other way around: "prs" to "fa_AF" and "pes" to "fa_IR".

I would like to be able to keep the same default locale names for collation too. So Dari wouldn't become fa_AF for normal locale data and prs for collation.

(BTW, I have provided most of the locale information for "fa", "fa_AF", and "ps" myself, the later two from two field trips to Kabul and months of research.)

comment:5 Changed 6 years ago by markus

Roozbeh: That all sounds reasonable to me. Please document fa_AF=prs in the main/fa_AF and collation/fa_AF files, and add comments in both collation/fa_AF files and collation/ps about the merged tailoring you created, as you described earlier. Only if it's documented will we avoid further such questions from people who know nothing about these languages (like me).

I actually don't know how we add comments in main/* files since they get generated from the Survey Tool (right?). But I believe that collation/* files are manually maintained.

comment:6 Changed 6 years ago by mark

  • Owner changed from anybody to markus
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 23

Can have comments in files, and they are maintained.

comment:7 Changed 5 years ago by markus

  • Owner changed from markus to roozbeh
  • Priority changed from assess to medium
  • Summary changed from collation: fa_AF == ps, really? to document fa_AF=prs & merged fa_AF=ps collation tailoring
  • Milestone changed from 23 to 24dsub

TODO: comment 5

Note: The tailorings were changed from a forbidden alias to an import; the documentation should still be added.

comment:8 Changed 5 years ago by mark

  • Milestone changed from 24dsub to 24dres

comment:9 Changed 5 years ago by markus

Comments in main/* need to be added either before the Survey Tool opens or after data has been imported from it. Otherwise they get clobbered.

comment:10 Changed 5 years ago by markus

  • Keywords collation added
  • Component changed from data-collation to data

Note: Comments to be added in both main/ and collation/, see the list of comments here.

comment:11 Changed 5 years ago by emmons

  • Component changed from data to data-collation
  • Milestone changed from 24rc to 25design

comment:12 Changed 4 years ago by emmons

  • Milestone changed from 25design to 25rc

Moving all 25design to 25rc. If you plan to complete this item in the 25M1 time frame, please change the milestone to 25M1.

comment:13 Changed 4 years ago by emmons

  • Milestone changed from 25rc to 26rc

comment:14 Changed 4 years ago by mark

  • Milestone changed from 26rc to 27dsub

comment:15 Changed 4 years ago by markus

  • Phase set to dsub
  • Milestone changed from 27dsub to 27

comment:16 Changed 3 years ago by roozbeh

  • Milestone changed from 27 to 28

comment:17 Changed 3 years ago by markus

  • Type changed from defect to data

comment:18 Changed 3 years ago by roozbeh

  • Owner changed from roozbeh to shervin

comment:19 Changed 3 years ago by shervin

  • Milestone changed from 28 to 29

comment:20 Changed 3 years ago by srl

  • Status changed from assigned to accepted

comment:21 Changed 3 years ago by emmons

  • Milestone changed from 29 to upcoming

Auto move of all 29 -> upcoming


Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.