NamesList.txt as data source

Marcel Schneider charupdate at
Sat Mar 12 18:29:02 CST 2016

On Thu, 10 Mar 2016 15:14:09 -0700, Doug Ewell  wrote:

> Ken Whistler wrote:
> > NamesList.txt should *not* be data mined.
> And yet it was the only Unicode data file utilized by MSKLC.
> There are many possible reasons for this approach, which we will
> probably never know.

Sadly it is too late to ask Michael Kaplan the question. To add one more answer in his place: I never doubted that NamesList.txt was the best choice for MSKLC, which parses the file for code points and character names to generate a human readable display and output as defined by Asmus Freytag on Thu, 10 Mar 2016 18:13:21 -0800. This would have been similarly achieved by parsing UnicodeData.txt. However the main difference between using NamesList vs. UnicodeData in the MSKLC as I see it, is the cultural benefit for the end-user.

Consistently, the Names List is shipped in the root directory of MSKLC, beside a copy of the EULA, and then copied to a safe location at %User%\AppData\Local\MSKLC (where I recently updated it to some 8.0.0 version of its French translation), so that the user can view it―and even alter it without disturbing the tool. Itʼs sort of a pocket version of the Code Chartsʼ textual information, thus likely to satisfy both the (human) keyboard editor and the creator (software).

Extrapolating from my case, I believe that the >2 million downloads of MSKLC [1] surely contributed to some extent to spread the knowledge about Unicode, and to give people the desire to learn more―because indeed, Ken Whistler warned on Thu, 10 Mar 2016 13:40:47 -0800, and the Code Charts Disclaimer clearly states that they «do not provide all the information needed to fully support individual scripts using the Unicode Standard.»

And they canʼt even. On Thu, 10 Mar 2016 18:13:21 -0800, Asmus Freytag wrote:

> The goal getting a complete and machine-readable description of
> character behavior is illusory.


[1] Kaplan, M. S. (2013, October 4). The story of MSKLC | Sorting it all Out, v2! Retrieved August 18, 2015, from

More information about the Unicode mailing list