Re: Internationalised Computer Science Exercises from Richard Wordingham via Unicode on 2018-02-05 (Unicode Mail List Archive)

From: Richard Wordingham via Unicode <unicode_at_unicode.org>
Date: Mon, 5 Feb 2018 21:37:30 +0000

On Thu, 1 Feb 2018 19:20:04 +0000
Richard Wordingham via Unicode <unicode_at_unicode.org> wrote:

> A regular trace expression of the form
>
> [:ccc=1:][:ccc=2:]…[:ccc=n:]
>
> seems to require 2^n states in your scheme. As I effectively only
> apply the regex to NFD input strings, I use fewer states. However,
> the efficiency of my scheme depends on the order of the commuting
> factors - reverse order would require the 2^n states.

I've overstated the compactness of my scheme. Firstly, I split the
state for an optionally final matched character into two states
according to whether it is to be the final character or not.
Secondly, the DFA for a Unicode character is quite large. I've
kept it simple and identify most states by the matched Unicode
character, which means I have nearly a million states, whereas I could
probably whittle it down to more like a thousand or so, at a vast
increase in complexity.

Richard.
Received on Mon Feb 05 2018 - 15:38:13 CST

This archive was generated by hypermail 2.2.0 : Mon Feb 05 2018 - 15:38:14 CST