Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)

From: Mark Davis ([email protected])
Date: Fri Oct 05 2007 - 13:06:21 CDT

Next message: Michael Maxwell: "RE: New Public Review Issue: Proposed Update UTS #18"

Previous message: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
In reply to: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
Next in thread: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
Reply: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Going back to my doc, and tweaking it, I think what we are saying is that

[a-z \q{ch} \q{rr}]

is equivalent to

( ch | rr | [a-z] )

in matching, so however the latter works, the former should work. I think
complement is the only really tricky problem.

Mark

On 10/5/07, Andy Heninger <[email protected]> wrote:
>
>
>
> On 10/4/07, Mike <[email protected]> wrote:
> >
> > > With strings in sets at all, separately from the question of how to do
> > > set negation, I'm not sure how matching should work. Which choice is
> > > selected if more than one is possible? Should backtracking try
> > > additional choices if the first one doesn't lead to an overall match?
> > > If sets don't have an implied ordering, do we need to require a POSIX
> > > style longest match, which could be slow?
> >
> > In a set, I keep track of the strings by their length, so the longest
> > match is always found. I don't think you want to backtrack and try a
> > shorter string since the longer match is supposed to be treated as a
> > unit....
> >
> > > Should the set [^xyz\q{ch}] match the 'c' in "ch" ?
> >
> > I don't think so; since the \q{ch} matches "ch", the negated set does
> > not match at the first position.
>
>
> The choices you have made seem reasonable to me.
>
> But what would implementations with a DFA (non-backtracking)
> implementation do? It would be very difficult for them to not take a
> shorter string from a set if that led to an overall longer match. Would it
> be OK - still useful- if the UTS left what happens unspecified?
>
> -- Andy
>
>
> > I'm half inclined to move strings, or literal clusters, into section 3,
> > > then move the entire section 3 of UTS-18 into a separate document for
> > > interesting, but not fully worked out, ideas.
> >
> > This seems like a good idea.
> >
> > Mike
> >
>
>

-- 
Mark

Next message: Michael Maxwell: "RE: New Public Review Issue: Proposed Update UTS #18"
Previous message: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
In reply to: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
Next in thread: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
Reply: Andy Heninger: "Re: Proposal for matching negated sets (was Re: New Public Review Issue: Proposed Update UTS #18)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Oct 05 2007 - 13:07:53 CDT