From: Sam Mason (sam@samason.me.uk)
Date: Sun Apr 26 2009 - 15:18:57 CDT
On Sun, Apr 26, 2009 at 10:57:49AM -0700, Mark Davis wrote:
> I'd disagree about that. It is certainly simpler to always process as code
> points, but where performance is required, you need to design your
> algorithms with the encoding of the core string representation in mind,
> typically UTF-8 or UTF-16. You can get huge speedups in that way.
Are there any pointers to literature about that? I'd be interested
to see how this sort of scheme would hang together; there would seem
to be quite a trade-off between instruction cache pressure, branch
prediction and most probably other effects I can't think of at the
moment. Correctness would seem to have suddenly got much harder to
demonstrate so this sort of thing would only be reasonable for very
specialised libraries, which does seem to be what the OP was about.
-- Sam http://samason.me.uk/
This archive was generated by hypermail 2.1.5 : Sun Apr 26 2009 - 15:23:14 CDT