[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

indexing and pattern matching suggestions



Absent implementations of query language in Ozone, does anyone have pointers
or suggestions for practical ways to speed up simple pattern matching in
large groups of similar objects?

In my case, I'm looking at string pattern matching, the mechanics of which
are handled just fine by regexp implementations. What I'm trying to do is
figure out -- or borrow -- an efficient way of indexing objects by string
value such that for any given simple regexp, you aren't scanning the whole
group of objects, but can immediately narrow down to a subgroup.

A half-way hack to this end would be (say) to break up your collection of
strings into subgroups by the first three (or two or four or whatever)
characters. However, this doesn't help you when matching something like
"*pattern*" as opposed to "pattern*".

So far my searching hasn't turned up any useful code...

Reason
http://www.exratio.com/