summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorRasmus Villemoes <[email protected]>2025-05-13 10:40:26 +0200
committerTom Rini <[email protected]>2025-05-29 08:25:18 -0600
commit19b3e24083eb0b1b5299e689d0bc5f1a6c4ebdcd (patch)
tree2625148ba823179a0ef72fb545cfce8e6f9c83d5 /include
parentced883d92c0568cdb15b5b67106c29a4623b19d8 (diff)
slre: drop wrong "anchored" optimization
The regex '^a|b' means "does the string start with a, or does it have a b anywhere", not "does the string start with a or b" (the latter should be spelled '^[ab]' or '^(a|b)'). It should match exactly the same strings as 'b|^a'. But the current implementation hard-codes an assumption that when the regex starts with a ^, the whole regex must match from the beginning, i.e. it only attempts at offset 0. It really should be completely symmetrical to 'b|c$' ("does it have a b anywhere or end with c?"), which is treated correctly. Another quirk is that currently the regex 'x*$', which should match all strings (because it just means "does the string end with 0 or more x'es"), does not, because in the unanchored case we never attempt to match at ofs==len. In the anchored case, '^x*$', this works correctly and matches exactly strings (including the empty string) consisting entirely of x'es. Fix both of these issues by dropping all use of the slre->anchored member and always test at all possible offsets. If the regex does have a ^ somewhere (including after a | branch character), that is correctly handled by the match engine by only matching when *ofs is 0. Reviewed-by: Simon Glass <[email protected]> Signed-off-by: Rasmus Villemoes <[email protected]>
Diffstat (limited to 'include')
-rw-r--r--include/slre.h1
1 files changed, 0 insertions, 1 deletions
diff --git a/include/slre.h b/include/slre.h
index 4b41a4b276f..af5b1302d9c 100644
--- a/include/slre.h
+++ b/include/slre.h
@@ -63,7 +63,6 @@ struct slre {
int code_size;
int data_size;
int num_caps; /* Number of bracket pairs */
- int anchored; /* Must match from string start */
const char *err_str; /* Error string */
};