<feed xmlns='http://www.w3.org/2005/Atom'>
<title>u-boot.git/lib/slre.c, branch next</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/'/>
<entry>
<title>slre: implement support for ranges in character classes</title>
<updated>2025-05-29T14:25:18+00:00</updated>
<author>
<name>Rasmus Villemoes</name>
<email>ravi@prevas.dk</email>
</author>
<published>2025-05-13T08:40:32+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=fe4f21185050ef3f2cc847ba04701d1e714522e3'/>
<id>fe4f21185050ef3f2cc847ba04701d1e714522e3</id>
<content type='text'>
When trying to use U-Boot's regex facility, it is a rather large
gotcha that [a-z] range syntax is not supported. It doesn't require a
lot of extra code to implement that; we just let the regular parsing
emit the start and end literal symbols as usual, and add a new
"escape" code RANGE.

At match time, this means the code will first just see an 'a' and try
to match that, and only then recognize that it's actually part of a
range and then do the 'a' &lt;= ch &lt;= 'z' test.

Of course, this means that a - in the middle of a [] pair no longer
matches a literal dash, but I highly doubt anybody relies on
that. Putting it first or last, or escaping it with \, as in most
other RE engines, continues to work.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When trying to use U-Boot's regex facility, it is a rather large
gotcha that [a-z] range syntax is not supported. It doesn't require a
lot of extra code to implement that; we just let the regular parsing
emit the start and end literal symbols as usual, and add a new
"escape" code RANGE.

At match time, this means the code will first just see an 'a' and try
to match that, and only then recognize that it's actually part of a
range and then do the 'a' &lt;= ch &lt;= 'z' test.

Of course, this means that a - in the middle of a [] pair no longer
matches a literal dash, but I highly doubt anybody relies on
that. Putting it first or last, or escaping it with \, as in most
other RE engines, continues to work.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>slre: fix matching of escape sequence used inside character class</title>
<updated>2025-05-29T14:25:18+00:00</updated>
<author>
<name>Rasmus Villemoes</name>
<email>ravi@prevas.dk</email>
</author>
<published>2025-05-13T08:40:30+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=5d3f91d6a82386e94c3178721641608563560871'/>
<id>5d3f91d6a82386e94c3178721641608563560871</id>
<content type='text'>
At the compile stage, the anyof() function clearly intends to handle escape
sequences like \d (for digits) inside square brackets, since the logic
emits a 0 byte followed by the code representing the character
class (NONSPACE, SPACE or DIGIT).

However, this is not handled in the corresponding match helper
is_any_of(); it just naively loops over all the bytes in the -&gt;data
array emitted by anyof() and compares those directly to the current
character. For example, this means that the string "\x11" (containing
the single character with value 17) is matched by the regex "[#%\d]",
because DIGIT happens to be 17.

Fix that by recognizing a zero byte as indicating something special
and act accordingly. In order not to repeat the "increment *ofs and
return 1" in all places, put those two lines after a new match: label.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At the compile stage, the anyof() function clearly intends to handle escape
sequences like \d (for digits) inside square brackets, since the logic
emits a 0 byte followed by the code representing the character
class (NONSPACE, SPACE or DIGIT).

However, this is not handled in the corresponding match helper
is_any_of(); it just naively loops over all the bytes in the -&gt;data
array emitted by anyof() and compares those directly to the current
character. For example, this means that the string "\x11" (containing
the single character with value 17) is matched by the regex "[#%\d]",
because DIGIT happens to be 17.

Fix that by recognizing a zero byte as indicating something special
and act accordingly. In order not to repeat the "increment *ofs and
return 1" in all places, put those two lines after a new match: label.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>slre: refactor is_any_but()</title>
<updated>2025-05-29T14:25:18+00:00</updated>
<author>
<name>Rasmus Villemoes</name>
<email>ravi@prevas.dk</email>
</author>
<published>2025-05-13T08:40:29+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=457f19815c8f8b74c36d8cbef454a0b575ea7068'/>
<id>457f19815c8f8b74c36d8cbef454a0b575ea7068</id>
<content type='text'>
As preparation for fixing the handling of backslash-escapes used
inside a character class, refactor is_any_but() to be defined in terms
of is_any_of() so we don't have to repeat the same logic in two places.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As preparation for fixing the handling of backslash-escapes used
inside a character class, refactor is_any_but() to be defined in terms
of is_any_of() so we don't have to repeat the same logic in two places.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>slre: drop wrong "anchored" optimization</title>
<updated>2025-05-29T14:25:18+00:00</updated>
<author>
<name>Rasmus Villemoes</name>
<email>ravi@prevas.dk</email>
</author>
<published>2025-05-13T08:40:26+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=19b3e24083eb0b1b5299e689d0bc5f1a6c4ebdcd'/>
<id>19b3e24083eb0b1b5299e689d0bc5f1a6c4ebdcd</id>
<content type='text'>
The regex '^a|b' means "does the string start with a, or does it have
a b anywhere", not "does the string start with a or b" (the latter
should be spelled '^[ab]' or '^(a|b)'). It should match exactly the
same strings as 'b|^a'. But the current implementation hard-codes an
assumption that when the regex starts with a ^, the whole regex must
match from the beginning, i.e. it only attempts at offset 0.

It really should be completely symmetrical to 'b|c$' ("does it have a
b anywhere or end with c?"), which is treated correctly.

Another quirk is that currently the regex 'x*$', which should match
all strings (because it just means "does the string end
with 0 or more x'es"), does not, because in the unanchored case we
never attempt to match at ofs==len. In the anchored case, '^x*$', this
works correctly and matches exactly strings (including the empty
string) consisting entirely of x'es.

Fix both of these issues by dropping all use of the slre-&gt;anchored
member and always test at all possible offsets. If the regex does have
a ^ somewhere (including after a | branch character), that is
correctly handled by the match engine by only matching when *ofs is 0.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The regex '^a|b' means "does the string start with a, or does it have
a b anywhere", not "does the string start with a or b" (the latter
should be spelled '^[ab]' or '^(a|b)'). It should match exactly the
same strings as 'b|^a'. But the current implementation hard-codes an
assumption that when the regex starts with a ^, the whole regex must
match from the beginning, i.e. it only attempts at offset 0.

It really should be completely symmetrical to 'b|c$' ("does it have a
b anywhere or end with c?"), which is treated correctly.

Another quirk is that currently the regex 'x*$', which should match
all strings (because it just means "does the string end
with 0 or more x'es"), does not, because in the unanchored case we
never attempt to match at ofs==len. In the anchored case, '^x*$', this
works correctly and matches exactly strings (including the empty
string) consisting entirely of x'es.

Fix both of these issues by dropping all use of the slre-&gt;anchored
member and always test at all possible offsets. If the regex does have
a ^ somewhere (including after a | branch character), that is
correctly handled by the match engine by only matching when *ofs is 0.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Rasmus Villemoes &lt;ravi@prevas.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib: Remove &lt;common.h&gt; inclusion from these files</title>
<updated>2023-12-21T13:54:37+00:00</updated>
<author>
<name>Tom Rini</name>
<email>trini@konsulko.com</email>
</author>
<published>2023-12-14T18:16:58+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=467382ca03758e4f3f13107e3a83669e93a7461e'/>
<id>467382ca03758e4f3f13107e3a83669e93a7461e</id>
<content type='text'>
After some header file cleanups to add missing include files, remove
common.h from all files in the lib directory. This primarily means just
dropping the line but in a few cases we need to add in other header
files now.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Tom Rini &lt;trini@konsulko.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After some header file cleanups to add missing include files, remove
common.h from all files in the lib directory. This primarily means just
dropping the line but in a few cases we need to add in other header
files now.

Reviewed-by: Simon Glass &lt;sjg@chromium.org&gt;
Signed-off-by: Tom Rini &lt;trini@konsulko.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib/slre: Fix memory leak if regex compilation fails</title>
<updated>2023-11-29T14:32:15+00:00</updated>
<author>
<name>Francois Berder</name>
<email>fberder@outlook.fr</email>
</author>
<published>2023-11-12T19:16:50+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=891b178c579c0084c1759cf0319e7151a7e288a5'/>
<id>891b178c579c0084c1759cf0319e7151a7e288a5</id>
<content type='text'>
Signed-off-by: Francois Berder &lt;fberder@outlook.fr&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Francois Berder &lt;fberder@outlook.fr&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>common: Drop log.h from common header</title>
<updated>2020-05-19T01:19:18+00:00</updated>
<author>
<name>Simon Glass</name>
<email>sjg@chromium.org</email>
</author>
<published>2020-05-10T17:40:05+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=f7ae49fc4f363a803dab3be078e93ead8e75a8e9'/>
<id>f7ae49fc4f363a803dab3be078e93ead8e75a8e9</id>
<content type='text'>
Move this header out of the common header.

Signed-off-by: Simon Glass &lt;sjg@chromium.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move this header out of the common header.

Signed-off-by: Simon Glass &lt;sjg@chromium.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib/slre: remove superfluous assignment</title>
<updated>2018-09-05T20:02:34+00:00</updated>
<author>
<name>Heinrich Schuchardt</name>
<email>xypron.glpk@gmx.de</email>
</author>
<published>2018-09-03T03:17:20+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=f2906e5f586d4ff238f5370b77717279bf5b0639'/>
<id>f2906e5f586d4ff238f5370b77717279bf5b0639</id>
<content type='text'>
It makes no sense to assign a value to 'res' if the next use of the
variable is an assignment.

Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It makes no sense to assign a value to 'res' if the next use of the
variable is an assignment.

Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>lib/slre: remove superfluous assignment</title>
<updated>2017-05-12T12:37:18+00:00</updated>
<author>
<name>xypron.glpk@gmx.de</name>
<email>xypron.glpk@gmx.de</email>
</author>
<published>2017-05-08T19:13:43+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=b8865e6bf0ed519e6a8b9634a776092a2812af9e'/>
<id>b8865e6bf0ed519e6a8b9634a776092a2812af9e</id>
<content type='text'>
The value assigned to saved_offset is never used.

The problem was indicated by clang scan-build.

Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The value assigned to saved_offset is never used.

The problem was indicated by clang scan-build.

Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Coding Style cleanup: remove trailing white space</title>
<updated>2013-10-14T20:06:53+00:00</updated>
<author>
<name>Wolfgang Denk</name>
<email>wd@denx.de</email>
</author>
<published>2013-10-07T11:07:26+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.235523.xyz/u-boot.git/commit/?id=3765b3e7bd0f8e46914d417f29cbcb0c72b1acf7'/>
<id>3765b3e7bd0f8e46914d417f29cbcb0c72b1acf7</id>
<content type='text'>
Signed-off-by: Wolfgang Denk &lt;wd@denx.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Wolfgang Denk &lt;wd@denx.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
