I understand regular expressions, but can someone please explain this:
re = %r/((?((?:\[()]|[^()]|\g)*)))/
By the way, this only works with the Oniguruma engine (Ruby 1.9).
So, now that there is the capability to match balanced parens and so
forth, does this mean that the new regular expression engine can be
used to construct simple parsers (matching language constructs)?
I understand regular expressions, but can someone please explain this:
re = %r/((?((?:\[()]|[^()]|\g)*)))/
By the way, this only works with the Oniguruma engine (Ruby 1.9).
So, now that there is the capability to match balanced parens and so
forth, does this mean that the new regular expression engine can be
used to construct simple parsers (matching language constructs)?
%r/ … /
– regexp delimter (why they didn’t just use / … /, I don’t know)
( … )
– non-capturing group - (normally would be capturing, but see サービス終了のお知らせ, part 10, case 3)
– seems rather useless, given that the only contained item is a
capturing group
(? … )
– capturing named group (see サービス終了のお知らせ,
part 7)
( … )
– literal parentheses surrounding pattern
(?: … | … | … )*
– non-capturing group of 3 alternatives, repeated 0 or more times
\[()]
– escaped literal parens
[^()]
– anything except parens
\g
– match the pg-named pattern here (recursive sub-exp - see サービス終了のお知らせ, part 9)
Thanks, I understand nearly everything now. It really shows the power
of the oniguruma engine for regular expressions. By the way, the comma
caused a Japanese site to come up. For people’s reference the manual
for onigurama is at:
I understand regular expressions, but can someone please explain this:
re = %r/((?((?:\[()]|[^()]|\g))))/
snip
(?: … | … | … )
– non-capturing group of 3 alternatives, repeated 0 or more times
\[()]
– escaped literal parens
[^()]
– anything except parens
\g
– match the pg-named pattern here
Ok, so there are 3 alternatives in the non-capturing group:
I understand regular expressions, but can someone please explain this:
re = %r/((?((?:\[()]|[^()]|\g)*)))/
By the way, this only works with the Oniguruma engine (Ruby 1.9).
So, now that there is the capability to match balanced parens and so
forth, does this mean that the new regular expression engine can be
used to construct simple parsers (matching language constructs)?
[^()]
– anything except parens
\g
– match the pg-named pattern here
Ok, so there are 3 alternatives in the non-capturing group:
An open or close parenthesis
correction. As Eric said above, an escaped (read, with leading
backslash) parenthesis.
Any character except a paren
yup.
A pattern that starts with an open paren
AND ends in a close paren, and contains only, non-parens, escaped
parens, and balanced pairs of parens.
Am I the only one that finds this strange?
Doubtful :). You may be one of the ones to which this is new, though.
I find it strange that only recognize parenthesis escapes, and not
escaped backslashes. So you can do something like:
( ( )
and match correctly, but there’s no way to do a balanced pair of
parentheses containing just a backslash:
() – no
(\) – no
(\) – no
(\ ) – matches, but has an extra space.
I would have replaced ‘\[()]’ by ‘\[()\]’ so that ‘(\)’ would
match.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.