Divide and conquer

gazoduc · June 18, 2009, 2:02pm

Hi list (and Jason) !

I have a prototype parser that uses a Regexp based “Divide and
Conquer” pattern to parse textile:

http://github.com/gaspard/redcloth-regexp/tree/master

This parser currently only parses simple ‘list’, ‘strong’, ‘em’ and
‘bold’ but it is very easy to extend and adapt.

To give you an idea of how this thing works:

take a string
try to match first regular expression from context (if you are in
:main and :main => [:p, :bold], the first regexp is defined by =>
…)
if the pattern matches, insert a placeholder and scan matched text
in the new context (:p).
when you cannot match (no more re in context list), unfold by
expanding text to an S-expression tree

Example:

“hello em and strong”

match regular expression associated with :em
=> “hello @@=9347=@@”

scan matched content in :em context
=> “em and strong” matches :strong
=> “em and @@=9350=@@”

no match in “strong”

expand in :strong context ==> [:strong, “strong”]
expand in :em context ==> [:em, "em and ", [:strong, “strong”]]
expand in :main ==> [:main, "hello ", [:em, "em and ",
[:strong, “strong”]]]

Let me know what you think.

Gaspard