Hi list (and Jason) !
I have a prototype parser that uses a Regexp based “Divide and
Conquer” pattern to parse textile:
http://github.com/gaspard/redcloth-regexp/tree/master
This parser currently only parses simple ‘list’, ‘strong’, ‘em’ and
‘bold’ but it is very easy to extend and adapt.
To give you an idea of how this thing works:
- take a string
- try to match first regular expression from context (if you are in
:main and :main => [:p, :bold], the first regexp is defined by =>
…) - if the pattern matches, insert a placeholder and scan matched text
in the new context (:p). - when you cannot match (no more re in context list), unfold by
expanding text to an S-expression tree
Example:
“hello em and strong”
match regular expression associated with :em
=> “hello @@=9347=@@”
scan matched content in :em context
=> “em and strong” matches :strong
=> “em and @@=9350=@@”
no match in “strong”
expand in :strong context ==> [:strong, “strong”]
expand in :em context ==> [:em, "em and ", [:strong, “strong”]]
expand in :main ==> [:main, "hello ", [:em, "em and ",
[:strong, “strong”]]]
Let me know what you think.
Gaspard