Hi all.
I’m pleased to announce 0.0.1 (aka “early adopters only” release) of my
Uniforma library.
It’s here: http://rubyforge.org/projects/uniforma/
== What is it?
Library for parsing “simple text” formats (RD, Textile, Markdown, etc.)
and
generating output in various formats (including simple text, html/xml
and
more complex ones).
The heart of the library is two DSLs - for defining parsers and
generators.
== Why?
-
Preparing “one more serious library”'s documentation, I’ve found a
dillema: write it in RD? (for auto-generate all with RDoc) or Trac’s
wiki
format? (for uploading to Trac site) or Textile? (for once uploading to
stand-alone site) So I’ve decided to do conversion library/tool. -
I’m using RedCloth (Textile) for all my works, and trying to patch it
for
my needs, I’ve found it’s a mess. I just need to have separate clear
description of “how is it parsed” and “how is it generated” aspects. -
For my journalism, I need MS Word output (I have no fun to do text
editing in MS Word, but ability to generate it is a must). Now I use
“Textile=>(RedCloth)=>HTML=>winword mytext.html
” scheme, which have
several flaws. I want be able to easy define MS Word generator (using
win32ole, of course, no hand-made heroism).
== Show. Me. The. Code.
Usage:
puts Uniforma::textile(‘some text “with
links”:http://google.com.’).to_html_string
output:
some text with links.
Defining parsers:
module Uniforma::Parsers
class Textile < LineParser
definition do
…
#how to parse some line
…
line /^h(\d+).\s+/ do para(:heading, :level => @_1.to_i) end
…
#how to parse inline formatting:
inline /(.+?)/, :italic
end
end
end
Defining generators
module Uniforma::Generators
class HtmlString < TextGenerator
definition do
…
#what to place around some “paragraph type”
around(:heading) {|p| i = p.level; [“<h#{i}>”, “</h#{i}>\n”]}
…
#what to place around some "inline markup type"
around(:italic) {["<i>", "</i>"]}
end
end
end
Uniforma is smart enough to allow:
- non-line based formats parsers (in fact, it also has one “toy” parser
for
HTML, which even works! on not-very-complex HTML documents) - non-text format generators (I’m working on PDF and MSWord generators.
It’s
not very hard to define with Uniforma)
== Important notes about current release
-
This release shamelessly includes htmlentities library by Paul
Battley[1],
without even notice it in license files. It is subject to change ASAP. -
It’s really “early adopters” release. Almost no docs, and very, very
poor
tests. But it shows an idea and is a base for further work. -
This release include parsers for: Textile, RD, HTML and generators
for:
BBcode, RD, HTML. All of them are incomplete but tend to work. -
I’d want to hear opinions about whether DSLs for parser/generator
looks
“right” from point-of-view of a) native English speakers and b) real
Ruby
ninja. You can examine my parsers in lib/uniforma/parsers/ and
generators in
lib/uniforma/generators/
Again, the library is here: http://rubyforge.org/projects/uniforma/
Thanx.
Zverok.