Yes. I have run into this before. Mongrel will error on an invalid HTTP
URI, with one common case being characters not properly escaped, which
is what your example is. When one of the developers of my app brought
this up before, he was told by the Mongrel developer that this was
intentional, and would not be changed.
I didn’t like this then, and I don’t like it now, for a variety of
reasons, including that my app needs to respond to URLs sent by third
parties that are not under my control. Perhaps the current mongrel
developers (IS there even any active development on mongrel?) have a
different opinion, and this could be changed, or made configurable.
In the meantime, I have gotten around it with some mod_rewrite rules in
apache on top of mongrel, to take illegal URLs and escape/rewrite them
to be legal. Except due to some weird (bugs?) in apache and mod_rewrite
around escaping and difficulty of controlling escaping in the apache
conf, I actually had to use an external perl file too. Here’s what I do:
Apache conf, applying to mongrel urls (which in my setup are all urls on
a given apache virtual host)
RewriteEngine on
RewriteMap query_escape
prg:/data/web/findit/Umlaut/distribution/script/rewrite_map.pl
#RewriteLock /var/lock/subsys/apache.rewrite.lock
RewriteCond %{query_string} ^(.[><].)$
RewriteRule ^(.*)$ $1?${query_escape:%1} [R,L,NE]
The rewrite_map.pl file:
#!/usr/bin/perl
$| = 1; # Turn off buffering
while () {
s/>/%3E/g;
s/</%3C/g;
s/\//%2F/g;
s/\\/%5C/g;
s/ /\+/g;
print $_;
}
Looks like I’m not actually escaping bare ‘%’ chars, since i hadn’t run
into those before in the URLs I need to handle. It would be trickier to
add a regexp for that, since you need to distinguish an improper % from
an % that’s actually part of an entity reference. Maybe something like:
s/%([^A-F0-9]|$)([^A-F0-9]|$)/%25/g;
‘/%25’ would be a valid URI path representing the % char. ‘/%’ is not.
Hope this helps,
Jonathan
Robbie A. wrote:
but is there anyway to have mongrel ignore lone percent signs? Or
perhaps a Nginx rewrite rule that will encode extraneous percent signs?
–
Jonathan R.
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu