Hello,
when a bot visits our page, we want to create a response that is
different
than the response for a human. In particular we want to limit the
hierarchies of menus so that the bot doesnt think there are too may
tags/keywords for the site (and thus they dont get indexed) as well we
do
not display ads for bots.
We cache most of our pages using standard rails caching.
The problem is that when a bot visits the site, it will get the standard
cached page with the incorrect menus. We want to work around this so
that
bots do not get cached pages. (Yes, it is good if the bots get a
bot-specific cached page, but lets keep it simple so far).
We did try numerous the following:
- using apache rewrite to add /robot to the URL. then mongrel will
never
find the cached page on disk. The problem here is that all the links in
the
page have /robot in front of them. So, having apache add /robot to the
URI
results in the PATH_INFO as-seen by rails to have /robot in it. Another
problem was that we effectivly duplicated the set of rules in routes.rb
for
the cases with /robot as a prefix.
We have 2 ideas:
- when apache detects a bot, send the request to a non-caching
webserver.
Does anyonme know one of these? - I edited the mongrel source code in
mongrel-1.1.4/lib/mongrel/rails.rb
and added this kind of thing to the process method:
do_not_cache = KNOWN_ROBOT_AGENTS.detect{ |b|
user_agent.downcase.include?(b) } if user_agent
And then used that variable in the tests for @files.can_serve(…)
This works but we still want mongrel to serve static files as cached (so
the
rules above can take care of this too, it just gets more complicated to
check for /stylesheets, /.images etc.).
Question: is there a way to plug-in our own logic into the mogrel
process of
handling a request? And/or can we set up a specific mongrel to never
cache
(are there options for this)?
Any ideas are appreciated,
Mike