I was wondering what experience the group has for monitoring tools,
particularily monit and nagios? We have a PHP platform that we are
pushing live in Rails soon, and we need to get a better handle on the
health of our servers and the overall load we have to deal with. Our
site is growing and just having a terminal open for every server
running ‘sar -u 2 0’ doesn’t scale very well =).
Also, any real world experiences with fiveruns.com would be great.
We use both monit and nagios – I find nagios hard to setup, every
time, but it sure does work well. On the purely rails and mongrel
side of things, I use a custom capistrano task to setup monit along
with mongrel_cluster to manage mongrels. It looks a little something
like this:
desc “Setup monit daemon monitoring”
task :setup_monit do
monit_configuration = render :template => <<-EOF
This monit configuration was generated dynamically
EOF
(0…mongrel_servers-1).each do |server|
monit_configuration +=<<-EOF
check process mongrel-#{mongrel_start_port + server} with pidfile
#{deploy_to}/current/log/mongrel.#{mongrel_start_port + server}.pid
group mongrel
start program = “/usr/local/bin/ruby /usr/local/bin/mongrel_rails
start -d -e production -p #{mongrel_start_port + server} -a 127.0.0.1
-l #{deploy_to}/current/log/mongrel.log -P
#{deploy_to}/shared/log/mongrel.#{mongrel_start_port + server}.pid -c
#{deploy_to}/current”
stop program = “/usr/local/bin/ruby /usr/local/bin/mongrel_rails
stop -P #{deploy_to}/shared/log/mongrel.#{mongrel_start_port +
server}.pid”
if totalmem > 100.0 MB for 5 cycles then restart
if failed port #{mongrel_start_port + server} protocol http with
timeout 10 seconds then restart
EOF
end
put monit_configuration, “/etc/monit.d/rails.conf”
end
You’ll need to add the appropriate --user and --group in there too,
otherwise they start back up as root. I’ve been experimenting with
that timeout – it’s pretty low… Also probably could use the
addition of Bradley Taylor’s mongrel_cluster --cleanup with single
mongrels specified…
Let us (rails-deployment) know what you go for.
shameless plug – I also use heartbeat for simple monitoring of my
site and making sure important rails actions are still responding in a
timely manner: http://heartbeat.highgroove.com/
Thanks for the advice here…we have nagios installed, just not
configured at all yet.
Also, I knew someone had developed a very simple monitoring app like
Heartbeat, I just couldn’t remember the name. Thanks for reminding me
off it before I went a wrote something myself =). It seems the
Railsday SVN doesn’t work anymore, is the code still available and
open to play with or extend?
I tried fiveruns.com and was frustrated. You have to install their
binary daemon as root. There was no documentation on how to monitor
different services.
Rob S. wrote:
Hi all,
I was wondering what experience the group has for monitoring tools,
particularily monit and nagios? We have a PHP platform that we are
pushing live in Rails soon, and we need to get a better handle on the
health of our servers and the overall load we have to deal with. Our
site is growing and just having a terminal open for every server
running ‘sar -u 2 0’ doesn’t scale very well =).
Also, any real world experiences with fiveruns.com would be great.
Hey Rob - I can’t answer your question, but I did come across this
other - http://www.hyperic.com/ - nice eye candy there.
one to add to your list.
Cheers,
Jodi
General Partner
The nNovation Group inc.
www.nnovation.ca/blog
Thanks for the props, Jodi. Speaking as someone from the industry - its
really hard to tell whats good and whats lacking out there. I’ll admit
being a Hyperican right up, but I’ll shoot straight with you on the ones
you’ve questioned. FiveRuns is an interesting business as it is hosted
monitoring. Which means someone else is managing your monitoring
solution. Some find it nice to offload this completely, but most Admins
I talk to won’t hear of not having complete control and the monitoring
solution behind their own firewall. Nagios has been around for ages.
Huge community, generally works well. But it is HARD to set up, very
fragile, and has challenges when you hit scalability and compatibility
with anything non-SNMP based for the most part. Hyperic is a much more
modern approach, that was built for flexibility and web infrastructure.
A perfect fit for Rails. Hyperic monitors everything from the networks,
hardware, software and services and does auto-discovery to keep up with
the changes. Hyperic also just started code on our summer release, and
are incorporating Rails into our code. So far, we’re HUGE fans. Which
means we are also building out a plugin to manage the internals of Rails
specifically as well. The plugin should be simple, as our plugin
architecture is pretty clean and flexible, we just haven’t had a working
Rails app to play with until this project. Anyone wants to help out,
we’d love the contributions - even if its only testing!
For all those who have had problems setting up Nagios in the past
there is a project in Sourceforge called Blue (http:// blue.sourceforge.net/) which is a Java port of Nagios that comes with
a nifty configuration tool that you can also use on both new and
existing Nagios installations, its not written in Rails buts its a
definite improvement!
Jon
On Apr 11, 3:15 am, Stacey Schneider <ruby-forum-incom…@andreas-
General Partner
solution behind their own firewall. Nagios has been around for ages.
architecture is pretty clean and flexible, we just haven’t had a working
Big Sister http://bigsister.graeff.com/ is a simpler, but easier to
configure option. It keeps history by default, I think, and it runs on
port 1984, which I like.
We sell Fiveruns subscriptions as an add on service to our virtual
and dedicated server packages. We love the service and our customers
have been quite happy with. Fiveruns has a strong platform and I’d be
happy to answer any questions you have about it (although I’m in the
digest, so CC me with any questions).