3. What to Monitor?
network connectivity
database connectivity
bandwidth
computer resources
free RAM
CPU load
disk space
events
4. What do you use?
Self monitoring tools
runit
monit
nagios
Web based monitoring services
Montastic (http://www.montastic.com)
Monitor (http://mon.itor.us)
Site24x7 (http://site24x7.com)
6. Features
Open Source
Configuration file written in ruby
Easily write your own custom conditions in Ruby
Supports both poll and event based conditions
Different poll conditions can have different intervals
Integrated notification system (write your own too!)
Easily control non-daemonizing scripts
Best for RubyOnRails and Merb
7. Installation
Available as a rubygem at http://github.com/mojombo/
god, latest stable release is 0.7.11
Works on Linux (kernel 2.6.15+), BSD, and Darwin
systems
The following systems have been tested.
Darwin 10.4.10
RedHat Fedora Core 6
Ubuntu Dapper (no events)
Ubuntu Feisty
CentOS 4.5 (no events)
8. # run with: god -c /path/to/rails/root/config/monitor.rb
RAILS_ROOT = quot;/path/to/rails/rootquot;
%w{4000}.each do |port|
God.watch do |w|
w.group = quot;mongrelquot;
w.name = quot;mongrel-#{port}quot;
w.interval = 60.seconds # default
w.start = quot;mongrel_rails start -c #{RAILS_ROOT} -p #{port} -P #{RAILS_ROOT}/log/mongrel.pid -dquot;
w.stop = quot;mongrel_rails stop -P #{RAILS_ROOT}/log/mongrel.pidquot;
w.restart = quot;mongrel_rails restart -P #{RAILS_ROOT}/log/mongrel.pidquot;
w.start_grace = 10.seconds
w.restart_grace = 10.seconds
w.pid_file = File.join(RAILS_ROOT, quot;log/mongrel.pidquot;)
Config file
w.behavior(:clean_pid_file)
w.start_if do |start|
start.condition(:process_running) do |c|
c.interval = 30.seconds
c.running = false
end
end
w.restart_if do |restart|
restart.condition(:memory_usage) do |c|
c.above = 150.megabytes
c.times = [3, 5] # 3 out of 5 intervals
end
restart.condition(:cpu_usage) do |c|
c.above = 50.percent
c.times = 5
end
end
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minute
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 2.hours
end
end
end
end
10. Config file (contd.)
w.start_if do |start|
start.condition(:process_running) do |c|
c.interval = 30.seconds
c.running = false
end
end
w.restart_if do |restart|
restart.condition(:memory_usage) do |c|
c.above = 150.megabytes
c.times = [3, 5] # 3 out of 5 intervals
end
restart.condition(:cpu_usage) do |c|
c.above = 50.percent
c.times = 5
end
end
11. Config file (contd.)
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minute
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 2.hours
end
end
15. Transitions & Events
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
end
end
16. Watching Non-Daemon
Processes
God.pid_file_directory = '/path/to/pid_file_directory'
God.watch do |w|
# watch with no pid_file attribute set
end
17. Loading Config Files
# load in particular god configs
God.load quot;/path/to/config.godquot;
$ god load path/to/config.god
18. Drawbacks
No dashboard, statistical data, graphical UI
reduces ease of monitoring remotely
Uses ruby
installing it, other related rubygems
High memory consumption just for monitoring as
compared to other command line monitoring tools like
runit