Arthur Chang

Rails.cache override in development

My friend Calvin (IntoMobile Developer) and I were just talking about keeping your app from caching with Rails.cache in development mode.  One nice way to do it is to wrap all your Rails.cache calls in your application_controller.rb.  In this wrapping call, you can determine whether or not to execute the caching or to just let it through in development.  Another benefit is being able to change your caching behavior in the future without needing to mess around with every single line of code that you've written for caching.

In my development.rb file, I define the following:

I am using the memcache gem, rather than the more popular memcache-client gem, and running memcached on my development box as well (I like testing with it running, rather than using the built in rails way of storing it in memory).  The reason for using memcache gem is because I'm currently hosting our app on Heroku, which requires this gem to work in tandem with their cloud solution for memcached.

The CACHE constant is setup to use this memcache gem's caching mechanism, which would be equivalent to the built in Rails.cache.  Setting this constant up allows me to change this in my environment files once, without ever needing to worry about all the places I set it up.

Then there's the cache_store, which I set correctly to use Memcached gem.  This will help me store the cache correctly.

CACHE_DEVELOPMENT constant is for me to turn caching on / off for development mode.  You'll see how it works in the next piece of code below.

The last two lines that are commented out basically control action and class caching, which doesn't really effect if Rails.cache even goes through.  So just setting these two things to false, Rails.cache will still try to cache.  In order to keep this from happening, we have to do some logic in our application controller wrapper, here it is:

The cache_fetch method will be called everytime I want a cache call from a controller.  I pass in the cache_key as the first parameter to basically name this cache result, and I also accept the period of time the cache is kept before it expires.  0 means it never expires, so that is the default value.

The conditional statement basically allows me to turn caching on or off.  If I'm in the development environment and the constant CACHE_DEVELOPMENT is set to false, I just let the block through without hitting any caching.  Otherwise it goes into the caching logic.

Check out that # CACHE.fetch(key, time_expire){yield} block.  This is how it used to work, but recently a few things changed so that I needed to do a get, and if that didn't exist I'd have to manually set it.  I'm not sure why this happened, or if it's still needed, but you can see how I was able to change this detail in one place, rather than everywhere I tried to cache.

The rest shows you how I'm caching.  This would be ugly if I had to do all of that everytime I wanted to cache a result, and even worse if something changed.  This way I can control the development environment for testing caching, as well as keeping code as DRY as possible.

Here's a simple example of how I call the cache_fetch action from my controller:

Hope that helps!

 

Tagged  //   code   memcached   ruby on rails  

Monitor memcached with Monit and alert with Gmail

I setup monit on a production slice running Ubuntu to monitor memcached using the directions on Luc Castera's blog post.  And for lack of a reliable smtp server setup, I'm just using GMail's SMTP server to send us the alert emails.  GMail SMTP requires SSL, and using monit version 4.8.x won't work.

Write a monit configuration file:

sudo touch /etc/monitrc

chmod 700 /etc/monitrc

Create a directory to hold individual monit monitoring configurations (for memcached:)

sudo mkdir /etc/monit.d

sudo touch /etc/monit.d/memcached.monit

The above creates an empty monit file for you to add some configurations.  Here is the monitrc file configuration I'm using (minus a few password related stuff):

set daemon 120 # Poll at 2-minute intervals

        set logfile /home/arthur/logs/monit.log # saves the log to a log file

        set httpd port 2812 and use address MY_IP_ADDRESS # this is monit's httpd server, shows you graphically what's happening

                allow MY_IP_ADDRESS # this is what IP's are allowed to access the monit page

                allow localhost # localhost is good for testing locally

            allow username:password # Allow Basic Auth

        set mailserver smtp.gmail.com port 587 username "emailname@gmail.com" password "gmailpass" using tlsv1 with timeout 30 seconds # this basically allows you to use gmail's smtp server.  tlsv1 gets things going in SSL.

        set alert globalemail@email.com # set this as a default email to send alerts to if you don't specify later on

include /etc/monit.d/* # includes your memcached.monit config file, and any others you write

So before you can test it out, since we're asking to include config files for stuff, we have to write one or else monit will complain it can't find any to load, so here's what I'm doing for memcached.monit:

check process memcached with pidfile "/home/arthur/logs/memcached.pid" # this is where the pid file will be

                start = "/usr/bin/memcached -u root -d -P /home/arthur/logs/memcached.pid" # this starts it with the root user, daemonized, and stores the pid in the file mentioned, the pid is important to be used all over.

                stop = "/bin/kill -9 cat `/home/arthur/logs/memcached.pid`; rm /home/arthur/logs/memcached.pid" # kills it and also removes the pid file to stop

                if failed host 127.0.0.1 port 11211 then restart # restart it if it dies on localhost on the default port of 11211

                if cpu usage is greater than 60 percent for 2 cycles then alert # if it uses too much cpu we'll send out an alert

                if cpu usage > 98% for 5 cycles then restart # if it uses tons of cpu, restart.  this should probably not be so high

                if 2 restarts within 3 cycles then timeout # timeout if you try too many times

        alert email@email.com # alert email

The cpu usage stuff is just some random stuff, so make sure you tune those to what you need.

Check your syntax by:

sudo monit -t

It should all check out OK, if not it will give you a bunch of errors.  Also if it can't find your monitrc config file, specify it with the -c flag.  If you did everything posted in the blog post the -c flag shouldn't be needed, as it'll find the config in the path.

Now big drumroll, start up monit

sudo monit

That's it, check the status with sudo monit status, and you can manually start memcached if not already with sudo monit start memcached.  Let me know if any of the info above is inaccurate or you have questions!

Tagged  //   code   memcached   monit  

memcached with passenger, ree, and the memcache-client gem

Memcached is pretty easy to setup, but there are a few items that are super strange.  This is how I set it up locally to test and on a production slice:

Production: Passenger 2.2.1, REE, rails 2.3.2, memcached, memcache-client gem, systemtimer gem, slicehost slice

Development: mongrels as usual, rails 2.3.2, ruby 1.8.6, memcached, memcache-client gem, systemtimer gem, Mac OS X Leapard

Development:

  • Install macports if you haven't already
  • sudo port install memcached
  • sudo gem install memcache-client
  • sudo gem install systemtimer
  • memcache -vv # this is the verbose for testing
  • I have yet to figure out how to get development working without memcached running when you don't care for it.

Production:

  • sudo apt-get memcached
  • sudo gem install memcache-client
  • sudo gem install systemtimer
  • memcache -d # this daemonizes it with the default IP to 127.0.0.1 and port 11211

Important Notes:

  • There is no configuration needed for apache/passenger.
  • memcache-client 1.5.0 is actually bundled with rails 2.1, but I would highly suggest upgrading to the memcache-client 1.7.2.  Read why here.  In a nutshell, it's WAY faster.
  • Check out his commit recently about system timer, this is extremely important for those running with REE.  He has not released this yet, but I've tried it on my slice and it works fine, so go ahead and edit your memcache.rb file with these changes.
  • Make sure you install all our gems with ruby enterprise.  Symlink your /usr/bin/ruby to the enterprise one, or make the enterprise ruby the first and only one to show up in your path.
  • Marshal serializes objects into memcached, and de-serializes them (is that the way to say it?) when you want to pull it out.  This way you can store more than just strings.  By passing in true as the third parameter of a fetch, you can do a raw add to memcache, but when you pull it back out, it will only come out as a string.  See more stuff later in this post about this.
  • Objects added or get'd from memcached need to be serializable!  Passenger gives you a horrible error message pointing to a line in memcache.rb where it does a Marshal.load or Marshal.dump, but tells you nothing else.  That's a good indication to check the objects you're returning in a fetch/add/get block to see if they are serializable.  If not write your own, see more further along in this post about it.
  • Passenger does smart spawning, which is great, but also freaks out memcached.  This is where we use memcache-client gem to do a reset whenever Passenger forks.  What smart spawning does, in short, is that whenever passenger needs a new worker process it loads it up with the an already loaded Rails application/framework, rather than loading the entire app and framework for each worker process.  It only does this once.  Really fast with REE and Passenger lined up.  REE improves this because it is copy-on-write friendly which means the worker processes will share as much memory as possible.  If you want to know more, read the two links in this bullet point that i mentioned.
  • How to re-establish connection with memcached in your rails app then specifically?  See below:

Setting up your production.rb to use memcached and to solve the smart spawning issue that Passenger has

# set cache classes to true
config.cache_classes = true
config.action_controller.consider_all_requests_local = false

# of course you want to perform caching
config.action_controller.perform_caching             = true

config.cache_store = :mem_cache_store
memcache_options = {
  :c_threshold => 10000,
  :compression => true,
  :debug => false,
  :namespace => 'a',
  :readonly => false,
  :urlencode => false

}

# require the new gem, this will load up 1.7.2 instead of using the built in 1.5.0
require 'memcache'

# make a CACHE global to use in your controllers instead of Rails.cache, this will use the new memcache-client 1.7.2
CACHE = MemCache.new memcache_options

# connect to your server that you started earlier
CACHE.servers = '127.0.0.1:11211'

# this is where you deal with passenger's forking
begin
   PhusionPassenger.on_event(:starting_worker_process) do |forked|
     if forked
       # We're in smart spawning mode, so...
       # Close duplicated memcached connections - they will open themselves
       CACHE.reset
     end
   end
# In case you're not running under Passenger (i.e. devmode with mongrel)
rescue NameError => error
end

And finally, the magical part of caching in your controllers:

CACHE.fetch('cachekey', 1.hour) { # block }

The above is using our CACHE global, that uses memcache-client 1.7.2.  memcache-client gem basically helps rails talk to memcached server.  No apache settings needed here at all.

The cache key should be unique enough so that the items in the block will be valid.

The second parameter is the timed expiration.  Be careful, Rails.cache.fetch accepts expiration as a hash parameter, :expires_in => 1.hour.  This is not the case for memcache-client, you must pass it in as a regular parameter.

The block can hold any ruby code you want.  It has access to any variables etc. normally accessible at this point.  The very last thing in that block is returned.  Great!  But something very important:

The returned object (and if a hash or array or an enum'd object, all objects there within) MUST be serializable.  If not, you're going to get some crazy ass error messages that says nothing about being serializable, and will give you headaches.  If something is not serializable, Marshalling will error out without telling you exactly what's happening.  You can write your own serialization for special objects to get around this, or save things in a custom hash if you don't really care about the entire object.  That custom hash would just hold all the values you need outside of the fetch block, and that hash can be returned.  here's an example of the error I got from Passenger when trying to hit an action that had an unserializable object returning:

[Sun May 03 23:14:57 2009] [error] [client 76.239.166.13] Premature end of script headers: amazon, referer: [ADDRESS_REMOVED_FOR_THIS_BLOG_POST]
[ pid=5227 file=ext/apache2/Hooks.cpp:546 time=2009-05-03 23:14:57.657 ]:
  Backend process 5255 did not return a valid HTTP response. It returned no data.
/opt/ruby-enterprise-1.8.6-20090201/lib/ruby/gems/1.8/gems/memcache-client-1.7.2/lib/memcache.rb:335: [BUG] non-initialized struct

If you see the above, immediately check the objects returning from the fetch block.

That's all for now, hope that was helpful for those with passenger, ree, and the memcache-client gem!

Big thanks to all the help from Mike Perham, who maintains the memcache-client gem (amongst other amazing feats), Michael Simons who wrote about the Passenger issues with more emphasis with solutions when using memcache-client gem, and all the folks at memcached google group and phusion passenger google group.  And big help from joe who helped troubleshoot all the code the whole time and dealt with all my crap.

Tagged  //   code   memcached   rails