Arthur Chang

Using GTalk and XMPP to get online friends

Inviting a friend to join you in some real time application is an easy concept, but hard to do from the perspective of that real time app.  You need near instant response from a friend, and what better way than to find a friend who's online and available through protocols used by chat applications?

For a web application that only requires one request to find all online users, and then another request to send an invitation, a full fledged chat client is overkill.  Why build out a fully working client that scales with millions of connections, just to get a buddy list of all your online contacts?  Unless you're building Meebo you really don't need this.

I was trying to solve that very problem, invite online friends!  Google Contacts Data API did give me all the people in a Google contact list, but provided me nothing other than a last update sort.  Last updated contacts wasn't going to cut it.  So I turned to writing a XMPP type request.

Since I'm using the Ruby on Rails framework for this application, I decided to use the XMPP4R gem.  It's probably the most currently developed and up to date gem out there for XMPP and Ruby.

The Roster Helper that is bundled with XMPP4R has a handy way to get your initial buddy list, online and offline users alike.  The problem is, the XML response will not include presence updates.  The roster helper does do some kind of presence updates of it's own, but it was too inconsistent to use.  The correct way to deal with presence as an XMPP client would be to wait on a callback for presence updates.  The initial update would send you all of your online buddy's statuses.

First set up the Jabber client in order to connect to the GTalk XMPP server:

 

        cl = Jabber::Client.new(Jabber::JID::new(email))

        cl.connect('talk.google.com', 5222)

        cl.auth(password)

 

This above creates a new client connection with Google Talk.  The connect command accepts a server, which for GTalk is 'talk.google.com', and the port 5222.  5222 is the standard for most XMPP servers.  Next is the tricky part where you want to write a callback for presence updates:

 

# get the roster

        roster     = Jabber::Roster::Helper.new(cl)

        mainthread = Thread.current

        roster.add_presence_callback { |item,oldpres,pres|

  # code I left out in here populates the name, jid, and presence stuff

  # see the xmpp protocol response docs for specifics on how to get the info

          @list  << {:name => name, :jid => jid, :presence => pres.show.to_s}

        }

 

The roster is first initialized, and we set up a presence callback.  In this callback, we just populate a list of buddies represented as a hash.  Name is their nickname you've given your buddy, the jid is the jabber id.  In GTalk, the jid is actually just their email address.  Presence hash that I put together is actually their availability status, which can be one of the following: away, do not disturb, extended away, nil.  Nil is just available and online.  

Notice that we also save the current thread as our main thread, this is important because we're going to block this thread until we get a few seconds worth of initial presence updates:

        # send initial presence

        cl.send(Jabber::Presence.new.set_show(:dnd))

 

        # continue thread after timeout

        t = Thread.new { sleep XMPP_REQUEST_TIMEOUT; mainthread.wakeup;}

 

        # block thread

        Thread.stop

To get GTalk to start calling the callback function, send an initial presence of the current user.  In this case I'm sending do not disturb using the :dnd symbol.  The next section is where I create a new Thread, that sleeps for a specified number of seconds, then wakes the main thread up.  What does this all mean?  Well keep looking, right after I have that new thread created, I block the current one.  The current thread is the main thread, the one that's doing the fetching of online buddies.  The client's browser is still waiting until the mainthread starts up again.  Notice that we just wait for some amount of seconds before continuing, with NO indication of how many or if we have gotten all of our online buddies.  That is the shaky part.  I noticed that waiting 2 - 3 seconds gets all of my online friends, and using some "loading" graphic on the browser page keeps people from bailing out.  Each callback has it's own thread, so to keep them from spawning over and over again, we need to kill the client connection.

So the main thread blocks right where we say Thread.stop, and continues on to render the page after we wake the main thread up.  After we wakeup the thread, we immediately close the client connection, and go on to rendering the page with your populated list of online friends:

        # kill client connection

        cl.close

 

In the view code, we simply run through the array of friends and display their statuses.  We then allow people to click and invite their friends to the site immediately.

A lot of the code has been stripped out or simplified from what we've actually used, but you'll get the point.  If anyone has feedback, suggestions, or comments, please post them!  I'd love to see how to better solve this problem.

Tagged  //   code   xmpp   xmpp4r  

Javascript Closures

I had a good conversation today that brought up Javascript Closures, which was me mostly having no idea what closures were, and the concept just being pretty hard to explain right off the bat.  Anyway, that all lead me to go checkout what closures are.

In a nutshell, Javascript Closures are most likely used in everyone's code already, and the actual technical aspect is not new, it just has a name.  But knowing about closures can be a powerful tool to avoid using gnarly global variables and needing to keep track of variables, parameters, and such that aren't necessarily needed all the time, globally.

A closure can be seen as an inner function within an outer function, where the inner retains the local variables, parameters, and what not of the outer function, even after the outer function has returned.

Some other ways to see it is that the closure can be a stack-frame that's not deallocated when when the outer function returns.

Again, that might be totally confusing, but take a common occurance as an example (example is using JQuery helpers, got the example from here):

function SetClassOnHover(className){
  $
("td").hover(
   
function () {
      $
(this).addClass(className);
   
},
   
function () {
      $
(this).removeClass(className);
   
}
 
);
}

What happens is, when the td gets a hover state, you add and remove the class names that were parameteres initially set in the SetClassOnHover function.  SetClassOnHover has already returned, yet when the hover observer is fired off, you still know the className.  That's how closures work!  Hope that makes sense, leave any comments feedback that you have to help clarify or correct.

Tagged  //   code   javascript  

Monitor memcached with Monit and alert with Gmail

I setup monit on a production slice running Ubuntu to monitor memcached using the directions on Luc Castera's blog post.  And for lack of a reliable smtp server setup, I'm just using GMail's SMTP server to send us the alert emails.  GMail SMTP requires SSL, and using monit version 4.8.x won't work.

Write a monit configuration file:

sudo touch /etc/monitrc

chmod 700 /etc/monitrc

Create a directory to hold individual monit monitoring configurations (for memcached:)

sudo mkdir /etc/monit.d

sudo touch /etc/monit.d/memcached.monit

The above creates an empty monit file for you to add some configurations.  Here is the monitrc file configuration I'm using (minus a few password related stuff):

set daemon 120 # Poll at 2-minute intervals

        set logfile /home/arthur/logs/monit.log # saves the log to a log file

        set httpd port 2812 and use address MY_IP_ADDRESS # this is monit's httpd server, shows you graphically what's happening

                allow MY_IP_ADDRESS # this is what IP's are allowed to access the monit page

                allow localhost # localhost is good for testing locally

            allow username:password # Allow Basic Auth

        set mailserver smtp.gmail.com port 587 username "emailname@gmail.com" password "gmailpass" using tlsv1 with timeout 30 seconds # this basically allows you to use gmail's smtp server.  tlsv1 gets things going in SSL.

        set alert globalemail@email.com # set this as a default email to send alerts to if you don't specify later on

include /etc/monit.d/* # includes your memcached.monit config file, and any others you write

So before you can test it out, since we're asking to include config files for stuff, we have to write one or else monit will complain it can't find any to load, so here's what I'm doing for memcached.monit:

check process memcached with pidfile "/home/arthur/logs/memcached.pid" # this is where the pid file will be

                start = "/usr/bin/memcached -u root -d -P /home/arthur/logs/memcached.pid" # this starts it with the root user, daemonized, and stores the pid in the file mentioned, the pid is important to be used all over.

                stop = "/bin/kill -9 cat `/home/arthur/logs/memcached.pid`; rm /home/arthur/logs/memcached.pid" # kills it and also removes the pid file to stop

                if failed host 127.0.0.1 port 11211 then restart # restart it if it dies on localhost on the default port of 11211

                if cpu usage is greater than 60 percent for 2 cycles then alert # if it uses too much cpu we'll send out an alert

                if cpu usage > 98% for 5 cycles then restart # if it uses tons of cpu, restart.  this should probably not be so high

                if 2 restarts within 3 cycles then timeout # timeout if you try too many times

        alert email@email.com # alert email

The cpu usage stuff is just some random stuff, so make sure you tune those to what you need.

Check your syntax by:

sudo monit -t

It should all check out OK, if not it will give you a bunch of errors.  Also if it can't find your monitrc config file, specify it with the -c flag.  If you did everything posted in the blog post the -c flag shouldn't be needed, as it'll find the config in the path.

Now big drumroll, start up monit

sudo monit

That's it, check the status with sudo monit status, and you can manually start memcached if not already with sudo monit start memcached.  Let me know if any of the info above is inaccurate or you have questions!

Tagged  //   code   memcached   monit  

Position Fixed and PNG on Internet Explorer 6

Internet Explorer 6, (excuse my language) is basically the bane of the existence of the internet.  The time spent wrestling with this car-with-square-wheels easily adds up to enough time to take a 2 month vacation.  If enterprises would just give their machines a refresh with IE7 or even FireFox, the web developer world would be a much better one.  Developers would be living much longer lives, have fewer grey hairs, have more girlfriends, and fewer wrinkles.  Seriously, why are we enslaved by IE6?

I have yet to find a perfect and elegant solution that can be applied to completely negate the neediness of IE6 and the "if lt IE 6" tag you use to load up specific ie 6 css rules.  The closest thing I've found to date is something my friend joe pointed me to.  Yes again, Joe is a source of a lot of good ideas.  It is a javascript library that fixes a bunch of stuff in ie 6: http://code.google.com/p/ie7-js/

The IE7.js file makes IE 6 perform more like IE7, and they also have an IE8.js that does the same for IE7 to IE8.  Problem is, it's definitely not a catch all and has a few problems, the biggest being that IE7.js works much better on ie6 than IE8.js does.  In fact I got my IE6 in much better shape than IE7 faster.

The PNG transparency fix it comes with is by far the best one I've found so far, not messing up any of my existing styles, working on background images (though not on tiling backgrounds), just by naming your png's that you want transparent with a -trans.png extension.

And amazingly enough, position fixed works like a charm in ie 6!  There are some other subtle fixes, like widths and such are less crazy, but still needs tuning.

I still spent almost 6 hours so far getting ie 6 to look and work better.  lots of javascript problems that can't be solved, even using prototype and scriptaculous.

Biggest caveat is that this is not css, it's javascript, so your DOM loads real ugly at first, and then the javascript kicks in and adjusts everything.  Load this file pretty early on so that other javascript doesn't start firing off without the fixes that the IE7.js provides.  I feel like the slow experience, and the shitty looks you get are deserved for using IE6.  Alas, I will be addressing all these issues into something more elegant.  More on that later... much later.

One day, our geek children will be amazed and have endless sympathy for their father developers and ie6, while they happily develop and visit the Maldives in their hovercraft or with the teleporter.

What am saying?  http://www.saveie6.com let the future feel the pain.

Tagged  //   code  

memcached with passenger, ree, and the memcache-client gem

Memcached is pretty easy to setup, but there are a few items that are super strange.  This is how I set it up locally to test and on a production slice:

Production: Passenger 2.2.1, REE, rails 2.3.2, memcached, memcache-client gem, systemtimer gem, slicehost slice

Development: mongrels as usual, rails 2.3.2, ruby 1.8.6, memcached, memcache-client gem, systemtimer gem, Mac OS X Leapard

Development:

  • Install macports if you haven't already
  • sudo port install memcached
  • sudo gem install memcache-client
  • sudo gem install systemtimer
  • memcache -vv # this is the verbose for testing
  • I have yet to figure out how to get development working without memcached running when you don't care for it.

Production:

  • sudo apt-get memcached
  • sudo gem install memcache-client
  • sudo gem install systemtimer
  • memcache -d # this daemonizes it with the default IP to 127.0.0.1 and port 11211

Important Notes:

  • There is no configuration needed for apache/passenger.
  • memcache-client 1.5.0 is actually bundled with rails 2.1, but I would highly suggest upgrading to the memcache-client 1.7.2.  Read why here.  In a nutshell, it's WAY faster.
  • Check out his commit recently about system timer, this is extremely important for those running with REE.  He has not released this yet, but I've tried it on my slice and it works fine, so go ahead and edit your memcache.rb file with these changes.
  • Make sure you install all our gems with ruby enterprise.  Symlink your /usr/bin/ruby to the enterprise one, or make the enterprise ruby the first and only one to show up in your path.
  • Marshal serializes objects into memcached, and de-serializes them (is that the way to say it?) when you want to pull it out.  This way you can store more than just strings.  By passing in true as the third parameter of a fetch, you can do a raw add to memcache, but when you pull it back out, it will only come out as a string.  See more stuff later in this post about this.
  • Objects added or get'd from memcached need to be serializable!  Passenger gives you a horrible error message pointing to a line in memcache.rb where it does a Marshal.load or Marshal.dump, but tells you nothing else.  That's a good indication to check the objects you're returning in a fetch/add/get block to see if they are serializable.  If not write your own, see more further along in this post about it.
  • Passenger does smart spawning, which is great, but also freaks out memcached.  This is where we use memcache-client gem to do a reset whenever Passenger forks.  What smart spawning does, in short, is that whenever passenger needs a new worker process it loads it up with the an already loaded Rails application/framework, rather than loading the entire app and framework for each worker process.  It only does this once.  Really fast with REE and Passenger lined up.  REE improves this because it is copy-on-write friendly which means the worker processes will share as much memory as possible.  If you want to know more, read the two links in this bullet point that i mentioned.
  • How to re-establish connection with memcached in your rails app then specifically?  See below:

Setting up your production.rb to use memcached and to solve the smart spawning issue that Passenger has

# set cache classes to true
config.cache_classes = true
config.action_controller.consider_all_requests_local = false

# of course you want to perform caching
config.action_controller.perform_caching             = true

config.cache_store = :mem_cache_store
memcache_options = {
  :c_threshold => 10000,
  :compression => true,
  :debug => false,
  :namespace => 'a',
  :readonly => false,
  :urlencode => false

}

# require the new gem, this will load up 1.7.2 instead of using the built in 1.5.0
require 'memcache'

# make a CACHE global to use in your controllers instead of Rails.cache, this will use the new memcache-client 1.7.2
CACHE = MemCache.new memcache_options

# connect to your server that you started earlier
CACHE.servers = '127.0.0.1:11211'

# this is where you deal with passenger's forking
begin
   PhusionPassenger.on_event(:starting_worker_process) do |forked|
     if forked
       # We're in smart spawning mode, so...
       # Close duplicated memcached connections - they will open themselves
       CACHE.reset
     end
   end
# In case you're not running under Passenger (i.e. devmode with mongrel)
rescue NameError => error
end

And finally, the magical part of caching in your controllers:

CACHE.fetch('cachekey', 1.hour) { # block }

The above is using our CACHE global, that uses memcache-client 1.7.2.  memcache-client gem basically helps rails talk to memcached server.  No apache settings needed here at all.

The cache key should be unique enough so that the items in the block will be valid.

The second parameter is the timed expiration.  Be careful, Rails.cache.fetch accepts expiration as a hash parameter, :expires_in => 1.hour.  This is not the case for memcache-client, you must pass it in as a regular parameter.

The block can hold any ruby code you want.  It has access to any variables etc. normally accessible at this point.  The very last thing in that block is returned.  Great!  But something very important:

The returned object (and if a hash or array or an enum'd object, all objects there within) MUST be serializable.  If not, you're going to get some crazy ass error messages that says nothing about being serializable, and will give you headaches.  If something is not serializable, Marshalling will error out without telling you exactly what's happening.  You can write your own serialization for special objects to get around this, or save things in a custom hash if you don't really care about the entire object.  That custom hash would just hold all the values you need outside of the fetch block, and that hash can be returned.  here's an example of the error I got from Passenger when trying to hit an action that had an unserializable object returning:

[Sun May 03 23:14:57 2009] [error] [client 76.239.166.13] Premature end of script headers: amazon, referer: [ADDRESS_REMOVED_FOR_THIS_BLOG_POST]
[ pid=5227 file=ext/apache2/Hooks.cpp:546 time=2009-05-03 23:14:57.657 ]:
  Backend process 5255 did not return a valid HTTP response. It returned no data.
/opt/ruby-enterprise-1.8.6-20090201/lib/ruby/gems/1.8/gems/memcache-client-1.7.2/lib/memcache.rb:335: [BUG] non-initialized struct

If you see the above, immediately check the objects returning from the fetch block.

That's all for now, hope that was helpful for those with passenger, ree, and the memcache-client gem!

Big thanks to all the help from Mike Perham, who maintains the memcache-client gem (amongst other amazing feats), Michael Simons who wrote about the Passenger issues with more emphasis with solutions when using memcache-client gem, and all the folks at memcached google group and phusion passenger google group.  And big help from joe who helped troubleshoot all the code the whole time and dealt with all my crap.

Tagged  //   code   memcached   rails  

Rails cache store class and time-based expiry support with :expires_in option

Cache Store Class

Rails 2.x has an abstract cache store class, which is great to use for caching queries in the controller, but there are a few big gotchas that you'll need to figure out.  The fine print of the docs say little with a lot of links.  The basics of it are that you need to worry about your cache store implementation.  The docs recommends MemCacheStore.  I'm not sure what else you could use out of the box.

MemCacheStore uses memcached as cache storage, and is required to use :expires_in

:expires_in

So :expires_in won't work unless you specify in your settings that you're using the MemCacheStore implementation (or something similar) because MemCacheStore supports the :expires_in option with the write commands.  Otherwise your cache will not expire over time.  It's probably a better idea to use memcached and MemCacheStore on production as it's probably the best solution currently, than to write something to the database that saves off cache times and such.

If anyone has other suggestions to better solutiosn other than MemCacheStore, please post!  This is the only solution I've found looking through the docs.


Tagged  //   code   rails  

Rails and Twitter Signin

For awhile I've been looking for a nice Twitter solution for Rails.  Sure I could've built something on my own, but I have been mostly looking out of curiosity.  The usual suspects were not the greatest, and I couldn't find a lightweight and elegant solution.  Then came TwitterAuth, a plugin written by Michael Bleigh.  It's made for Rails 2.3, mostly because of the Rails engine use, which is pretty slick and a whole other discussion altogether.

The fun part of TwitterAuth is that it uses oauth but is heavily influenced by the restful_authentication that the rails community has adopted as a very standard / solid way to do user authenticated accounts.  What does that mean?  that means it uses controller extensions like "logged_in?" and "current_user" so if you already use restful_authentication, this makes total sense.

Install TwitterAuth as a gem, or as a plugin.  Remember: you need oauth gem installed as well which is taken care of automatically with the gem install, but will be needed with the plugin instal method.

To quickly get into authenticating your users, goto the gem or plugin directory, and checkout his app directory that comes with it.  In there you'll see a user.rb model, a sessions_controller.rb, and some view partials!  This is exactly what you'll need, if not the only things you'll need to get your users immediately working with Twitter OAuth.  No need to write these yourself, grab these from his examples, and modify as needed.  Out of the box they worked perfectly for me.

Don't forget to get your consumer key and secret from Twitter.  Remember that if you send direct messages and stuff to twitter, it will come from the user you apply for the twitter key / secret with.  Meaning, if you use your FooBar twitter account to signup for the Twitter API key / secret, all direct messages will come from FooBar.  I have yet to figure out if we can send them from one specific person who we've authenticated in the past.  Should be easy.

To get the key and secret, you'll need to goto: http://twitter.com/apps.  This link is so buried, it took me forever to find.  That and I hadn't had coffee all day and I was on my 15th hour of working for the day.

Lastly, the OAuth callback is a bit tricky, because if you're working on localhost as a developer, it won't be able to... well, callback, unless you can give it a visible IP.  Without getting into tricks and sorcery, I just gave it a fake callback, and copy and pasted the parameters in the GET callback request and appended it to http://localhost:3000/oauth_callback. ; UPDATE: In the API Changeset of April 23rd, 2009, the oauth_callback is deprecated due to security issues, so no more localhost callback.  UPDATE: cleaverness of sorcery is actually attributed to joe.

Anyway, hope that was fun, go and authenticate yourself like crazy with Twitter =)

Tagged  //   code   rails   twitter  

Rails and processing inbound emails

The TMail library is now included in the latest Rails, and handles inbound emails amazingly well.  No need to go into parsing MIME/SMIME for different email clients anymore, this does it all for you.  I found a lot of resources through searches on the subject, but they were slightly outdated and a bit confusing.  The quick and dirty of it is:

  1. Create a mailer
    • ruby script/generate mailer SomeNameMail
  2. def a receive method that accepts a TMail object - the email parameter is the TMail object in the example below:
    • def receive(email)
      ... # handle mail here (see next step) ...
      end

  3. Retrieve to, from, subject, attachments at more with very simple commands.  Note that email.to and email.from returns arrays in the case that there's more than one person being sent or recieved from, so make sure you grab all of them or just the first one:
    • @to = email.to.first
      @from = email.from.first

  4. And to get the body of the email with the 'text/html' content-type (meaning it comes with all the nice html tags) you need to do a little extra below:
    • @body = body_html(email)
      ...
      def body_html(email)
      result = nil
              if email.multipart?
                  email.parts.each do |part|
                      if part.multipart?
                          part.parts.each do |part2|
                              result = part2.unquoted_body if part2.content_type =~ /html/i
                          end
                      elsif !email.attachment?(part)
                          result = part.unquoted_body if part.content_type =~ /html/i
                      end
                  end
              else
                  result = email.unquoted_body if email.content_type =~ /html/i
              end
              result = email.body if result.nil?
              return result.strip
          end

TMail RDOC: http://tmail.rubyforge.org/rdoc/index.html

To get mail into the receive action is another story.  Read #46 in the Advanced Rails Recipes book for more information on how to do this.  The basics of it is to run a daemon that fetches mail from an inbox and feeds it as TMail to your mailer.  The mail_fetcher script mentioned in the Advanced Rails Recipes does a good job of it.

Attachments are also very simple, just email.attachments returns the array of attachments, which you can then save off to something like Paperclip or filecolumn.

Cheers!

Tagged  //   code   rails  

Running unit test directly through command line in Rails 2

Testing out some mail receiving stuff, and ran into a problem with just running the script using:

ruby test/unit/mail_receive_test.rb

test/unit/mail_receive_test.rb:1:in `require': no such file to load -- test_helper (LoadError)
    from test/unit/mai_receive_test.rb:1

This was because I was not specifying the test directory in the load path, so simply pass in the test directory with the -I flag, and you'll be golden

ruby -Itest test/unit/mail_receive_test.rb

This was all run while in the rails root directory.

Tagged  //   code   ruby  

Connection presence with Orbited

Orbited is a comet technology, allowing you to hookup a message broker and protocol to it to manage messages and other such features you may have.  Orbited currently ships with MorbidQ and STOMP as the default broker and protocol, but if you're looking for more control, as in a chatroom and knowing how many people are in a certain "room", a different protocol would be needed.  I'm going to be looking into plugging in a XMPP protocol into Orbited to replace the current one we're using, which is ActiveMQ with STOMP.  I'll update with more information on that soon.

Here are some resources to learn more about comet, orbited, and such:


Orbited main page (sometimes goes down) and google group:

http://orbited.org/

http://groups.google.com/group/orbited-users


About the Comet neogism:

http://en.wikipedia.org/wiki/Comet_(programming)


Protocol information for Orbited:

http://orbited.org/wiki/Protocols


Tagged  //   code   rails