Monitoring the AWS-S3 gem in RPM : Custom Instrumentation Part 2

By Posted in Tech Topics 6 July 2010

I recently blogged about the rpm_contrib gem, which has custom instrumentation for Camping, Paperclip, MongoDB (via MongoMapper or Mongoid), Resque and Redis that were contributed by expert RPM users. Now let’s look at how easy it can be to add custom instrumentation for a Ruby gem so many of us depend on.


One of the most popular Ruby gems that we’ve seen our customers using over the last two years is the AWS-S3 gem.  Written by Marcel Molina starting in late 2006, the AWS-S3 gem is a Ruby library for Amazon’s Simple Storage Service’s (S3) REST API.  Amazon describes S3 as “storage for the Internet.”  Marcel’s library makes accessing S3 incredibly simple, and the code is really quite elegant.   I should point out that you may have to patch the gem to get access to EU buckets since it appears that the official gem is not actively maintained.

While using the AWS-S3 gem is certainly a convenient way to access S3, using a web service to interact with your static assets can be a potential sore spot for application performance. By instrumenting the AWS-S3 gem, we gain great visibility into how we interact with S3 and get a clearer picture of how that service is performing within our environment.

A Step-by-step Guide to Building Custom Instrumentation for the AWS-S3 Gem

The rpm_contrib gem, like the rpm ruby agent itself, is hosted on github. To contribute to the gem, just fork the project on github, add the instrumentation for the library you need and send a pull request.  Making your contribution on a topic branch and not changing the Rakefile, version or changelog makes it really easy for us to accept your changes and get your code released to the world that much faster.

Setup

  • Fork rpm_contrib on github
  • Clone your fork to your development machine
  • On your local repository, create a topic branch specific to this instrumentation
  • Example: git branch aws-s3; git checkout aws-s3
  • Find an app to test with that’s using RPM.  If you don’t have one handy, you can always sign up for a free account.

What do you want to instrument today?

First, let’s look through the AWS-S3 API and decide what should be instrumented.  We’ve got Buckets and S3Objects all over the place, so we should certainly be instrumenting those.  I see that we’ve got lots of places where we can establish a connection with the S3 service, too.  I wonder how often that happens in a typical app?  Let’s make sure to instrument that too.

A few method tracers, please

Instrumenting methods with RPM in Ruby is pretty straightforward.  In many cases, all we need to do is sprinkle in an add_method_tracer call in the context of the method we want to instrument and the RPM agent does the rest.  What can be tricky is figuring out how to get that add_method_tracer call in the right scope.  Let’s take a look at the AWS-S3 instrumentation piece by piece.

Instrumenting Connections to S3

The establish_connection! method exists in a module that is mixed into several classes.  Let’s make sure we’re catching all connections, not just those that are made through one of the classes that include this module.  To pull this off we can use Module#module_eval to get our instrumentation in there.

::AWS::S3::Connection::Management::ClassMethods.module_eval do
  add_method_tracer :establish_connection!, 'AWS-S3/establish_connection!'
end

Before we go any further, let’s take a quick look at the add_method_tracer method.  add_method_tracer takes a method name, a metric name and some options.  Since custom views in RPM can use regular expressions to grab several metrics to display in a single graph, you may consider a naming convention that makes building custom dashboards easier. If you’re instrumenting a library that performs database-like functions, you should prefix your metric names with Database, so they’ll show up in the database tab of RPM. See the Mongo Mapper instrumentation for an example of this.

Instrumenting Bucket class methods

::AWS::S3::Bucket.instance_eval do
  class << self
    add_method_tracer :create,  'AWS-S3/Bucket/create'
    add_method_tracer :find,    'AWS-S3/Bucket/find'
    add_method_tracer :objects, 'AWS-S3/Bucket/objects'
    add_method_tracer :delete,  'AWS-S3/Bucket/delete'
    add_method_tracer :list,    'AWS-S3/Bucket/list'
  end
end

instance_eval? I thought we were instrumenting the Bucket class? Yep! Classes in Ruby are objects. The Bucket class is an instance of the Class class, which has various methods on it that we’d like to instrument. In the context of that class instance, we can add custom instrumentation for each class method.

Instrumenting Bucket instance methods

::AWS::S3::Bucket.class_eval do
  add_method_tracer :[],        'AWS-S3/Bucket/#{self.name}/[]'
  add_method_tracer :new_object,'AWS-S3/Bucket/#{self.name}/new_objects'
  add_method_tracer :objects,   'AWS-S3/Bucket/#{self.name}/objects'
  add_method_tracer :delete,    'AWS-S3/Bucket/#{self.name}/delete'
  add_method_tracer :delete_all,'AWS-S3/Bucket/#{self.name}/delete_all'
  add_method_tracer :update,    'AWS-S3/Bucket/#{self.name}/update'
end

Here I’m using class_eval to inject methods into the Bucket class. class_eval evaluates the block in the context of the class, so these methods are incorporated just as they would be if they were written into the original class file, as instance methods. If these eval methods still seem strange, check out Brian Morearty’s nice and simple blog post that demonstrates the differences between them. While we could just re-open the Bucket class and add our methods directly, I find these eval methods to be more explicit.

What’s up with those single quotes?

We know that in Ruby double quotes allow for interpolation, while single quotes do not. So why am I embedding a variable, #{self.name}, into a string with single quotes? At the time these strings are initially encountered by the interpreter, a reference to self would refer to the Bucket class. I want that variable to refer to the ‘name’ method on a Bucket instance. These strings are evaluated at call time, allowing us to use variables that are available on bucket instances. In this case, we’ll get a series of metrics for each bucket you interact with in your system.

Instrumenting S3Objects

Since the concepts are the same, I’ll just show the code here for instrumenting the S3Object class itself, and the instances of that class:

::AWS::S3::S3Object.instance_eval do
  class << self
    add_method_tracer :about,   'AWS-S3/S3Object/about'
    add_method_tracer :copy,    'AWS-S3/S3Object/copy'
    add_method_tracer :delete,  'AWS-S3/S3Object/delete'
    add_method_tracer :rename,  'AWS-S3/S3Object/rename'
    add_method_tracer :store,   'AWS-S3/S3Object/store'
  end
end

# Instrument methods on S3Object instances
# Metric names are aggregated across all S3Objects since having a metric for
# every single S3Object instance and method pair would be fairly useless
::AWS::S3::S3Object.class_eval do
  add_method_tracer :value,     'AWS-S3/S3Objects/value'
  add_method_tracer :about,     'AWS-S3/S3Objects/about'
  add_method_tracer :metadata,  'AWS-S3/S3Objects/metadata'
  add_method_tracer :store,     'AWS-S3/S3Objects/store'
  add_method_tracer :delete,    'AWS-S3/S3Objects/delete'
  add_method_tracer :copy,      'AWS-S3/S3Objects/copy'
  add_method_tracer :rename,    'AWS-S3/S3Objects/rename'
  add_method_tracer :etag,      'AWS-S3/S3Objects/etag'
  add_method_tracer :owner,     'AWS-S3/S3Objects/owner'
end

Check out the full source of this instrumentation on the rpm_contrib repository on github.

Ship it!

  • Push your branch back up to your github repository. Example: git push origin aws-s3
  • Send a pull request on github to have your instrumentation integrated with the rpm_contrib gem
  • We will release a new version of the gem frequently, to ensure everyone gets quick access to all the great user contributions

That was easy!

In many cases, writing custom instrumentation can be as easy as figuring out what you want to instrument and adding a call to add_method_tracer in the context of that method. In around 42 lines of code we now have instrumentation for the AWS-S3 gem in New Relic RPM! If there are methods I missed that you’d like to see instrumented, please add them! Want to see more gems instrumented with RPM? Write them! The rpm_contrib gem is open source and hosted on github, as is the New Relic Ruby agent. If you have any questions on writing custom instrumentation, feel free to contact me via email or on github.

Next Up

Custom instrumentation in RPM is pretty awesome. Far too awesome for just one post. Here’s what’s coming up next:

  • Writing custom dashboards in RPM that can display any metric collected from your app
  • A how-to guide on writing custom instrumentation in Java

About the author

brian@emphaticsolutions.com'Marketing at Github

Tell us your thoughts Or Send us an internal high five

Talk to @newrelic