Mutate your Rack middleware's env!

originally appearing on theScore's blog

If you've been a Ruby developer for a while, chances are that you've written some kind of Rack middleware. Rack is a pretty big part of web development with Rails, and here at theScore we've developed a few, small pieces of middleware to add additional data to our requests so that our Rails applications can use it.

If you're not familiar with Rack or Rack middleware, I would recommend this article on Rack and this stackoverflow question on the middleware.

Background and Problem

While looking to integrate Yeller, we ran into a bug in our middleware. The bug came up because Yeller was looking for Rails specific data that was being attached to the request's env hash when an error was raised. In this case, during a request, ActionController::Metal sets action_controller.instance on the env hash:

def dispatch(name, request) #:nodoc:
  @_request = request
  @_env = request.env
  @_env['action_controller.instance'] = self
  process(name)
  to_a
end

(source pinned to Rails 4.1.4)

In Yeller, the error reporting code extracts that information to give us a better idea of what controller and action was involved when the error occurred:

def render_exception_with_yeller(env, exception)
  # ...
  controller = env['action_controller.instance']
  # ...
  if controller
    # ...
    location = "#{controller.class.to_s}##{params[:action]}"
  else
    # code without location information available
  end
end

(source pinned to yeller_ruby 0.2.2)

When we ran Yeller's verification Rake task, our errors were being sent to Yeller's servers, but the Rake task also told us that we were missing that Rails specific information (the location, for example). So, we scratched our heads a bit. Error reporting worked in new Rails applications, so it had to be something in our code.

I started by removing our custom middleware – we had three of them and I removed all of them. The rake task succeeded! So I started adding them in one-by-one until I added one and the rake task failed. At this point, I had narrowed the code down to about 3 lines and they looked something like this:

def call(env)
  api_version = env['HTTP_X_API_VERSION'].presence || default_api_version
  api_version = api_version.to_s.split(',').first # Handle bad API versions with commas in them
  @app.call(env.merge('the_score.api_version' => ApiVersionDecorator.new(api_version)))
end

The Solution

After showing Yeller's team the middleware, they saw the problem! env.merge returns a new hash. The 'gotcha' in this case is this:

The scope of the env hash exists after calls to @app.call finish. Because of this, @app.call should be invoked with the same env hash passed into the method.

Middleware that rely on subsequent middleware (or the app) to set information (e.g. action_controller.instance) on the env hash also rely on the fact that the subsequent middleware won't pass a different env hash to the subsequent middleware. If the subsequent middleware do pass a different hash, here's what happens:

The new env is only used from that point forward in the middleware stack
Needless memory is allocated because you've just duplicated every request's information in a separate env

The fix and proper way to do this is to instead mutate your hash through Hash#[], Hash#merge!, Hash#store, and so forth. A new hash isn't created when you use these methods. Here's the fixed code:

def call(env)
  api_version = env['HTTP_X_API_VERSION'].presence || default_api_version
  api_version = api_version.to_s.split(',').first # Handle bad API versions with commas in them
  @app.call(env.merge!('the_score.api_version' => ApiVersionDecorator.new(api_version)))
end

If you want to see the problem happening in your console with a 'real' Rack request (OK, not really, it uses Rack::MockRequest in the tests), you can clone this project that I've created, bundle install, and run rspec.

Bonus: How to Test It

We had a suite of tests using rpsec around these simple pieces of middleware, so we added something like this to every test:

class MyMiddleware
  def initialize(app)
    @app = app
  end

  def call(env)
    @app.call(env)
  end
end

describe MyMiddleware do
  let(:app) { double(:app) }
  let(:env) { double(:env) }

  subject { MyMiddleware.new(app) }

  it "calls app with the same env hash" do
    expect(app).to receive(:call).with(env)
    subject.call(env)
  end
end

Conclusion

Rack is fairly straightforward, but there definitely are some 'gotchas' that you can run into. This is just one example and there are a few more that you need to consider when creating Rack middleware. In a future post, I'll explain another 'gotcha' with regards to multithreading and Rack middleware.

Another interesting point is that this is a 'bug' that has been in our code base for a little over two years without being noticed. It just goes to show that every once in a while, you should take a look at the pieces of middleware that your application depends on: they may be misbehaving like ours was!

Two computers and a microphone