Overview

Chef Handler is used to identify situations that may arise during a chef-client run and then instruct the chef-client on how to handle these situations. There are three type of handlers in Chef:

  • exception – Exception handler is used to identify situations that have caused the chef-client run to fail. This type of handler can be used to send email notifications about the failure or can take necessary actions to prevent from cascading failure.
  • report – Report handler is used when a chef-client run succeeds and reports back on certain details about that chef-client run.
  • start – Start handler is used to run events at the beginning of the chef-client run.

In this blog, we will see how to write a custom exception handler to avoid the failure of a chef run to cause big side effects.

Background

When I was working on the rs-storage cookbook, there was a situation where we freeze the filesystem during a backup. I wanted to make sure the filesystem freeze doesn’t give any adverse effects for the applications that may use the filesystem. The sequence of actions we do for taking a backup are: freeze the filesystem, take a backup using the RightScale API, and then unfreeze the filesystem. What if the backup operation fails? Will it leave the filesystem in a frozen state? Definitely yes. We don’t want that to happen. So the filesystem should be unfrozen regardless of whether the backup succeeded or failed.

Solution

To avoid leaving the filesystem from being frozen on a backup failure, I decided to write an exception handler which will unfreeze the filesystem during a backup failure.

The recipe looks like this

Original Recipe
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Attributes used by the resources
node.set['rs-storage']['device']['nickname'] = 'data_storage'
node.set['rs-storage']['device']['mount_point'] = '/mnt/storage'
node.set['rs-storage']['backup']['lineage'] = 'testing'

# Freeze the filesystem
filesystem node['rs-storage']['device']['nickname'] do
  mount node['rs-storage']['device']['mount_point']
  action :freeze
end

# Take a backup
rightscale_backup node['rs-storage']['device']['nickname'] do
  lineage node['rs-storage']['backup']['lineage']
  action :create
end

# Unfreeze the filesystem
filesystem node['rs-storage']['device']['nickname'] do
  mount node['rs-storage']['device']['mount_point']
  action :unfreeze
end

The filesystem cookbook was modified to support the freeze and unfreeze actions. The rightscale_backup provides a resource/provider for handling backups on a RightScale supported cloud.

This recipe is just that simple. But if the backup resource encountered an error, the chef run will and leave the filesystem in frozen state. So I wrote a small exception handler that gets run when the chef run fails and will notify the unfreeze action of the filesystem[data_storage] resource. Here is the handler:

The Error Handler
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module Rightscale
  class BackupErrorHandler < Chef::Handler
    # This method gets called when the chef run fails and this handler is registered
    # to act as an exception handler.
    def report
      # The nickname (data_storage in this example) is obtained from the node
      nickname = run_context.node['rs-storage']['device']['nickname']
      # Find the filesystem resource from resource collection
      filesystem_resource = run_context.resource_collection.lookup("filesystem[#{nickname}]")
      # Run the `unfreeze` action on the filesystem resource found
      filesystem_resource.run_action(:unfreeze) if filesystem_resource
    end
  end
end

This handler will simply call the unfreeze action on the filesystem resource when the chef run fails. Place this handler in the files directory of your cookbook and we will use this later as cookbook_file. To enable this handler we will use the chef_handler cookbook.

Chef Handler Cookbook

The chef_handler cookbook published by Chef enables the creation and configuration of chef handlers easy. This cookbook creates the handlers directory and provides an LWRP for enabling and disabling specific handlers from within the recipe.

Here is the modified recipe with the use of chef_handler

Modified Recipe with Chef Handler
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Include the chef_handler::default recipe which will setup the handlers directory for chef.
include_recipe 'chef_handler::default'

# Place the handler file we wrote in the handlers directory on the node
cookbook_file "#{node['chef_handler']['handler_path']}/rs-storage_backup.rb" do
  source 'backup_error_handler.rb'
  action :create
end

# Enable the exception handler for this recipe
chef_handler 'Rightscale::BackupErrorHandler' do
  source "#{node['chef_handler']['handler_path']}/rs-storage_backup.rb"
  action :enable
end

# Attributes used by the resources
node.set['rs-storage']['device']['nickname'] = 'data_storage'
node.set['rs-storage']['device']['mount_point'] = '/mnt/storage'
node.set['rs-storage']['backup']['lineage'] = 'testing'

# Freeze the filesystem
filesystem node['rs-storage']['device']['nickname'] do
  mount node['rs-storage']['device']['mount_point']
  action :freeze
end

# Take a backup
rightscale_backup node['rs-storage']['device']['nickname'] do
  lineage node['rs-storage']['backup']['lineage']
  action :create
end

# Unfreeze the filesystem
filesystem node['rs-storage']['device']['nickname'] do
  mount node['rs-storage']['device']['mount_point']
  action :unfreeze
end

Now if the chef run fails, the filesystem will always be unfrozen.

Further Reading

There is an RFC proposed for chef called on_failure which gives a nice way for handling such exceptions. These on_failure blocks can be given at the resource level instead of the recipe level which is much nicer.

Comments