Real-life Ansible – chapter one – how to run playbooks on production (more) safely

This is the first blog post in a series presenting some various real-life examples of Ansible in our projects. If you are interested in Ansible going production-ready – stay tuned!

Problem definition

We have a few inventories, like dev, stage and prod. For some of them, stability is not as important as for the others. On the other hand, our playbooks may contain several parameters, and we also want to pass some ansible-playbook parameters (e.g. tags), so a command to perform some action could be pretty long. It’s easy to overlook the crucial ones, especially for the production environment.

ansible-playbook -i inventories/dev app1.yml -e app_version=snapshot -t app --skip-tags maintenance -e serial=2

In such a situation there is a very high risk that we find a command in our shell history and run a playbook forgetting to change the inventory. While running the wrong command on dev is not so painful, we want to prevent accidental playbook running on production.

Confirmation on production

In the next few paragraphs I will show you our solution of making our playbooks production-aware.

Dedicated role

We decided to create a dedicated role named confirm, which is responsible for confirming our production actions. Why a dedicated role? Because we can include it in all playbooks that require confirmation. So to ensure confirmation is prompted, it’s enough to add a role to your playbook.

- hosts: app1 gather_facts: no roles: - confirm tags: - always

We can skip the fact-gathering stage, as we do not depend on any system parameter. Most importantly, by using the always tag, we run this role always, regardless of the tags given in the command line.

Role tasks

I will show you the tasks’ evolution leading to the final version. Firstly, there might be just two simple tasks:

- name: wait for confirmation if required pause: prompt: "You are running playbook for {{ env }}, type environment name to confirm" register: confirm_response - name: fail if wrong confirmation fail: msg: "Aborting due to incorrect input, given <<{{ confirm_response.user_input }}>>, expected <<{{ env }}>>" when: env != confirm_response.user_input

The snippet above shows a simple prompt asking a person to type the environment name to confirm their actions. This gives us the solution to our main goal – getting rid of playbooks that are running mindlessly. After prompt, you must type again the environment name (e.g. prod), so you must be aware that the playbook is run on production. If the confirmation response is wrong, the playbook is aborted.

Running multiple playbooks at once

Sometimes we need to run multiple playbooks at once, e.g.:

ansible-playbook -i inventories/dev app1.yml app2.yml app3.yml

In the solution above, we should confirm our actions in each playbook. To avoid this drawback, let’s introduce a slight modification:

- name: wait for confirmation if required pause: prompt: "You are running playbook for {{ env }}, type environment name to confirm" register: confirm_response when: confirm_response_user_input is not defined - set_fact: confirm_response_user_input: "{{ confirm_response.user_input }}" when: confirm_response_user_input is not defined - name: fail if wrong confirmation fail: msg: "Aborting due to incorrect input, given <<{{ confirm_response_user_input }}>>, expected <<{{ env }}>>" when: env != confirm_response_user_input

Now, we save the confirmation to the variable, so we can confirm only once. Why a dedicated variable? Why can’t we use confirm_response? Because if we skip the first task, confirm_response would contain info about the skipped first task instead of previous user input.

Securing only prod

Securing all playbooks would give us some inconvenience – it’s only important in a prod environment. To run security only on prod, we can wrap all tasks in a block, where we would test if this environment needs securing.

- block: - name: wait for confirmation if required pause: prompt: "You are running playbook for {{ env }}, type environment name to confirm" register: confirm_response when: confirm_response_user_input is not defined - set_fact: confirm_response_user_input: "{{ confirm_response.user_input }}" when: confirm_response_user_input is not defined - name: fail if wrong confirmation fail: msg: "Aborting due to incorrect input, given <<{{ confirm_response_user_input }}>>, expected <<{{ env }}>>" when: env != confirm_response_user_input when: confirm_actions is defined and confirm_actions delegate_to: localhost run_once: true

Now, only production has the variable defined as confirm_actions: true and only production needs manual confirmation before running. We must delegate the whole block to localhost, because we need to set fact – and we want to do it once and for all hosts.

Summary

Using the solution given above, we achieved the following goals:

  • Confirmation is required before running a playbook on production.
  • Even if we run a few playbooks at once, only single confirmation is required.
  • It’s easy to secure additional roles.
You May Also Like

How to use mocks in controller tests

Even since I started to write tests for my Grails application I couldn't find many articles on using mocks. Everyone is talking about tests and TDD but if you search for it there isn't many articles.

Today I want to share with you a test with mocks for a simple and complete scenario. I have a simple application that can fetch Twitter tweets and present it to user. I use REST service and I use GET to fetch tweets by id like this: http://api.twitter.com/1/statuses/show/236024636775735296.json. You can copy and paste it into your browser to see a result.

My application uses Grails 2.1 with spock-0.6 for tests. I have TwitterReaderService that fetches tweets by id, then I parse a response into my Tweet class.


class TwitterReaderService {
Tweet readTweet(String id) throws TwitterError {
try {
String jsonBody = callTwitter(id)
Tweet parsedTweet = parseBody(jsonBody)
return parsedTweet
} catch (Throwable t) {
throw new TwitterError(t)
}
}

private String callTwitter(String id) {
// TODO: implementation
}

private Tweet parseBody(String jsonBody) {
// TODO: implementation
}
}

class Tweet {
String id
String userId
String username
String text
Date createdAt
}

class TwitterError extends RuntimeException {}

TwitterController plays main part here. Users call show action along with id of a tweet. This action is my subject under test. I've implemented some basic functionality. It's easier to focus on it while writing tests.


class TwitterController {
def twitterReaderService

def index() {
}

def show() {
Tweet tweet = twitterReaderService.readTweet(params.id)
if (tweet == null) {
flash.message = 'Tweet not found'
redirect(action: 'index')
return
}

[tweet: tweet]
}
}

Let's start writing a test from scratch. Most important thing here is that I use mock for my TwitterReaderService. I do not construct new TwitterReaderService(), because in this test I test only TwitterController. I am not interested in injected service. I know how this service is supposed to work and I am not interested in internals. So before every test I inject a twitterReaderServiceMock into controller:


import grails.test.mixin.TestFor
import spock.lang.Specification

@TestFor(TwitterController)
class TwitterControllerSpec extends Specification {
TwitterReaderService twitterReaderServiceMock = Mock(TwitterReaderService)

def setup() {
controller.twitterReaderService = twitterReaderServiceMock
}
}

Now it's time to think what scenarios I need to test. This line from TwitterReaderService is the most important:


Tweet readTweet(String id) throws TwitterError

You must think of this method like a black box right now. You know nothing of internals from controller's point of view. You're only interested what can be returned for you:

  • a TwitterError can be thrown
  • null can be returned
  • Tweet instance can be returned

This list is your test blueprint. Now answer a simple question for each element: "What do I want my controller to do in this situation?" and you have plan test:

  • show action should redirect to index if TwitterError is thrown and inform about error
  • show action should redirect to index and inform if tweet is not found
  • show action should show found tweet

That was easy and straightforward! And now is the best part: we use twitterReaderServiceMock to mock each of these three scenarios!

In Spock there is a good documentation about interaction with mocks. You declare what methods are called, how many times, what parameters are given and what should be returned. Remember a black box? Mock is your black box with detailed instruction, e.g.: I expect you that if receive exactly one call to readTweet with parameter '1' then you should throw me a TwitterError. Rephrase this sentence out loud and look at this:


1 * twitterReaderServiceMock.readTweet('1') >> { throw new TwitterError() }

This is a valid interaction definition on mock! It's that easy! Here is a complete test that fails for now:


import grails.test.mixin.TestFor
import spock.lang.Specification

@TestFor(TwitterController)
class TwitterControllerSpec extends Specification {
TwitterReaderService twitterReaderServiceMock = Mock(TwitterReaderService)

def setup() {
controller.twitterReaderService = twitterReaderServiceMock
}

def "show should redirect to index if TwitterError is thrown"() {
given:
controller.params.id = '1'
when:
controller.show()
then:
1 * twitterReaderServiceMock.readTweet('1') >> { throw new TwitterError() }
0 * _._
flash.message == 'There was an error on fetching your tweet'
response.redirectUrl == '/twitter/index'
}
}

| Failure: show should redirect to index if TwitterError is thrown(pl.refaktor.twitter.TwitterControllerSpec)
| pl.refaktor.twitter.TwitterError
at pl.refaktor.twitter.TwitterControllerSpec.show should redirect to index if TwitterError is thrown_closure1(TwitterControllerSpec.groovy:29)

You may notice 0 * _._ notation. It says: I don't want any other mocks or any other methods called. Fail this test if something is called! It's a good practice to ensure that there are no more interactions than you want.

Ok, now I need to implement controller logic to handle TwitterError.


class TwitterController {

def twitterReaderService

def index() {
}

def show() {
Tweet tweet

try {
tweet = twitterReaderService.readTweet(params.id)
} catch (TwitterError e) {
log.error(e)
flash.message = 'There was an error on fetching your tweet'
redirect(action: 'index')
return
}

[tweet: tweet]
}
}

My tests passes! We have two scenarios left. Rule stays the same: TwitterReaderService returns something and we test against it. So this line is the heart of each test, change only returned values after >>:


1 * twitterReaderServiceMock.readTweet('1') >> { throw new TwitterError() }

Here is a complete test for three scenarios and controller that passes it.


import grails.test.mixin.TestFor
import spock.lang.Specification

@TestFor(TwitterController)
class TwitterControllerSpec extends Specification {

TwitterReaderService twitterReaderServiceMock = Mock(TwitterReaderService)

def setup() {
controller.twitterReaderService = twitterReaderServiceMock
}

def "show should redirect to index if TwitterError is thrown"() {
given:
controller.params.id = '1'
when:
controller.show()
then:
1 * twitterReaderServiceMock.readTweet('1') >> { throw new TwitterError() }
0 * _._
flash.message == 'There was an error on fetching your tweet'
response.redirectUrl == '/twitter/index'
}

def "show should inform about not found tweet"() {
given:
controller.params.id = '1'
when:
controller.show()
then:
1 * twitterReaderServiceMock.readTweet('1') >> null
0 * _._
flash.message == 'Tweet not found'
response.redirectUrl == '/twitter/index'
}


def "show should show found tweet"() {
given:
controller.params.id = '1'
when:
controller.show()
then:
1 * twitterReaderServiceMock.readTweet('1') >> new Tweet()
0 * _._
flash.message == null
response.status == 200
}
}

class TwitterController {

def twitterReaderService

def index() {
}

def show() {
Tweet tweet

try {
tweet = twitterReaderService.readTweet(params.id)
} catch (TwitterError e) {
log.error(e)
flash.message = 'There was an error on fetching your tweet'
redirect(action: 'index')
return
}

if (tweet == null) {
flash.message = 'Tweet not found'
redirect(action: 'index')
return
}

[tweet: tweet]
}
}

The most important thing here is that we've tested controller-service interaction without logic implementation in service! That's why mock technique is so useful. It decouples your dependencies and let you focus on exactly one subject under test. Happy testing!