Ansible is a widely popular IT automation tool for people who manage, configure, and deploy infrastructure and applications. Now owned by Red Hat, Ansible has long boasted ease of use; in fact, one of the main messages in its marketing is that “complexity kills productivity,” and I couldn’t agree more.

In terms of simplicity, one of my favorite things about using Ansible is that the code you write for it is perfect for modularization and reusability—it’s so much more than just a collection of shell scripts.

In the hopes that they will help you reduce complexity, I’d like to share four of my personal best practices for getting the most out of Ansible:

  1. Use roles to group related tasks
  2. Use Ansible Galaxy to find, share, and create roles
  3. Don’t use ignore_errors
  4. Be pragmatic instead of clever

1. Use roles to group related tasks

In Ansible, roles allow you to group related tasks and all their variables and dependencies into a single, self-contained, portable entity. Grouping your tasks into roles is one of the best ways to maximize the power of Ansible’s modularity and reusability, as organizing things into roles let you reuse common configuration steps between different types of servers.

What’s in a role?

Roles provide a standardized file-and-directory structure that lets Ansible automatically load variables, tasks, handlers, and default values into your Ansible Playbooks.

Per the Ansible roles documentation, a role must contain at least one of the following directories, and each directory used must contain a main.yml file that contains the relevant content for that directory:

  • Tasks: The main list of tasks to be executed by the role.
  • Templates: Templates of files whose final state will be rendered on the server using variable substitution and additional logic.
  • Files: Static files that will be deployed to the server as-is.
  • Vars: Variables that will be used to fill in abstractions in tasks and templates.
  • Defaults: Default values for all variables used in the role.
  • Handlers: Actions that are triggered/notified by tasks. Typically used for restarting services.
  • Meta: Machine-readable metadata about the role, its author, license, compatibilities, and dependencies. If a role depends on another role, it’s declared here, and Ansible will pull in the dependent role automatically.
  • README: Human-readable information about the role, how to use it, and what variables it requires.

With roles you can you strip your playbooks down to just a list of roles to run and whatever variables you need to pass into them. Consider this visual representation of how roles can change your playbook:

You could go from this:

# site.yml
---
- hosts: all
  vars:
  remote_user: user
  become: yes
  
  tasks:
  - name: Install ntp
    yum: name=ntp state=present
    tags: ntp
  
  - name: Configure ntp file
    template: src=ntp.conf.j2 dest=/etc/ntp.conf
    tags: ntp
    notify: restart ntp
  
  - name: Start the ntp service
    service: name=ntpd state=started enabled=yes
    tags: ntp
  
  - name: test to see if selinux is running
    command: getenforce
    register: sestatus
    changed_when: false

- hosts: database
  vars:
    mysql_port: 3306
    dbname: somedb
    dbuser: someuser
    dbpass: somepass
  remote_user: user
  become: yes

  tasks:
  - name: Install Mysql package
    yum: name={{ item }} state=installed
    with_items:
     - mysql-server
     - MySQL-python
     - libselinux-python
     - libsemanage-python
  
  - name: Configure SELinux to start mysql on any port
    seboolean: name=mysql_connect_any state=true persistent=yes
    when: sestatus.rc != 0
  
  - name: Create Mysql configuration file
    template: src=my.cnf.j2 dest=/etc/my.cnf
    notify:
    - restart mysql
  
  - name: Start Mysql Service
    service: name=mysqld state=started enabled=yes
  
  - name: insert iptables rule
    lineinfile: dest=/etc/sysconfig/iptables state=present regexp="{{ mysql_port }}" insertafter="^:OUTPUT " line="-A INPUT -p tcp --dport {{ mysql_port }} -j ACCEPT"
    notify: restart iptables
  
  - name: Create Application Database
    mysql_db: name={{ dbname }} state=present
  
  - name: Create Application DB User
    mysql_user: name={{ dbuser }} password={{ dbpass }} priv=*.*:ALL host='%' state=present
...

To this:

# site.yml
---
# This playbook deploys the whole application stack in this site.

- name: apply common configuration to all nodes
  hosts: all
  remote_user: user

  roles:
    - common

- name: configure and deploy the webservers and application code
  hosts: webservers
  remote_user: user

  roles:
    - web

- name: deploy MySQL and configure the databases
  hosts: dbservers
  remote_user: user

  roles:
    - db

In this example, the tasks needed to configure the webserver, database, and common nodes are each defined in separate, portable, reusable roles instead of being defined in the playbook itself.

2. Use Ansible Galaxy to find and share roles

Ansible Galaxy is a hub for finding, reusing, and sharing Ansible content with the rest of the Ansible community. You download, install, create, and manage roles with the ansible-galaxy command line interface. The CLI ships as part of the regular Ansible package and should already be installed wherever you have Ansible.

Essential ansible-galaxy commands

  • ansible-galaxy init <role_name>
    Use this command to generate the directory and file structure and to stub out the README and metadata files for new roles you want to create and share.

    roles
    └── my_role
    ├── README.md
    ├── defaults
    │   └── main.yml
    ├── handlers
    │   └── main.yml
    ├── meta
    │   └── main.yml
    ├── tasks
    │   └── main.yml
    ├── tests
    │   ├── inventory
    │   └── test.yml
    └── vars
        └── main.yml

    Ansible creates the files; you just have to customize them. It’s always a best practice to make sure you include the relevant documentation needed for each directory.

  • ansible-galaxy install
    Use this command to download and install roles published to the Ansible Galaxy community. Someone may have already written the role you need. You might not have to reinvent the wheel.
  • ansible-galaxy import
    Use this command to import roles from any repository to which you have access.
    Depending on how your teams write and share roles, this can open up some really cool possibilities. For example, roles can live in individual team or org repos but be imported and used from a central location.

3. Don’t use ignore_errors

If Ansible encounters an error on a host when executing a task from a playbook, it won’t continue running the playbook. Mostly this is a good thing, but sometimes certain errors shouldn’t stop the playbook from running. When this happens, it can be tempting to set ignore_errors: yes on the task and move on.

However, there are better, safer ways to handle errors. The ignore_errors setting swallows all errors, even ones you may not be expecting, and you risk leaving your host in a broken or unstable state. You can achieve less-brittle error handling and better control over how your tasks succeed or fail using the following Ansible built-ins:

  • failed_when: Use this setting to specify exactly what constitutes failure, rather than just relying on the exit code. Maybe a command has really failed only when certain text is present in standard error (stderr), for example.
  • changed_when: Use this setting to tell Ansible exactly when a task should register as “changed.” This is especially important to use with modules like shell or command, since these will always run and will always report “changed” otherwise. This can also give you better control over notifying handlers, since they’re often notified by changes.
  • wait_for: Use this setting to tell Ansible to wait until a specific condition is met before continuing (for example, wait for a port to become available on a service before configuring another service to talk to that port).
  • do-until loops: You can set a module to retry until it registers a specified result, or to retry for a specified number of times with a specified delay between attempts. By default, Ansible sets retries at 3 and delay at 5.
  • block and rescue: You can use a block section to set data or directives for a group of tasks (e.g., configuration settings), and then use a rescue section to recover from any error generated when the block section was executed. The ability to rescue errors from blocks gives you a ton of flexibility in how you handle errors; you can use them to flush_handlers, make callbacks, execute additional tasks, print debug messages, and much more.

4. Be pragmatic, not clever

The key to success with Ansible—the way to reduce complexity—is to keep it simple. Be pragmatic with your code, and try not to outsmart yourself by being too clever.

Start by thinking about better ways to get the data or behavior you want.

  • If you have to use a regular expresssion (regex) in a role, you’re probably off to a bad start. Regex is brittle and difficult to read (important when you’re designing roles to be distributed and reused). Also, it’s often a sign that you’re trying to parse or differentiate something that could be better expressed as or read from metadata or variables.
  • If you’re missing facts or metadata that would enable the logic you need, consider adding them. If you’re missing a way to feed that data into Ansible, create it. For example, here at New Relic we wrote some code that queries an internal data source and exposes server location information as Ansible facts, which then allows us to use those facts to configure servers differently based on their location.
  • Apply this thinking when choosing the tools and services in your environment, too. It’s easier to write Ansible code for tools or services that lend themselves to modular, declarative configuration. For example, “.d” config structures (conf.d, cron.d, sudoers.d, etc.) will always be easier to automate (think firewalld vs. iptables).

Other pragmatic tips include:

  • Rather than filling your task files and templates with if statements, abstract using variables and Ansible’s built-in looping mechanisms, such as:
    • with_items: Iterates over a list of items and performs the same action on all of them. Use this setting to avoid having to write the same code over and over (for example, when creating users in a task).
    • with_dict: Use this setting to loop through the elements of a hash/dictionary. This makes it easy to create and access complex and nested data structures.
  • Use templates rather than the lineinfile module. Much like regex, lineinfile can be brittle and lead to unexpected results. In most cases, where possible, you’re better off templating the file you’d like to modify so you can be sure the end result is how you want it. If you find yourself in a situation where you’re using lineinfile because multiple roles might need to modify the same file, consider whether you can switch to an application or service that makes use of .d file structures.
  • Use tags. Tags enable running only parts of a role, which promotes greater reuse.
  • Where possible, use Ansible’s task-specific built-in modules rather than the shell or command modules.
    • Shell commands are less likely to be idempotent.
    • Shell commands will always run and will always report “changed,” unless you’re diligent about using changed_when.
    • Many modules are designed to be operating system agnostic, which also helps you write more reusable code.
  • Use includes and imports, which let you split your Ansible code into logical chunks and reference and reuse them more easily.
    • In your roles, split your tasks into further logical groupings and put them into their own separate task files. Any tasks in main.yml will be run automatically, but other task files can be imported/included as needed.
      --- 
      # tasks/main.yml 
      - include_tasks: centOS.yml 
        when: ansible_os_family == "RedHat" 
      
      - include_tasks: debian.yml 
        when: ansible_os_family == "Debian"
    • This also lets you define handlers once, for your entire environment, and import them into various playbooks. Some handlers belong bundled with their respective roles, but handlers that do things like notify monitoring or make callbacks, that apply to your entire environment, are more powerful if you can write them once and use them everywhere.

As always, RTFM

Hopefully, these tips will get you thinking about how to use Ansible most effectively, but there’s even more goodness in the Ansible docs. The links below can help you get the most of your Ansible experience:

 

Kat Dober is a senior site reliability engineer on the Metal-as-a-Service team at New Relic. She is passionate about scalability, reusability, and automation and has spent her career building tools and platforms to help tame unruly servers and services. View posts by .

Interested in writing for New Relic Blog? Send us a pitch!