Ansible : the basics - Httqm's Docs

The tips below are taken from my experience with Ansible. Don't forget the best practices from Ansible documentation .
I've gathered best practices as well as many more information in Ansible pour les initiés.pdf.

no manual change
idempotence
the check mode
modules
variables
includes
comments

no manual change

If you change anything without letting Ansible know (for example editing a configuration file manually), it is very likely that this change will be overwritten by the next Ansible execution.

idempotence

There are 2 ways to consider idempotence. There is no good / bad method —both have their pros and cons— but you must make your choice early in the project and stick to it.

method 1 : idempotence is paramount

you can (almost) read the PLAY RECAP only :
- nothing changes if it doesn't have to
- you know what to expect (where + when changes occur) when altering the code of your playbooks
gives you confidence that you're actually controlling machines with Ansible, not only "doing actions" on them
this comes at the cost of added complexity (especially with shell and command modules)

method 2 : we don't care much about idempotence (i.e. every playbook execution causes changes)

when used like this, Ansible is a distributed scripting language (like shell scripting on steroids which handles the SSH part for you). This is still great and extremely powerful
on the cons side, you never know whether everything is going extremely well (i.e. as described in your code) since all executions cause changes. This means you have to analyze the execution log to find out

the check mode

The execution in check mode :

must NOT make any change to the slaves
- except if a very specific change is mandatory to let a very specific task be executed afterwards during the check mode execution. In such case, use check_mode: no.
- if the exception above causes a truckload of check_mode: no, read these rules again : you may be trying to do too much in check mode
must NOT be used to setup / check / validate commands (especially shell and command ones) or tasks. These are expected to be :
- skipped safely during the check mode
- executed normally otherwise

modules

Use modules as much as possible instead of re-coding things yourself :

this is quicker + cleaner + more readable + more maintainable
guarantees idempotence
avoids questionable hacks

variables

Don't over-use set_fact to declare numerous internal / intermediary variables :

a set_fact is a regular task, which implies :
- 1 more task in a play
- some execution time + CPU + RAM
- a log entry
it is possible to pipe data along :
- using Ansible filters and Jinja2 filters
- and combining lists and dicts
at the expense of readability + complexity
if you need to set some variables, use :
- the variable files
  - for roles : the vars and default directories
  - for plays : the group_vars/ directory
- the vars keyword

includes

include_role + task_from or include_tasks :
- is similar to a goto (i.e. : ugly and cheating )
- is often the consequence of trying to loop over a group of tasks (which can't be done with a block)
- _may_ be avoided (sometimes). Check if :
  - there's a module already doing what you're about to code (have I already said to use modules ?)
  - one of the tasks / commands you use in your included code accepts lists / dicts as input : this can save an explicit loop and make the whole include unnecessary
regarding includes in general (this is not Ansible-specific), there are different approaches :
- not-my-approach : some people consider code should be placed in a dedicated file and included only if it's included at least twice. Otherwise :
  - it "costs" 1 include during execution
  - the developer has to open 1 more file
  - _may_ decrease readability
- my-approach :
  - I think it is worth grouping tasks
    - that belong together
    - for which you don't need to know the very detail
    while working on the calling file. Indeed, code stating :
```
include configureDatabase
include addDbUsers
```
    is explicit enough and doesn't require opening extra files
  - too much nested blocks of code (hence too much indents) decreases readability, which is why grouping + moving code in distinct files helps
  - the logic about including / not including code should be in the calling file :
    DON'T :
    callingFile :
    include includedFile
    includedFile :
    if condition do things
    instead DO :
    callingFile :
    if condition include includedFile
    includedFile :
    do things
    This way :
    - the code logic appears clearly without leaving callingFile
    - the include is processed only if condition is true, not everytime
  - with modern IDEs and decent editors (Vi, Emacs, ) opening 1 extra file shouldn't be that of a hassle

comments

why do people do this :

# I explain what the code below does
- moduleName :
    arg1: foo
    arg2: bar

instead of :

- name: "Description of what's happening below"
  moduleName :
    arg1: foo
    arg2: bar

⇒ use name:, this is what it's for.

a comment must explain things while still being generic : it must stay relevant when the context changes (things added / removed / moved). Indeed, code changes but people "forget" to update comments and delete irrelevant ones. Those obsolete comments may give WRONG information about what's going on. This is subtle, but as a general rule, remember that a comment should not explicitly mention :
- the data being processed (configuration values, path, ...)
- a machine name

Variables declared in vars have a higher precedence than those from defaults.

Many articles describe the purpose of these directories explaining that they are :

for static vs dynamic variables
scope-related (i.e. per environment, like "dev" vs "prod")
to separate those that can be overridden from others

IMHO, this is not completely wrong, but not totally right either. It makes sense to discriminate variables on criteria like those above, but the Ansible documentation does not enforce such rules.
Regarding the vars and defaults directories, it's only about variable precedence.

As a summary :

it's fine to have guidelines specifying to declare variables in one directory rather than in the other
unless when leveraging from variables precedence (i.e. need to specify variables + their default values), code clarity benefits from storing variables in a single place
it is perfectly legal for Ansible to have the same variable name with different values in both directories :
- precedence matters
- readability may suffer from this
- I'm afraid this becomes messy and opens the door to debug nightmares

[root] ...well, this is the playbook root directory

ansible.cfg Ansible main configuration file

playbooks/

main.yml the main playbook file

myPlaybook1.yml other playbook (if needed)

myPlaybook2.yml other playbook (if needed)

hosts inventory file

group_vars/

databases.yml variables for all hosts that are members of the databases group

webservers.yml variables for all hosts that are members of the webservers group

vault.yml Ansible-vault encrypted variables file

host_vars/

mysql1.yml variables for the mysql1 host

mysql2.yml variables for the mysql2 host

apache1.yml variables for the apache1 host

apache2.yml variables for the apache2 host

roles/

myRole1

defaults/ What's the difference between the vars and defaults role directories ?

main.yml default/ variables of myRole1 (read more about variable precedence)

files/ files that will be copied to the slaves (is it supposed to mimic the destination file tree ?)

myFile1

myFile2

myFile3

handlers/ Changing a configuration file may involve restarting the corresponding daemon. However, multiple changes to the same file must restart the daemon only once. Handlers, triggered by the notify directive, implement this event-driven behavior.

main.yml

meta/ this is where role dependencies are described. More about Conditional role dependencies.

main.yml list of roles (and related parameters) to play before playing myRole1

tasks/

main.yml tasks of myRole1

templates/ Files or snippets that will be used to generate files on the slaves via the templating engine (is it supposed to mimic the destination file tree ?)

myTemplate1.j2

myTemplate2.j2

myTemplate3.j2

vars/ What's the difference between the vars and defaults role directories ?

main.yml vars/ variables of myRole1 (read more about variable precedence)

myRole1_variables1.yml other variables (if needed)

myRole1_variables2.yml other variables (if needed)

myRole2 another role, with a similar structure

#!/usr/bin/env bash
######################################### makeAnsibleRoleSkeleton.sh ################################
# Create the directory structure and some files for an Ansible role
########################################## ##########################################################

rolesDirectory='/opt/ansible/roles'
newRoleName=$1

[ -z "$newRoleName" ] && {
	echo "Usage : $0 <new role name>"
	exit 1
	}

mkdir -p "$rolesDirectory/$newRoleName/"{tasks,handlers,templates,files,vars,meta}
echo '---' | tee "$rolesDirectory/$newRoleName/"{tasks,handlers,vars,meta}/main.yml > /dev/null

enter the "new role" directory
for subDir in files handlers meta tasks templates vars; do mkdir -p "$subDir"; newFile="$subDir/main.yml"; echo '---' > "$newFile"; git add "$newFile"; done
don't forget to commit

play

block of code associating tasks to one or more hosts. Example :

- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  tasks:

  - debug:
      msg: "hello, world"

  - debug:
      msg: "farewell, world"

playbook

group of one or more plays

Definitions :

As seen in the Introduction to Ansible article, Ansible can be used to perform ad-hoc tasks. But it can also execute procedures called playbooks (examples).

Playbooks :

are written in YAML.
are composed of one or more plays. A play is a list of hosts with associated roles and tasks. A task is, basically speaking, calling an Ansible module, as seen in the Introduction to Ansible article.

Important things about playbooks :

When running the playbook, which runs top to bottom, hosts with failed tasks are taken out of the rotation for the entire playbook. If things fail, simply correct the playbook file and rerun.
Modules (hence playbooks) are idempotent : if you run them again, they will make only the changes they must in order to bring the system to the desired state. Ansible relies on facts before taking any action; and playbooks must be designed :
- NOT as a list of actions to do
- but as the description of a desired state
For example, if you instruct Ansible to install a package, it will first detect whether this package is already installed or not (during the facts gathering preliminary step), then act accordingly. This makes it very safe to rerun the same playbook multiple times : it won't change things unless it has to do so.
except for shell and command modules, where idempotence cannot be guaranteed automatically (details).
Each task has a name parameter which is included in the output from running the playbook. This is for humans only and should be as descriptive as possible.
It is usually wise to track playbooks in SCM tools such as Git.

My first playbook :

Save this as playbook.yml :

---
# this is my 1st playbook

- hosts: groupSlaves
  tasks:
  - name: test connection
    ping:

groupSlaves is the group name defined in the inventory file.

Launch it : ansible-playbook playbook.yml

It may return :

PLAY [groupSlaves] *****************************************************************

GATHERING FACTS ***************************************************************
ok: [192.168.105.80]
ok: [192.168.105.114]

TASK: [test connection] *******************************************************
ok: [192.168.105.80]
ok: [192.168.105.114]

PLAY RECAP ********************************************************************
192.168.105.114		: ok=2	changed=0	unreachable=0	failed=0
192.168.105.80		: ok=2	changed=0	unreachable=0	failed=0

A full playbook :

To know what's performed by this playbook, just read the name lines.

---
# playbook_web.yml

- hosts: myGroup1:myGroup2

  vars:
    apacheUser:        'www-data'
    apacheGroup:       'www-data'
    documentRoot:      '/var/www/test/' # final '/' expected
    websiteLocalPath:  '/root/'
    websiteConfigFile: 'test.conf'

  tasks:
  - name: install Apache
    apt: name=apache2 state=present

  - name: disable default Apache website
    shell: a2dissite default

  - name: define Apache FQDN
    shell: echo "ServerName localhost" > /etc/apache2/conf.d/fqdn

  - name: create docRoot
    file: state=directory path={{ documentRoot }} owner={{ apacheUser }} group={{ apacheGroup }}

  - name: deploy website
    copy: src={{ websiteLocalPath }}index.html dest={{ documentRoot }} owner={{ apacheUser }} group={{ apacheGroup }}

  - name: deploy website conf
    copy: src={{ websiteLocalPath }}{{ websiteConfigFile }} dest=/etc/apache2/sites-available/

  - name: enable website
    shell: a2ensite {{ websiteConfigFile }}

  - name: reload Apache
    service: name=apache2 state=reloaded enabled=yes

- hosts: 127.0.0.1
  connection: local
  tasks:
  - name: check everything is ok on webserver 'myGroup1'
    shell: wget -S -O - -Y off http://192.168.105.114/index.html

  - name: check everything is ok on webserver 'myGroup2'
    shell: wget -S -O - -Y off http://192.168.105.80/index.html

How to run actions on the master in a playbook (source) :

Let's say you just deployed a new web server + a web application. Wouldn't it be great if you could run some checks at the end of the playbook, just to make sure everything's responding as expected ? To do so, you'd have to run some commands from the master host : use this code as the last play of your playbook :

- hosts: 127.0.0.1
  connection: local
  tasks:
  - name: make sure blah is blah.
    shell: 'myCheckCommand'

If myCheckCommand returns a Unix success :

Testing with myCheckCommand being true, execution of this specific play returns :

PLAY [127.0.0.1] **************************************************************

GATHERING FACTS ****************************************************************
ok: [127.0.0.1]

TASK: [make sure blah is blah.] ***********************************************
changed: [127.0.0.1]

PLAY RECAP *********************************************************************
127.0.0.1	: ok=2	changed=1	unreachable=0	failed=0

If myCheckCommand returns a Unix failure :

Now with false, output becomes :

PLAY [127.0.0.1] **************************************************************

GATHERING FACTS ****************************************************************
ok: [127.0.0.1]

TASK: [make sure blah is blah.] ***********************************************
failed: [127.0.0.1] => {"changed": true, "cmd": "false", "delta": "0:00:00.015746",
	"end": "2014-10-16 15:11:49.153606", "rc": 1, "start": "2014-10-16 15:11:49.137860"}

FATAL: all hosts have already failed -- aborting

PLAY RECAP *********************************************************************
	to retry, use: --limit @/root/fileNameOfMyPlaybook.retry

127.0.0.1	: ok=1	changed=0	unreachable=0	failed=1

Playbooks with roles (source, role file tree) :

Initial Ansible syntax (early versions) :

- hosts: webservers
  roles:
    - role_X
    - role_Y

These are processed as static imports.

Updated syntax (for Ansible 2.4+) :

- hosts: webservers
  tasks:
    - import_role:
        name: role_X
    - include_role:
        name: role_Y

You may choose between import_role and include_role considering the static or dynamic import that will be performed.

Old vs new syntax :

"Old" syntax (compact mode) :

- hosts: webservers
  roles:
  - { role: role_X, myVariable: "42", tags: "tag1, tag2" }

"Old" syntax (verbose mode) :

- hosts: webservers
  roles:
    - role: role_X
      vars:
        myVariable: "42"
      tags:
        - tag1
        - tag2

"New" syntax :

- hosts: webservers
  tasks:
    - import_role:
        name: role_X
      vars:
        myVariable: "42"
      tags:
        - tag1
        - tag2

The inventory file :

is the list of hosts that are controlled by Ansible (aka slaves).
lists hosts organized in groups (plus the default group all that has everybody).
can either be static (read below) or dynamic.
defaults to /etc/ansible/hosts. An alternate inventory file can be specified on the command line with -i alternateInventoryFile
can either be in YAML or in INI format :
```
[groupSlaves]						group name
slave1	ansible_host=192.168.105.114		details
slave2	ansible_host=192.168.105.80

[myGroup1]						group name
slave1

[myGroup2]						group name
slave2
```
- This syntax is strongly related to patterns, the way to declare targets of our actions.
- It is possible to declare groups of groups with children :
  - :children for an inventory in INI format
  - children: for an inventory in YAML format

Default groups

The implicit group all includes all slaves.
There is also another group named ungrouped. The logic behind Ansible is that all slaves must belong to at least 2 groups : all and "an other one". If there is no such "other one", ungrouped will be that one.

Both groups will always exist and don't need to be explicitly declared.

Ansible is installed on a master host to rule them all. There's nothing to install on slaves (except SSH keys ).

Setup Ansible master on a Debian Buster (inspired by) :

as :
apt install python3-pip
as a non- user : setup + activate a Python virtual environment
still as a non- user, and from within the virtual environment (if present) :
pip3 install -U ansible

Setup SSH on the master (source) :

Create a new key : ssh-keygen -t rsa will generate the 2048-bit /root/.ssh/id_rsa RSA private key.
Deploy it to the slave(s)

Configure SSH accordingly (/root/.ssh/config) :

Host slave1
	hostname	192.168.105.114
	user		
	IdentityFile	~/.ssh/id_rsa

Host slave2
	hostname	192.168.105.80
	user		
	IdentityFile	~/.ssh/id_rsa

List slave(s) into the inventory file :

192.168.105.114	# slave1
192.168.105.80	# slave2

Check communication between master and slave(s) :

ansible all -m ping -u

192.168.105.114 | success >> {
	"changed": false,
	"ping": "pong"
}

192.168.105.80 | success >> {
	"changed": false,
	"ping": "pong"
}

It works !!!
Define groups of hosts in the inventory file

CLI flags are common to several Ansible commands / tools. See this dedicated article.

Get information about slaves :

ansible all -m setup

This will output a VERY long list of inventory information (aka facts) about the target(s). To get detailed information on a specific topic, you can apply a filter :

ansible myGroup2 -m setup -a 'filter=ansible_processor*'

192.168.105.80 | success >> {
	"ansible_facts": {
		"ansible_processor": [
			"Intel(R) Core(TM)2 Duo CPU	 E8400 @ 3.00GHz"
		],
		"ansible_processor_cores": 1,
		"ansible_processor_count": 1,
		"ansible_processor_threads_per_core": 1,
		"ansible_processor_vcpus": 1
	},
	"changed": false
}

Run shell commands on slaves :

ansible all -a "hostname"

run a basic command on all slaves :

192.168.105.114 | success | rc=0 >>
ansibleSlave

This is for basic commands (single binary, no options).

ansible myGroup1 -a "echo $(hostname)"

This is executed on the master because double quotes are interpreted locally :

192.168.105.114 | success | rc=0 >>
ansibleMaster

ansible myGroup1 -a 'echo $(hostname)'

This is sent to the right slave but not executed, because shell/subshell commands are not interpreted :

192.168.105.114 | success | rc=0 >>
$(hostname)

ansible myGroup1 -m shell -a 'echo $(hostname)'

Thanks to the shell module, this command is executed as expected :

192.168.105.114 | success | rc=0 >>
ansibleSlave

ansible all -m shell -a 'echo $(hostname) | grep -e "[a-z]"'

It's possible to run "complex" shell commands now :

192.168.105.80 | success | rc=0 >>
ansibleSlave2

192.168.105.114 | success | rc=0 >>
ansibleSlave

File transfer :

Ansible can scp files from the master to its slaves :

ansible all -m copy -a "src=/home/test.txt dest=/home/"

It's possible to rename the file during the copy by specifying a different destination name : ... "src=/home/test.txt dest=/home/otherName"

Manage packages :

Ansible can query its slaves about software using some dedicated packages :

apt for Debianoids. This module is part of the default install.
yum for Red Hatoids.

Possible values : installed, latest, removed, absent, present.

Make sure the package openssh-server is installed :

ansible all -m apt -a "name=openssh-server state=installed"

192.168.105.114 | success >> {
	"changed": false
}

192.168.105.80 | success >> {
	"changed": false
}

If the specified package was not already installed, this will install it. The FULL command output (install procedure) will be reported by Ansible.

Make sure the package apache2 is absent :

ansible all -m apt -a "name=apache2 state=absent"

192.168.105.80 | success >> {
	"changed": false
}

192.168.105.114 | success >> {
	"changed": false
}

Users and groups, the user module :

Create a user account for Bob :

ansible myGroup1 -m user -a "name=bob state=present"

192.168.105.114 | success >> {
	"changed": true,
	"comment": "",
	"createhome": true,
	"group": 1001,
	"home": "/home/bob",
	"name": "bob",
	"shell": "/bin/sh",
	"state": "present",
	"system": false,
	"uid": 1001
}

And if I run the same command again, whereas Bob's account already exists :

192.168.105.114 | success >> {
	"append": false,
	"changed": false,
	"comment": "",
	"group": 1001,
	"home": "/home/bob",
	"move_home": false,
	"name": "bob",
	"shell": "/bin/sh",
	"state": "present",
	"uid": 1001
}

Delete Bob's user account :

ansible myGroup1 -m user -a "name=bob state=absent remove=yes"

192.168.105.114 | success >> {
	"changed": true,
	"force": false,
	"name": "bob",
	"remove": true,
	"state": "absent"
	"stderr": "userdel: bob mail spool (/var/mail/bob) not found\n"
}

Running the same command again (no user account named Bob anymore) :

192.168.105.114 | success >> {
	"changed": false,
	"name": "bob",
	"state": "absent"
}

remove=yes instructs Ansible to delete the homedir as well.
remove=no is equivalent to not using the "remove" option at all (defaults to no), and leaves the homedir untouched.

List existing user accounts :

ansible myGroup1 -m shell -a 'less /etc/passwd | cut -d ":" -f 1'
ansible myGroup1 -m shell -a 'sed -r "s/^([^:]+).*/\1/" /etc/passwd'
ansible myGroup1 -m shell -a 'awk -F ":" "{print \$1}" /etc/passwd'

This gets complex because of escaping quotes and some special characters

192.168.105.114 | success | rc=0 >>
root
daemon
bin

nobody
libuuid
messagebus
bob

Deploying from Git : the git module :

ansible webservers -m git -a "repo=git://foo.example.org/repo.git dest=/srv/myapp version=HEAD"

Managing services, the service module :

Start a service : ansible all -m service -a "name=ssh state=started"

192.168.105.114 | success >> {
	"changed": false,
	"name": "ssh",
	"state": "started"
}

192.168.105.80 | success >> {
	"changed": false,
	"name": "ssh",
	"state": "started"
}

Accepted states :

started : start service if not running
stopped : stop service if running
restarted : always restart
reloaded : always reload
running : ?

Ansible best practices

Table of contents

‌