Ansible - Simple IT Automation

mail

The args keyword

Modules that support free form parameters (shell, command, script, ) accept the args keyword to provide options (snippet below taken from the shell documentation) :
- name: This command will change the working directory to someDir/ and will only run when someDir/someLog.txt doesn't exist
  ansible.builtin.shell: someScript.sh >> someLog.txt
  args:
    chdir: someDir/
    creates: someLog.txt
You may / may not use args : it depends mostly on how you format / indent your code . Compare :
---
- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  tasks:

  - name: "Play with the 'shell' module"
    shell: touch /tmp/test
      executable: /bin/bash
returns :
ERROR! Syntax Error while loading YAML.
  mapping values are not allowed in this context

The error appears to be in '/playbook.yml': line n	refers to the executable: line
---
- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  tasks:
  - name: "Play with the 'shell' module"
    shell: touch /tmp/test
    args:
      executable: /bin/bash
works fine (except a warning suggesting to use the file module + state=touch rather than using touch within shell, which can easily be fixed with warn: no)
The error —that is fixed by args— is not related to Ansible but to the YAML syntax itself. Indeed, code formatted like :
foo: bar
  baz: foobar
suggests that : which is exactly what we tried to do with :
    
    shell: touch /tmp/test
      executable: /bin/bash
    
And args to the rescue :
    
    shell: touch /tmp/test	this is a string variable
    args:			this is a dict
      executable: /bin/bash	this is a key/value of that dict
    
And now, everybody's happy !

Alternate solution

No args but this works fine too :
---
- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  tasks:
  - name: "Play with the 'shell' module"
    shell:			nothing after the :, shell is now a dict
      cmd: touch /tmp/test	regular key/value of that dict introducing the cmd keyword
      executable: /bin/bash
mail

Ansible environment variables

These environment variables are destined to be used like :
ANSIBLE_=value ansible-playbook

List of variables :

ANSIBLE_CALLBACK_WHITELIST
  • Used to whitelist callback plugins
  • So far, mostly used to display statistics (tasks duration) at the end of a playbook execution :
    ANSIBLE_CALLBACK_WHITELIST='profile_tasks'
ANSIBLE_DEBUG
without surprise, this displays debug information :
  • this is extremely verbose
  • does not display the very same set of information than -v
ANSIBLE_LOCALHOST_WARNING
Silence warnings that would be output by code like :
---
# ANSIBLE_LOCALHOST_WARNING=false ansible-playbook test.yml
- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  tasks:

  - debug:
      msg: "hello world!"
ANSIBLE_STDOUT_CALLBACK
Specifies the callback used to display the Ansible execution log :
ANSIBLE_STDOUT_CALLBACK=dense
minimal stdout output
ANSIBLE_STDOUT_CALLBACK=yaml
  • less "", '', [] and {} in the output
  • can also turn \n in the output into actual \n, making error messages way more readable (source, details)
mail

Ansible Vault / ansible-vault

Usage

Example

Setup :

  1. define the vault_password_file in ansible.cfg :
    [defaults]
    
    vault_password_file	= path/to/.vault_password
  2. set the vault password in path/to/.vault_password :
    pwgen -N 1 -ys 40 > path/to/.vault_password
    chmod 400 path/to/.vault_password
  3. create the encrypted file :
    ansible-vault create path/to/encrypted/file
  4. view the encrypted file :
    ansible-vault view path/to/encrypted/file
  5. edit the encrypted file :
    ansible-vault edit path/to/encrypted/file
    You can define which editor will be used by setting the EDITOR environment variable.

How to read information from an Ansible Vault encrypted file ?

For your eyes only :

ansible-vault has decrypt and view commands :
  • decrypt : decrypts the specified file and writes it as a plaintext file
    This is not the right option when simply trying to retrieve information from an encrypted file.
  • view : displays the decrypted contents of the encrypted file but leaves it unmodified
ansible-vault view path/to/encryptedFile.yml

In a play :

  • In the play parameters :
    - hosts: myServer
      vars_files:
        - path/to/encryptedFile.yml
  • With a dedicated task :
    - name: "Load vault data"
      include_vars:
        file: path/to/encryptedFile.yml
        name: vaultData
    Variables will be available via the vaultData key.
mail

ansible.cfg

Ansible configuration file(s) :

You may save Ansible settings in different places, which are searched in the following order :
  1. ANSIBLE_CONFIG : environment variable
  2. ansible.cfg : in the current directory (so this may be versioned and shared with the code)
  3. ~/.ansible.cfg : personal user settings
  4. /etc/ansible/ansible.cfg : system-wide settings
  • Ansible uses the first it finds and ignores the others.
  • To find out which configuration file was actually used by Ansible, just run any basic command in verbose mode :
    ansible localhost -m ping -v
    Using path/to/ansible.cfg as config file
    

Sample ansible.cfg :

This is not a copy of my configuration file : with time, there may be incompatible / conflicting settings below. Consider this as a list of the options I've used most.
[defaults]
# paths
inventory		= path/to/inventoryFile
log_path		= ./ansible.log
lookup_plugins	= path/to/lookup_plugins
#remote_tmp		= /tmp/${USER}/ansible		deprecated ?, details
roles_path		= path/to/roles
vault_password_file	= /var/lib/ansible/.vault_password

# ansible internal stuff
deprecation_warnings		= True
forks				= 20		details
gathering			= smart
host_key_checking		= False
internal_poll_interval	= 0.05
interpreter_python		= auto_silent
retry_files_enabled		= False

[ssh_connection]
pipelining	= True		conflicts with privilege escalation (become + sudo). For details, please 
retries	= 5
ssh_args	= -o ControlMaster=auto -o ControlPersist=60s -o PreferredAuthentications=password,publickey
mail

asynchronous and background tasks

Definitions : asynchronous vs background tasks :

asynchronous

By default, the SSH connection to a slave stays open until the task completes, which is fine for short tasks, but causes problems if task duration > SSH timeout

Use case :
  • start a long running task
  • check its status later (by polling regularly)

asynchronous tasks are run "detached" from the SSH connection, but still "blocking" the play.

background
Some tasks can be run concurrently and shorten the overall playbook duration. In other words : this allows not blocking on a task to continue on others.

the play continues while a background task is being executed.

async and poll :

async n
set task timeout
  • if task is not finished after n seconds, it will be :
    • reported as FAILED
    • terminated
  • when a task is killed by timeout, its register directive is not executed, and anything based on the registered value will fail because the expected variable does not exist
poll n
check task status every n seconds
  • it's ok if n > actual task duration
  • the task behavior changes when n = 0 (see below)

async + poll n (n > 0) = avoid connection timeout (source)

The play blocks on the task until it :
  • completes
  • fails
  • times out

async + poll n (n = 0) = background task (source)

  • Ansible starts the task and immediately moves on to the next one without
    • waiting for a result
    • ever checking back on this task
    This is the fire and forget mode
  • use the async_status module to re-synchronize tasks later in the play
  • it seems possible for a background task to finish after the end of the playbook :
    ---
    - hosts: 127.0.0.1
      connection: local
      gather_facts: no
      tasks:
    
      - name: Job A
        shell: for i in 1 2 3 4 5_end; do sleep 1; echo "job A\t$i\t$(date)" >> myTempFile; done
        async: 5
        poll: 0
    
      - name: Job B
        shell: for i in 1 2 3_end; do sleep 1; echo "job B\t$i\t$(date)" >> myTempFile; done
    
      - debug:
          msg: "{{lookup('file', 'myTempFile') }}"
    

    ANSIBLE_LOCALHOST_WARNING=false ansible-playbook myPlaybook.yml; sleep 3; cat myTempFile; rm myTempFile

    PLAY [127.0.0.1] ************************************************************************************
    
    TASK [Job A] ****************************************************************************************
    changed: [127.0.0.1]
    
    TASK [Job B] ****************************************************************************************
    changed: [127.0.0.1]
    
    TASK [debug] ****************************************************************************************
    ok: [127.0.0.1] => {
        "msg": "
    job A	1	Wed 27 May 2020 12:13:31 PM CEST	file contents at the end of the playbook
    job B	1	Wed 27 May 2020 12:13:31 PM CEST
    job A	2	Wed 27 May 2020 12:13:32 PM CEST
    job B	2	Wed 27 May 2020 12:13:32 PM CEST
    job A	3	Wed 27 May 2020 12:13:33 PM CEST
    job B	3_end	Wed 27 May 2020 12:13:33 PM CEST
    "}
    
    PLAY RECAP ******************************************************************************************
    127.0.0.1	: ok=3	changed=2	unreachable=0	failed=0	skipped=0	rescued=0	ignored=0
    
    job A	1	Wed 27 May 2020 12:13:31 PM CEST	file contents after the end of the playbook + pause
    job B	1	Wed 27 May 2020 12:13:31 PM CEST
    job A	2	Wed 27 May 2020 12:13:32 PM CEST
    job B	2	Wed 27 May 2020 12:13:32 PM CEST
    job A	3	Wed 27 May 2020 12:13:33 PM CEST
    job B	3_end	Wed 27 May 2020 12:13:33 PM CEST
    job A	4	Wed 27 May 2020 12:13:34 PM CEST
    job A	5_end	Wed 27 May 2020 12:13:35 PM CEST

async_status : re-synchronize a background task

Use case :
  • when the result of a background task is required to continue
  • when you want to ensure a background task is over
it is possible to "wait" and re-synchronize tasks.
With a long lasting background task, legitimate questions would be :
  • since I don't know which of the background or foreground tasks ends first, how to make sure that —at a given point of a play— the background task is finished ?
  • how to check the background task exit status ?
  • "regular" tasks output is visible in the Ansible execution log, what do I get here ?
  • how to retrieve the background task output (stdout, stderr) ?
---
- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  tasks:

  - name: Init. things
    shell: "[ -f myTempFile ] && rm myTempFile || :"

  - name: Job A
    shell: for i in 1 2 3 4 5_end; do sleep 1; echo "job A\t$i\t$(date)" >> myTempFile; done; echo 'job A OK'
    async: 20
    poll: 0
    register: jobA

  - name: Job B
    shell: for i in 1 2 3_end; do sleep 1; echo "job B\t$i\t$(date)" >> myTempFile; done; echo 'job B OK'

  - name: Wait for 'Job A' to end
    async_status:
      jid: '{{ jobA.ansible_job_id }}'
    register: jobAStatus
    until: jobAStatus.finished
    retries: 10		number of attempts
    delay: 1		seconds between attempts

  - debug:		when reaching this point, the task named "Job A" is finished
      var: jobA

  - debug:
      var: jobAStatus

  - debug:
      msg: "{{lookup('file', jobA.results_file) }}"

ANSIBLE_LOCALHOST_WARNING=false ansible-playbook myPlaybook.yml

PLAY [127.0.0.1] ************************************************************************************

TASK [Init. things] *********************************************************************************
changed: [127.0.0.1]

TASK [Job A] ****************************************************************************************
changed: [127.0.0.1]

TASK [Job B] ****************************************************************************************
changed: [127.0.0.1]

TASK [Wait for 'Job A' to end] **********************************************************************
FAILED - RETRYING: Wait for 'Job A' to end (10 retries left).
FAILED - RETRYING: Wait for 'Job A' to end (9 retries left).
changed: [127.0.0.1]

TASK [debug] ****************************************************************************************
ok: [127.0.0.1] => {
    "jobA": {			the variable I registered to use async_status
        "ansible_job_id": "648023500521.29516",
        "changed": true,
        "failed": false,	the background task exit status
        "finished": 0,		I expected this to be 1 ()
        "results_file": "/home/bob/.ansible_async/648023500521.29516",
        "started": 1
    }
}

TASK [debug] ****************************************************************************************
ok: [127.0.0.1] => {
    "jobAStatus": {		the variable I registered to use until with async_status
        "ansible_job_id": "648023500521.29516",
        "attempts": 3,
        "changed": true,
        "cmd": "for i in 1 2 3 4 5_end; do sleep 1; echo \"job A\\t$i\\t$(date)\" >> myTempFile; done; echo 'job A OK'",
        "delta": "0:00:05.016975",
        "end": "2020-05-27 13:41:46.752286",
        "failed": false,
        "finished": 1,
        "rc": 0,
        "start": "2020-05-27 13:41:41.735311",
        "stderr": "",
        "stderr_lines": [],
        "stdout": "job A OK",
        "stdout_lines": [
            "job A OK"
        ]
    }
}

TASK [debug] ****************************************************************************************
ok: [127.0.0.1] => {
    "msg": {			contents of the job's result_file
        "changed": true,
        "cmd": "for i in 1 2 3 4 5_end; do sleep 1; echo \"job A\\t$i\\t$(date)\" >> myTempFile; done; echo 'job A OK'",
        "delta": "0:00:05.016975",
        "end": "2020-05-27 13:41:46.752286",
        "invocation": {
            "module_args": {
                "_raw_params": "for i in 1 2 3 4 5_end; do sleep 1; echo \"job A\\t$i\\t$(date)\" >> myTempFile; done; echo 'job A OK'",
                "_uses_shell": true,
                "argv": null,
                "chdir": null,
                "creates": null,
                "executable": null,
                "removes": null,
                "stdin": null,
                "stdin_add_newline": true,
                "strip_empty_ends": true,
                "warn": true
            }
        },
        "rc": 0,
        "start": "2020-05-27 13:41:41.735311",
        "stderr": "",
        "stdout": "job A OK"
    }
}

PLAY RECAP ******************************************************************************************
127.0.0.1                  : ok=7    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
mail

"New platform" cheatsheet

Since it's always the same but I regularly forget basic steps when starting working on a new platform with Ansible, here's my cheatsheet :
  1. configure SSH properly :
  2. try a manual connection to SSH hosts :
    • to confirm the SSH configuration is fine
    • necessary if you have not disabled the host key checking
  3. build the inventory file
    • you can choose between INI and YAML formats
    • AFAIK, both work fine. INI is more readable IMHO.
    • they differ in the way they interpret values assigned to variables (details)
  4. test the connection to slaves by Ansible :
    ansible --inventory=myInventoryFile all -m ping
    slave1 | SUCCESS => {
        "changed": false,
        "ping": "pong"
    }
    slave2 | SUCCESS => {
        "changed": false,
        "ping": "pong"
    }
    slave3 | SUCCESS => {
        "changed": false,
        "ping": "pong"
    }
    
  5. check you can connect to slaves and gather facts :
    ansible --inventory=myInventoryFile all -m setup
  6. launch a playbook carefully :
    ansible-playbook --inventory=myInventoryFile --limit=mysql myPlaybook.yml -u ansible -DC
  7. to be continued...
mail

static vs dynamic, aka import vs include

static
  • made with import* directives (import_tasks, import_playbook, ...)
  • pre-processed during playbook parsing time
  • the tags and when directives of a task are copied to all its children
dynamic
  • made with include* directives (include_tasks, include_role, ...)
  • processed during runtime at the point in which that task is encountered
  • the tags and when directives of a task apply to that task only and are not copied to its children
mail

ansible and ansible-playbook CLI flags

These flags are common to the ansible and ansible-playbook commands.

Flag Usage
-a 'arguments'
--args='arguments'
Pass arguments to the module specified with -m :
-a 'arg1Name=arg1Value arg2Name=arg2Value'
-b
--become
Run operations with become
Does not imply password prompting, use -K.
Provided I get this correctly, -K _may_ not be necessary if commands requiring "sudo" privileges are configured with NOPASSWD. But that wouldn't be very safe (and I've not been able to make this work so far...). So use -K whenever -b is there.
-C
--check
Do not make any changes on the remote system, but test resources to see what might have changed.
This can not scan all possible resource types and is only a simulation.
-D
--diff
  • When changing any templated files : show the unified diffs of how they changed.
  • When used with --check (i.e. when simulating) : show how the files would have changed.
-e
--extra-vars
Specify additional variables :
  • in key=value format :
    • -e value=42
    • -e "value1=foo value2=bar"
    • -e "name='John Doe' age=42"
  • as YAML/JSON : -e '{"key": value, "list": ["valueA", "valueB"]}'
  • from a file : -e "@myVars.yml"
-f n
--forks=n
Launch up to n parallel processes (forks)
-i inventory
--inventory=inventory
inventory can be :
  • the path to the inventory file (defaults to /etc/ansible/hosts)
  • OR a comma-separated list of hosts Don't trick yourself into believing that -i == inventory file !
-k
--ask-pass
Prompt for the connection password, if it is needed for the transport used (e.g. SSH). details
-K
--ask-become-pass
Ask for privilege escalation password (i.e. sudo password). details
-l pattern
--limit=pattern
limit the playbook execution to slaves matching pattern.
Let's imagine you have web1.example.com, web2.example.com and sql.example.com slaves and want to alter the web servers only. pattern could be specified :
-m moduleName
--module-name=moduleName
Execute module moduleName (module index)
-t tags
--tags tags
Only run plays and tasks tagged with these tags. See also : Using tags to slice playbooks
-u remoteUser
--user remoteUser
connect to ansibleSlave as remoteUser
-v
--verbose
verbose mode, -vvv to increase verbosity, -vvvv to enable connection debugging
You may also use the ANSIBLE_DEBUG environment variable :
ANSIBLE_DEBUG=true ansible
mail

group_vars

As stated by its name, group_vars is for variables applying to one or more group of hosts.
Defining in group_vars variables that don't apply to groups is extremely misleading (although it _may_ work) and is discouraged as this is bad practice ().

group_vars can either be :

a regular file
variables for all groups will be defined here.
This is fine unless this file gets really long, complex and barely readable.
a directory (example)
when there are many groups / variables / both, it becomes easier to have a structure like :
[root] playbook root directory
group_vars (directory)
group1.yml variables for the members of the group1 group
group2.yml variables for the members of the group2 group

i.e. : group_vars is a directory with :
  • 1 file per hostgroup
  • each file has variables for the corresponding group
  • each file is named after the group it applies to
mail

Using tags to slice playbooks

Usage :

Let's consider myPlaybook.yml :
- hosts: all
  roles:
    - roleA
    - roleB
    - roleC


- hosts: sql
  roles:
    - roleD
    - roleE
  tags:
    - sqlOnly
It is possible to play roles roleD and roleE on members of the sql host group with :
ansible-playbook -i myInventoryFile --diff --check -t sqlOnly myPlaybook.yml
To specify several tags :
ansible-playbook -t 'tag1,tag2'
tags may also be applied to tasks & al.

Special tags (source) :

always
will always run a task, unless explicitly skipped with --skip-tags always
never
will prevent a task from running unless a tag is explicitly requested (i.e. never must be associated with another tag)
tagged
will run tasks that have at least 1 tag
untagged
will run tasks that have no tag
all
will run all tasks
By default, Ansible runs as if --tags all had been specified.

Tags inheritance (source) :

Tags added to :
  • a play
  • or to statically imported tasks and roles (i.e. when using an import_... directive)
adds those tags to all of the contained tasks.
This is referred to as tag inheritance.
Tag inheritance is not applicable to dynamic inclusions such as include_role and include_tasks.
When tags is applied to... it affects the object having the tag ...and its children too
a play Yes Yes
anything that is import_*ed Yes Yes
anything that is include_*ed Yes No

Related directives :

--skip-tags
Not only do tags allow to run specific parts of a playbook, but they also allow skipping parts :
  • to skip a single tag :
    ansible-playbook myPlaybook.yml [options] --skip-tags tagToSkip
  • to skip several tags :
    ansible-playbook myPlaybook.yml [options] --skip-tags 'tag1,tag2'
--start-at-task
execute subparts of a playbook (without relying on tags)
mail

ansible-playbook : prompt for passwords with -k and -K

Everything below also applies to ad-hoc commands launched with ansible.

Setup

  • kevin is sitting at ansibleMaster, where Ansible is installed
  • kevin wants to perform some actions, with Ansible, on ansibleSlave

Running a playbook

kevin@ansibleMaster$ ansible-playbook [some work] ansibleSlave
  • connects to ansibleSlave as kevin via SSH to do [some work]
  • requires kevin to be able to "ssh kevin@ansibleSlave"
  • works the same with or without SSH keys

So :

If, on ansibleMaster, /home/kevin/.ssh/config looks like :
Host ansibleSlave
	User stuart
	IdentityFile ~/.ssh/id_rsa
  • Then :
    kevin@ansibleMaster$ ansible-playbook [some work] ansibleSlave
    will do [some work] on ansibleSlave as stuart, still via SSH, and using key authentication (so no password required).
  • Otherwise (no key authentication configured), Ansible would need to be instructed to prompt for stuart's password on ansibleSlave with -k.

When escalated privileges (i.e. sudo) are necessary

You'll have to include into the ansible / ansible-playbook command line :
for the SSH part for the sudo part
playbook using become
  • make sure you can ssh myself@ansibleSlave
  • without SSH keys : -k
  • with SSH keys : nothing special
ad-hoc command or
playbook not using become

Commands that require escalated privileges need not specifying sudo : this would be redundant with -b :

ansible all -i ansibleSlave, -bK -m command -a 'whoami'
ansibleSlave | SUCCESS | rc=0 >>
root
Without -b, with sudo :
ansible all -i ansibleSlave, -K -m command -a 'sudo whoami'
ansibleSlave | FAILED | rc=1 >>
sudo: no tty present and no askpass program specified

When "ansible -bkK " keeps failing

If you can successfully run commands manually (ssh myself@ansibleSlave + sudo command) while ansible(-playbook)? -bkK fails :
TASK [setup] *******************************************************************
fatal: [ansibleSlave]: UNREACHABLE! => {"changed": false, "msg": "Authentication failure.", "unreachable": true}
check the points below.

Make sure your local SSH configuration (~/.ssh/config) doesn't interfere

A User directive can send a different login name : grep -i user ~/.ssh/config

Make sure the remote SSH configuration (/etc/ssh/sshd_config) is still appropriate

It _may_ not be up-to-date on a given host because the playbook managing it has not been run for a long time, hence missing / colliding options

Make sure the SSH connection is opened the way you mean it to be

For instance, you may have typed :

ansible-playbook myPlaybook.yml -l 'ansibleSlave' -u $USER -bkK -D
expecting the SSH connection to be opened as $USER (i.e. ssh $USER@ansibleSlave), then sudo to run the playbook tasks.

Check it by making Ansible verbose :

ansible-playbook -vvv myPlaybook.yml -l 'ansibleSlave' -u $USER -bkK -D

<10.27.25.1> ESTABLISH SSH CONNECTION FOR USER: root
So despite my specification, the connection is still open as root.

Turned out that myPlaybook.yml looks like :

- hosts: all
  remote_user: root		GOTCHA!!!
  roles:
  
... which explains _WHY_.

This is a BAD10100 practice which effectively forces Ansible to "ssh root@ansibleSlave" (this should _NOT_ be possible).

I can see only ONE _very specific_ usage for remote_user: root : for tasks to run on virtual machines that just have been spawned : they have no local user accounts, no sudo / domain / LDAP configured. But I guess there may be cleaner ways than hardcoding this...

To workaround this :

ansible-playbook myPlaybook.yml -l 'ansibleSlave' --extra-vars "ansible_user=$USER" -kK -D

mail

ansible-playbook

Usage

Run an Ansible playbook. Typical usage :

ansible-playbook -i inventory --diff --check myPlaybook.yml

Flags

CLI flags are common to several Ansible commands / tools. See this dedicated article.