Ansible - The HowTo's

mail

How to stop the execution of a play for a host and let it continue for others ?

Situation

Details

Solution

method output pros / cons
---
- hosts: slave1, slave2
  tasks:

  - name: "task for all hosts"
    debug:
      msg: "hello from everybody"

  - fail:
    when: inventory_hostname == 'slave1'

  - name: "task for survivors only"
    debug:
      msg: "hello from the survivors"
...
TASK [task for all hosts] ******************
ok: [slave1] =>
  msg: hello from everybody
ok: [slave2] =>
  msg: hello from everybody

TASK [fail] ********************************
fatal: [slave1]: FAILED! => changed=false
  msg: Failed as requested from task
skipping: [slave2]

TASK [task for survivors only] *************
ok: [slave2] =>
  msg: hello from the survivors
  • +
    • simple solution
  • -
    • the playbook execution status is failed
---
- hosts: slave1, slave2
  tasks:

  - name: "task for all hosts"
    debug:
      msg: "hello from everybody"

  - block:

      - name: "task for survivors only"
        debug:
          msg: "hello from the survivors"

    when: inventory_hostname != 'slave1'
...
TASK [task for all hosts] ******************
ok: [slave1] =>
  msg: hello from everybody
ok: [slave2] =>
  msg: hello from everybody

TASK [task for survivors only] *************
skipping: [slave1]
ok: [slave2] =>
  msg: hello from the survivors
  • +
    • excluding slave1 has no impact on the playbook execution status
  • -
(source)
main.yml
---
- hosts: slave1, slave2
  tasks:

  - name: "task for all hosts"
    debug:
      msg: "hello from everybody"

  - block:

      - name: "include tasks for survivors only"
        include_tasks: tasksForSurvivors.yml

    when: inventory_hostname != 'slave1'
...
tasksForSurvivors.yml
---
- name: "task for survivors only"
  debug:
    msg: "hello from the survivors"
...
TASK [task for all hosts] ******************
ok: [slave1] =>
  msg: hello from everybody
ok: [slave2] =>
  msg: hello from everybody

TASK [include tasks for survivors only] *************
skipping: [slave1]
included: path/to/tasksForSurvivors.yml for slave2

TASK [task for survivors only] *************
ok: [slave2] =>
  msg: hello from the survivors
  • +
    • (as above)
    • more readable than a giant block when
  • -
mail

How to spawn virtual machines with VMware and Ansible ?

Table of contents

  1. Prerequisites
  2. vSphere account
  3. My first playbook
  4. Comments

Prerequisites

Ansible can interact with VMware thanks to its vmware_guest module, which has its own list of requirements.

vSphere account

Instead of using a vSphere administrator account that has full permissions on everything, it is preferable that Ansible has its own account, with limited privileges. To do so, you'll have to :
  1. Create a newrole :
    1. Menu | Administration | Roles | +
    2. grant the appropriate permissions
  2. create a local user account :
    vSphere knows 3 kinds of accounts :
    • domain accounts : defined in an external Active Directory, for instance
    • OS accounts : defined in the underlying Linux (for local services and stuff... ???)
    • web app accounts : local user accounts defined within vSphere itself. These are also called SSO.
    1. Menu | Administration | Users and Groups
    2. open the Users tab
    3. in the Domain select box, pick vsphere.local
    4. enter the login + password + other information
  3. Attach the role to the user account :
    1. Menu | Administration | Global Permissions | +
    2. select the user + the role you just created
    3. tick [x] Propagate to children
  4. Test ansible connection to vSphere by adding a dedicated task at the beginning of the playbook below.

My first playbook

To be honest :
  • this was strongly inspired by the examples found in the docs
  • this was not my _very first_ playbook , consider it as my current minimal working example and don't forget some things may be wrong / could be better in the code below
  • pay attention to the comments in the code and in the section below
---
# ANSIBLE_LOCALHOST_WARNING=false ansible-playbook testVmware.yml -DC
- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  vars:
    vcenter:
      structure:
        hostname: 'vcenter.myCompany.tld'
        datacenter: 'MY_DATACENTER'
        cluster: 'MY_CLUSTER'
      account:
        username: 'kevin'
        password: 'P@ssw0rd'
    virtualMachines:
      - name: 'test_VM_made_with_ansible__1'
        folder: '/development'
        state: 'poweredon'
        datastore: 'DS_DEVELOPMENT'
        template: 'TEMPLATE_RHEL8.4'
        hardware:
          memoryMb: 2048
          nbCpu: 1
        networks:
          - name: 'DvS_DEVELOPMENT'
            ip: 10.27.25.23
            netmask: 255.255.255.224
            gateway: 10.27.25.30
          - name: 'DvS_TESTING'
            ip: 10.27.25.200
            netmask: 255.255.255.224
            gateway: 10.27.25.222
        disk:
        - size_gb: 20
          type: thin
        - size_gb: 1			just trying to create a VM with more disks that the template
          type: thin

      - name: 'test_VM_made_with_ansible__2'
        folder: '/development'
        state: 'poweredon'
        datastore: 'DS_DEVELOPMENT'
        template: 'TEMPLATE_RHEL8.4'
        hardware:
          memoryMb: 2048
          nbCpu: 1
        disk:
        - size_gb: 21			can the 1st disk be larger than the one specified in the template ?
          type: thin
        - size_gb: 2
          type: thin
        networks:
          - name: 'DvS_DEVELOPMENT'
            ip: 10.27.25.24
            netmask: 255.255.255.224
            gateway: 10.27.25.30
          - name: 'DvS_TESTING'
            ip: 10.27.25.201
            netmask: 255.255.255.224
            gateway: 10.27.25.222

  tasks:

  - name: Test connection to vCenter
    vmware_vm_facts:
      hostname: "{{ vcenter.structure.hostname }}"
      username: "{{ vcenter.account.username }}"
      password: "{{ vcenter.account.password }}"
      validate_certs: no

  - name: "Delete all the VMs"			so that I can start clean each time I execute this playbook
    vmware_guest:
      hostname: "{{ vcenter.structure.hostname }}"
      username: "{{ vcenter.account.username }}"
      password: "{{ vcenter.account.password }}"
      validate_certs: no
      name: "{{ vm.name }}"
      state: absent
      force: yes				necessary when trying to delete a powered on VM
    loop:
      "{{ virtualMachines }}"
    loop_control:
      loop_var: vm

  - name: "Create virtual machines"
    vmware_guest:
      hostname: "{{ vcenter.structure.hostname }}"
      username: "{{ vcenter.account.username }}"
      password: "{{ vcenter.account.password }}"
      validate_certs: no
      datacenter: "{{ vcenter.structure.datacenter }}"
      cluster: "{{ vcenter.structure.cluster }}"
      name: "{{ vm.name }}"
      folder: "{{ vm.folder }}"
      state: "{{ vm.state }}"
      datastore: "{{ vm.datastore }}"
      disk: "{{ vm.disk }}"
      hardware:
        memory_mb: "{{ vm.hardware.memoryMb }}"
        num_cpus: "{{ vm.hardware.nbCpu }}"
        scsi: paravirtual
      networks: "{{ vm.networks }}"
      template: "{{ vm.template }}
    loop:
      "{{ virtualMachines }}"
    loop_control:
      loop_var: vm
...

Comments

  • if a newly created VM only boots in Emergency Mode :
    • you should see an error message suggesting things like running journalctl -xb to have an idea of what's going on
    • if your VM has less disks than the template used to build it, its /etc/fstab will refer to a missing disk, causing the Emergency Mode. Just comment the extra disk(s) and the VM will boot normally
  • about disks :
    • a VM can have more disks than the template used to build it
    • a VM disk size can be equal or greater than the size of the corresponding template disk but not smaller
    • about the datastore parameter :
      • despite the per-disk value, all disks of a VM will be created on the same datastore than the first disk, whatever value is given to this parameter for the 2nd (and next) disk(s) (including a non-existing datastore name)
      • changing its value on an existing VM :
        • will NOT affect the disk(s)
        • the corresponding task will have a ok (no change) status
    • not sure whether it is possible to remove a disk from an existing VM
  • about templates :
mail

How to view Ansible settings values (custom or default) ?

  1. Custom values can be defined in configuration files to override defaults. So something like this should do the trick :
    ansibleSetting='become_method'; echo "$ANSIBLE_CONFIG" | grep "$ansibleSetting"; grep -E "^[^#]*$ansibleSetting" ./ansible.cfg ~/.ansible.cfg /etc/ansible/ansible.cfg
  2. If the commands above return nothing —which is pretty likely— this means :
    • no custom value has been defined for the considered setting
    • the default value is still in use
    How to get this default value ?
  3. These default values are pretty awkward to find in the online docs, but here comes ansible-config :
    • let's consider we're looking for the currently configured value of the become_user directive :
      • which is neither defined in any configuration file
      • nor is specified in a playbook / role / task / file (don't forget to check this )
      • What we'll get is the default value.
    • ansible-config list | grep -i -A3 'become_method'
      DEFAULT_BECOME_METHOD:
        default: sudo
        description: Privilege escalation method to use when `become` is enabled.
        env:
        - {name: ANSIBLE_BECOME_METHOD}
        ini:
        - {key: become_method, section: privilege_escalation}
        name: Choose privilege escalation method
    • We also found that the become_user directive has a DEFAULT_BECOME_METHOD counterpart.
mail

How to subtract a list from a list ?

As said in the Ansible documentation, different kinds of filters are available to manipulate data (see links in the code below) :

See the code :

---
#	ANSIBLE_LOCALHOST_WARNING=false ansible-playbook test.yml
- hosts: 127.0.0.1
  connection: local
  gather_facts: no

  vars:
    myList: [ 'a', 'b', 'c', '4', 'd', '9' ]
    groceryList: [ 'fruit', 'vegetables', 'milk', 'FAT', 'SALT', 'SUGAR' ]
    unhealthyFood: [ 'FAT', 'SALT', 'SUGAR' ]

  tasks:

  - set_fact:
      rejectSingleItem: "{{ myList | reject('search', '4') | list }}"
      rejectNonMatching: "{{ myList | reject('match', '[^a-z]') | list }}"
      healthyGroceryList_forLoop: "[ {% for item in groceryList if item not in unhealthyFood %}'{{ item }}'{{ ', ' if not loop.last else '' }}{% endfor %} ]"
      healthyGroceryList_reject: "{{ groceryList | reject('in', unhealthyFood) | list }}"
      healthyGroceryList_difference: "{{ groceryList | difference(unhealthyFood) }}"

  - debug:
      var: "{{ item }}"
    loop:
      - rejectSingleItem
      - rejectNonMatching
      - healthyGroceryList_forLoop
      - healthyGroceryList_reject
      - healthyGroceryList_difference
...
TASK [debug] *****************************************************
ok: [127.0.0.1] => (item=rejectSingleItem) => {
    "ansible_loop_var": "item",
    "item": "rejectSingleItem",
    "rejectSingleItem": [
        "a",
        "b",
        "c",
        "d",
        "9"
    ]
}
ok: [127.0.0.1] => (item=rejectNonMatching) => {
    "ansible_loop_var": "item",
    "item": "rejectNonMatching",
    "rejectNonMatching": [
        "a",
        "b",
        "c",
        "d"
    ]
}
ok: [127.0.0.1] => (item=healthyGroceryList_forLoop) => {
    "ansible_loop_var": "item",
    "healthyGroceryList_forLoop": [
        "fruit",
        "vegetables",
        "milk"
    ],
    "item": "healthyGroceryList_forLoop"
}
ok: [127.0.0.1] => (item=healthyGroceryList_reject) => {
    "ansible_loop_var": "item",
    "healthyGroceryList_reject": [
        "fruit",
        "vegetables",
        "milk"
    ],
    "item": "healthyGroceryList_reject"
}
ok: [127.0.0.1] => (item=healthyGroceryList_difference) => {
    "ansible_loop_var": "item",
    "healthyGroceryList_difference": [
        "fruit",
        "vegetables",
        "milk"
    ],
    "item": "healthyGroceryList_difference"
}

If you get a no test named 'in' error :

  • The in test was added in Jinja2 2.10 (unfold Changelog or see changelog in release notes), so you may not be up-to-date.
  • Check it :
    dpkg -l | grep jinja2
mail

How to send an alert with Ansible ?

Code Output Notes
- fail:
    msg: "ALERT!"
TASK [fail] *********************************************************************************************
fatal: [myHost]: FAILED! => {"changed": false, "msg": "ALERT!"}

PLAY RECAP **********************************************************************************************
myHost	: ok=n	changed=0	unreachable=0	failed=1	skipped=0	rescued=0	ignored=0
  • extremely visible since the execution aborts immediately
  • depending on situations, we _may_ prefer the execution to continue uninterrupted
- fail:
    msg: "ALERT!"
  ignore_errors: true
TASK [fail] *********************************************************************************************
fatal: [myHost]: FAILED! => {"changed": false, "msg": "ALERT!"}
...ignoring
following tasks, then finally :
PLAY RECAP **********************************************************************************************
myHost	: ok=n	changed=0	unreachable=0	failed=0	skipped=0	rescued=0	ignored=1
  • does not interrupt the execution
  • depending on the number of tasks, the alert goes from slightly less visible to almost invisible, lost in the log output
- block:
    - fail:
        msg: "ALERT!"

  rescue:
    - debug:
        msg: "RESCUE"
TASK [fail] *********************************************************************************************
fatal: [myHost]: FAILED! => {"changed": false, "msg": "ALERT!"}

TASK [debug] ********************************************************************************************
ok: [myHost] => {
    "msg": "RESCUE"
}
following tasks, then finally :
PLAY RECAP **********************************************************************************************
myHost	: ok=n	changed=0	unreachable=0	failed=0	skipped=0	rescued=1	ignored=0
  • does not interrupt the execution
  • requires to rescue the error instead of ignoring it
  • appears in the rescued tasks
  • as above, depending on the context : from less visible to almost hidden
- mail:
  
(nothing special, just like any other task)
  • requires a mail server (+ account) that is reachable by the host
  • with time, scripts sending emails may turn into spam machines
mail

How to specify target hosts ?

A few words of warning :

Examples below :

Now let's target hosts :

Target Syntax Comment
by name
a list of hosts ansible host1,host2,host3
a list of hosts matching a regular expression ansible '~server0[12]\.acme\.org'
  • matches server01.acme.org and server02.acme.org
  • make sure you're not limiting the effective target with the ansible-playbook -l flag
by group
all hosts of a single group ansible groupName
all hosts known to Ansible ansible all
hosts of several groups ansible group1:group2:group3 The colon : actually means a logical OR. This command above applies to any host belonging either to group1 or to group2 or to group3
When it comes to complex rules with intersections and exclusions (see examples), it may not be a REAL "logical OR"
hosts belonging to the 2 specified groups ansible 'group1:&group2'
  • aka intersection of groups
by name + group
all hosts except those matching an expression
  • ansible 'all:!expression*'
  • ansible 'groupName:!badHost'
all hosts of a group except several of them ansible 'groupName:!~(host1|host2)'
  • ~ is used to introduce a regular expression
  • you get the idea to exclude as many hosts as necessary
mail

How to loop on a list of items ?

Let's consider this play :
- hosts: 127.0.0.1
  connection: local
  gather_facts: no
  tasks:

  - set_fact:
      fruits: [
        'apple',
        'orange',
        'banana',
        ]

  - debug:
      var: item
    loop:
      - fruits

  - debug:
      var: item
    loop:
      - "{{ fruits }}"

  - debug:
      var: item
    loop:
      "{{ fruits }}"
which outputs :
TASK [debug] ********************************************************************
ok: [127.0.0.1] => (item=fruits) => {
    "item": "fruits"
}

TASK [debug] ********************************************************************
ok: [127.0.0.1] => (item=[u'apple', u'orange', u'banana']) => {
    "item": [
        "apple",
        "orange",
        "banana"
    ]
}

TASK [debug] ********************************************************************
ok: [127.0.0.1] => (item=apple) => {
    "item": "apple"
}
ok: [127.0.0.1] => (item=orange) => {
    "item": "orange"
}
ok: [127.0.0.1] => (item=banana) => {
    "item": "banana"
}

Explanations

- fruits
  • this passes the string fruits, not the fruits variable
  • use "{{ }}" to pass a variable
- "{{ fruits }}"
  • the leading dash - introduces a list item
  • _HERE_ it's a one-item list, and this item happens to be a list
  • Anyway, this is why the loop turns only once and displays all fruits at once
"{{ fruits }}"
this is what we expected
mail

How to stop ansible-playbook execution (i.e. something like --stop-at-task) with / without a failure exit code ?

Situation

Context Desired
status code
Suggested
method
  • development / test / debug
  • just need a quick-n-dirty STOP HERE! instruction
don't care fail
  • any time
  • some requirements are not met
failure
  • ending in a nothing to do situation
  • just want to leave nicely
success end_play
There is a --start-at-task (source : 1, 2) directive but no --stop-at-task so far.

Solution

This stops the execution abruptly, with a failure status, making this solution mostly suited for debugging. It has the advantage that :

Exceptions ?

So far, I've found no exception to this rule : when using fail, the playbook execution stops. Period.
However, I've been amazed once, when —despite fail— the execution continued (... or at least _seemed_ to continue). Here's what happened then :
  1. RedHat.yml is a role task file having a fail task, everything should go extremely well
  2. the role RedHat.yml belongs to is applied by myPlaybook.yml
  3. I run myPlaybook.yml on a group of hosts (myGroup has MANY hosts) :
    ansible-playbook myPlaybook.yml -l myGroup
  4. the execution starts normally, I can see a series of expected fatal: [hostname]: FAILED! => {"changed": false, "msg": "Execution stopped on purpose for whatever reason"}, then the output continues with the following tasks ()
Explanation (this is highly specific to my context / code) :
  1. when myPlaybook.yml starts running the role having the RedHat.yml task file, it starts executing main.yml
  2. main.yml makes a conditional include_tasks of RedHat.yml, meaning some hosts get it, some don't. This is where the trick happens :
    • RedHat.yml is included by Red Hat servers
    • but there is also Debian.yml, with a similar set of tasks (hence tasks names) for Debian servers
  3. the hosts who actually included RedHat.yml fail as expected (and are removed from the list of the playbook targets)
  4. the other hosts continue normally, which is what I can see
    the tasks running at that time are not from RedHat.yml anymore but from Debian.yml
  5. reasons why I've not been able to see what was happening :
    • large number of hosts in myGroup : the execution lists all of them and scrolls fast
    • cryptic hostnames : pretty difficult to spot that some hosts have disappeared after the fail
    • identical task names in RedHat.yml and in Debian.yml : when the playbook execution showed the name of the task right after the fail, I thought the execution of the task I edited file continued.

Alternate solution

This _can_ do the job too :
- meta: end_play
But it's fairly different from the fail method :
mail

How to override /etc/ansible/ansible.cfg settings with personal values ?

Create ~/.ansible.cfg and replicate + override the required section / values :
[defaults]
host_key_checking	= False
inventory		= /home/stuart/ansible/hosts
roles_path		= /home/stuart/ansible/roles
vault_password_file	= /var/lib/ansible/.vault_password
mail

How to use with_items ?

Because I can never remember how to use with_items, here's an example :
- name: unmount volume groups
  mount:
    src: "{{ item.device }}"
    name: "{{ item.mountPoint }}"		will be path in Ansible 2.3+
    state: unmounted
  with_items:
    - { device: '/dev/mapper/vg_data-data', mountPoint: '/var/lib/docker/devicemapper' }
    - { device: '/dev/mapper/vg_data-data', mountPoint: '/var/lib/docker' }