Ansible errors - (sh*t happens)

mail

The network starts disconnected after deploying a VM with Ansible and vmware_guest

Situation

This article refers to a situation I experienced when trying to spawn virtual machines with VMware and Ansible.

Details

I found several posts that did NOT fix this issue :

The answer is subtle and can be read here, here, and confirmed here : Customization of Linux guest operating systems requires that Perl is installed in the Linux guest operating system.

A minimal install of RHEL 8.4 does NOT install Perl.

An error message in the vm-tools logs may also have caught your attention (detail) :

/usr/bin/perl: bad interpreter or No such file or directory

Solution

Now we know what's missing, it should be as simple as :
yum install perl
in the VM used to build the template (+ clone it as a template _again_ ). But what if this VM : Details in the dedicated article : How to install software on RHEL with the install DVD only ?.
mail

timedatectl command was found but not usable: Failed to create bus connection: No such file or directory + given timezone "Etc/UTC" is not available

Situation

Full error message :
TASK [common : set timezone to Etc/UTC] ******************************************************************************************************************
 [WARNING]: timedatectl command was found but not usable: Failed to create bus connection: No such file or directory . using other method.

fatal: [myHost]: FAILED! => {"changed": false, "msg": "Error message:\ngiven timezone \"Etc/UTC\" is not available"}

Details

There are 2 problems here :
  1. timedatectl command was found but not usable
  2. given timezone "Etc/UTC" is not available
    • I got this error both with Ansible versions 2.7.9 and 2.8.6
    • The source code shows :
      250     def _verify_timezone(self):
      251         tz = self.value['name']['planned']
      252         tzfile = '/usr/share/zoneinfo/%s' % tz
      253         if not os.path.isfile(tzfile):
      254             self.abort('given timezone "%s" is not available' % tz)
      255         return tzfile
    • which was confirmed within the container by :
      ll /usr/share/zoneinfo/Etc/UTC
In my case, both errors were caused by the fact that myHost is a Docker container.

Solution

for timedatectl command was found but not usable (sources : 1, 2) :

  1. Stop containers :
    docker-compose stop
  2. Add to docker-compose.yml :
      s_myHost:
        hostname: myHost
        build: .
        container_name: c_myHost
        tty: true
        volumes:
         - /run/dbus/system_bus_socket:/run/dbus/system_bus_socket:ro
  3. Rebuild + restart :
    docker-compose up --build -d
  4. update your Ansible inventory

for given timezone "Etc/UTC" is not available :

Same as above, with :
  s_myHost:
    hostname: myHost
    build: .
    container_name: c_myHost
    tty: true
    volumes:
     - /run/dbus/system_bus_socket:/run/dbus/system_bus_socket:ro
     - /usr/share/zoneinfo:/usr/share/zoneinfo:ro
mail

"msg": "Failed to connect to the host via ssh: ", "unreachable": true

Situation

Full error message :
failed: [myHost] (item=someItem) => {"item": "someItem", "msg": "Failed to connect to the host via ssh: ", "unreachable": true}

Details

This error is very common and can have multiple causes. It can usually be fixed by making sure you're doing SSH right.
mail

ERROR! 'delegate_to' is not a valid attribute for a TaskInclude

Situation

The error message :
ERROR! 'delegate_to' is not a valid attribute for a TaskInclude
refers to this task :
  - name: Create MySQL manager user
    include_tasks: mysql-users.yml
    when: create_mysql_manager | bool
    delegate_to: "{{ groups['mysql'][0] }}"
    run_once: true
    vars:
      mysql_user_name: "{{ mysql_manager_username }}"
      mysql_user_password: "{{ mysql_manager_password }}"
      mysql_user_state: 'present'
      mysql_user_priv: '*.*:ALL,GRANT'
      mysql_user_checkadmin: false
      mysql_user_updatepw: 'on_create'
This is legacy code I have to support and (try to) update so that it runs without errors. It was built for Ansible 2.7. Not sure this was (still is) the appropriate way of doing things.

Details

ansible --version
ansible 2.8.4

  python version = 3.7.3 (default, Apr  3 2019, 05:39:12) [GCC 8.3.0]

Solution

Not a solution but a workaround (source), add to ansible.cfg :
[defaults]

invalid_task_attribute_failed=False
mail

UNREACHABLE! => Data could not be sent to remote host "12.34.56.78". Make sure this host can be reached over ssh:

Situation

Full error message :
fatal: [myHost]: UNREACHABLE! => {
    "changed": false,
    "msg": "Data could not be sent to remote host \"12.34.56.78\". Make sure this host can be reached over ssh: ",
    "unreachable": true
}

Details

This error is usually no big deal, but it can be pretty frustrating since it can have several (simultaneous) causes .

Solution

Make sure :
  1. 12.34.56.78 is really the host you'd like to manage with Ansible : is your inventory file up-to-date ?
  2. you can ssh -i privateKey remoteUser@12.34.56.78 :
  3. you are specifying the right user in the Ansible command line :
    ansible-playbook [other options] -u username
  4. Ansible is actually using the username you specified :
    Triple v's are necessary for the verbosity level shown here.
    ansible-playbook [other options] -u stuart -vvv
    <12.34.56.78> ESTABLISH SSH CONNECTION FOR USER: stuart		the username it's actually trying to connect as
    <12.34.56.78> SSH: EXEC ssh -C \					line broken for readability
    	-o ForwardAgent=yes \
    	-o ControlMaster=auto \
    	-o ControlPersist=60s \
    	-o StrictHostKeyChecking=no \
    	-o KbdInteractiveAuthentication=no \
    	-o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey \
    	-o PasswordAuthentication=no \
    	-o 'User="stuart"' \
    	-o ConnectTimeout=20 \
    	-o ControlPath=$HOME/.ansible/cp/5aa7fea824 172.18.0.2 '/bin/sh -c '"'"'sudo -H -S -n  -u root /bin/sh -c '"'"'"	cp stands for ControlPath
    '"'"'"'"'"'echo BECOME-SUCCESS-newtperqpfiryppbejvbytibvfvwafjg ; /usr/bin/python'"'"'"'"'"'"'"'"' && sleep 0'"'"''		the full ssh command with all options
mail

AnsibleFilterError: The ipaddr filter requires python-netaddr be installed on the ansible controller

Situation

One of my playbooks miserably fails complaining :
failed: [slave_n] (item=etc/iptables/rules.v4) => {"changed": false, "item": "etc/iptables/rules.v4", "msg": "AnsibleFilterError: The ipaddr filter requires python-netaddr be installed on the ansible controller"}

Solution

Just install the missing module :

Alternate solution

You may also (source) :
pip install netaddr
mail

fatal: [server]: FAILED! => {"msg": "Incorrect sudo password"}

Situation

Details

Steps to reproduce

  1. ansible-playbook -i myInventoryFile -l *pattern* all.yml -t myTag -kK -DC
  2. which prompts :
    SSH password: _
  3. so I enter my SSH password I think the black magic lies here (details)
  4. I'm then prompted :
    SUDO password[defaults to SSH password]: _
    and I just press
  5. the playbook execution begins, until it fails as described above

Technical environment

Welcome to The Twilight Zone

Things are getting weird, be prepared ! (See also the alternate solution)

My sudo password is a random string of characters generated by pwgen.

No idea whether this is related or not, but it contains special characters that may puzzle the shell such as ;, (, }, |, ? or ~.

When developing / running playbooks, I have to enter my password again and again, but since I'm a lazy guy, it's also in the *scratch* buffer of my editor, Emacs, so that I can copy-paste it into the shell window when prompted.
In editors, you can copy text until :

  • its last character (i.e. end of the word / string / line)
  • or you can copy the whole line, including the trailing carriage return. Which saves pressing after pasting (lazy guy, told you !)

Solution

Copy-pasting the password without the trailing carriage return seems to fix it.

Alternate solution

As said earlier, this behavior is rather puzzling, and no formal solution was found so far. However, it is possible to workaround it by
  1. removing the need for a sudo password with the NOPASSWD directive
  2. and running the playbook without the -K flag
mail

AnsibleError: Can't LOOKUP(dig): module dns.resolver is not installed

Situation

One of my playbooks miserably fails complaining :
failed: [slave_n] (item=someItem) => {"failed": true, "item": "someItem", "msg": "AnsibleError: Can't LOOKUP(dig): module dns.resolver is not installed"}

Details

Looks like something's missing on the Ansible "master" host.

Solution

Just install the missing module :
  1. As root :
  2. As a standard user :
    pip install --upgrade dnspython
    Collecting dnspython
    	Downloading dnspython-1.15.0-py2.py3-none-any.whl (177kB)
    		100% |................................| 184kB 4.1MB/s
    Installing collected packages: dnspython
    Successfully installed dnspython-1.15.0

Alternate solution

As root :