Manage large projects

Posted by dlgilbert on Mon, 03 Jan 2022 08:02:10 +0100

1. Select the host by using the host mode

1.1 reference list host

Host mode is used to specify the host to be the target of play or temporary commands. In its simplest form, the name of the managed host or host group in the list specifies the host mode of the host or host group.

In play, hosts specifies the managed host against which to run play. For temporary commands, the host mode is provided to the ansible command as a command line parameter.

The following example listing will be used throughout this section to demonstrate host mode.

[root@localhost ~]# cat myinventory 
web.example.com
data.example.com

[lab]
labhost1.example.com
labhost2.example.com

[test]
test1.example.com
test2.example.com

[datacenter1]
labhost1.example.com
test1.example.com

[datacenter2]
labhost2.example.com
test2.example.com

[datacenter:children]
datacenter1
datacenter2

[new]
172.16.103.129
172.16.103.130

To demonstrate how to parse the host mode, we'll execute the playbook of Ansible Playbook YML, using different host patterns to target different subsets of managed hosts in this sample manifest.

1.2 managed host

The most basic host mode is that a single managed host name is listed in the list. This specifies that the host is the only host in the manifest where the ansible command will perform the operation.

When the playbook is running, the first Gathering Facts task should run on all managed hosts that match the host pattern. Failures during this task may cause the managed host to be removed from play.

If the list explicitly lists the IP address instead of the host name, you can use it as the host mode. If the IP address is not listed in the list, we cannot use it to specify the host, even if the IP address will resolve to the host name in DNS.

The following example demonstrates how to use host mode to refer to the IP address contained in the manifest.

[root@localhost ~]# vim playbook.yml
---
- hosts: 192.168.101.120


[root@localhost ~]# ansible-playbook playbook.yml 
PLAY [192.168.101.120] ***************************************************************

TASK [Gathering Facts] **************************************************************
ok: [192.168.101.120]

PLAY RECAP **************************************************************************
192.168.101.120            : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

be careful:
There is a problem in referencing the managed host through the IP address in the list, that is, it is difficult to remember which IP address is used by the host targeted by the play or temporary command. However, if there is no resolvable host name, we may have to specify the host by IP address to connect.

You can set ansible_host host variable to point an alias to a specific IP address in the list. For example, you can have a list named dummy Example, and then create a host with the following host variables_ vars/dummy.example file to point the connection with this name to the IP address 192.168.101.120:

ansible_host: 192.168.101.120

1.3 specifying hosts using groups

When the group name is used as host mode, it specifies that Ansible will perform operations on hosts that are members of the group.

---
- hosts: lab

Remember, there is a special group called all that matches all the managed hosts in the list.

---
- hosts: all

There is also a special group called ungrouped, which includes all managed hosts in the list that do not belong to any other group:

---
- hosts: ungrouped

1.4 matching multiple hosts using wildcards

Another way to achieve the same goal as the all host mode is to use the * wildcard, which will match any string. If the host pattern is just a quoted asterisk, all hosts in the list will match.

---
- hosts: '*'

important

Some characters used in host mode also make sense for the shell. This can be problematic when running temporary commands from the command line using host mode through ansible. It is recommended that you use single quotation marks to enclose the host modes used in the command line to prevent them from being accidentally extended by the shell.
Similarly, if any special wildcard or list character is used in Ansible Playbook, the host mode must be placed in single quotation marks to ensure that the host mode can be correctly parsed.

---
- hosts: '!test1.example.com,development'

A * managed character string or a * managed character group can also be used.
For example, the following wildcard matches the host pattern to example. All manifest names at the end of COM:

---
- hosts: '*.example.com'

The following example uses a wildcard host pattern to match a host that starts with 192.168.2 Name of the host or host group:

---
- hosts: '192.168.2.*'

The following example uses a wildcard host pattern to match the names of hosts or host groups that begin with datacenter.

---
- hosts: 'datacenter*'

important

The wildcard host pattern matches all manifest names, hosts, and host groups. They do not distinguish whether the name is a DNS name, IP address or group, which may lead to some unexpected matches.
For example, according to the example list, compare the results of the data center host mode specified in the previous example with the results of the data host mode:

---
- hosts: 'data*'

1.5 list

Multiple entries in the list can be referenced through a logical list. The comma separated list of host patterns matches all hosts that match any of these host patterns.

If you provide a comma separated list of managed hosts, all these managed hosts will be targets:

---
- hosts: labhost1.example.com,test2.example.com,192.168.2.2

If you provide a comma separated list of groups, all hosts belonging to any of these groups will be targets:

---
- hosts: lab,datacenter1

You can also mix managed hosts, host groups, and wildcards, as follows:

---
- hosts: 'lab,data*,192.168.2.2'

be careful
You can also replace commas with colons (:). However, commas are the preferred delimiters, especially when using IPv6 addresses as managed host names.

If an item in the list begins with an amp ersand (&), the host must match the item to match the host pattern. It works like a logical AND.

For example, according to our example listing, the following host patterns will match computers in the lab group that also belong to the datacenter 1 group:

---
- hosts: lab,&datacenter1

We can also use the host mode & lab, datacenter1 or datacenter, & lab to specify that the computers in the datacenter1 group will match only when they also belong to the lab group.

By using an exclamation point (!) in front of the host mode Indicates that the host the first mock exam is excluded from the list. It works like logical NOT.

According to the example listing, the following example matches all hosts defined in the datacenter group, but test2 example. Except com:

---
- hosts: datacenter,!test2.example.com

You can also use mode '! test2.example.com,datacenter 'to get the same results.

The last example demonstrates using a host pattern that matches all hosts in the test manifest, except managed hosts in the datacenter 1 group.

---
- hosts: all,!datacenter1

2. Management dynamic list

2.1 dynamically generated list

The static list we used earlier is easy to write and easy to manage small infrastructure. However, it may be difficult to keep the static manifest file up-to-date if you want to operate many computers or work in an environment where computers change very quickly.

Most large IT environments do not have systems to track available hosts and how they are organized. For example, external directory services may be maintained through monitoring systems such as Zabbix, or located on FreeIPA or Active Directory servers. Cobbler and other installation servers or management services such as red hat satellite may track the deployed bare metal systems. Similarly, cloud services such as Amazon Web ServicesEC2 or OpenStack deployment, or virtual machine infrastructure based on Vmware or red hat virtualization may be the source of information about those replaced instances and virtual machines.

Ansible supports dynamic list scripts. These scripts retrieve current information from these types of sources whenever ansible is executed, so that the list can be updated in real time. These scripts are executable programs that can collect information from some external sources and output lists in JSON format.

Dynamic manifest scripts are used in the same way as static manifest text files. The location of the manifest can be directly in the current Ansible Specified in the cfg file or through the - i option. If the manifest file can be executed, it will be regarded as a dynamic manifest program, and Ansible will try to run it to generate the manifest. If the file is not executable, it is treated as a static manifest.

The manifest location can be in ansible The CFG configuration file is configured through the inventory parameter. By default, it is configured as / etc/ansible/hosts.

2.2 open source community script

The open source community contributed a large number of existing dynamic list scripts to the ansible project. They are not included in the ansle package. These scripts are available from the Ansible GigHub website( https://github.com/ansible/ansible/tree/devel/examples )Get.

2.3 preparation of dynamic list program

If the directory system or infrastructure used does not have a dynamic manifest script, we can write a custom manifest program. Custom programs can be written in any programming language, but the manifest information must be returned in JSON format when passing the appropriate options.

The Ansible inventory command is a useful tool for learning how to write Ansible listings in JSON format.

To display the contents of the manifest file in JSON format, run the Ansible Inventory -- list command. You can use the - i option to specify the location of the manifest file to process, or use only the default manifest set by the current Ansible configuration.

The following example demonstrates how to use the ansible inventory command to process INI style manifest files and output them in JSON format.

[root@localhost ~]# cat inventory
workstation1.lab.example.com

[webservers]
web1.lab.example.com
web2.lab.example.com

[databases]
db1.lab.example.com
db2.lab.example.com


[root@localhost ~]# ansible-inventory -i inventory --list

If you want to write your own dynamic list script, you can deploy the dynamic list source[ https://docs.ansible.com/ansible/latest/dev_guide/developing_inventory.html ]For more details. The following is a brief summary.

The script starts with the appropriate interpreter line (for example, #! / usr/bin/python) and can be executed so that Ansible can run it.

When passing the – list option, the script must display JSON encoded hashes / dictionaries for all hosts and groups in the list.

In its simplest form, a group can be a list of managed hosts. In the JSON encoded output example of this listing script, webservers is a host group that contains Web1.0 lab.example. COM and web2 lab.example. Com managed host. The members of the databases host group are db1 lab.example. COM and DB2 lab.example. Com host.

[root@localhost ~]# ./inventoryscript --list
{
    "webservers": ["web1.lab.example.com","web2.lab.example.com"],
    "databases": ["db1.lab.example.com","db2.lab.example.com"]
}

In addition, the value of each group can be a JSON hash / dictionary containing a list of each managed host, any subgroup, and any group variables that may be set. The next example shows the JSON encoded output of a more complex dynamic manifest. The boston Group has two subgroups (backup and ipa), its own three managed hosts, and a set of group variables (example_host: false).

{
    "webservers": [
        "web1.lab.example.com",
        "web2.lab.example.com"
    ],
    "boston": {
        "children": [
            "backup",
            "ipa"
        ],
        "vars": {
            "example_host": false
        },
        "hosts": [
            "server1.demo.example.com",
            "server2.demo.example.com",
            "server3.demo.example.com",
        ]
    },
    "backup": [
        "server4.demo.example.com"
    ],
    "ipa": [
        "server5.demo.example.com"
    ],
    "_meta": {
        "hostvars": {
            "server5.demo.example.com": {
                "ntpserver": "ntp.demo.example.com",
                "dnsserver": "dns.demo.example.com"
            }
        }
    }
}

The script also supports the – host managed host option. This option must display a JSON hash / dictionary consisting of variables associated with the host, or a blank JSON hash / dictionary.

[root@localhost ~]# ./inventoryscript --host server5.demo.example.com
{
    "ntpserver": "ntp.demo.example.com",
    "dnsserver": "dns.demo.example.com"
}

be careful
When called through the – host hostname option, the script must display a JSON hash / dictionary of variables for the specified host. If no variables are provided, a blank JSON hash or dictionary may be displayed.

In addition, if the – list option returns a value named_ The top-level element of meta can return all host variables in one script call, so as to improve the script performance. At this point, the – host call is not made.

For more information, see deploying dynamic inventory sources[ https://docs.ansible.com/ansible/latest/dev_guide/developing_inventory.html ].

Here is an example script:

[root@localhost ~]# vim inventory.py
#!/usr/bin/env python

'''
Example custom dynamic inventory script for Ansible, in Python.
'''

import os
import sys
import argparse

try:
    import json
except ImportError:
    import simplejson as json

class ExampleInventory(object):

    def __init__(self):
        self.inventory = {}
        self.read_cli_args()

        # Called with `--list`.
        if self.args.list:
            self.inventory = self.example_inventory()
        # Called with `--host [hostname]`.
        elif self.args.host:
            # Not implemented, since we return _meta info `--list`.
            self.inventory = self.empty_inventory()
        # If no groups or vars are present, return empty inventory.
        else:
            self.inventory = self.empty_inventory()

        print json.dumps(self.inventory);

    # Example inventory for testing.
    def example_inventory(self):
        return {
            'group': {
                'hosts': ['172.16.103.129', '172.16.103.130'],
                'vars': {
                    'ansible_ssh_user': 'root',
                    'ansible_ssh_pass': '123456',
                    'example_variable': 'value'
                }
            },
            '_meta': {
                'hostvars': {
                    '172.16.103.129': {
                        'host_specific_var': 'foo'
                    },
                    '172.16.103.130': {
                        'host_specific_var': 'bar'
                    }
                }
            }
        }

    # Empty inventory for testing.
    def empty_inventory(self):
        return {'_meta': {'hostvars': {}}}

    # Read the command line args passed to the script.
    def read_cli_args(self):
        parser = argparse.ArgumentParser()
        parser.add_argument('--list', action = 'store_true')
        parser.add_argument('--host', action = 'store')
        self.args = parser.parse_args()

# Get the inventory.
ExampleInventory()

How do I use this script?

chmod +x inventory.py
./inventory.py --list
./inventory.py --host 172.16.103.129
ansible uses this dynamic list to manage hosts:

ansible all -i inventory.py -m ping
ansible 172.16.103.129 -i inventory.py -m ping

2.4 manage multiple lists

Ansible supports the use of multiple manifests in the same run. If the location of the manifest is a directory (whether set by the - i option, the value of the inventory parameter, or in some other way), all manifest files contained in the directory (whether static or dynamic) will be combined to determine the manifest. The executable files in this directory will be used to retrieve the dynamic list, and other files will be used as static lists.

Manifest files should not rely on other manifest files or scripts to parse. For example, if the static manifest file specifies that a group should be a child of another group, it also needs to have a placeholder entry for that group, even if all members of the group are from the dynamic manifest.

[cloud-east]

[servers]
test.demo.example.com

[servers:children]
cloud-east

This ensures that the manifest files are internally consistent regardless of the order in which they are parsed.

be careful
The parsing order of the manifest file is not specified by the document. Currently, if there are multiple manifest files, they will be parsed alphabetically. If one inventory source relies on the information of another inventory source, their loading order may determine whether the inventory file works as expected or causes an error. Therefore, it is important to ensure that all files are self consistent to avoid unexpected errors.

Ansible ignores files in the manifest directory that end with a specific suffix. This can be done through the inventory in the ansible configuration file_ ignore_ Extensions instruction. For more information, please refer to ansible's official documentation.

Use dynamic list: https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html
Development dynamic list: https://docs.ansible.com/ansible/latest/dev_guide/developing_inventory.html

3. Configure parallel

3.1 configuring parallelism in ansible using bifurcation

When Ansible processes the playbook, each play is run sequentially. After determining the host list for play, Ansible will run each task in order. Generally, all hosts must successfully complete the task before any host starts the next task in play.

In theory, Ansible can connect to all hosts in play at the same time to perform each task. This is ideal for small host lists. However, if the play targets hundreds of hosts, it may bring a heavy burden to the control node.

The maximum number of simultaneous connections made by Ansible is controlled by the forks parameter in the Ansible configuration file. By default, it is set to 5, which can be verified in one of the following ways.

[root@localhost ansible]# vim ansible.cfg            # View profile
······
[defaults]

# some basic default values...

#inventory      = /etc/ansible/hosts
inventory      = /etc/ansible/cctv
#library        = /usr/share/my_modules/
#module_utils   = /usr/share/my_module_utils/
#remote_tmp     = ~/.ansible/tmp
#local_tmp      = ~/.ansible/tmp
#plugin_filters_cfg = /etc/ansible/plugin_filters.yml
#forks          = 5			#By default, 5 sets are executed at a time
#poll_interval  = 15
······

[root@localhost ansible]# ansible-config dump|grep -i forks		#View directly using the command
DEFAULT_FORKS(default) = 5

[root@localhost ansible]# ansible-config list|grep -i forks
DEFAULT_FORKS:
  description: Maximum number of forks Ansible will use to execute tasks on target
  - {name: ANSIBLE_FORKS}
  - {key: forks, section: defaults}
  name: Number of task forks

For example, suppose that the ansible control node is configured with a default value of 5 forks, and play has 10 managed hosts. Ansible will perform the first task in play on the first five managed hosts, and then perform the second round of the first task on the other five managed hosts. After performing the first task on all managed hosts, ansible will continue to perform the next task on all managed hosts in the group of 5 managed hosts at a time. Ansible will perform this operation on each task in turn until the end of play.

The default value of forks is set very conservatively. If your control node is managing a Linux host, most tasks will run on the managed host and the control node will have less load. In this case, you can usually set the forks value higher, perhaps close to 100, and then the performance will improve.

If playbook runs a lot of code on the control node, it should be wise to raise the forks limit. If Ansible is used to manage network routers and switches, most modules run on control nodes rather than network devices. Since this increases the load on the control node, its ability to support an increase in the number of forks will be significantly lower than the control node that manages only Linux hosts.

You can override the default setting of forks in the ansible configuration file from the command line. Both the ansible and ansible playbook commands provide the - f or – forks option to specify the number of forks to use.

3.2 manage rolling updates

Typically, when Ansible runs play, it ensures that all managed hosts have completed each task before starting any host for the next task. After all managed hosts complete all tasks, the handler for any notifications runs.

However, running all tasks on all hosts can cause unexpected behavior. For example, if play updates a load balancing Web server cluster, you may need to stop each Web server from serving when the update is made. If all servers are updated in the same play, they may all stop serving at the same time.

One way to avoid this problem is to use the serial keyword to run the host in batch through play. Before the next batch starts, each batch of hosts will run in the whole play.

In the following example, ansible performs play on two managed hosts at a time until all managed hosts have been updated. Ansible first performs the tasks in play on the first two managed hosts. If either or both of these hosts notify the handler, ansible will run the handler according to the needs of both hosts. After performing play on these two managed hosts, ansible will repeat the process on the next two managed hosts. Ansible continues to run play in this manner until all managed hosts have been updated.

---
- name: Rolling update
  hosts: webservers
  serial: 2
  tasks:
  - name: latest apache httpd package is installed
    yum:
      name: httpd
      state: latest
    notify: restart apache
    
  handlers:
  - name: restart apache
    service:
      name: httpd
      state: restarted

Suppose the web servers group in the previous example contains five Web servers, which are located behind the load balancer. When the serial parameter is set to 2, play will run two web servers at a time. Therefore, most of the five Web servers will always be available.

On the contrary, if the serial keyword is not used, the play and generated handlers will be executed on five Web servers at the same time. This may cause a service outage because the web service will restart simultaneously on all web servers.

Perform tasks in play on. If either or both of these hosts notify the handler, ansible will run the handler according to the needs of both hosts. After performing play on these two managed hosts, ansible will repeat the process on the next two managed hosts. Ansible continues to run play in this manner until all managed hosts have been updated.

---
- name: Rolling update
  hosts: webservers
  serial: 2
  tasks:
  - name: latest apache httpd package is installed
    yum:
      name: httpd
      state: latest
    notify: restart apache
    
  handlers:
  - name: restart apache
    service:
      name: httpd
      state: restarted

Suppose the web servers group in the previous example contains five Web servers, which are located behind the load balancer. When the serial parameter is set to 2, play will run two web servers at a time. Most of the 5 Web servers will always be available.

On the contrary, if the serial keyword is not used, the play and generated handlers will be executed on five Web servers at the same time. This may cause a service outage because the web service will restart simultaneously on all web servers.

important
For some purposes, each batch of hosts counts as a complete play running on a subset of hosts. This means that if the whole batch fails, play will fail, which will cause the whole playbook to fail.

In the previous scenario with serial: 2 set, if a problem occurs and the play of the first two hosts fails, the playbook will abort and the other three hosts will not run through play. This is a useful feature because only some servers will be unavailable, degrading the service rather than interrupting it.

The serial keyword can also be specified as a percentage. This percentage is applied to the total number of hosts in play to determine the size of the rolling update batch. Regardless of the percentage, the number of hosts in each operation is always 1 or more.

Topics: Linux