Cloud and Infrastructure

Definitions

CRM

software running on one or all of the cluster nodes, working with a pool of resource agents.

Some CRM specialize in dispatching work to cluster nodes (grid computing), whereas others are dedicated to high availability and load balancing.

In short, its role is to make sure everything is going extremely well (health check, "monitor" part), and take according actions ("start/stop/enable/disable" part) when needed.

cluster

set of computers that work together so that, in many respects, they can be viewed as a single system

failover

automatic switchover triggered by an incident

high availability

characteristic of a system which aims to ensure an agreed level of operational performance (aka SLA, usually uptime) for a higher than normal period

redundancy

duplication of critical components of a system with the intention of increasing reliability of the system

resource agent

software running on the cluster nodes to effectively manage and configure devices (IP addresses, ...), start / stop the services (daemons, ...).
These agents are standardized interfaces (i.e. : abstraction layers) for a cluster resource. They are used to :

translate a standard set of operations into steps specific to the resource or application
and interpret their results as success or failure

Their duty is mostly to :

start / enable
stop / disable
monitor
validate configuration

Most resource agents are coded as shell scripts (resource agents implementation, resource-agents).

switchover

manual switch from one system to a redundant (aka standby) system upon incident or for maintenance

Solutions

Name	Role	Notes
Corosync	cluster communication + membership	Often associated with Pacemaker, Corosync is the communication layer.
HAProxy	TCP + HTTP reverse proxy	As for the HTTP reverse proxy functionality only (ignoring factors such as performance or other functionalities such as caching), what HAProxy does could be performed as well by : Apache Lighttpd Nginx Varnish Squid-cache Even though it has "high availability" in its name, this is a standalone application that needs to be redunded like any other resource.
Heartbeat (1, 2, 3)	cluster communication + membership	Must be associated to a CRM such as Pacemaker Since up to release 2.1.4 the messaging layer (Heartbeat proper), the Local Resource Manager, "plumbing" infrastructure and STONITH (now known as Cluster Glue), the Resource Agents, and the CRM (now Pacemaker) were all part of a single package named Heartbeat, the name was often applied to the Linux-HA project as a whole. This generalization is no longer accurate, the name Heartbeat should thus be used for the messaging layer exclusively. (source)
Keepalived
Pacemaker	CRM	Often associated with Corosync.

http://www.linux-ha.org
http://clusterlabs.org/
https://geekpeek.net/linux-cluster-nodes/

https://www.quora.com/Which-is-a-better-Linux-ip-failover-tool-keepalived-or-heartbeat-pacemaker

http://www.formilux.org/archives/haproxy/1003/3259.html

	a cluster-oriented product such as heartbeat will ensure that a shared resource will be present at *at most* one place. This is very important for shared filesystems, disks, etc... It is designed to take a service down on one node and up on another one during a switchover. That way, the shared resource may never be concurrently accessed. This is a very hard task to accomplish and it does it well.
	a network-oriented product such as keepalived will ensure that a shared IP address will be present at *at least* one place. Please note that I'm not talking about a service or resource anymore, it just plays with IP addresses. It will not try to down or up any service, it will just consider a certain number of criteria to decide which node is the most suited to offer the service. But the service must already be up on both nodes. As such, it is very well suited for redundant routers, firewalls and proxies, but not at all for disk arrays nor filesystems.

	==> The difference is very visible in case of a dirty failure such as a split brain. A cluster-based product may very well end up with none of the nodes offering the service, to ensure that the shared resource is never corrupted by concurrent accesses. A network-oriented product may end up with the IP present on both nodes, resulting in the service being available on both of them. This is the reason why you don't want to serve file-systems from shared arrays with ucarp or keepalived.

LinuX Containers is an operating-system-level virtualization method for running multiple isolated Linux systems (i.e. containers) on a control host using a single Linux kernel. Containers offer an environment as close as possible as the one you'd get from a VM but without the overhead that comes with running a separate kernel and simulating all the hardware.

The Linux kernel provides :

the cgroups functionality that allows limitation and prioritization of resources (CPU, memory, block I/O, network, etc.) without the need for starting any virtual machines
and also namespace isolation functionality that allows complete isolation of an applications' view of the operating environment, including process trees, networking, user IDs and mounted file systems.

LXC combines both to provide an isolated environment for applications.

Early versions of Docker used LXC as the container execution driver, though LXC was made optional in v0.9 and support was dropped in Docker v1.10 (2016-02-04).

Setup

apt install lxc

And that's it !

Create a new container


lxc-create --template download --name myContainer



https://wiki.debian.org/LXC#Changes_between_.22Jessie.22_and_.22Stretch.22
echo 'USE_LXC_BRIDGE="true"' >> /etc/default/lxc-net



systemctl status lxc
● lxc.service - LXC Container Initialization and Autoboot Code
   Loaded: loaded (/lib/systemd/system/lxc.service; enabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:lxc-autostart
           man:lxc

systemctl start lxc


lxc-checkconfig
	Everything should be stated as "enable" in green color. If not, try to reboot the system.


https://wiki.debian.org/LXC#line-1-6
lxc-create --name ubuntu -t download
Setting up the GPG keyring


https://superuser.com/questions/399938/how-to-create-additional-gpg-keyring
gpg --no-default-keyring --keyring trustedkeys.gpg --fingerprint


https://linuxcontainers.org/lxc/manpages/man1/lxc-create.1.html
lxc-create --name ubuntu -t download -B best


Setting up the GPG keyring
ERROR: Unable to fetch GPG key from keyserver.


lxc-create --name ubuntu -t download -B best --logpriority=DEBUG


https://bugs.launchpad.net/openstack-ansible/+bug/1609479
https://review.openstack.org/#/c/350684/3/defaults/main.yml
# The DNS name of the LXD server to source the base container cache from
lxc_image_cache_server: images.linuxcontainers.org

# The keyservers to use when validating GPG keys for the downloaded cache
lxc_image_cache_primary_keyserver: hkp://p80.pool.sks-keyservers.net:80
lxc_image_cache_secondary_keyserver: hkp://keyserver.ubuntu.com:80

wget hkp://keyserver.ubuntu.com:80


https://doc.ubuntu-fr.org/apt-key
https://cran.r-project.org/bin/linux/ubuntu/

gpg --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys E084DAB9


nc -vz p80.pool.sks-keyservers.net 80 11371
nc -vz keyserver.ubuntu.com 80 11371



nmap -sT www.google.com -p 80,443
nmap -sT keyserver.ubuntu.com -p 80,11371

As :

lxc-create --template download --name myContainer -- -d debian -r strech -a amd64

Manage containers

list containers :: lxc-ls -f
destroy a container :: lxc-destroy --name myContainer

Start / stop a container

TO DO

apt-get install adcli libnss-sss libpam-sss realmd samba-common-bin sssd sssd-tools
systemctl enable sssd

realm discover myDomain.local

myDomain.local
	type: kerberos
	realm-name: MYDOMAIN.LOCAL
	domain-name: myDomain.local
	configured: no
	server-software: active-directory
	client-software: sssd
	required-package: sssd-tools
	required-package: sssd
	required-package: libnss-sss
	required-package: libpam-sss
	required-package: adcli
	required-package: samba-common-bin

realm join --user=domainAdmin myDomain.local
systemctl start sssd

check :

getent passwd user@myDomain.local

t_anderson@metacortex.com:*:919801223:919800513:Thomas ANDERSON:/home/t_anderson:/bin/bash

If you get an answer like that, it's ok

Enable creation of a home directory for domain users (with the specified skeleton) :
echo 'session required pam_mkhomedir.so skel=/etc/skel/ umask=0022' | sudo tee -a /etc/pam.d/common-session
Let domain administrators become local admins (i.e. sudoers) (complete the domain name below ;-) :
1. aptitude install libsss-sudo
2. echo '%domain\ admins@myDomain.local ALL=(ALL) ALL' | sudo tee -a /etc/sudoers.d/domain_admins
To login as user instead of user@myDomain.local, set in /etc/sssd/sssd.conf :
```
use_fully_qualified_names = False
```

You'll also need in /etc/sssd/sssd.conf :

simple_allow_groups = list,of,AD,groups,allowed,to,login

systemctl restart sssd
update DNS records accordingly

Setup :

As :

apt-get install opam ocaml make fuse camlp4-extra build-essential pkg-config
groupadd fuse
adduser bob fuse
chown .fuse /dev/fuse; chmod 660 /dev/fuse

As Bob :

just in case, backup ~/.profile
opam init
opam update
opam install depext
opam depext google-drive-ocamlfuse
opam install google-drive-ocamlfuse
. $HOME/.opam/opam-init/init.sh
google-drive-ocamlfuse
This will open a browser window on the Google Drive page, asking for credentials + allowing google-drive-ocamlfuse to access files.

Mount :

mountPoint='/home/bob/googleDrive'; mkdir "$mountPoint"; google-drive-ocamlfuse "$mountPoint"

On the first time, this will open a browser window to check permissions.

Check : mount | grep -q "$mountPoint" && echo 'GOOGLE DRIVE IS MOUNTED' || echo 'GOOGLE DRIVE NOT MOUNTED'

Unmount :

sudo umount "$mountPoint"
kill -1 $(pidof google-drive-ocamlfuse)

Galera cluster load balancing
	http://galeracluster.com/documentation-webpages/loadbalancing.html

Cluster deployment variants
	http://galeracluster.com/documentation-webpages/deploymentvariants.html

An Introduction to HAProxy and Load Balancing Concepts
	https://www.digitalocean.com/community/tutorials/an-introduction-to-haproxy-and-load-balancing-concepts

Load balancing types (source) :

Layer 4 load balancing :: Relying on the transport layer, this forwards a request to a backend server based on the destination IP+port of the request. All backend servers are expected to be able to return identical responses to any given request.
Layer 7 load balancing :: Relying on the application layer, this forwards a request to a backend server based on the contents of the request itself (e.g. the requested URL). In this mode, backend servers need not being clones of each other since they may serve different contents.

Algorithms (source) :

roundrobin: select servers in turns (default in HAProxy)
leastconn: select the server with the least number of connections (recommended for longer sessions)
source: select which server to use based on a hash of the source IP address. This ensures that a user will connect to the same server.

High availability of the load balancer itself (source, details) :

The load balancer itself mustn't be the SPOF of the infrastructure. This is why it has to become highly available via redundancy :

a floating IP is assigned to the 1^st load balancer, which performs normally
a 2^nd (passive) load balancer is added to the infrastructure
health check is performed on both load balancers
should the active load balancer fail, the floating IP is transferred to the spare load balancer, which becomes active

This can be achieved with tools such as :

Cloud and Infrastructure - Tools, storage and more

fault tolerance, high availability, cluster, failover, redundancy, ... : let's make thing clearer

Definitions

Solutions

LXC

Setup

Create a new container

Manage containers

Start / stop a container

How to register a Debian host into Active Directory ?

How to mount Google Drive on Debian GNU/Linux ?

Setup :

As :

As Bob :

Mount :

Unmount :

Load balancing

Load balancing types (source) :

Algorithms (source) :

High availability of the load balancer itself (source, details) :