Creating DE environment 1 - general overview and Docker swarm
Introduction
Recently, I was such an inactive blogger. I was busy working, learning new things, and transforming my server to the new cloud provider. I have a few ideas for blog projects, but mostly they require a bigger computational power to work properly. I could just scale up my VPS vertically, but I think the more elegant and interesting is scaling it up horizontally. Distributed systems and parallel processing are such important things in the data engineering field nowadays, so it is good to have some practical knowledge about them. I suppose it will be a series of short posts, containing:
- General overview and setting Docker swarm
- Setting the Caddy web server and Authelia
- Setting the Dask cluster
- Setting the MongoDB, PostgresDB and DbGate
- Setting the Prefect orchestration tool
The clean infrastructure will also be crucial for my next projects and posts. I hope I will be able to deploy things much faster in the future. I would also like to inform you that the posts are not tutorials. To be honest I will use a lot of other tutorials and documentation from the web (ofc I will add them to the references as always). This series of posts is something like a diary for me. The two main tools I will use in the posts are Ansible - an open-source IT automation tool and Docker - an open-source containerization tool. My whole infrastructure will be hosted on OVH. I choose Dask as my parallel computing library because it is more lightweight. It is really important for my cheap instances.
The goal is to build an architecture with the 3 nodes:
- The data-master-node - swarm manager node for the Dask cluster. It will contain containers with all tools like databases, schedulers, or brokers, Caddy web server with authentication and authorization service - Authelia, and act like a master node (scheduler) for the Dask and. Everything except Dask services will be deployed as standalone containers.
- The data-worker-node-1 and the data-worker-node-2 - just Dask workers, reponsible for the computations only.
I choose the docker swarm as the deployment tool for my Dask cluster because it is relatively easy to set up. My infrastructure will be also humble, so I think using something like Kubernetes would be overkill.
Let’s get to work!
Initial configuration
Here I would like to show a few obvious steps I would like to perform on every machine. The list contains:
- installing aptitude - a package manager with several useful features. It is optional, but aptitude makes working with Ansible quite easier,
- setting the passwordless sudo,
- creating a new user with sudo privileges - avoiding extensive use of the root user is a good practice,
- setting alternate SSH port - it is also good safety practice, you can avoid some automatic attack attempts by simply doing this,
- adding SSH setup - using SSH keys instead of passwords is a good practice also,
- disabling SSH access for root users - also the safety reasons,
- installing required packages - I need only ufw to open only the necessary ports, python 3 with pip to install docker-compose because Ansible requires the version from pip for some reason, and jsondiff - it is a dependency of docker_stack module
- configuring firewall.
The first thing to do should be to specify inventory in the hosts file. As you could notice in the initial configuration SSH port of the machines will change. That’s why I usually create the group for_initial_config with all machines and port 22. This group is used just once. Then I create separated groups for different use cases, but with the alternative SSH port for every machine. The hosts file should be like this:
[for_initial_config]
data-master-node-1_init ansible_port=22 ansible_host=<ip>
data-worker-node-1_init ansible_port=22 ansible_host=<ip>
data-worker-node-2_init ansible_port=22 ansible_host=<ip>
[data_worker_nodes]
data-worker-node-1 ansible_port=1212 ansible_host=<ip>
data-worker-node-2 ansible_port=1212 ansible_host=<ip>
[data_master_nodes]
data-master-node-1 ansible_port=1212 ansible_host=<ip>
Now let’s create an initial_config.yaml playbook.
# initial_config.yaml
---
- hosts: for_initial_config
become: true
vars:
created_username: charizard
tasks:
- name: Install aptitude
apt:
name: aptitude
state: latest
update_cache: true
- name: Setup passwordless sudo
lineinfile:
path: /etc/sudoers
state: present
regexp: '^%sudo'
line: '%sudo ALL=(ALL) NOPASSWD: ALL'
validate: '/usr/sbin/visudo -cf %s'
- name: Create a new regular user with sudo privileges
user:
name: "{{ created_username }}"
state: present
groups: sudo
append: true
create_home: true
- name: Setup alternate SSH port
lineinfile:
dest: "/etc/ssh/sshd_config"
regexp: "^Port"
line: "Port 1212"
notify: "Restart sshd"
- name: Set authorized key for remote user
ansible.posix.authorized_key:
user: "{{ created_username }}"
state: present
key: "{{ lookup('file', lookup('env','HOME') + '/.ssh/id_rsa.pub') }}"
- name: Disable ssh access for root user
lineinfile:
path: /etc/ssh/sshd_config
state: present
regexp: '^#?PermitRootLogin'
line: 'PermitRootLogin no'
- name: Update apt and install required system packages
apt:
pkg:
- ufw
- python3
- python3-pip
- jsondiff
state: latest
update_cache: true
- name: UFW - Allow SSH connections
community.general.ufw:
rule: allow
port: '1212'
proto: tcp
- name: UFW - Allow all access to tcp port 80
community.general.ufw:
rule: allow
port: '80'
proto: tcp
- name: UFW - Allow all access to tcp port 443
community.general.ufw:
rule: allow
port: '443'
proto: tcp
- name: UFW - Allow all access to tcp port 2377
community.general.ufw:
rule: allow
port: '2377'
proto: tcp
- name: UFW - Allow all access to tcp port 7946
community.general.ufw:
rule: allow
port: '7946'
proto: tcp
- name: UFW - Allow all access to udp port 7946
community.general.ufw:
rule: allow
port: '7946'
proto: udp
- name: UFW - Allow all access to udp port 4789
community.general.ufw:
rule: allow
port: '4789'
proto: udp
- name: UFW - Enable and deny by default
community.general.ufw:
state: enabled
default: deny
handlers:
- name: Restart sshd
service:
name: sshd
state: restarted
As you can the tasks are pretty straightforward. All the opened ports except HTTP/HTTPS and SSH are required for the Docker swarm mode. The playbook can be executed with the following commands:
ansible-playbook initial_config.yaml -l data-master-node_init -u root -k
ansible-playbook initial_config.yaml -l data-worker-node-1_init -u root -k
ansible-playbook initial_config.yaml -l data-worker-node-2_init -u root -k
As you can see I run the playbook separately for every host. Ansible is usually used with passwordless sudo, but this is the initial config when I configure this feature. The -k
flag is quite important for running it for the first time, it allows the user to log in with the SSH password. It is not that easy to run it with multiple hosts, because of just one password prompt. It can be also done by specifying the passwords in the inventory file with the ansible_ssh_pass
variable or using Ansible Vaults. I have just 3 instances, so I can run it manually because it is the safest way in my opinion.
Install docker
Installing Docker with an Ansible is just like following the Docker installation guide. The one thing to note is installing the docker-compose with pip, it is some strange Ansible requirement. I also create the Docker directory with reading, writing, and executing permissions for my user, just to keep all services’ Docker files here.
# docker_install.yaml
---
- name: install Docker
hosts: data_worker_nodes:data_master_nodes
become: true
tasks:
- name: Install apt-transport-https
ansible.builtin.apt:
name:
- apt-transport-https
- ca-certificates
- lsb-release
- gnupg
state: latest
update_cache: true
- name: Add signing key
ansible.builtin.apt_key:
url: "https://download.docker.com/linux/debian/gpg"
state: present
- name: Add repository into sources list
ansible.builtin.apt_repository:
repo: "deb https://download.docker.com/linux/debian bullseye stable stable"
state: present
filename: docker
- name: Install Docker
ansible.builtin.apt:
name:
- docker-ce
- docker-ce-cli
- containerd.io
state: latest
update_cache: true
- name: Install python docker-compose sdk for ansible
ansible.builtin.pip:
name: docker-compose
- name: Make sure Docker is active
service:
name: docker
state: started
enabled: yes
- name: Add remote user to docker group
user:
name: charizard
groups: "docker"
append: yes
- name: Create Docker directory
file:
path: /docker
state: directory
owner: charizard
mode: '0766'
Let’s execute this playbook with a command:
ansible-playbook docker_install.yaml -u charizard
Creating the swarm
Docker swarm is a container orchestration tool. It helps with managing multiple containers deployed on multiple host instances. All of the features can be found here. Creating the swarm manually is not so hard task, but doing it with Ansible is more convenient, especially if there are a lot of nodes. Luckily, Ansible has modules - docker_swarm and docker_node - that makes the whole process fast and easy.
The first two tasks are initiating the swarm and setting the join_token_worker variable. The token is required to join other nodes to the swarm. Those tasks will be executed on the master node machine - which will be the swarm manager.
# swarm_init.yaml
---
- name: Init a new swarm with default parameters
docker_swarm:
state: present
register: init_swarm
- name: Set fact - join token worker
set_fact:
join_token_worker: "{{ init_swarm.swarm_facts.JoinTokens.Worker }}"
The second thing to do is add the worker nodes. This task will run on worker machines, but requires two external variables from the master host:
- the token we obtained previously,
- host IP address
# swarm_join_workers.yaml
---
- name: Add nodes
docker_swarm:
state: join
join_token: "{{ hostvars[groups['data_master_nodes'][0]].join_token_worker }}"
remote_addrs: "{{ hostvars[groups['data_master_nodes'][0]].ansible_host }}:2377"
The last thing is labeling the nodes, just for easier working with docker-compose in the future. This task needs to be executed on the swarm manager machine.
# swarm_label_nodes.yaml
- name: Give a label for a master
docker_node:
hostname: "{{ ansible_hostname }}"
labels:
role: master
labels_state: merge
- name: Give a label for a data worker 1
docker_node:
hostname: "{{ hostvars[groups['data_worker_nodes'][0]].ansible_hostname }}"
labels:
role: data-worker
labels_state: merge
- name: Give a label for a data worker 2
docker_node:
hostname: "{{ hostvars[groups['data_worker_nodes'][1]].ansible_hostname }}"
labels:
role: data-worker
labels_state: merge
All tasks can be executed from the single playbook:
# swarm_deploy.yaml
---
- hosts: data_master_nodes
become: true
tasks:
- include_tasks: swarm_init.yaml
- hosts: data_worker_nodes
become: true
tasks:
- include_tasks: swarm_join_workers.yaml
- hosts: data_master_nodes
become: true
tasks:
- include_tasks: swarm_label_nodes.yaml
Just run this command:
ansible-playbook swarm_deploy.yaml -u charizard
The swarm is running now. It can be checked running the following command from the master node:
docker node ls
Docker and UFW issue
Docker bypassing the UFW rules is a popular problem for years. Every published port by Docker can be accessed from the outside, no matter if you block it in UFW. The problem is super serious, it breaks the whole UFW intention. Luckily the not-so-elegant solution is here, it requires modifying UFW after.rules file. I won’t do this manually, because there is already awesome repo with a helpful utility script, that also supports the swarm mode. Let’s install it with Ansible:
# ufw_docker.yaml
- hosts: data_master_nodes
become: true
tasks:
- name: Download ufw-docker script
ansible.builtin.get_url:
url: https://github.com/chaifeng/ufw-docker/raw/master/ufw-docker
dest: /usr/local/bin/ufw-docker
mode: 'u+x'
- name: Uwf-docker install
ansible.builtin.command: ufw-docker install
- name: Reload ufw
community.general.ufw:
state: reloaded
It requires installing only on the manager node.
ansible-playbook ufw_docker.yaml -u charizard
Now the easier firewall tasks can be accomplished with ufw-docker
commands and the more complicated ones with ufw route allow
.
Conclusion
I think there are a lot of ways to build the simple Docker swarm architecture. My way is not the most professional one, but I think it is easy to understand and will serve its purpose. Working with Ansible is charming. It is an impressive tool and I think I will use it a lot in the future. Cya in the next post from the series!
References
- https://www.digitalocean.com/community/tutorials/how-to-use-ansible-to-automate-initial-server-setup-on-ubuntu-20-04
- https://towardsdatascience.com/diy-apache-spark-docker-bb4f11c10d24
- https://www.seelk.co/blog/docker-swarm-on-aws-with-ansible/
- https://www.howtogeek.com/devops/how-to-use-docker-with-a-ufw-firewall/
- https://blog.neuvector.com/article/docker-swarm-container-networking