On This Page
Move etcd - In-place AWS Instance
Overview
This procedure is required for all clusters, including remote clusters, and is run on data nodes only.
The Kubernetes etcd database must be on a separate volume to give it access to all the resources required for optimal TOS performance, stability and minimal latency.
This procedure must be performed by an experienced Linux administrator with knowledge of network configuration.
Preliminary Preparations
-
Run the following command:
If the output contains /var/lib/rancher/k3s/server/db, etcd is already on a separate volume, and you do not need to perform this procedure.
-
If you are going to perform this procedure over multiple maintenance periods, create a new backup each time.
-
Create the backup using tos backup create:
-
You can check the backup creation status using tos backup status, which shows the status of backups in progress. Wait until completion before continuing.
-
Run the following command to display the list of backups saved on the node:
-
Check that your backup file appears in the list, and that the status is "Completed".
-
Run the following command to export the backup to a file:
-
If your backup files are saved locally:
-
Run sudo tos backup export to save your backup file from a TOS backup directory as a single
.gzip
file. If there are other backups present, they will be included as well. -
Transfer the exported
.gzip
file to a safe, remote location.Make sure you have the location of your backups safely documented and accessible, including credentials needed to access them, for recovery when needed.
After the backup is exported, we recommend verifying that the file contents can be viewed by running the following command:
-
Example output:
[%=Local.admin-prompt% sudo tos backup create [Aug 23 16:18:42] INFO Running backup Backup status can be monitored with "tos backup status"
Example output:
[<ADMIN> ~]$ sudo tos backup status Found active backup "23-august-2021-16-18"
Example output:
[<ADMIN> ~]$ sudo tos backup list ["23-august-2021-16-18"] Started: "2021-08-23 13:18:43 +0000 UTC" Completed: "N/A" Modules: "ST, SC" HA mode: "false" TOS release: "21.2 (PGA.0.0) Final" TOS build: "21.2.2100-210722164631509" Expiration Date: "2021-09-22 13:18:43 +0000 UTC" Status: "Completed"
The command creates a single backup file.
[<ADMIN> ~]$ sudo tos backup export [Aug 23 16:33:42] INFO Preparing target dir /opt/tufin/backups [Aug 23 16:33:42] INFO Compressing... [Aug 23 16:33:48] INFO Backup exported file: /opt/tufin/backups/backup-21-2-pga.0.0-final-20210823163342.tar.gzip [Aug 23 16:33:48] INFO Backup export has completed
-
-
Switch to the root user.
-
Install the rsync RPM.
Mount The etcd Database to a Separate Volume
-
Shut down TOS.
-
Shut down TOS.
-
Wait for the following message:
Deployment has been stopped successfully
-
Stop the k3s service.
-
Disable the k3s service.
-
Verify that the k3s service is stopped and disabled.
Output should return inactive.
Output should return disabled.
-
-
Create a backup directory.
-
Create an etcd backup directory with the timestamp on the /opt partition.
-
Identify the path of the etcd backup directory.
-
Verify that the etcd backup directory is assigned to the variable ETCD_BACKUP_DIR.
-
-
Locate the etcd database
The purpose of this step is to identify whether the etcd database is located in the k3s directory, or whether due to older architecture the etcd database is located in the gravity directory.
-
Check if there is a link to the etcd database.
[<ADMIN> ~]# test -L /var/lib/rancher/k3s/server/db/etcd && echo "Etcd link exists."
test -L /var/lib/rancher/k3s/server/db/etcd && echo "Etcd link exists."If the output is empty, no link exists. Proceed to step 2.
If the output returns Etcd link exists, this indicates that the etcd database is in the gravity directory. Proceed to step 3.
-
Check if the database is in the k3s directory.
[<ADMIN> ~]# test -d /var/lib/rancher/k3s/server/db/etcd || echo "Etcd directory does not exist."
test -d /var/lib/rancher/k3s/server/db/etcd || echo "Etcd directory does not exist."If the output is empty, the etcd database is in the k3s directory. Do the following:
-
Assign the path of the k3s directory to the ETCD_ROOT_DIR variable.
-
Proceed to back up the etcd database.
If the output returns Etcd directory does not exist, this indicates that the etcd database could not be found. Stop the procedure and contact customer support.
-
-
Check if there is a link to the etcd database in the gravity directory.
-
assign the path of the gravity directory to the ETCD_ROOT_DIR directory variable:
-
Proceed to back up the etcd database.
If the output is empty, this indicates that the etcd database is in the gravity directory. Do the following:
If the output returns Etcd directory does not exist, this indicates that the etcd database could not be found. Stop the procedure and contact customer support.
-
-
Back up the etcd database
-
Back up the etcd database to the backup directory you created.
[<ADMIN> ~]# rsync -avP ${ETCD_ROOT_DIR}/ ${ETCD_BACKUP_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n"
rsync -avP ${ETCD_ROOT_DIR}/ ${ETCD_BACKUP_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n" echo "Fail"Output should return ok.
-
If it exists, remove the etcd symbolic link.
-
-
Add a volume to the AWS instance.
-
In the AWS instance navigation pane, go to Volumes pane, and click Create Volume.
-
Configure the following settings:
-
Volume type: SSD gp3
-
Size: Allocate a disk size of at least 128 GB
-
IOPS: 7500
-
Availability Zone: Same availability zone as instance
-
Throughput: 250 MBps
-
Snapshot ID: Keep the default value
-
Encryption: If the volume is encrypted, it can only be attached to an instance type that supports Amazon EBS encryption.
-
-
Click Create Volume.
-
In the navigation pane, go to Volumes pane, and click Actions > Attach Volume.
-
In Instance, enter the ID of the instance or select the instance from the list of options.
-
In Device name, select an available device name from the Recommended for data volumes section of the list.
-
Connect to the instance and proceed to the next step.
The device name is used by Amazon EC2. The block device driver for the instance might assign a different device name when mounting the volume.
-
-
Mount the new volume.
-
Log into the data node as the root user.
-
Verify that the new volume s recognized by the the OS.
Compare the output with the name of the volume you created in the previous step, and verify that the index number in the volume name is the latest number. In most cases this will be 1. For example, if this is the only volume the output will be nvme1n1. If an older volume was previously created, the index number will be nvme2n1.
-
Create a variable with the block device path of the new volume.
where <> represents the index number of the new volume.
-
Generate a UUID for the block device of the new volume.
-
Create a primary partition on the new volume.
-
Verify that the partition was created.
-
Format the partition as ext4.
-
Verify that the partition has been formatted with the UUID and the etcd label (output should return the partition with UUID and an ETCD label).
-
Create the mount point of the etcd database.
-
Set the partition to mount upon operating system startup.
-
Load the changes to the filesystem.
-
Mount the partition that was added to /etc/fstab.
If the output is not empty, stop the procedure. The etcd volume cannot be mounted. Review what was missed in the previous steps.
-
Verify the partition has been mounted (the output should return the block device and mount point).
[<ADMIN> ~]# mount | grep "/var/lib/rancher/k3s/server/db"
mount | grep "/var/lib/rancher/k3s/server/db"If the output is empty, stop the procedure. The etcd volume is not mounted. Review what was missed in the previous steps.
-
-
Restore the etcd database.
[<ADMIN> ~]# ETCD_ROOT_DIR="/var/lib/rancher/k3s/server/db"
ETCD_ROOT_DIR="/var/lib/rancher/k3s/server/db"[<ADMIN> ~]# rsync -avP ${ETCD_BACKUP_DIR}/ ${ETCD_ROOT_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n"
rsync -avP ${ETCD_BACKUP_DIR}/ ${ETCD_ROOT_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n"Output should return ok.
-
Start TOS
-
Start the k3s service.
Verify that there are no errors in the command output and that the service is active (running).
-
Enable the k3s service.
-
Verify that the k3s service is enabled.
-
Start TOS.
The output should return enabled.
-
-
Check the cluster status.
-
On the primary data nodes, check the TOS status.
-
In the output, check if the System Status is Ok and all the items listed under Components appear as ok. If this is not the case, contact Tufin Support.
Example output:
[<ADMIN> ~]$ sudo tos status Tufin Orchestration Suite 2.0 System Status: Ok System Mode: Multi Node Nodes: 1 Master, 1 Worker. Total 2 nodes. Nodes are healthy. Components: Node: Ok Cassandra: Ok Mongodb: Ok Mongodb_sc: Ok Nats: Ok Neo4j: Ok Postgres: Ok Postgres_sc: Ok
-