Move etcd - In-place AWS Instance

Overview

This procedure is required for all clusters, including remote clusters, and is run on data nodes only.

The Kubernetes etcd database must be on a separate volume to give it access to all the resources required for optimal TOS performance, stability and minimal latency.

This procedure must be performed by an experienced Linux administrator with knowledge of network configuration.

Preliminary Preparations

Run the following command:
```
lsblk | grep "/var/lib/rancher/k3s/server/db"
```
lsblk | grep "/var/lib/rancher/k3s/server/db"
If the output contains /var/lib/rancher/k3s/server/db, etcd is already on a separate volume, and you do not need to perform this procedure.
If you are going to perform this procedure over multiple maintenance periods, create a new backup each time.
1. Create the backup using tos backup create:
2. You can check the backup creation status using tos backup status, which shows the status of backups in progress. Wait until completion before continuing.
3. Run the following command to display the list of backups saved on the node:
```
[<ADMIN> ~]$ sudo tos backup list
```
  sudo tos backup list
4. Check that your backup file appears in the list, and that the status is "Completed".
5. Run the following command to export the backup to a file:
```
[<ADMIN> ~]$ sudo tos backup export
```
  sudo tos backup export
6. If your backup files are saved locally:
  1. Run sudo tos backup export to save your backup file from a TOS backup directory as a single .gzip file. If there are other backups present, they will be included as well.
  2. Transfer the exported .gzip file to a safe, remote location.
    
    Make sure you have the location of your backups safely documented and accessible, including credentials needed to access them, for recovery when needed.
  After the backup is exported, we recommend verifying that the file contents can be viewed by running the following command:
```
[Target location]$ tar tzvf <filename>
```
  tar tzvf <file name>
Switch to the root user.
```
[<ADMIN> ~]$ sudo su -
```
sudo su -
Install the rsync RPM.
```
[<ADMIN> ~]$ dnf install rsync
```
dnf install rsync

Mount The etcd Database to a Separate Volume

Shut down TOS.
1. Shut down TOS.
  [<ADMIN> ~]# tos stop
  
  tos stop
2. Wait for the following message:
```
Deployment has been stopped successfully
```
3. Stop the k3s service.
  [<ADMIN> ~]# systemctl stop k3s.service
  
  systemctl stop k3s.service
4. Disable the k3s service.
  [<ADMIN> ~]# systemctl disable k3s.service
  
  systemctl disable k3s.service
5. Verify that the k3s service is stopped and disabled.
  [<ADMIN> ~]# systemctl is-active k3s.service
  
  systemctl is-active k3s.service
  Output should return inactive.
  [<ADMIN> ~]# systemctl is-enabled k3s.service
  
  systemctl is-enabled k3s.service
  Output should return disabled.
Create a backup directory.
1. Create an etcd backup directory with the timestamp on the /opt partition.
  [<ADMIN> ~]# mkdir /opt/etcd_data_backup_$(date "+%Y%m%d-%H%M%S") || echo "Fail"
  
  mkdir /opt/etcd_data_backup_$(date "+%Y%m%d-%H%M%S") || echo "Fail"
2. Identify the path of the etcd backup directory.
  [<ADMIN> ~]# ETCD_BACKUP_DIR="$(ls -1dt /opt/etcd_data_backup_* | head -n1)"
  
  ETCD_BACKUP_DIR="$(ls -1dt /opt/etcd_data_backup_* | head -n1)"
3. Verify that the etcd backup directory is assigned to the variable ETCD_BACKUP_DIR.
  [<ADMIN> ~]# echo "$ETCD_BACKUP_DIR"
  
  echo "$ETCD_BACKUP_DIR"
Locate the etcd database
The purpose of this step is to identify whether the etcd database is located in the k3s directory, or whether due to older architecture the etcd database is located in the gravity directory.
1. Check if there is a link to the etcd database.
  [<ADMIN> ~]# test -L /var/lib/rancher/k3s/server/db/etcd && echo "Etcd link exists."
  
  test -L /var/lib/rancher/k3s/server/db/etcd && echo "Etcd link exists."
  If the output is empty, no link exists. Proceed to step 2.
  
  If the output returns Etcd link exists, this indicates that the etcd database is in the gravity directory. Proceed to step 3.
2. Check if the database is in the k3s directory.
  [<ADMIN> ~]# test -d /var/lib/rancher/k3s/server/db/etcd || echo "Etcd directory does not exist."
  
  test -d /var/lib/rancher/k3s/server/db/etcd || echo "Etcd directory does not exist."
  If the output is empty, the etcd database is in the k3s directory. Do the following:
  1. Assign the path of the k3s directory to the ETCD_ROOT_DIR variable.
    
    [<ADMIN> ~]# ETCD_ROOT_DIR="/var/lib/rancher/k3s/server/db"
    
    ETCD_ROOT_DIR="/var/lib/rancher/k3s/server/db"
  2. Proceed to back up the etcd database.
  If the output returns Etcd directory does not exist, this indicates that the etcd database could not be found. Stop the procedure and contact customer support.
3. Check if there is a link to the etcd database in the gravity directory.
  [<ADMIN> ~]# test -d /var/lib/gravity/planet/etcd || echo "Etcd directory does not exist."
  
  test -d /var/lib/gravity/planet/etcd || echo "Etcd directory does not exist."
Back up the etcd database
1. Back up the etcd database to the backup directory you created.
  [<ADMIN> ~]# rsync -avP ${ETCD_ROOT_DIR}/ ${ETCD_BACKUP_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n"
  
  rsync -avP ${ETCD_ROOT_DIR}/ ${ETCD_BACKUP_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n" echo "Fail"
  Output should return ok.
2. If it exists, remove the etcd symbolic link.
  [<ADMIN> ~]# ETCD_LINK_PATH="/var/lib/rancher/k3s/server/db/etcd"
  
  ETCD_LINK_PATH="/var/lib/rancher/k3s/server/db/etcd"
  [<ADMIN> ~]# test -L ${ETCD_LINK_PATH} && rm -f ${ETCD_LINK_PATH}
  
  test -L ${ETCD_LINK_PATH} && rm -f ${ETCD_LINK_PATH}
Add a volume to the AWS instance.
1. In the AWS instance navigation pane, go to Volumes pane, and click Create Volume.
2. Configure the following settings:
  - Volume type: SSD gp3
  - Size: Allocate a disk size of at least 128 GB
  - IOPS: 7500
  - Availability Zone: Same availability zone as instance
  - Throughput: 250 MBps
  - Snapshot ID: Keep the default value
  - Encryption: If the volume is encrypted, it can only be attached to an instance type that supports Amazon EBS encryption.
3. Click Create Volume.
4. In the navigation pane, go to Volumes pane, and click Actions > Attach Volume.
5. In Instance, enter the ID of the instance or select the instance from the list of options.
6. In Device name, select an available device name from the Recommended for data volumes section of the list.
7. Connect to the instance and proceed to the next step.
Mount the new volume.
Restore the etcd database.
[<ADMIN> ~]# ETCD_ROOT_DIR="/var/lib/rancher/k3s/server/db"
ETCD_ROOT_DIR="/var/lib/rancher/k3s/server/db"
[<ADMIN> ~]# rsync -avP ${ETCD_BACKUP_DIR}/ ${ETCD_ROOT_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n"
rsync -avP ${ETCD_BACKUP_DIR}/ ${ETCD_ROOT_DIR}/ && echo -e "\nOK\n" || echo -e "\nFail\n"
Output should return ok.
Start TOS
1. Start the k3s service.
  [<ADMIN> ~]# systemctl start k3s.service
  
  systemctl start k3s.service
  Verify that there are no errors in the command output and that the service is active (running).
2. Enable the k3s service.
  [<ADMIN> ~]# systemctl enable k3s.service
  
  systemctl enable k3s.service
3. Verify that the k3s service is enabled.
  [<ADMIN> ~]# systemctl is-enabled k3s.service
  
  systemctl is-enabled k3s.service
4. Start TOS.
  [<ADMIN> ~]# tos start
  
  tos start

Check the cluster status.

On the primary data nodes, check the TOS status.
```
[<ADMIN> ~]$ sudo tos status
```
sudo tos status
In the output, check if the System Status is Ok and all the items listed under Components appear as ok. If this is not the case, contact Tufin Support.

Example output:

[<ADMIN> ~]$ sudo tos status
 Tufin Orchestration Suite 2.0

 System Status: Ok
 System Mode:   Multi Node

 Nodes:
   1 Master, 1 Worker. Total 2 nodes. Nodes are healthy.

 Components:
   Node:            Ok
   Cassandra:       Ok
   Mongodb:         Ok
   Mongodb_sc:      Ok
   Nats:            Ok
   Neo4j:           Ok
   Postgres:        Ok
   Postgres_sc:     Ok