Move etcd - HA Non-Cloud VM

Overview

The etcd database should be on a separate disk to improve the stability of TOS Aurora and reduce latency. Moving the etcd database to a separate disk ensures that the kubernetes database has access to all the resources required to ensure an optimal TOS performance.

This procedure is only required for data nodes running: TufinOS 4, Rocky Linux 8 or RHEL 8.

This procedure must be performed by an experienced Linux administrator with knowledge of network configuration.

Only perform this procedure if you are mounting the etcd database on an in-place high availability deployment. If you are setting up a new high availability deployment, first follow the instructions in High Availability, and then return to this procedure. With new high availability deployments, TOS Aurora should only be installed on one data node.

Preliminary Preparations

  1. Run the following command:

    lsblk | grep "/var/lib/rancher/k3s/server/db"
    lsblk | grep "/var/lib/rancher/k3s/server/db"

    If the output contains /var/lib/rancher/k3s/server/db, etcd is already on a separate disk, and you do not need to perform this procedure.

    If no output is returned, continue the procedure.

  2. If you are going to perform this procedure over multiple maintenance periods, create a new backup each time.

    1. Create the backup using tos backup create:

    2. [<ADMIN> ~]$ sudo tos backup create
      sudo tos backup create

      Example output:

      [%=Local.admin-prompt% sudo tos backup create
      [Aug 23 16:18:42]  INFO Running backup
      Backup status can be monitored with "tos backup status"
    3. You can check the backup creation status using tos backup status, which shows the status of backups in progress. Wait until completion before continuing.

    4. [<ADMIN> ~]$ sudo tos backup status
      sudo tos backup status

      Example output:

      [<ADMIN> ~]$ sudo tos backup status
       Found active backup "23-august-2021-16-18"
    5. Run the following command to display the list of backups saved on the node:

      [<ADMIN> ~]$ sudo tos backup list
      sudo tos backup list
    6. Example output:

      [<ADMIN> ~]$ sudo tos backup list
       ["23-august-2021-16-18"]
         Started: "2021-08-23 13:18:43 +0000 UTC"
         Completed: "N/A"
         Modules: "ST, SC"
         HA mode: "false"
         TOS release: "21.2 (PGA.0.0) Final"
         TOS build: "21.2.2100-210722164631509"
         Expiration Date: "2021-09-22 13:18:43 +0000 UTC"
         Status: "Completed"
    7. Check that your backup file appears in the list, and that the status is "Completed".

    8. Run the following command to export the backup to a file:

      [<ADMIN> ~]$ sudo tos backup export
      sudo tos backup export
    9. The command creates a single backup file.

      [<ADMIN> ~]$ sudo tos backup export
       [Aug 23 16:33:42]  INFO Preparing target dir /opt/tufin/backups
       [Aug 23 16:33:42]  INFO Compressing...
       [Aug 23 16:33:48]  INFO Backup exported file: /opt/tufin/backups/backup-21-2-pga.0.0-final-20210823163342.tar.gzip 
       [Aug 23 16:33:48]  INFO Backup export has completed
    10. If your backup files are saved locally:

      1. Run sudo tos backup export to save your backup file from a TOS backup directory as a single .gzip file. If there are other backups present, they will be included as well.

      2. Transfer the exported .gzip file to a safe, remote location.

        Make sure you have the location of your backups safely documented and accessible, including credentials needed to access them, for recovery when needed.

      After the backup is exported, we recommend verifying that the file contents can be viewed by running the following command:

      [Target location]$ tar tzvf <filename>
      tar tzvf <file name>
  3. Switch to the root user.

    [<ADMIN> ~]$ sudo su -
    sudo su -
  4. Non-TufinOS VMs only. Install the rsync RPM.

    [<ADMIN> ~]$ dnf install rsync
    dnf install rsync
  5. Find the name of the last disk added to the VM.

    [<ADMIN> ~]# lsblk -ndl -o NAME
    lsblk -ndl -o NAME

    The output returns the list of disks on the VM. The last letter of the disk name indicates in which it was added, for example: sda, sdb, sdc.

  6. Save the name of the last disk in a separate location. You will need it later for verification purposes.

Mount The etcd Database to a Separate Disk

Shut Down TOS

  1. Shut down TOS.

    [<ADMIN> ~]# tos stop
    tos stop
  2. Wait for the following message:

    Deployment has been stopped successfully
  3. Stop the k3s service.

    [<ADMIN> ~]# systemctl stop k3s.service
    systemctl stop k3s.service
  4. Disable the k3s service.

    [<ADMIN> ~]# systemctl disable k3s.service
    systemctl disable k3s.service
  5. Verify that the k3s service is stopped and disabled.

    [<ADMIN> ~]# systemctl is-active k3s.service
    systemctl is-active k3s.service

    Output should return inactive.

    [<ADMIN> ~]# systemctl is-enabled k3s.service
    systemctl is-enabled k3s.service

    Output should return disabled.

Mount The etcd Database

Repeat these steps for each data node.

Start TOS

  1. Start the k3s service.

    [<ADMIN> ~]# systemctl start k3s.service
    systemctl start k3s.service

    Verify that there are no errors in the command output and that the service is active (running).

  2. Enable the k3s service.

    [<ADMIN> ~]# systemctl enable k3s.service
    systemctl enable k3s.service
  3. Verify that the k3s service is enabled.

    [<ADMIN> ~]# systemctl is-enabled k3s.service
    systemctl is-enabled k3s.service
  4. The output should return enabled.

  5. Primary data node only. Start TOS.

    [<ADMIN> ~]# tos start
    tos start

Check the Cluster Status

  1. On the primary data nodes, check the TOS status.

    [<ADMIN> ~]$ sudo tos status
    sudo tos status
  2. In the output, check if the System Status is Ok and all the items listed under Components appear as ok. If this is not the case, contact Tufin Support.

  3. Example output:

    [<ADMIN> ~]$ sudo tos status
     Tufin Orchestration Suite 2.0
    
     System Status: Ok
     System Mode:   Multi Node
    
     Nodes:
       1 Master, 1 Worker. Total 2 nodes. Nodes are healthy.
    
     Components:
       Node:            Ok
       Cassandra:       Ok
       Mongodb:         Ok
       Mongodb_sc:      Ok
       Nats:            Ok
       Neo4j:           Ok
       Postgres:        Ok
       Postgres_sc:     Ok