Deploy High Availability

This procedure explains how to deploy high availability in a TOS environment. For more information on the high availability architecture, see High Availability Architecture

Prerequisites

  • Prepare two additional machines, one for each data node to be added to the cluster, making three in total, including the primary data node.

    High availability is supported for Tufin Appliances, VMware VMs, GCP and Open Server deployments.

  • All the nodes in the cluster need to be connected to the same L2 network and share the same subnet.

  • Each machine requires:

    • The same resources as allocated to your primary data node.

    • Allocation of a dedicated static IP address.

    • A supported operating system:

      • TufinOS 4.50

      • Red Hat Enterprise Linux 8.10

      • Rocky Linux 8.10

    • Install TOS on each new machine.

There is no need for a special interface to enable high availability in TOS. Communication between the nodes is done internally on the cluster.

Add Data Nodes

After adding the first new data node, proceed immediately to add the second; running on only two data nodes (i.e. primary data node and one additional data node only) will not allow high availability to be enabled and will make the cluster unstable.

  1. Log in to the primary data node.

  2. On the primary data node:

    [<ADMIN> ~]$ sudo tos cluster node add --role=data
    sudo tos cluster node add --role=data

    On completion, a new command string is displayed, which you will need to run on the new node within 30 minutes. If the allocated time expires, you will need to repeat the current step.

  3. Copy the command string to the clipboard.

  4. Log in to the new node.

  5. On the new node, paste the command string copied previously and run it. If the allocated time has expired, you will need to start from the beginning.
  6. Verify that the node was added by running sudo tos cluster node list on the primary data node.
  7. On the primary data node, check the TOS status.

    [<ADMIN> ~]$ sudo tos status
    sudo tos status
  8. In the output, check if the System Status is Ok and all the items listed under Components appear as ok. If this is not the case, contact Tufin Support.

  9. Example output:

    [<ADMIN> ~]$ tos status         
    [Mar 28 13:42:09]  INFO Checking cluster health status           
    TOS Aurora
    Tos Version: 24.2 (PRC1.1.0)
    
    System Status: "Ok"
                
    Cluster Status:
       Status: "Ok"
       Mode: "Preparing High Availability"
    
    Nodes
      Nodes:
      - ["datanode1"]
        Type: "Primary"
        Status: "Ok"
        Disk usage:
        - ["/opt"]
          Status: "Ok"
          Usage: 32%
      - ["datanode2"]
        Type: "Data Node"
        Status: "Ok"
        Disk usage:
        - ["/opt"]
          Status: "Ok"
          Usage: 16%
     
    registry
      Expiration ETA: 819 days
      Status: "Ok"
    
    Infra
    Databases:
    - ["cassandra"]
      Status: "Ok"
    - ["kafka"]
      Status: "Ok"
    - ["mongodb"]
      Status: "Ok"
    - ["mongodb_sc"]
      Status: "Ok"
    - ["ongDb"]
      Status: "Ok"
    - ["postgres"]
      Status: "Ok"
    - [postgres_sc"]
      Status: "Ok"
    
    Application
    Application Services Status OK
    Running services 54/54
    
      Backup Storage:
      Location: "Local
    s3:http://minio.default.svc:9000/velerok8s/restic/default "
      Status: "Ok"
      Latest Backup: 2024-03-23 05:00:34 +0000 UTC			

Enable High Availability

  1. On the primary data node:

    [<ADMIN> ~]$ sudo tos cluster ha enable
    sudo tos cluster ha enable

    Replication of data will commence. The time to completion will vary depending on the size of your database.

    On completion,  TOS will be in high availability mode.

  2. Verify that HA is active by running sudo tos status.

  3. We recommend defining a notification to inform you in the event of a change in the health of your cluster - see TOS Monitoring.

  4. You are now in HA mode.