Cluster management#
A NethServer 8 cluster is composed of one leader node and multiple worker nodes.
An NS8 cluster composed only of the leader node, is a fully functional system. Worker nodes can be added or removed at any time. NS8 clusters support a maximum of 4 nodes.
All nodes are managed by the Web user interface running on the leader node.
Add a node#
You can add (join) a worker node to an existing cluster. The process consists of three steps:
install the new node
obtain the join code from the leader node
enter the join code into the worker node
First, prepare a machine with the same Linux distribution used for the leader node, then follow the install instruction until the login to the Web user interface.
After the login on the worker node, click the Join cluster button.
Ensure the node Fully Qualified Domain Name (FQDN) is correct, and respects the DNS requirements.
On the leader node, access the Nodes
page and click on Add node to cluster and copy the join code from the dialog box.
Return to the worker node and paste the code inside the Join code
field and click the
Join cluster button.
If the leader node does not have a valid TLS certificate, remember to disable the TLS certification validation
option before
clicking the join button.
When the node registration is complete, you can return to the leader user interface and install applications running on the new worker node.
Remove a node#
Worker nodes can be removed from the cluster. Before removing a given
worker node, ensure no account provider replica is running on it. In the
Domains and users
page, for each domain follow the N providers
link to see the node where a provider replica is installed on, then remove
it.
Warning
If the node is not reachable, or is not responding, the provider replica removal must be completed manually after the node removal.
Access the Nodes
page, go to the three-dots menu of the node and click
on Remove from cluster
to open a confirmation window. Applications
installed on the node are listed: review that list carefully because node
removal is not recoverable.
If the node removal window is confirmed by pushing the I understand, remove node button, the node and its applications are disconnected, their authorizations are revoked and they cannot access the cluster any more.
When a node is removed from the cluster the applications running on it are not affected and they are left in a running state. Shutdown and switch off the node to finalize the node removal.
Promote a node to leader#
Adding and removing nodes might raise the need of changing the cluster leader node.
A good leader node must be reachable by any other worker node.
If DNS is used to find the leader IP address, every worker node must properly resolve the leader host name and that must be the same for every worker node.
Depending on the current leader node state there are two possible procedures to promote a node to leader role:
Reachable leader node
Unreachable leader node
In any case, after leader promotion it is necessary to perform these additional tasks:
The cluster backup password must be set again. See also Cluster backup.
See also the note in Audit trail about node promotion.
Reachable leader node#
If the current leader node is working properly, access the Nodes
page,
go to the three-dots menu of the node to promote and click on Promote to
leader
.
Confirm or enter the leader host name in the VPN public address
field. An IP address is accepted, too.
Confirm or enter the VPN public UDP port
number. Every worker node
will connect the leader on that UDP port number.
When the confirmation string is typed, the I understand, promote the node button becomes active and it is possible to complete the node promotion.
The Check node connectivity
checkbox verifies the connection of every
node with the selected one. The check might fail due to settings of other
devices in the network, like port-forwarding. In this case, if you are
sure the entered configuration is correct, it is possible to disable the
check: do it at your own risk!
Unreachable leader node#
If the current leader node is not reachable, it is necessary to run a command on any other worker node. Be prepared in advance for this situation by enabling SSH, console or Cockpit terminal root access to the nodes.
For example, to promote node with ID 3
, VPN endpoint
node3.example.com
UDP port 55820
, run the following command on
every worker node:
switch-leader --node 3 --endpoint node3.example.com:55820
Administrators#
Cluster administrators can fully manage the cluster. It’s recommended to create a personal user for each cluster administrator. All actions executed by a cluster administrator are collected inside a security Audit trail.
To add a new cluster administrator go to the Settings
page and select the Cluster administrators
card.
Then click on Create admin button and fill the required fields.
An administrator can’t delete its own user. To delete an administrator, you must log in with another existing cluster administrator.
Administrators can change their own password from the Account
card inside the Settings
page.
Two-factor authentication (2FA)#
Two-factor authentication (2FA) can be used to add an extra layer of security required to access the cluster management user interface.
The administrator can enable 2FA from the Account
card inside the Settings
page by clicking
the Enable 2FA button.
The user will have to:
download and install the preferred 2FA application on the smartphone
scan the QR code with the 2FA application
generate a new code and copy it inside the verification field, then click Verify code
Smartphone applications#
There are several commercial and open source 2FA applications:
Available for both Android and iOS:
FreeOTP: available for both Android and iOS
Authenticator: available on iOS only
2FAS: available for both Android and iOS
Reset the cluster administrator password#
If you are locked out of the web user interface and you can still access a
system command-line shell as root
(e.g. by the system recovery console
or SSH), run the following command to disable 2FA and reset the password:
api-cli run alter-user --data '{"user":"admin","set":{"password":"Nethesis,1234","2fa":false}}'
Replace the admin
and Nethesis,1234
default credentials as needed.
Audit trail#
Inside the audit trail page, cluster administrators can inspect all actions executed by any other administrator. Each event of the audit trail contains at least:
date and time of the action
user name of the cluster administrator
name of the action
Audit trail events can be filtered by user, date, action type, and custom text match.
Note
Audit trail information is stored in the leader node disk. In case of new leader promotion the audit trail information in the old leader is no longer accessible.