Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
X-Road® is open-source software and ecosystem solution that provides unified and secure data exchange between organisations.
The X-Road Data Exchange Layer is a standardised, cohesive, collaborative, interoperable and secure data exchange layer that gives service providers a completely new kind of opportunity of making themselves visible in services directed at citizens, businesses and civil servants. Creating entities that combine many different services and data sources is easy and cost efficient.
Improves the quality of existing services and products
Enables new types of service innovations
Savings in infrastructure, archiving and other costs
Standardised data security and privacy protection
Easy implementation, data access via interfaces – after connecting all included services are available
For an in-depth look at the X-Road architecture, take a look at the architecture documents section of the documentation.
The X-Road software packages to be installed vary between different use cases.
If you are joining an existing X-Road ecosystem, you should familiarise yourself with the ecosystem-specific documentation before moving to the Security Server installation guides. The X-Road ecosystem-specific documentation is provided and maintained by the ecosystem's X-Road Operator.
To learn more about installing the software, please visit the appropriate guide for your operating system of choice:
See for more information about X-Road.
Instead, if you're setting up a new X-Road ecosystem, it's strongly recommended to visit the for additional resources and study the architecture documents on this page. For a new X-Road ecosystem, it's required to set up the Central Server and a management Security Server.
The free courses provide online training for developers, users, operators, consultants, service providers and for anyone willing to learn more about X-Road.
The X-Road Security Server packages are officially available for Ubuntu 20.04, Ubuntu 22.04, RHEL7 and RHEL8. Additionally we provide docker support with the .
To learn about setting up a Security Server cluster with an external load balancer, please take a look at the relevant documentation .
The X-Road Central Server packages are currently only available for Ubuntu 20.04 and Ubuntu 22.04. You can find the installation manual .
To learn about setting up a Central Server cluster, please also take a look at the relevant documentation .
Setting up a fully functional X-Road environment requires a Certificate Authority (CA) with an OCSP service and a time-stamping service. The can be used for testing and development purposes.
Version: 1.17 Doc. ID: IG-XLB
22.3.2017
1.0
Initial version
Jarkko Hyöty, Olli Lindgren
27.4.2017
1.1
Added slave node user group instructions
Tatu Repo
15.6.2017
1.2
Added health check interface maintenance mode
Tatu Repo
21.6.2017
1.3
Olli Lindgren
02.03.2018
1.4
Added uniform terms and conditions reference
Tatu Repo
15.11.2018
1.5
Updates for Ubuntu 18.04 support
Jarkko Hyöty
20.12.2018
1.6
Update upgrade instructions
Jarkko Hyöty
11.09.2019
1.7
Remove Ubuntu 14.04 support
Jarkko Hyöty
08.10.2020
1.8
Added notes about API keys and caching
Janne Mattila
19.10.2020
1.9
Remove xroad-jetty and nginx mentions and add xroad-proxy-ui-api
Caro Hautamäki
19.10.2020
1.10
Added information about management REST API permissions
Petteri Kivimäki
23.12.2020
1.11
Updates for Ubuntu 20.04 support
Jarkko Hyöty
02.07.2021
1.12
Updates for state sync
Jarkko Hyöty
25.08.2021
1.13
Update X-Road references from version 6 to 7
Caro Hautamäki
17.09.2021
1.14
Add note about the proxy health check now also checking global conf validity
Caro Hautamäki
17.06.2022
1.15
Replace the word "replica" with "secondary"
Petteri Kivimäki
26.09.2022
1.16
Remove Ubuntu 18.04 support
Andres Rosenthal
01.03.2023
1.17
Updates for user groups in secondary nodes
Petteri Kivimäki
## Table of Contents
This document is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.
The intended audience of this installation guide are the X-Road security server administrators responsible for installing and configuring X-Road security servers to use external load balancing. The document is intended for readers with a good knowledge of Linux server management, computer networks, database administration, clustered environments and the X-Road functioning principles.
[SS-CLUSTER]
[IG-SS]
[UG-SS]
[TA-TERMS]
This document describes the external load balancing support features implemented by X-Road and the steps necessary to configure security servers to run as a cluster where each node has an identical configuration, including their keys and certificates. X-Road security server configuration changes are handled by a single primary server and one or more secondary servers.
The primary goal of the load balancing support is, as the name suggests, load balancing, not fault tolerance. A clustered environment increases fault tolerance but some X-Road messages can still be lost if a security server node fails.
The load balancing support is implemented with a few assumptions about the environment that users should be aware of. Carefully consider these assumptions before deciding if the supported features are suitable for your needs.
Adding or removing nodes to or from the cluster is infrequent. New nodes need to be added manually and this takes some time.
Changes to the configuration files are relatively infrequent and some downtime in ability to propagate the changes can be tolerated.
The cluster uses a primary-secondary model and the configuration primary is not replicated.
Changes to the serverconf
database, authorization and signing keys are applied via the configuration primary, which is a member of the cluster. The replication is one-way from primary to secondaries and the secondaries should treat the configuration as read-only.
The cluster nodes can continue operation if the primary fails but the configuration can not be changed until:
the primary comes back online, or
some other node is manually promoted to be the primary.
If a node fails, the messages being processed by that node are lost.
It is the responsibility of the load balancer component to detect the failure and route further messages to other nodes. Because there potentially is some delay before the failure is noticed, some messages might be lost due to the delay.
Recovering lost messages is not supported.
Configuration updates are asynchronous and the cluster state is eventually consistent.
If the primary node fails or communication is interrupted during a configuration update, each secondary should have a valid configuration, but the cluster state can be inconsistent (some members might have the old configuration while some might have received all the changes).
When external security servers communicate with the cluster, they see only the public IP address of the cluster which is registered to the global configuration as the security server address. From the caller point of view, this case is analogous to making a request to a single security server.
When a security server makes a request to an external server (security server, OCSP, TSA or a central server), the external server sees only the public IP address. Note that depending on the configuration, the public IP address might be different from the one used in the previous scenario. It should also be noted that the security servers will independently make requests to OCSP and TSA services as well as to the central server to fetch the global configuration as needed.
2.3.1.1 serverconf
database replication
serverconf database
replication required
PostgreSQL streaming replication (Hot standby)
The serverconf database replication is done using streaming replication with hot standby. Note that PostgreSQL replication is all-or-nothing: it is not possible to exclude databases from the replication. This is why the replicated serverconf and non-replicated messagelog databases need to be separated to different instances.
2.3.1.2 Key configuration and software token replication from /etc/xroad/signer/*
keyconf and the software token
replicated
rsync+ssh
(scheduled)
The secondary nodes use the keyconf.xml
in read-only mode: no changes made from the admin UI are persisted to disk. secondaries reload the configuration from disk periodically and apply the changes to their running in-memory configuration.
2.3.1.3 Other server configuration parameters from /etc/xroad/*
other server configuration parameters
replicated
rsync+ssh
(scheduled)
The following configurations are excluded from replication:
db.properties
(node-specific)
postgresql/*
(node-specific keys and certs)
globalconf/
(syncing globalconf could conflict with confclient
)
conf.d/node.ini
(specifies node type: primary or secondary)
2.3.2.1 messagelog
database
The messagelog database is not replicated. Each node has its own separate messagelog database. However, in order to support PostgreSQL streaming replication (hot standby mode) for the serverconf data, the serverconf and messagelog databases must be separated. This requires modifications to the installation (a separate PostgreSQL instance is needed for the messagelog database) and has some implications on the security server resource requirements as a separate instance uses some memory.
2.3.2.2 OCSP responses from /var/cache/xroad/
The OCSP responses are currently not replicated. Replicating them could make the cluster more fault tolerant but the replication cannot simultaneously create a single point of failure. A distributed cache could be used for the responses.
This chapter details the complete installation on a high level, with links to other chapters that go into the details.
In order to properly set up the data replication, the secondary nodes must be able to connect to:
the primary server using SSH (tcp port 22), and
the primary serverconf
database (e.g. tcp port 5433).
Install the X-Road security server packages using the normal installation procedure or use an existing standalone node.
Stop the xroad services.
Change /etc/xroad/db.properties
to point to the separate database instance:
serverconf.hibernate.connection.url
: Change the url port number from 5432
to 5433
(or the port you specified)
Additionally, rssh
shell can be used to restrict secondary access further, but note that it is not available on RHEL.
Configure the node type as master
in /etc/xroad/conf.d/node.ini
:
Change the owner and group of the file to xroad:xroad
if it is not already.
Disable support for client-side pooled connections (HTTP connection persistence) in /etc/xroad/conf.d/local.ini
Because the load balancing works at TCP level, disabling persistent HTTP connections is recommended so that the load balancer can evenly distribute the traffic.
Start the X-Road services.
Install security server packages using the normal installation procedure. Alternatively you can also install only the packages required for secondary nodes. xroad-proxy-ui-api
package can be omitted, but the admin graphical user interface (which is provided by this package) can be handy for diagnostics. It should be noted that changing a secondary's configuration via the admin gui is not possible (except entering token PIN).
Stop the xroad
services.
Change /etc/xroad/db.properties
to point to the separate database instance and change password to match the one defined in the primary database (the password is part of the data that is replicated to the secondaries).
serverconf.hibernate.connection.url
: Change the url port number from 5432
to 5433
(or the port you specified)
serverconf.hibernate.connection.password
: Change to match the primary db's password (in plaintext).
Set up SSH between the primary and the secondary (the secondary must be able to access /etc/xroad
via ssh)
Create an SSH keypair for xroad
user and copy the public key to authorized keys of the primary node (/home/xroad-slave/.ssh/authorized_keys
)
Make the initial synchronization between the primary and the secondary.
Where <primary>
is the primary server's DNS or IP address.
Configure the node type as slave
in /etc/xroad/conf.d/node.ini
.
Change the owner and group of the file to xroad:xroad
if it is not already.
Start the X-Road services.
If you wish to use the secondary security server's admin user interface, you need to implement additional user group restrictions. As noted in step 1, changes to the secondary node security server configuration must not be made through its admin user interface, as any such changes would be overwritten by the replication. To disable UI editing privileges for all users, remove the following user groups from the secondary security server:
xroad-registration-officer
xroad-service-administrator
xroad-system-administrator
Note: xroad-security-officer
should remain, otherwise you will not be able to enter token PIN codes.
After removing these groups, the super user created during the security server installation is a member of two UI privilege groups: xroad-securityserver-observer
and xroad-security-officer
. These groups allow read-only access to the admin user interface and provide a safe way to use the UI for checking the configuration status of the secondary security server. In addition, the groups allow the user to enter the token PIN code. Since admin UI users are UNIX users that are members of specific privilege groups, more users can be added to the groups as necessary. Security server installation scripts detect the node type of existing installations and modify user group creation accordingly. Instead, version upgrades do not overwrite or modify this configuration during security server updates.
Also, the secondary security server's management REST API can be used to read the secondary's configuration. However, modifying the secondary's configuration using the management REST API is blocked. API keys are replicated from the primary to the secondaries, and the keys that are associated with the xroad-securityserver-observer
role have read-only access to the secondary. In addition, the keys that are associated with the xroad-securityserver-observer
and xroad-security-officer
roles, are able to enter token PIN codes. The keys that are not associated with the xroad-securityserver-observer
role, don't have any access to the secondary. See next item for more details.
Note about API keys and caching. If API keys have been created for primary node, those keys are replicated to secondaries, like everything else from serverconf
database is. The keys that are associated with the xroad-securityserver-observer
role have read-only access to the secondary. Instead, the keys that are not associated with the xroad-securityserver-observer
role, don't have any access to the secondary and API calls will fail. To avoid this, secondary REST API should only be accessed using keys associated with the xroad-securityserver-observer
role, and only for operations that read configuration, not updates.
Furthermore, API keys are accessed through a cache that assumes that all updates to keys (e.g. revoking keys, or changing permissions) are done using the same node. If API keys are changed on primary, the changes are not reflected on the secondary caches until the next time xroad-proxy-ui-api
process is restarted. To address this issue, you should restart secondary nodes' xroad-proxy-ui-api
processes after API keys are modified (and database has been replicated to secondaries), to ensure correct operation.
Improvements to API key handling in clustered setups will be included in later releases.
The load balancing support includes a health check service that can be used to ping the security server using HTTP to see if it is healthy and likely to be able to send and receive messages. The service is disabled by default but can be enabled via configuration options.
health-check-interface
0.0.0.0
(all network interfaces)
The network interface this service listens to. This should be an address the load balancer component can use to check the server status
health-check-port
0
(disabled)
The tcp port the service listens to for HTTP requests. The default value 0
disables the service.
Below is a configuration that can be added to /etc/xroad/conf.d/local.ini
on the primary that would enable the health check service on all the nodes once the configuration has been replicated. Changes to the settings require restarting the xroad-proxy
service to take effect. This example enables listening to all available network interfaces (0.0.0.0
) on port 5588.
The service can be accessed using plain HTTP. It will return HTTP 200 OK
if the proxy should be able to process messages and HTTP 500 Internal Server Error
otherwise. A short message about the failure reason, if available, is added to the body of the response. The service runs as a part of the xroad-proxy
service.
In addition to implicitly verifying that the xroad-proxy
service is running, the health checks verify that:
The server authentication key is accessible and that the OCSP response for the certificate is good
. This requires a running xroad-signer
service in good condition.
The serverconf
database is accessible.
The global configuration
is valid and not expired.
Each of these status checks has a separate timeout of 5 seconds. If the status check fails to produce a response in this time, it will be considered a health check failure and will cause a HTTP 500
response.
In addition, each status check result will be cached for a short while to avoid excess resource usage. A successful status check result will be cached for 2 seconds before a new verification is triggered. This is to make sure the OK results are as fresh as possible while avoiding per-request verification. In contrast, verification failures will be cached for 30 seconds before a new verification is triggered. This should allow for the security server to get up and running after a failure or possible reboot before the status is queried again.
Security server's health check interface can also be manually switched to a maintenance mode in order to inform the load balancing solution that the security server will be undergoing maintenance and should be removed from active use.
When in maintenance mode the health check interface will only respond with HTTP 503 Service unavailable
and the message Health check interface is in maintenance mode
and no actual health check diagnostics will be run. Maintenance mode is disabled by default and will automatically reset to its default when the proxy service is restarted.
Maintenance mode can be enabled or disabled by sending HTTP GET
-request from the target security server to its proxy admin port 5566
. The intended new state can be defined using the targetState
HTTP-parameter:
Enable maintenance mode
http://localhost:5566/maintenance?targetState=true
Disable maintenance mode
http://localhost:5566/maintenance?targetState=false
Proxy admin port will respond with 200 OK
and a message detailing the actualized maintenance mode state change, e.g. Maintenance mode set: false => true
. In case the maintenance mode state could not be changed, the returned message will detail the reason.
There is a known but rarely and not naturally occurring issue where the health check will report an OK condition for a limited time but sending some messages might not be possible. This happens when an admin user logs out of the keys.
The health check will detect if the tokens (the key containers) have not been signed into after xroad-signer
startup. It will however, not detect immediately when the tokens are manually logged out of. The keys are cached by the xroad-proxy
process for a short while. As long as the authentication key is still cached, the health check will return OK, even though the necessary signing context values for sending a message might no longer be cached. This means messages might fail to be sent even if the health check returns OK. As the authentication key expires from the cache (after a maximum of 5 minutes), the health check will start returning failures. This is a feature of caching and not a bug per se. In addition, logging out of a security server's keys should not occur by accident so it should not be a surprise that the node cannot send messages after not having access to it's keys.
Before testing with an actual load balancer, you can test the health check service with curl
, for example.
Below is an example response from the Health check service when everything is up and running and messages should go through this node:
And a health check service response on the same node when the service xroad-signer
is not running:
This section describes how to create and set up certificate authentication between the secondary and primary database instances.
Generate the Certificate Authority key and a self-signed certificate for the root-of-trust:
The subject name does not really matter here. Remember to keep the ca.key
file in a safe place.
Alternatively, an existing internal CA can be used for managing the certificates. A sub-CA should be created as the database cluster root-of-trust and used for issuing the secondary and primary certificates.
Generate keys and certificates signed by the CA for each postgresql instance, including the primary. Do not use the CA certificate and key as the database certificate and key.
Generate a key and the Certificate Signing Request for it:
Note: The <nodename>
(the subject common name) will be used for identifying the cluster nodes. For secondary nodes, it needs to match the replication user name that is added to the primary database and the username that the secondary node database uses to connect to the primary. For example, in a system with one primary and two secondaries, the names of the nodes could be primary
, replica1
and replica2
. Other parts of the subject name do not matter and can be named as is convenient.
Sign the CSR with the CA, creating a certificate:
Repeat the above steps for each node.
Copy the certificates and keys to the nodes:
First, prepare a directory for them:
Then, copy the certificates (ca.crt, and the instance's server.crt and server.key) to /etc/xroad/postgresql
on each cluster instance.
Finally, set the owner and access rights for the key and certificates:
serverconf
databaseCreate a new systemctl
service unit for the new database. As root, execute the following command:
Create the database and configure SELinux:
In the above command, 10
is the postgresql major version. Use pg_lsclusters
to find out what version(s) are available.
Edit postgresql.conf
and set the following options:
On RHEL, PostgreSQL config files are located in the
PGDATA
directory/var/lib/pgql/serverconf
. Ubuntu keeps the config in/etc/postgresql/<version>/<cluster name>
, e.g./etc/postgresql/10/serverconf
)