https://access.redhat.com/articles/3004101#testing-manual-move-of-saphana-resource-to-another-node
SAP HANA system replication in pacemaker cluster
2. SAP HANA System Replication
The following example shows how to set up system replication between 2 nodes running SAP HANA.
Configuration used in the example:
SID: RH2
Instance Number: 02
node1 FQDN: node1.example.com
node2 FQDN: node2.example.com
node1 HANA site name: DC1
node2 HANA site name: DC2
SAP HANA 'SYSTEM' user password: <HANA_SYSTEM_PASSWORD>
SAP HANA administrative user: rh2adm
Ensure that both systems can resolve the FQDN of both systems without issues. To ensure that FQDNs can be resolved even without DNS you can place them into /etc/hosts
like in the example below.
# /etc/hosts
192.168.0.11 node1.example.com node1
192.168.0.12 node2.example.com node2
For the system replication to work, the SAP HANA log_mode
variable must be set to normal
. This can be verified as HANA system user using the command below on both nodes.
[rh2adm]# hdbsql -u system -p <HANA_SYSTEM_PASSWORD> -i 02 "select value from "SYS"."M_INIFILE_CONTENTS" where key='log_mode'"
VALUE "normal"
1 row selected
Note that later configuration of primary and secondary node is used only during setup. The roles (primary/secondary) may change during cluster operation based on cluster configuration.
A lot of the configuration steps are performed from the SAP HANA administrative user on the system whose name was selected during installation. In examples we will use rh2adm
as we use SID RH2
. To become the SAP HANA administrative user you can use the command below.
[root]# sudo -i -u rh2adm
[rh2adm]#
2.1. Configure HANA primary node
SAP HANA system replication will only work after initial backup has been performed. The following command will create an initial backup in /tmp/foo
directory. Please note that the size of the backup depends on the database size and may take some time to complete. The directory to which the backup will be placed must by writeable by the SAP HANA administrative user.
a) On single container systems following command can be used for backup:
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "BACKUP DATA USING FILE ('/tmp/foo')"
0 rows affected (overall time xx.xxx sec; server time xx.xxx sec)
b) On multiple container systems (MDC) SYSTEMDB
and all tenant databases needs to be backed up:
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> -d SYSTEMDB "BACKUP DATA USING FILE ('/tmp/foo')"
0 rows affected (overall time xx.xxx sec; server time xx.xxx sec)
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> -d SYSTEMDB "BACKUP DATA FOR RH2 USING FILE ('/tmp/foo-RH2')"
0 rows affected (overall time xx.xxx sec; server time xx.xxx sec)
After the initial backup, initialize the replication using the command below.
[rh2adm]# hdbnsutil -sr_enable --name=DC1
checking for active nameserver ...
nameserver is active, proceeding ...
successfully enabled system as system replication source site
done.
Verify that initialization is showing current node as 'primary' and that SAP HANA is running on it.
[rh2adm]# hdbnsutil -sr_state
checking for active or inactive nameserver ...
System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~
mode: primary
site id: 1
site name: DC1
Host Mappings:
2.2. Configure HANA secondary node
Secondary node needs to be registered to, now running, primary node. SAP HANA on the secondary node must be shut down before using the command bellow.
[rh2adm]# HDB stop
(SAP HANA2.0 only) Copy the SAP HANA system PKI SSFS_RH2.KEY
and SSFS_RH2.DAT
files from primary node to secondary node.
[rh2adm]# scp root@node1:/usr/sap/RH2/SYS/global/security/rsecssfs/key/SSFS_RH2.KEY /usr/sap/RH2/SYS/global/security/rsecssfs/key/SSFS_RH2.KEY
[rh2adm]# scp root@node1:/usr/sap/RH2/SYS/global/security/rsecssfs/data/SSFS_RH2.DAT /usr/sap/RH2/SYS/global/security/rsecssfs/data/SSFS_RH2.DAT
To register secondary node use the command below.
[rh2adm]# hdbnsutil -sr_register --remoteHost=node1 --remoteInstance=02 --replicationMode=syncmem --name=DC2
adding site ...
checking for inactive nameserver ...
nameserver node2:30201 not responding.
collecting information ...
updating local ini files ...
done.
Start SAP HANA on the secondary node.
[rh2adm]# HDB start
Verify that the secondary node is running and that 'mode' is syncmem
. Output should look similar to the output below.
[rh2adm]# hdbnsutil -sr_state
checking for active or inactive nameserver ...
System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~
mode: syncmem
site id: 2
site name: DC2
active primary site: 1
Host Mappings:
~~~~~~~~~~~~~~
node2 -> [DC1] node1
node2 -> [DC2] node2
2.3. Testing SAP HANA System Replication
To manually test the SAP HANA System Replication setup you can follow the procedure described in following SAP documents:
- SAP HANA 1.0: chapter "8. Testing" - How to Perform System Replication for SAP HANA 1.0 guide
- SAP HANA 2.0: chapter "9. Testing" - How to Perform System Replication for SAP HANA 2.0 guide
2.4. Checking SAP HANA System Replication state
To check the current state of SAP HANA System Replication you can execute the following command as the SAP HANA administrative user on current primary SAP HANA node.
On single_container system:
[rh2adm]# python /usr/sap/RH2/HDB02/exe/python_support/systemReplicationStatus.py
| Host | Port | Service Name | Volume ID | Site ID | Site Name | Secondary | Secondary | Secondary | Secondary | Secondary | Replication | Replication | Replication |
| | | | | | | Host | Port | Site ID | Site Name | Active Status | Mode | Status | Status Details |
| ----- | ----- | ------------ | --------- | ------- | --------- | --------- | --------- | --------- | --------- | ------------- | ----------- | ----------- | -------------- |
| node1 | 30201 | nameserver | 1 | 1 | DC1 | node2 | 30201 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| node1 | 30207 | xsengine | 2 | 1 | DC1 | node2 | 30207 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| node1 | 30203 | indexserver | 3 | 1 | DC1 | node2 | 30203 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
On multiple_containers system (MDC):
[rh2adm]# python /usr/sap/RH2/HDB02/exe/python_support/systemReplicationStatus.py
| Database | Host | Port | Service Name | Volume ID | Site ID | Site Name | Secondary | Secondary | Secondary | Secondary | Secondary | Replication | Replication | Replication |
| | | | | | | | Host | Port | Site ID | Site Name | Active Status | Mode | Status | Status Details |
| -------- | ----- | ----- | ------------ | --------- | ------- | --------- | ----------| --------- | --------- | --------- | ------------- | ----------- | ----------- | -------------- |
| SYSTEMDB | node1 | 30201 | nameserver | 1 | 1 | DC1 | node2 | 30201 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| RH2 | node1 | 30207 | xsengine | 2 | 1 | DC1 | node2 | 30207 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| RH2 | node1 | 30203 | indexserver | 3 | 1 | DC1 | node2 | 30203 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
3. Configuring monitoring account in SAP HANA for cluster resource agents (SAP HANA 1.0 SPS12 and earlier)
Starting with SAP HANA 2.0 SPS0 monitoring account is not needed
A technical user with CATALOG READ
and MONITOR ADMIN
privileges must exist in SAP HANA for the resource agents to be able to run queries on the system replication status. The example below shows how to create such a user, assign him the correct permissions and disable password expiration for this user.
monitoring user username: rhelhasync
monitoring user password: <MONITORING_USER_PASSWORD>
3.1. Creating monitoring user
When SAP HANA System replication is active then only the primary system is able to access the database. Accessing the secondary system will fail.
On the primary system run the following commands to create the monitoring user.
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "create user rhelhasync password \"<MONITORING_USER_PASSWORD>\""
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "grant CATALOG READ to rhelhasync"
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "grant MONITOR ADMIN to rhelhasync"
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "ALTER USER rhelhasync DISABLE PASSWORD LIFETIME"
3.2. Store monitoring user credentials on all nodes
The SAP HANA userkey allows the "root" user on OS level to access SAP HANA via monitoring user without asking for password. This is needed by resource agents so they can run queries on HANA System Replication status.
[root]# /usr/sap/RH2/HDB02/exe/hdbuserstore SET SAPHANARH2SR localhost:30215 rhelhasync "<MONITORING_USER_PASSWORD>"
To verify that the userkey has been created correctly in root's userstore, you can run hdbuserstore list
command on each node and check if the monitoring account is present in the output as shown below:
[root]# /usr/sap/RH2/HDB02/exe/hdbuserstore list
DATA FILE : /root/.hdb/node1/SSFS_HDB.DAT
KEY FILE : /root/.hdb/node1/SSFS_HDB.KEY
KEY SAPHANARH2SR
ENV : localhost:30215
USER: rhelhasync
Please also verify that it is possible to run hdbsql commands as root using the SAPHANA
[root]# /usr/sap/RH2/HDB02/exe/hdbsql -U SAPHANARH2SR -i 02 "select distinct REPLICATION_STATUS from SYS.M_SERVICE_REPLICATION"
REPLICATION_STATUS
"ACTIVE"
1 row selected
If you get an error message about issues with the password or if you are prompted for a password please verify with hdbsql
command or HANA Studio that the password for the user created with the hdbsql
commands above is not configured 'to be changed on first login' or that the password has not expired. You can use the command below.
(Note: be sure to use the name of monitoring user in capital letters)
[root]# /usr/sap/RH2/HDB02/exe/hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "select * from sys.users where USER_NAME='RHELHASYNC'"
USER_NAME,USER_ID,USER_MODE,EXTERNAL_IDENTITY,CREATOR,CREATE_TIME,VALID_FROM,VALID_UNTIL,LAST_SUCCESSFUL_CONNECT,LAST_INVALID_CONNECT_ATTEMPT,INVALID_CONNECT_A
TTEMPTS,ADMIN_GIVEN_PASSWORD,LAST_PASSWORD_CHANGE_TIME,PASSWORD_CHANGE_NEEDED,IS_PASSWORD_LIFETIME_CHECK_ENABLED,USER_DEACTIVATED,DEACTIVATION_TIME,IS_PASSWORD
_ENABLED,IS_KERBEROS_ENABLED,IS_SAML_ENABLED,IS_X509_ENABLED,IS_SAP_LOGON_TICKET_ENABLED,IS_SAP_ASSERTION_TICKET_ENABLED,IS_RESTRICTED,IS_CLIENT_CONNECT_ENABLE
D,HAS_REMOTE_USERS,PASSWORD_CHANGE_TIME
"RHELHASYNC",156529,"LOCAL",?,"SYSTEM","2017-05-12 15:10:49.971000000","2017-05-12 15:10:49.971000000",?,"2017-05-12 15:21:12.117000000",?,0,"TRUE","2017-05-12
15:10:49.971000000","FALSE","FALSE","FALSE",?,"TRUE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","TRUE","FALSE",?
1 row selected
4. Configuring SAP HANA in a pacemaker cluster
Please refer to Reference Document for the High Availability Add-On for Red Hat Enterprise Linux 7 documentation to first set up a pacemaker cluster. Note that the cluster must conform to article Support Policies for RHEL High Availability Clusters - General Requirements for Fencing/STONITH.
This guide will assume that following things are working properly:
- Pacemaker cluster is configured according to documentation and has proper and working fencing
- SAP HANA startup on boot is disabled on all cluster nodes as the start and stop will be managed by the cluster
- SAP HANA system replication and takeover using tools from SAP are working properly between cluster nodes
- SAP HANA contains monitoring account that can be used by the cluster from both cluster nodes
- Both nodes are subscribed to 'High-availability' and 'RHEL for SAP HANA' (RHEL 6,RHEL 7) channels
4.1. Configure general cluster properties
When testing SAP HANA you may wish to limit the number of failovers by setting up stickiness and migration threshold using commands below. These settings are optional and so are not required for proper setup of SAP HANA in pacemaker. Commands should be executed only on one node but they will take effect in the whole cluster.
[root]# pcs resource defaults resource-stickiness=1000
[root]# pcs resource defaults migration-threshold=5000
To remove above options after testing you can use commands below.
[root]# pcs resource defaults resource-stickiness=
[root]# pcs resource defaults migration-threshold=
In previous versions of this guide you might find the recommendation to set up no-quorum-policy
to ignore
which is currently NOT supported. In the default configuration there is no need to change the no-quorum-policy
property of cluster. If you would like to achieve behaviour provided by this option please check for more information in the article Can I configure pacemaker to continue to manage resources after a loss of quorum in RHEL 6 or 7?.
4.2. Create cloned SAPHanaTopology resource
SAPHanaTopology
resource is gathering status and configuration of SAP HANA System Replication on each node. SAPHanaTopology
requires following attributes to be configured.
Attribute Name | Description |
---|---|
SID | SAP System Identifier (SID) of SAP HANA installation. Must be same for all nodes. |
InstanceNumber | 2-digit SAP Instance identifier. |
Below is an example command to create the SAPHanaTopology
cloned resource.
[root]# pcs resource create SAPHanaTopology_RH2_02 SAPHanaTopology SID=RH2 InstanceNumber=02 --clone clone-max=2 clone-node-max=1 interleave=true
Resulting resource should look like the following.
[root]# pcs resource show SAPHanaTopology_RH2_02-clone
Clone: SAPHanaTopology_RH2_02-clone
Meta Attrs: clone-max=2 clone-node-max=1 interleave=true
Resource: SAPHanaTopology_RH2_02 (class=ocf provider=heartbeat type=SAPHanaTopology)
Attributes: SID=RH2 InstanceNumber=02
Operations: start interval=0s timeout=180 (SAPHanaTopology_RH2_02-start-interval-0s)
stop interval=0s timeout=60 (SAPHanaTopology_RH2_02-stop-interval-0s)
monitor interval=60 timeout=60 (SAPHanaTopology_RH2_02-monitor-interval-60)
Once the resource is started you will see the collected information stored in the form of node attributes that can be viewed with the command crm_mon -A1
. Below is an example of what attributes can look like when only SAPHanaTopology
is started.
[root]# crm_mon -A1
...
Node Attributes:
* Node node1:
+ hana_rh2_remoteHost : node2
+ hana_rh2_roles : 1:P:master1::worker:
+ hana_rh2_site : DC1
+ hana_rh2_srmode : syncmem
+ hana_rh2_vhost : node1
* Node node2:
+ hana_rh2_remoteHost : node1
+ hana_rh2_roles : 1:S:master1::worker:
+ hana_rh2_site : DC2
+ hana_rh2_srmode : syncmem
+ hana_rh2_vhost : node2
...
4.3. Create Master/Slave SAPHana resource
SAPHana
resource is responsible for starting, stopping and relocating the SAP HANA database. This resource must be run as a Master/Slave cluster resource. The resource has the following attributes.
Attribute Name | Required? | Default value | Description |
---|---|---|---|
SID | yes | none | SAP System Identifier (SID) of SAP HANA installation. Must be same for all nodes. |
InstanceNumber | yes | none | 2-digit SAP Instance identifier. |
PREFER_SITE_TAKEOVER | no | yes | Should cluster prefer to switchover to slave instance instead of restarting master locally? ("no": Do prefer restart locally; "yes": Do prefer takeover to remote site) |
AUTOMATED_REGISTER | no | false | Should the former SAP HANA primary be registered as secondary after takeover and DUPLICATE_PRIMARY_TIMEOUT? ("false": no, manual intervention will be needed; "true": yes, the former primary will be registered by resource agent as secondary) |
DUPLICATE_PRIMARY_TIMEOUT | no | 7200 | Time difference (in seconds) needed between primary time stamps, if a dual-primary situation occurs. If the time difference is less than the time gap, then the cluster holds one or both instances in a "WAITING" status. This is to give an admin a chance to react on a failover. A failed former primary will be registered after the time difference is passed. After this registration to the new primary all data will be overwritten by the system replication. |
Below is an example command to create the SAPHana
Master/Slave resource.
[root]# pcs resource create SAPHana_RH2_02 SAPHana SID=RH2 InstanceNumber=02 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false --master meta notify=true clone-max=2 clone-node-max=1 interleave=true
When running pcs-0.9.158-6.el7
, or newer, use the command below to avoid deprecation warning. More information about the change is explained in What are differences between master
and --master
option in pcs resource create
command?.
[root]# pcs resource create SAPHana_RH2_02 SAPHana SID=RH2 InstanceNumber=02 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false master notify=true clone-max=2 clone-node-max=1 interleave=true
Resulting resource should look like the following.
[root]# pcs resource show SAPHana_RH2_02-master
Master: SAPHana_RH2_02-master
Meta Attrs: notify=true clone-max=2 clone-node-max=1 interleave=true
Resource: SAPHana_RH2_02 (class=ocf provider=heartbeat type=SAPHana)
Attributes: SID=RH2 InstanceNumber=02 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false
Operations: start interval=0s timeout=180 (SAPHana_RH2_02-start-interval-0s)
stop interval=0s timeout=240 (SAPHana_RH2_02-stop-interval-0s)
monitor interval=120 timeout=60 (SAPHana_RH2_02-monitor-interval-120)
monitor interval=121 role=Slave timeout=60 (SAPHana_RH2_02-monitor-interval-121)
monitor interval=119 role=Master timeout=60 (SAPHana_RH2_02-monitor-interval-119)
promote interval=0s timeout=320 (SAPHana_RH2_02-promote-interval-0s)
demote interval=0s timeout=320 (SAPHana_RH2_02-demote-interval-0s)
Once the resource is started it will add additional node attributes describing the current state of SAP HANA databases on nodes as seen below.
[root]# crm_mon -A1
...
Node Attributes:
* Node node1:
+ hana_rh2_clone_state : PROMOTED
+ hana_rh2_op_mode : delta_datashipping
+ hana_rh2_remoteHost : node2
+ hana_rh2_roles : 4:S:master1:master:worker:master
+ hana_rh2_site : DC1
+ hana_rh2_sync_state : PRIM
+ hana_rh2_srmode : syncmem
+ hana_rh2_vhost : node1
+ lpa_rh2_lpt : 1495204085
+ master-hana : 100
* Node node2:
+ hana_rh2_clone_state : DEMOTED
+ hana_rh2_remoteHost : node1
+ hana_rh2_roles : 4:P:master1:master:worker:master
+ hana_rh2_site : DC2
+ hana_rh2_srmode : syncmem
+ hana_rh2_sync_state : SOK
+ hana_rh2_vhost : node2
+ lpa_rh2_lpt : 30
+ master-hana : 150
...
4.4 Create Virtual IP address resource
Cluster will contain Virtual IP address in order to reach the Master instance of SAP HANA. Below is example command to create IPaddr2
resource with IP 192.168.0.15
.
[root]# pcs resource create vip_RH2_02 IPaddr2 ip="192.168.0.15"
Resulting resource should look like one below.
[root]# pcs resource show vip_RH2_02
Resource: vip_RH2_02 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=192.168.0.15
Operations: start interval=0s timeout=20s (vip_RH2_02-start-interval-0s)
stop interval=0s timeout=20s (vip_RH2_02-stop-interval-0s)
monitor interval=10s timeout=20s (vip_RH2_02-monitor-interval-10s)
4.5. Create constraints
For correct operation we need to ensure that SAPHanaTopology
resources are started before starting the SAPHana
resources and also that the virtual IP address is present on the node where the Master resource of SAPHana
is running. To achieve this, the following 2 constraints need to be created.
4.5.1. constraint - start `SAPHanaTopology` before `SAPHana`
Example command below will create the constraint that mandates the start order of these resources. There are 2 things worth mentioning here:
symmetrical=false
attribute defines that we care only about thestart
of resources and they don't need to be stopped in reverse order.- Both resources (
SAPHana
andSAPHanaTopology
) have the attributeinterleave=true
that allows parallel start of these resources on nodes. This permits that despite of ordering we will not wait for all nodes to startSAPHanaTopology
but we can start theSAPHana
resource on any of nodes as soon asSAPHanaTopology
is running there.
Command for creating the constraint:
[root]# pcs constraint order SAPHanaTopology_RH2_02-clone then SAPHana_RH2_02-master symmetrical=false
The resulting constraint should look like the one in the example below.
[root]# pcs constraint
...
Ordering Constraints:
start SAPHanaTopology_RH2_02-clone then start SAPHana_RH2_02-master (kind:Mandatory) (non-symmetrical)
...
4.5.2. constraint - colocate the `IPaddr2` resource with Master of `SAPHana` resource
Below is an example command that will colocate the IPaddr2
resource with SAPHana
resource that was promoted as Master.
[root]# pcs constraint colocation add vip_RH2_02 with master SAPHana_RH2_02-master 2000
Note that the constraint is using a score of 2000 instead of the default INFINITY. This allows the IPaddr2
resource to be taken down by the cluster in case there is no Master promoted in the SAPHana
resource so it is still possible to use this address with tools like SAP Management Console or SAP LVM that can use this address to query the status information about the SAP Instance.
The resulting constraint should look like one in the example below.
[root]# pcs constraint
...
Colocation Constraints:
vip_RH2_02 with SAPHana_RH2_02-master (score:2000) (rsc-role:Started) (with-rsc-role:Master)
...
4.6. Testing the manual move of SAPHana resource to another node (SAP Hana takeover by cluster)
To test out the move of the SAPHana
resource from one node to another, use the command below. Note that the option --master
should NOT be used when running the below command due to the way how the SAPHana
resource works internally.
[root]# pcs resource move SAPHana_RH2_02-master
IMPORTANT: After each pcs resource move
command invocation the cluster creates location constraints to achieve the move of the resource. These constraints must be removed in order to allow automatic failover in the future. To remove them you can use the command pcs resource clear SAPHana_RH2_02-master
.
'HPC > RHEL7' 카테고리의 다른 글
Ubuntu 14 server 설치 후 Desktop 설치 (0) | 2019.10.07 |
---|---|
SAP HANA resource 넘어가는 순서 (0) | 2019.01.21 |
iptable nat outgoing (0) | 2018.07.01 |
Redhat 용어 정리 (0) | 2018.05.18 |
멜트다운,스텍터 (0) | 2018.05.18 |