Change data capture gains huge popularity as a data integration mechanism as a non-intrusive way of obtaining technical events from database. This 2 part series shows how to set up such environment.
HVR is a tool that replicates transactions between databases that HVR calls ‘locations’. Each change it captures is applied to the target locations. It can also replicate between directories (file locations) or replicate between databases and directories.
The HUB DATABASE is a small database from which HVR controls replication of the other databases. It is created especially for the HVR installation. It contains HVR catalog tables that hold all specifications of replication such as the names of the replicated databases, the replication direction and the list of tables to be replicated.
A LOCATION is a database that HVR will replicate to or from. A location can also be a directory (a ‘file location’) from which HVR replicates files or a Salesforce endpoint.
A CHANNEL is an object in HVR that groups together the locations that should be connected together and the tables that should be replicated. It also contains actions that control how replication should be done. Channels are defined in the hub database. As well as replicating changes, channels can also be used to refresh data. Refresh means all data is read from a source and loaded into another database, without replication.
WHERE TO INSTALL
The HVR distribution must be installed on the hub machine. If a database involved in replication is not located on the hub machine HVR can connect to it in two ways; either using the network database’s own protocol (e.g. SQL*Net or Ingres/Net) or using an HVR remote connection (recommended).
For an HVR remote location, the full HVR distribution should also be installed on the remote machine, although a few of its files are used. If the DBMS’s own protocol is used then no HVR files need to be installed on the remote machine.
If the HVR replication configuration is altered, all the reconfiguration steps are performed on the hub machine; installation files on the remote machine are unchanged.
It is recommended that HVR is installed under its own login account (e.g. hvr) on the hub machine and each machine containing the replicated database or directory. However, HVR can also use the account that owns the replicated tables (e.g. the DBA’s account) or it can use the DBMS owner account (e.g. oracle or ingres). Any login shell is sufficient.
The HVR distribution files are installed under environment variable $HVR_HOME. HVR also creates files underneath $HVR_CONFIG. Both these environment variables must be configured. An extra environment variable $HVR_TMP is also recognized, but if it is not configured it will default to $HVR_CONFIG/tmp.
INSTALLING HVR ON LINUX
Create user which HVR will run as.Example. User hvr identified with password ‘hvruser’. Run this commands:
useradd -p hvruser hvr
The following steps should be performed as the user which HVR will run as, not as user root. So first create user hvr identified with password ‘hvruser’ and then:
Choose directories for read only and read write parts of HVR. In our case:
$ HVR_HOME=/home/hvr/hvr_home $ HVR_CONFIG=/home/hvr/hvr_config $ HVR_TMP=/tmp
Place these variables to .bashrc file:
export HVR_HOME=/home/hvr/hvr_home export HVR_CONFIG=/home/hvr/hvr_config export HVR_TMP=/tmp
- Modify the $PATH and also put this into .bashrc file
- Read and uncompress the distribution file:
umask 0 mkdir $HVR_HOME $HVR_CONFIG cd $HVR_HOME gzip -dc /tmp/hvr-5.5.0-linux_glibc2.5-x64-64bit.tar.gz | tar xf -
- If this is a hub machine then install the HVR license file (hvr.lic) into the $HVR_HOME/lib folder.
cp /tmp/hvr.lic $HVR_HOME/lib
- Now define environment variable $HVR_HUB_CLASS as a default DBMS type for the hub database. For us is oracle. Place this in .bashrc file:
- HVR listen port must be configured.
- Port 4343 is recommended
- Login as root and configure systemd to invoke HVR.
- We need to create 2 files (hvr.socket and hvr@.service) in /etc/systemd/system folder.
- hvr.socket should contain the following:
[Unit] Description=HVR service socket
[Socket] ListenStream=4343 Accept=true TriggerLimitIntervalSec=1s # may not be supported on early systemd versions TriggerLimitBurst=10000 # may not be supported on early systemd versions MaxConnectionsPerSource=100 # may not be supported on early systemd versions MaxConnections=500 KeepAlive=true [Install] WantedBy=sockets.target
- hvr@.service should contain the following:
[Unit] Description=HVR service [Service] Environment="HVR_HOME=/home/hvr/hvr_home" Environment="HVR_CONFIG=/home/hvr/hvr_config" Environment="HVR_TMP=/tmp" User=root ExecStart=/home/hvr/hvr_home/bin/hvr -r StandardInput=socket KillMode=process
- To enable and start the service execute the following commands:
systemctl enable hvr.socket systemctl start hvr.socket
- To verify whether the service is active:
systemctl status hvr.socket hvr.socket - HVR service socket Loaded: loaded (/etc/systemd/system/hvr.socket; enabled; vendor preset: enabled) Active: active (listening) since Mon 2018-10-08 17:54:44 CEST; 5s ago Listen: [::]:4343 (Stream) Accepted: 0; Connected: 0
Preparing Oracle database for use with HVR
This section describes the requirements, access privileges, and other features of HVR when using Oracle for replication. It also helps you to get started with HVR for replicating data between Oracle databases.
CREATE HUB DATABASE
HVR allows you to create hub database (schema) in Oracle. The hub database is a small database which HVR uses to control its replication activities. This database stores HVR catalog tables that hold all specifications of replication such as the names of the replicated databases, the list of replicated tables, and the replication direction. Create the hub database (hvrhub) with password (hvr):
create user hvrhub identified by hvr default tablespace users temporary tablespace temp quota unlimited on users;
GRANTS FOR HUB SCHEMA
The hub database (schema) can be located inside an Oracle instance on the hub machine, or it can be located on another machine and connected using a TNS connection. The following grants are required for hub database user (hvrhub):
grant create session to hvrhub; grant create table to hvrhub; grant create trigger to hvrhub; grant create procedure to hvrhub; grant execute on dbms_alert to hvrhub;
HVR allows you to capture changes from Oracle database. By default, HVR performs log-based capture.
Enable Log-Based Capture
To enable log-base capture, configure the following:
The user name that HVR uses must be added to Oracle’s group.
On Linux, this can be done by adding the user name used by HVR (hvr) to the line in /etc/group that begins with dba.
The Oracle instance must have archiving enabled, otherwise (a) HVR will lose changes if it falls behind or it is suspended for a time, and (b) HVR will not capture changes made with Oracle insert statements with ‘append hints’.
Archiving can be enabled by running the following statement as sysdba against a mounted but unopened database:
alter database archivelog; select log_mode from v$database #checking the current state of archiving
Grants for Log-Based Capture
HVR does log-based capture if action Capture is defined.
- The databaseUser (user that is used to replicate tables – Ex. hr schema) that HVR uses must be granted create session privilege.
grant create session to hvruser;
- To improve the performance of HVR Initialize for channels with large number of tables (more than 150), HVR creates a temporary table (hvr_sys_table) within a schema. For HVR to automatically create this temporary table.
The User should be granted create table privilege.
grant create table to hvruser;
- To replicate tables which are owned by other schemas (using action TableProperties/Schema) the User must be also granted select any table privilege:
grant select any table to hvruser;
- The User must be granted select privilege for the following data dictionary objects:
grant select on V_$ARCHIVE_DEST to hvruser; grant select on V_$ARCHIVED_LOG to hvruser; grant select on V_$DATABASE_INCARNATION to hvruser; grant select on V_$LOG to hvruser; grant select on V_$LOGFILE to hvruser; grant select on V_$PARAMETER to hvruser; grant select on V_$NLS_PARAMETERS to hvruser; grant select on SYS.ALL_CONS_COLUMNS to hvruser; grant select on SYS.ALL_CONSTRAINTS to hvruser; grant select on SYS.ALL_IND_COLUMNS to hvruser; grant select on SYS.ALL_INDEXES to hvruser; grant select on SYS.ALL_LOBS to hvruser; grant select on SYS.ALL_LOG_GROUPS to hvruser; grant select on SYS.ALL_OBJECTS to hvruser; grant select on SYS.ALL_TAB_COLS to hvruser; grant select on SYS.ALL_TABLES to hvruser; grant select on SYS.ALL_TRIGGERS to hvruser; grant select on SYS.ALL_ENCRYPTED_COLUMNS to hvruser; grant select on SYS.DBA_TABLESPACES to hvruser; grant select on SYS.OBJ$ to hvruser; grant select on sys.user$ to hvruser; grant select on sys.enc$ to hvruser; grant execute on dbms_logmnr to hvruser; grant select any transaction to hvruser; grant execute_catalog_role to hvruser; grant logmining to hvruser; grant select any dictionary to hvruser; grant alter table to hvruser;
- To use action DbSequence the User must be granted following privileges:
grant select any sequence to hvruser; grant select on sys.seq$ to hvruser;
Background: HVR also needs Oracle’s “supplemental logging” feature enabled on replicate tables that it must replicate. Otherwise when an update is done Oracle will only log the columns which are changed. But HVR also needs other data (e.g. the key columns) so that it can generate a full update statement on the target database. Oracle supplemental logging can be set at database level and on specific tables.
The very first time that HVR Initialize runs it will check if the database allows any supplemental logging at all. If it is not thenHVR Initializewill attempt statement alter database add supplemental log data.
- To execute alter database add supplemental log data the User must have sysdba privilege
- Also we need :
grant alter any table to hvruser;
Log Read Method – Direct (Default)
By default, HVR captures changes using using the DIRECT log read method (Capture /LogReadMethod=DIRECT). In this method, HVR reads transaction log records directly from the DBMS log file using file I/O. This method is very fast in capturing changes from Oracle database. The DIRECT log read method requires that HVR agent is installed on the source database machine. Also we need to grant another privilege:
grant select any transaction to hvruser;
Native Access to Redo Files
HVR’s capture job needs permission to read Oracle’s redo and archive files at the Operating System level. On Linux the default group of user oracle can be seen in the 4th field of its line in /etc/passwd. The HVR user needs to be made a member of that group by adding its name to file /etc/group (e.g. line oinstall:x:101:oracle,hvr). Note that adding HVR’s user to group dba will also give HVR sysdba privilege.
HVR supports capturing from Oracle’s pluggable database (PDB). To enable capturing from PDB, the connection should point to the database service of the PDB. You need to have the remotelistener running on the database server (either on primary or standby server if using dataguard). The connection method is always TNS to the PDB.
Location connection details required for pluggable databases:
TNSis the connection to the database server, port, and database service (for example, dns:1521/HVR1210).
User is the username to connect HVR to container database (where the PDB is located). The user should be an Oracle common user. The privileges required for User is same as the log-based capture grants.
Password is the password of the User to connect HVR to container database.
Please visit the second part of the blog series: