Pages

Friday, December 15, 2023

High Availability Architecture for Goldengate Services - Part 2

 

About High Availability Architecture: High Availability (HA) architecture for Oracle GoldenGate services is designed to ensure that the data replication and integration processes continue to function without interruption, even in the event of system failures or disruptions. This is achieved through a combination of redundant components, failover mechanisms, and data synchronization strategies.

AGCTL

Agent Control (AGCTL) is a command line tool used to manage bundled agents within Oracle Grid Infrastructure for keeping applications available. It helps integrate an application as a resource under agent control within this infrastructure.

After setup, AGCTL is used to manage resources like starting or stopping the resource, relocating it, checking its status, and changing agent configurations.

Requirements for Running AGCTL

AGCTL, a tool in Oracle Grid Infrastructure, can be used by the owner of the infrastructure or a designated application administrator. This administrator must be part of the same primary OS group as the owner.

Administrators can run any AGCTL command and have two options for creating application virtual IPs. They can use the 'appvipcfg' utility in the Oracle Grid Infrastructure bin directory, or the AGCTL utility. Both methods require root access.

If an application VIP is pre-created by an administrator with root privileges, the application administrator only needs to provide the '--vip_name' when using AGCTL, without needing root privileges.

Oracle Grid Infrastructure bundled agents and the AGCTL management interface can support multiple administrators for different applications. Security, execution privileges, and ownership for an application are defined using UNIX-like ACL definitions through the AGCTL management interface.

AGCTL Syntax and Use    

Usage: agctl <verb> goldengate [<options>]    

verbs: add|check|config|disable|enable|modify|relocate|remove|start|status|stop    

For detailed help on each verb and its options use:    

agctl <verb> --help or

AGCTL is the designated command for managing application resources, specifically those of type XAG. It's important to note that the Oracle Grid Infrastructure utility CRSCTL should not be used to manage these resources. However, CRSCTL can be used to check the status of resources, which is fully supported.

Version Support Matrix

The following combination of Clusterware/GoldenGate releases is supported as per the Oracle doc.

Grid Infrastructure (GI) 

GoldenGate

11.2.0.4.+/12.1+/12.2+/18+/19+ 

11.1+/11.2+/12.1+/12.2+/12.3+/18+/19+ 


XAG Agent Functions:

  • Handle GoldenGate application's failover.

  • Start the GoldenGate instance and related dependencies.

  • Monitor GoldenGate instance extract, replicat & manager processes.

  • Stop the GoldenGate instance and related dependencies.

  • Move the GoldenGate instance and related dependencies.

  • Clean up the GoldenGate instance and related dependencies after a failure that can't be recovered.

Goldengate Instance States:

ONLINE – The GoldenGate instance is online

OFFLINE – The GoldenGate instance is offline

INTERMEDIATE – The GoldenGate manager is online, however some or all extract and replicate processes are offline or have timed out when attempting to start

UNKNOWN - The state when Oracle Clusterware is unable to manage the resource and manual Oracle Clusterware intervention is required to stop it and fix the root cause.  Once corrected, agctl start/stop commands should be used.

The GoldenGate instance resource will transition to states : ONLINE, OFFLINE based on the operations and state of the manager. This resource will transition to state INTERMEDIATE  if any of the specified extract or replicat processes are detected not to be running. 

AGCTL Command Options

AGCTL Syntax Complete AGCTL usage for the goldengate resource is exposed using agctl –h. The following are common AGCTL operations for the GoldenGate application.

AGCTL command to register and configure a GoldenGate resource for a GoldenGate instance:


agctl [add | modify] goldengate instance_name

--gg_home <GoldenGate installation directory>

--serverpool <serverpool name> | --nodes <node1,node2,...>

--instance_type <source|target|dual>

--oracle_home <path>      

--db_services <associated database services>      

--databases <associated database resources>      

--environment_vars <name1=value1,name2=value2,...>      

--monitor_extracts <ext1,ext2,ext3,...>      

--monitor_replicats <rep1,rep2,rep3,...>      

--network <network_number>      

--ip <new VIP address>      

--vip_name <VIP resource name>      

--filesystems <acfs1,acfs2,...>      

--attribute <name1=value1,name2=value2,...>


Where the options for AGCTL add and modify commands for GoldenGate are:


Option

Description

instance_name

The name of the GoldenGate instance. Required

gg_home

The GoldenGate installation directory.

serverpool

The name of the server pool in which this GoldenGate instance should be started 

nodes

A list of nodes where the GoldenGate instance can be run 

instance_type

Indicates whether this is a source or target instance. For bi-directional replication both source and target are required definitions. Required

oracle_home 

The ORACLE_HOME location. Required 

databases

The Oracle Database resources if GoldenGate instance has to be colocated with a Database instance and ORACLE_SID based connection is used by the extract/replicat processes 

db_services 

The Database services (TNS), if the extract/replicat processes use service based connection strings 

monitor_extracts 

An optional list of extracts to monitor. If any of the extract is not running, the GoldenGate instance resource state will transition to INTERMEDIATE. 

monitor_replicats

An optional list of replicats to monitor. If any of the replicat is not running, the GoldenGate instance resource state will transition to INTERMEDIATE. 

environment_vars 

An optional list of environment variables to be passed when the GoldenGate instance is started/stopped/monitored. This is useful for setting up the environment when GoldenGate is operating on non-Oracle datastores.

network

The network number if a new VIP resource is to be created Required 

ip

The VIP address if a new VIP resource is to be created Required if not using pre-created VIP 

user

The name of the OS Siebel user. Required if not using pre-configured VIP

group

The name of the Siebel OS group to which the Siebel user belongs.

attribute

Sets/overrides default values for standard Clusterware attributes of the GoldenGate resource (e.g. CHECK_INTERVAL, AUTO_START) 


Note: When it comes to creating the application VIP, there are two scenarios to consider:

  1. Pre-creation by Grid Admin (root): The Grid Administrator, who has root privileges, can pre-create the application VIP using the 'appvipcfg' command. Then, when running the 'agctl add' command, the GoldenGate Administrator only needs to specify the '--vip_name' parameter.

  2. Creation with agctl (run as root): If you choose to create the application VIP using 'agctl,' you need to run the command as the root user. In this case, don't use the '--vip_name.' Instead, use the '--network,' '--ip,' and '--user' flags to set up the VIP correctly.

The following are examples of common agctl commands for GoldenGate.

This command shows the current configuration of the GoldenGate instance. 

agctl config goldengate <instance_name>

Command to enable the Goldengate instance:

agctl enable goldengate <instance_name>

Note:  When the resource is first registered, it is enabled by default. 

Command to disable the goldengate instance:

agctl disable goldengate <instance_name> 

Command to relocate the goldengate instance from one node to another:

agctl relocate goldengate instance_name [--serverpool serverpool_name | --node node_name]

Command to delete the Goldengate instance:

agctl remove goldengate instance_name [--force]

Command to check the Goldengate instance status:

agctl status goldengate instance_name [-node node_name]

Command to stop the Goldengate instance:

agctl stop goldengate instance_name

Thursday, November 30, 2023

High Availability Architecture for Oracle Goldengate Services - Part 1

 High Availability Architecture for Goldengate Services

About High Availability Architecture: High Availability (HA) architecture for Oracle GoldenGate services is designed to ensure that the data replication and integration processes continue to function without interruption, even in the event of system failures or disruptions. This is achieved through a combination of redundant components, failover mechanisms, and data synchronization strategies.

Challenges without High Availability Architecture:

Without implementing a High Availability (HA) architecture, we may face several challenges as listed below:

  1. Downtime: In the absence of High Availability (HA), system failures could result in substantial downtime. Such disruptions can interfere with business operations, potentially leading to a decrease in revenue and a loss of trust from customers.

  2. Data Loss: In the absence of HA, there is a higher risk of data loss or corruption during system failures. This can have serious implications, especially for businesses dealing with sensitive or critical data.

  3. Scalability Issues: Without HA, it can be challenging to scale up the system to handle increased load or demand. This can lead to performance issues and poor user experience.

  4. Business Continuity Risks: Without HA, business continuity can be severely impacted during system failures or disruptions. This can affect the organization's ability to deliver services and meet its business objectives.

  5. Human Errors: When services are managed manually, the potential for human errors increases, which can subsequently impact the availability of the service.

The benefits of implementing a High Availability Architecture:

Implementing High Availability (HA) architecture for Oracle GoldenGate services is critical for several reasons:

  1. Minimize Downtime: HA architecture ensures that the data integration and replication processes continue to function without interruption, even in the event of system failures or disruptions. This minimizes downtime, which can be costly for businesses.

  2. Data Integrity: Oracle GoldenGate is often used to replicate data between different systems. Any disruption in this process can result in data loss or corruption. HA architecture helps to prevent this by ensuring that the data replication process continues even if one of the systems fails.

  3. Business Continuity: Any interruption in service can have a significant impact on business operations. By implementing HA architecture, organizations can ensure that their critical business processes continue to function without interruption.

  4. Scalability: Data grows as organization grows. HA architecture allows organizations to scale GoldenGate services to meet the increasing demands of our business without compromising on availability or performance.


High Availability Architecture Implementation:

Oracle Grid Infrastructure ensures that important applications are always available. It uses Oracle Grid Infrastructure Bundled Agents (XAG) to manage application resources. Oracle Clusterware, a part of this infrastructure, provides a network resource for application Virtual IPs (APPVIPs) to maintain network connectivity.

Shared storage is necessary for Goldengate services to keep certain files, ensuring their availability and failover. The recommended file system for this is DBFS.

The bundled agents make integration easier by removing the need for extra infrastructure agents. Oracle Grid Infrastructure also allows applications to integrate smoothly by managing each application as a resource.

Applications that connect to an Oracle Database in the same cluster can set dependencies on the database, ensuring the database starts before the application. Depending on the application's use of the database, different dependencies can be set for flexibility.

This setup makes management easier and more flexible, as the database always starts before the application.


In diagram-1, Goldengate services are run as regular resources and need to be managed using the GGSCI command locally.

In diagram-2, GoldenGate services run as cluster resources and are managed using the AGCTL command. If there is an issue with node-1, the services will automatically be relocated to node-2, providing high availability for the services.