Saturday 20 February 2016

GoldenGate – An introduction

In the last few years, GoldenGate has become the preferred choice for DBAs to handle the replication requirement of their data centers. Besides being extremely easy to configure, GoldenGate offers immense flexibility in the configuration strategies available with it. This series of articles will discuss GoldenGate technology, covering concepts, configuration options, troubleshooting and so forth.

The Scenario

Imagine you are working for a multi-national bank, though its headquarters are in London, UK you are based in a Mumbai, India branch. This bank uses a specific account for its financial application used globally at all the branches. You have been asked by your manager that transactions that have happened for that account in the database at Mumbai branch be kept in synch with the centralized database situated at the UK daily. The volume of transactions is massive, and even the slightest delay can greatly impact the business. Since there isn’t just one database in any bank, neither is there just one branch for the bank, so the same kind of setup probably is required at multiple destinations. This setup would also require to be monitored continuously, preferably through some sort of GUI based tool for the ease of management. Additionally, there are several other, non-critical applications used at all the branches. These applications are based on non-Oracle databases such MYSQL etc, but the transactions done over these non-Oracle databases also need to be loaded into an Oracle database located at the headquarters. The replication technology used must be support both Oracle and non-Oracle databases so they can talk to each other.
Given this situation, the most important question becomes, which software can achieve all these requirements?
If you guessed GoldenGate, you were absolutely spot-on and in this series of articles we shall be elaborating this technology in more details with every subsequent instalment.

GoldenGate  What and Why

GoldenGate is a lightweight, log-based software. Interestingly, it’s popularly known as Oracle GoldenGate but it wasn’t initially an Oracle product. GoldenGate Software Inc. came into the existence in 1995 and was founded by Eric Fish and Todd Davidson. Yes, the name is definitely inspired from the London’s Golden Gate Bridge. In 2009, GoldenGate was acquired by Oracle Corp. and the current release of it, 12.1.2, matches with the latest release of the Oracle database – 12c. The genesis of GoldenGate was inspired by its founders to make data replication happen easily, seamlessly and not between just between Oracle databases, but also between Oracle and non-Oracle databases. That’s why GoldenGate brings a very unique heterogeneous environment support. For a database, maintaining transaction-level integrity is of utmost importance. GoldenGate ensures such data integrity and provides support for zero-data-loss for fault-tolerance.
Let’s have a look at few of the most significant benefits that GoldenGate offers:
  1. With GoldenGate, data sending is in “near real time” which reduces the possibility of latency
  2. To maintain the performance and consistency, only committed data is sent. Uncommitted data is captured by GoldenGate, but is discarded after receiving a Rollback.
  3. GoldenGate offers support for different versions of Oracle database and also for many non-Oracle databases, along with availability over a wide range of Operating Systems and hardware platforms.
  4. Though the performance of the replication is really good, the impact on the underlying databases is minimal when GoldenGate is in action.
  5. The architecture and configuration of GoldenGate are very simplistic, making it a very effective and yet also an easy technology to learn and implement.
  6. GoldenGate has in-built mechanisms for recovery of data for gap resolution for different kinds of failures i.e. site or network failure.
  7. GoldenGate uses its own proprietary format files for keeping the committed data coming from the source database making it independently functional irrespective of the database in use.
  8. GoldenGate uses the standard network between a source and the target database and doesn’t rely on Oracle’s Network services. Thus any data transfer done by GoldenGate processes doesn’t impact the network performance of the source and target databases.
  9. For fault-tolerance, GoldenGate has its own mechanism to keep a track of how much work (transactions) is completed and how much is pending. This mechanism is independent of any database and ensures that no data loss occurs with the help of automatic gap-resolution.
A long list of features isn’t it!
So what benefits does GoldenGate offer with its technology stack for a business? With its data replication technology, the following are a few of the benefits.
  1. GoldenGate provides high availability. Using it, one can have a standby database that’s been constantly made in sync with the primary environment. In case of a crash, an immediate fail over would make the impact on the business minimal.
  2. GoldenGate offers Bi-Directional replication using which a complete Active-Active configuration can be created making the impact on the business from minimal to almost negligible.
  3. Golden Gate is a great tool to have a zero downtime for upgrades and migrations.
  4. Using GoldenGate’s replication technique, a separate system (where replication is going on) can be used for reporting thus relieving the burden from the source database.
  5. GoldenGate also offers adapters which can extend the functionality of it. For example, GoldenGate application adapter for Java, the capture data can be send to a non-RDBMS target i.e. Java Messaging Service (JMS).
In the forthcoming parts, we shall have a more detailed look at these benefits and shall also see the examples to implement them.

Configuration options with GoldenGate

One of the biggest benefits of the GoldenGate technology is that it offers a lot of flexibility and can be used in varied ways to cater to different requirements. It’s important to mention that all the available configurations hold merit in their own perspective. Thus usage of each configuration is subjective to the business requirement and may vary from one customer to another. Following are the available system configurations that GoldenGate supports,
  1. One-to-One (Unidirectional)
  2. One-to-Many (Broadcast)
  3. Many-to-One (Consolidation)
  4. Bi-Directional (Active-Active)
  5. Multimaster (Peer-to-Peer)
  6. Cascading Data Marts
Following pictorial representation covers all these topologies.
http://gavinsoorma.com/wp-content/uploads/2010/02/ggate1.jpgImage Source: Oracle documentation
Since we have had a glance on the GoldenGate’s features and benefits, let’s get a closer look at the underlying technology.

GoldenGate-Technology Overview

GoldenGate is based on transaction log shipping and log-apply mechanism. Its architecture is made up of components that do the formerly mentioned tasks. It’s important to mention that deliberately, I haven’t mentioned Redo Log but just the word “log”. The reason for this is that GoldenGate is not meant to work only with Oracle database where transaction logs (the term used by other RDBMSs) are called Redo Logs. So if you are going to use GoldenGate with Oracle database, the terminology would be Redo Logs but for some other databases the terminology would be different.
Following key terms are essential for anyone to know who wants to use GoldenGate. In this part, I am going to give a very brief overview to introduce the essential components. In the subsequent articles, I shall go deeper into the processes etc.

Extract (Capture) Process

Extract process runs on the source side. This process, as the name suggests, captures the committed transactions from the source database and writes them into a GoldenGate’s proprietary format trail files. Extract process reads database’s transaction logs as the source for the committed transactions

Pump Extract

Pump Extract or commonly known as Pump process is an optional process but it’s highly recommended that it should be configured. It’s important to note that there is no relationship between Pump extract and the Data Pump utility available with Oracle database software. To avoid the confusion, it’s better to use the term Pump extract instead of Data Pump extract.
Pump extract resides on the source side just like the Capture process. The best usage of the Pump extract is to safeguard the replication from any sort of network or site crashes. Data manipulation, if required, can be done on the source side using the Pump extract. Also, Pump extract allows sending trail data to more than one targets.

Replicat (Apply) Process

Replicat, also known as Apply process, is configured on the target side. Replicat receives the data being sent by the Capture process in the form of the trail files. Trail files data is applied by Replicat on the target side. Integrity control for the data and to ensure that no data loss occurs, Replicat maintains a checkpointing mechanism.

Manager Process

This process is available on both the source and the target sides. Manager process controls the overall environment on both sides i.e. starting or restarting the extract and Replicats, performing the space management for the trail files, producing reports etc. There is one process on source and the target side.

Collector Process

These processes are available on the target side. A collector receives the committed data send by either the source Extract or Pump Extract and accumulates it in the target side trail (also known as remote trail). The resultant trail file is consumed by the Replicat.

Trail Files

If you know what Redo Logs are to the Oracle database, then you can assume the same for trail files for GoldenGate (for illustration purposes). Trail files collect the data that’s captured by the Extract process from the source database and similar sort of trail files are created on the target database side which would be consumed by the Replicat to apply the captured changes. Trail files are organized in the Canonical format in the commit order of the captured transactions in order to keep the transaction integrity. Furthermore, GoldenGate deploys its own checkpointing mechanism to keep a track of data being written to a trail file.

Parameter Files

GoldenGate configuration is all about the parameters that are configured for Extracts and Replicats. These parameters are configured in process parameter files which are stored in a dedicated folder within GoldenGate home. These files are plain text files and can be edited by any text editor i.e. Notepad or GEdit.
Like it’s said a picture is worth thousand words, let’s put it all in a pictorial diagram as shown below,
Description of Figure 1-2 followsImage source: Oracle Documentation
We haven’t discussed what an “Initial Load” is yet. It’s something that we shall dive into after understanding the configuration options of the processes i.e. Extracts, Replicats in a subsequent part of the series.

GoldenGate Product Family

GoldenGate in itself is for data replication between two databases. But it’s complemented by a further couple of products that make the usage of GoldenGate much easier and efficient. The following are all part of GoldenGate family.

GoldenGate Director

Director is a GUI-based product for the complete administration of the GoldenGate environment configurement. It offers nice GUI interface for administration of the core processes i.e. Extract, Replicat etc. and also provides monitoring of the entire environment. It’s a part of GoldenGate Management pack.

GoldenGate Monitor

Monitor is another product in the GoldenGate family and part of Management Pack. It runs on a web browser and displays the entire configuration of the GoldenGate environment including details of the core processes, their statistics and lot more. Unlike Director, it doesn’t have capability to make changes in the configuration but it can be used to create notifications via alerts.

GoldenGate Veridata

Veridata offers data verification services for the data that’s being sent from the source database and is applied over the target database. Any data discrepancies are checked by Veridata and are reported by it without putting any additional impact on either the source or the target databases.
Besides the above mentioned products, GoldenGate configurations can take use of Cloud Control 12c and also the Oracle GoldenGate Adapters.

Wrapping-up

In this first part of GoldenGate series, we have got the introduction of the GoldenGate as a replication software, features & benefits offered by it. We had a glance at the technology underneath along with the complimenting products for it. In the next part, we shall understand the installation of GoldenGate 12c over Linux operating system. Stay tuned!

No comments:

Post a Comment