What is Data Replication?

Abhishek Ghosh

By Abhishek Ghosh September 25, 2021 6:25 pm Updated on September 25, 2021

What is Data Replication?

Replication or replication in the literal sense of the word is the mere production of multiple copies (copies) of the same data but is usually associated with the regular comparison of the data. In general, replication in data processing is used to make data accessible in multiple places. On the one hand, this is used for data backup; on the other hand, to shorten response times, especially for read data accesses.

The simplest form of data replication is the storage of a copy of a file, in advanced form the copy and paste of modern operating systems. Replication in the literal sense is also the duplication of optical data carriers in a pressing plant or with the help of a burner.

Changing data accesses are generally more complex due to the multiple keeping of the data. In the case of frequently encountered master/slave replication, a distinction is made between the “original” of the data (primary data) and the dependent copies. In the case of equivalent copies (version management), merge strategies must be used in replication to merge the data stocks (casual synchronization, different from real synchronization).

Sometimes it is important to know the timeliness of the replicas. Depending on the type of replication, there is a certain amount of time between the processing or creation of the primary data and its replication. This period is also called timeline but is usually referred to as latency.

Synchronous replication

Synchronous replication is when a change operation on a data object can only be completed if it has also been performed on the replicas. To be able to implement this technology, a protocol to ensure the atomicity (indivisibility) of transactions must be used, the commit protocol. Synchronous replication strategies:

ROWA procedure
Voting procedure, e.g. weighted voting

Examples of synchronous replication include:

Warm Standby Replication
Hot Standby Replication of SQL Server Microsoft Databases

Asynchronous replication

If there is a latency between the processing of the primary data and replication, it is called asynchrony. The data is synchronous (identical) only at the time of replication. A simple variant of asynchronous replication is “File Transfer Replication”, the transfer of files via FTP or SSH.

This means that the data from the replicas are only a snapshot of the primary data at a given point in time. At the database level, the transaction logs of the databases can be transported from one server to another at short time intervals and read into the database. Assuming an intact network, the latency then corresponds to the time interval in which the transaction logs are written. Asynchronous replication strategies:

Merge replication
Primary Copy
Snapshot replication

Pros and cons of replication

Advantages of replicas in distributed database systems:

Increased availability of data
Acceleration of read accesses
Better options for load balancing and query optimization

Disadvantages:

High update effort
Increased disk space requirements
Possible redundancy of the data sets with possible networking

What is Data Replication?

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Take The Conversation Further ...

Get new posts by email:

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Articles Related to What is Data Replication?

Take The Conversation Further ...

Get new posts by email: