What is Data Virtualization?

Abhishek Ghosh

By Abhishek Ghosh November 15, 2023 6:21 pm Updated on November 15, 2023

What is Data Virtualization?

The term data virtualization refers to certain approaches in the field of data management as a subset of data integration. These make it possible to query and manipulate data from source systems without the need for the querying system to be aware of its detailed technical information – such as the structure of the data source or the physical storage location.

Data virtualization can be seen as an alternative to the data warehouse approach with its ETL processes, in which the data is extracted from the source systems, transformed and finally loaded into the analytical system. The data, on the other hand, remains in its original systems, the virtualization component accesses this data directly and makes it available for further manipulation or consumption by further applications.

In order to eliminate the heterogeneity of the data (differences in data sources, format and semantics), various abstraction and transformation techniques are used. Potential benefits of this approach include the reduction of erroneous data and, if the virtualization component is designed appropriately, lower utilization of the systems involved. Furthermore, it is possible to write data back to the source systems.

Typical areas of application of the concept and corresponding software are in business intelligence, service-oriented architecture, cloud computing, enterprise search and master data management.

Data Virtualization and Data Warehousing

Many enterprise system landscapes consist of disparate data sources, including multiple data warehouses, data marts, and/or data lakes. Data virtualization can build a bridge over these source systems without the need for additional physical data storage. The existing data infrastructure can continue to perform its core functions, while the data virtualization layer only uses the data from those sources. This aspect can help increase data availability and usage.

Data virtualization can also be considered as an alternative to ETL processes and data warehousing. The concept aims to deliver insights from multiple data sources quickly and in a timely manner, without the need for extensive ETL processes and additional data storage. However, data virtualization can be extended and customized to meet data warehousing requirements as well. This requires an understanding of data storage requirements and historization, along with planning and design, to select appropriate data virtualization, integration, and storage strategies, as well as to perform infrastructure/performance optimizations (e.g., streaming, in-memory, hybrid storage).

Image credit: DataWerks Gmbh

Features of Data Virtualization

Data virtualization solutions offer a choice or all of the following features:

Abstraction – abstracting the technical aspect of the stored data such as location, storage structure, API, query language, and storage technology
Virtualized data access – access to different data sources and make the data available at a common logical access point
Transformation – transformation, data quality improvements, reformatting, aggregation of source data
Data federation – combining result sets from multiple source systems
Data delivery – Publishing result sets as views and/or data services that can be accessed by client applications or users

In addition, data virtualization software may include development, operation, and/or management capabilities. When used correctly, the following benefits can be achieved with the concept of data virtualization:

Reduction of erroneous data
Reduction of system load by keeping the data in the source system
Increased access speeds
Reduction of time required for development and support
Increased governance and reduced risk through policy application
Reduce storage footprint

Disadvantages of Data Virtualization

Operational systems could be affected in their response times. Especially if they can’t handle unexpected queries.
Data virtualization does not enforce a heterogeneous data model, this means that the user must interpret the data unless it is combined with data federation and business understanding of the data.
Data virtualization requires a defined governance approach to avoid budgeting issues across shared services.
Data virtualization is not suitable for historizing data. A data warehouse is better suited for this.
Change of management is associated with increased effort because all changes to the virtual data model must be accepted by all consuming applications and users.

Tagged With whozyd

What is Data Virtualization?

Data Virtualization and Data Warehousing

Features of Data Virtualization

Disadvantages of Data Virtualization

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Take The Conversation Further ...

Get new posts by email:

Data Virtualization and Data Warehousing

Features of Data Virtualization

Disadvantages of Data Virtualization

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Articles Related to What is Data Virtualization?

Take The Conversation Further ...

Get new posts by email: