In earlier publications on this website, we already discussed some of the basic, must to know matters around Big Data. Related to current topic they are Theoretical Foundations of Big Data, Data Lake, Data Refining, Difference Between Data Lake and Data Warehouse, ETL (Extract, transform, load) etc to mention a few. We often hear discussion around Data Mart. What is Data Mart? In simple language, a data mart is a special version of data warehouse. Data marts are small data warehouses focused on a specific business area or topic within an organization.
What is Data Mart?
According to Sinnexus (2016) Data Mart is a departmental database which is specialized in storing the data of a specific business area. It is characterized by having the optimal structure of data to analyze the information in detail from all perspectives which affect the processes of that department.
The existing data in this context can be grouped, explored and propagated in multiple ways so that different groups of users can exploit them in the most convenient way according to their needs. The data mart is a query-oriented system, in which batch data loading (high) processes occur at a low and known frequency. It is consulted through OLAP tools (On Line Analytical Processing) that offer a multidimensional vision of information. On these databases can be built EIS (Executive Information Systems, Information Systems for Executives) and DSS (Decision Support Systems, Decision Support Systems).
Data virtualization software can be used to create virtual data marts. They as like normal Data Mart will pull data from disparate sources and combining it with other data to meet the needs of specific business users.
Reasons behind creation of Data Mart are usually :
- To meet the need of easy access to frequently needed data
- To create a collective view by a group of users
- To improves the end-user response time
- To ease the workflow
- Lower cost than full data warehouse
- Contains only business essential data
- Less cluttered.
Difference Between Data Mart and Data Warehouse
- Data Warehouse holds multiple subject areas whereas Data Mart often holds only one topic area, like Finance.
- Data Warehouse holds too detailed information whereas Data Mart usually holds summarised data.
- Data Warehouse works to integrate all data sources whereas Data Mart is intended to concentrate on integrating information from pre-defined subject area.
- Data Warehouse not always use a dimensional model whereas Data Mart focused on a dimensional model using a star schema.
- Scope of Data Warehouse is Corporate whereas scope of Data Mart is Line-of-Business (LoB)
- Size of Data Warehouse can go to 100 GB-TB+ whereas size of Data Mart usually remains within < 100GB
- Implementation time of Data Warehouse is in months to years whereas it is in months for Data Mart
Types of Data Mart
Usually three types of data marts are usual – dependent, independent, and hybrid. This categorization of Data Mart is based on the data source that feeds the data mart. Dependent data marts fetches data from a central, already running data warehouse. Independent data marts are standalone systems built by drawing data directly from external sources of data or operational systems or both. Hybrid data marts can fetch data from operational systems or data warehouses.
Design schemas of Data Mart
Two usually common. One is star schema which is a popular design and enables a relational database to emulate the analytical functionality of a multidimensional database. The other is snowflake schema.