The snowflake scheme is a continuation of the star scheme used in OLAP and data warehousing. With the star schema, the dimension tables are denormalized, which results in better processing speed at the expense of data integrity and storage space. In contrast, the snowflake scheme refines the individual dimension tables by classifying or normalizing them. This branching of the data model creates the shape of a snowflake, where the name of this design pattern comes from.
Due to this finer structuring, the data is less redundant than in a star schema, but additional join operations may be required for queries. A snowflake scheme thus leads to smaller and better structured amounts of data, but they have more complex relationships and may therefore lead to longer loading or query times.
The snowflake scheme is a continuation of the star scheme. With this, the fact table remains analogous to the star schema. However, in contrast to the star schema, the dimension tables are different, because they no longer contain all dimension members, but only data about the dimension hierarchies. The dimensions are further refined by classifying or normalizing them. In any case, the dimension tables are extended by the attributes so that each dimension can be displayed in its table. This means that in the common snowflake scheme, the data is stored in the dimension tables in the 3rd normal form (3NF). Normalization creates a separate table for each hierarchy level of a dimension and thus leads to smaller and better-structured amounts of data. This branching of the data model creates the shape of a snowflake, which gives this schema its name.
Advantages and Disadvantages of the Snowflake Schema
The following are the advantages and disadvantages of the snowflake scheme compared to the simpler star scheme:
- Reduced disk space consumption: Dimension tables do not contain redundant data due to normalization.
n:m relationships between aggregation levels can be resolved using relation tables
- Optimal support of aggregation formation
- Browsing functionality: Frequent queries over very large dimension tables save time and provide a speed advantage.
- Speed disadvantage: Due to additional federations in the dimension tables
- More complex structuring: Due to the finer structuring, the data is less redundant than in a star schema, but the relationships are more complex. As a result, multi-level dimension tables must be re-joined using join queries and may result in longer query times.
- Larger number of tables: Due to the more complex structuring, a larger number of tables is required.
- Reorganization problem: Changes in the semantic model lead to extensive reorganization of the tables and consequently to a higher maintenance effort