Data storage Scheme – a visualization of a domain-specific information database, specially designed and intended for the preparation of reports and business analysis to support decision-making in an organization.
A document-oriented database is a designed for storing, retrieving, and managing document-oriented, or semi structured data. Schemas are a logical description of data warehouse tables. Schemes are formed from several fact tables and measurements.
There are three schemes for data warehouses:
When using the star scheme, the fact table is central, with which all dimension tables are associated. Thus, information about each dimension is located in a separate table, which simplifies their viewing, and makes the diagram itself logically transparent and understandable to the user. Such type is used in Data Warehouse MySQL, Data Warehouse SAS and Data Warehouse SAP.
However, placing all the measurement information in one table is not always justified. For example, if the goods sold are grouped (there is a hierarchy), you have to show in one way or another that group each product belongs to, which lead to a repeated repetition of group names. This is not only cause an increase in redundancy, but also increase the likelihood of contradictions (if, for example, the same product is mistakenly assigned to different groups).
For more efficient work with hierarchical measurements, a modification of the “star” scheme was developed, which was called the snowflake. The main feature of the snowflake scheme is that information about one dimension can be stored in several related tables. That is, if at least one of the dimension tables has one or more other dimension tables associated with it, then the snowflake scheme will be applied.
The constellation of facts has several fact tables. It means that in a Galaxy (or constellation) scheme, two or more related fact tables are surrounded by corresponding dimension tables.
The main functional difference between the snowflake scheme and the star scheme is the ability to work with hierarchical levels that determine the level of detail of the data. In the above example, the snowflake scheme allows to work with data at the level of maximum detail, for example, for each product separately, or to use a generalized representation of groups of goods with the corresponding aggregation of facts.
The choice of scheme for building DWH depends on the mechanisms used for collecting and processing data. Each of the schemes has its advantages and disadvantages, which, however, can manifest themselves to a greater or lesser extent depending on the characteristics of the functioning of the CD as a whole.
The advantages of the star scheme include:
The disadvantages of the star scheme are:
The advantages of the snowflake scheme are as follows:
Disadvantages of the snowflake scheme: