Normalization of the dimension tables ("snowflaking") can impair the performance of a data warehouse.
Friday, November 2, 2007
Where would you use a snowflake schema?
The snowflake schema is a more complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake schema because the diagram of the schema resembles a snowflake.
Snowflake schemas are generally used when a dimensional table becomes very big and when a star schema can’t represent the complexity of a data structure. For example if a PRODUCT dimension table contains millions of rows, the use of snowflake schemas should significantly improve performance by moving out some data to other table (with BRANDS for instance).
Normalization of the dimension tables ("snowflaking") can impair the performance of a data warehouse.
Normalization of the dimension tables ("snowflaking") can impair the performance of a data warehouse.
Data Warehousing.
What is a factless fact table?
According to Kimball.Fact tables contain the numeric, additive fields that are best thought of as the measurements of the business, measured at the intersection of all of the dimension values.
He also states that there may be fact tables which provide a lot of useful information but do not have any measures in them. Them may consist of no other columns except the dimensional keys.
Example: Student attendance tracking system.
List of Dimensions that your solution would have
- Date dimension
- Student dimension
- Teacher Dimension
- Course Dimension
- Student Attendance Fact
The grain of your fact table would be each class/lecture tat a student attends.
In such a situation there is no obvious fact to record. Actually this fact table consisting only of keys is a perfectly good fact table. It will answer a lot of questions like.
- Which classes were the most heavily attended?
- Which classes were the most consistently attended?
- Which teachers taught the most students?
A second kind of factless fact table is called a coverage table.
Coverage tables are frequently needed when a primary fact table in a dimensional data warehouse is sparse. Consider a simple sales fact table that records the sales of products in stores on particular days under each promotion condition. This sales fact table cannot answer questions like
- Which products were on promotion that didn't sell?
Answering the question, "Which products were on promotion that did not sell?" requires a two-step application.
- Consult the coverage table for the list of products on promotion on that day in that store.
- Consult the sales table for the list of products that did sell.
The desired answer is the set difference between these two lists of products.
Subscribe to:
Posts (Atom)
A TRIP TO BHANDUP PUMPING STATION - the birdwatchers paradise
"A Picture from Bhandup Pumping Station" I am a nature lover, and I often like to go birdwatching with my dad. We had read that ...
-
Suppose you have some data for sales of some 4 products (A, B, C, D) for the 4 quarters of a year & for 5 regions in the country, an ...