Skip to main content

Star shema and snow flake schema



Star Schema:

       Centrally located table is called as Fact Table

       All surrounding tables are dimensional tables

       Fact table should be a numeric value

       But every numeric value need not be a fact Ex: Phone Number

       Numeric's which are of type KPI are known as facts

       A fact presents in lowest level of granularity

       Star schema is a denormalized structure

       Redundancy occurs in star schema

       All dimension tables maintain direct relationship with the fact table

       In star schema dimension tables should not have any parent table

       Efficient in Query Processing

 

 

Star Schema:

         99.99% uses star schema

         A star schema represents a simple subject( Datamart )

         A fact table contains a composite key where each candidate key is a foreign key to the dimension table

         A fact table contains facts

         Facts are business measures which are used to evaluate the performance of an enterprise

 

 

Customer Dim                               Time Dim

CustId(PK)                               TimeId(PK)

CustName                               Year

Age                                   Quarter

                     Fact                     Month

                     SaleId(PK)           Week

                     CustId(FK)          Day

                     TimeId(FK)

Store Dim           StoreId(FK)         Product Dim

StoreId(PK)         ProdId(FK)          ProdId(PK)

StoreName         Qty         Facts    Prod.Name

Country              Revenue                    Category

City                                   Subcategory

 

Snow Flake Schema

         It is used 1%

         Dimension table maintains one or more parent tables

         It is a normalized structure/schema

         Efficient in transaction processing

         Inefficient in query processing(Data retrieval is slow)

         Normalization is a process of breaking a table into no of tables to eliminate the redundancy

 

Products                    Sales Fact                  Customers

ProdId                SaleId                 CustId

ProdName          CustId                 Cname

Price                   ProdId                CityId

SubcatId                    SaleDate

                     Quantity

                     Amount

      

SubCategory(ShrunkDim)      Time Dim                  City

SubcatId                    TimeId                CityId

SubcatName             Year                    CityName

CatId                   Qtr               StateId

                     Month                      

Categories          Week                  State

CatId                   Day              StateId

CatName                                       StateName

 

 

2 approaches to design datamarts

1)Top Down Approach –Dependant Datamarts– W.H.Inman

2)Bottomup Approach-Independent Datamarts-Ralph Kimball

Bottomup approach is widely followed

 

 

Comments

Popular posts from this blog

Data modeling

Data Modeling: Model: Model is business representation of information in one or more data sources OLTP                                    DWH   Model          Model                 Model E-R Model    (modify wih ETL)       Dimensional model (Design based on              (Design based on facts and Measures) Entities&Relationships)   Dimensional Modeling: Is a design methodology for designing a DWH It consists of following 3phases to design the Database 1)Conceptual Modeling: •          Understanding the requirements •     ...

FM workflow diagram

FM workflow process diagram: A windows based tool used to design metadata Models                        import Datasource------ à FM design and create the project Prepare the metadata project(Presentation layer)--Prepare  the project business view--Create & manage package---setup  security— Publish package---Content store(Metadata)---Reporting interface   Metadata source-Cognos Application •        Cognos impromptu •        Cognos 10.1 model •        Cognos Architect •        Decision stream •        Data Manager •        IBM Data Source   Third party metadata source: •        ERWIN •     ...

Book on IBM Cognos 10 written by me-

Here is a book written by me on IBM Business Intelligence Tool COGNOS which is published at the following link- http://www.amazon.com/dp/B00KKSVGPA/ref=rdr_kindle_ext_tmb Anyone who is having prior knowledge in database like oracle can easily learn to develop the business reports by following the screenshot assisted examples. This book will be a good resource for the students who wish to self learn IBM Cognos software.