Skip to main content

DATAWAREHOUSING INTRODUCTION



DATA WAREHOUSING    

Data warehousing is a relational database which is used to store large volumes of data for analyzing business but not for business transaction processing

       A DWH is designed to support decision making process. Hence it is known as Decision Support System(DSS)

       It analyzes the business transactions in order to support decision making

       DWH is a container data

       DWH is the process of developing a data warehouse

       DWH is a read only database because it is designed to read the data for analysis but not for transactional processing

       DWH is a historical database because it can store historical business information

       Father of Data warehousing  W.H.Inmon.In 1987, he designed a data warehouse

 Problems in existing systems:

       Data is scattered around multiple operational systems

       Inconsistency of the data across multiple transactional systems

       Volatile data

       Increasing complexity of reporting needs

 Business Analyst: Collects the requirements of a business

Data Architect: Designs DWH

ETL Developers

Report Authors

 GUI ETL Tools Ex: Informatica,Datastage

Reporting Tools Ex: Cognos, BI XI

 Data warehouse is a:

subjectoriented,integrated,non-volatile,time

variant database in support of management

decision-W.H.Inmon

 Subject oriented:

Tables in DWH has to be concentrated on subject

Ex: Products,Customers,Sales

Integrated:

     When we bring data from multiple sources, we need to make sure that the data has to be in

uniform format

 Ex:     

USA                  

 CustName Gender

  XXXX            M
  XXXX            F

INDIA

CustName   Gender

 XXXX       Male
 XXXX       Female

GERMANY

CustName   Gender

XXXX       0

XXXX       1


 Non volatile:

In DWH data is for read only

Time variant:

To compare business across time

 Time

Year

Quarter

Month

Week

Day

Datamart:

It is a subset of DWH, address single

business process

Ex: Sales, Marketing, Finance

       It is a high performance query structure(HPQS)

       Fast retrieval of data

 

Comments

Popular posts from this blog

Data modeling

Data Modeling: Model: Model is business representation of information in one or more data sources OLTP                                    DWH   Model          Model                 Model E-R Model    (modify wih ETL)       Dimensional model (Design based on              (Design based on facts and Measures) Entities&Relationships)   Dimensional Modeling: Is a design methodology for designing a DWH It consists of following 3phases to design the Database 1)Conceptual Modeling: •          Understanding the requirements •     ...

FM workflow diagram

FM workflow process diagram: A windows based tool used to design metadata Models                        import Datasource------ à FM design and create the project Prepare the metadata project(Presentation layer)--Prepare  the project business view--Create & manage package---setup  security— Publish package---Content store(Metadata)---Reporting interface   Metadata source-Cognos Application •        Cognos impromptu •        Cognos 10.1 model •        Cognos Architect •        Decision stream •        Data Manager •        IBM Data Source   Third party metadata source: •        ERWIN •     ...

Book on IBM Cognos 10 written by me-

Here is a book written by me on IBM Business Intelligence Tool COGNOS which is published at the following link- http://www.amazon.com/dp/B00KKSVGPA/ref=rdr_kindle_ext_tmb Anyone who is having prior knowledge in database like oracle can easily learn to develop the business reports by following the screenshot assisted examples. This book will be a good resource for the students who wish to self learn IBM Cognos software.