Skip to main content

Data Transformation



Data      Transformation:

Transforming the data from one format to client required format/business standard format

-A staging is a temporary memory or buffer where the following data transformation activities takes place

        Data merging

        Data Cleansing

        Data scrubbing

        Data aggregation

 

 

Data Merging: It’s a process of integrating the data from multiple

data sources into a single operational data set

         Sources can be homogenous/heterogenous

Two types of merging:

a)Horizontal Merging: Process of joining the records horizontally

when the 2 sources are having different data definitions

Ex: Source1---Emp Table

Empno-Ename-Sal-Deptno

Source2----Dept Table

Deptno-Dname-Loc

b)Vertical Merging: Process of joining the records vertically when the 2

sources are having similar data structures

 

Data cleansing: is a process of converting inconsistencies into

consistencies and removing unwanted data

Data scrubbing: is a process of delivering new data

definitions according to the user requirements

Data aggregation: is a process of calculating the

summaries from the detailed data

         Data scrubbing is applied for each record

         Data aggregation is applied for a group of records

Data Loading: Dumping/inserting of the processed data into the target system(DWH)

2 types of data loads:

a)Initial/Full load: is a process of loading the data into an empty

target table

b)Incremental/Delta load: is a process of loading only new

records/changed records, which takes place after initial load

         First we load dimensions

         Next we load facts

 

Comments

Popular posts from this blog

Data modeling

Data Modeling: Model: Model is business representation of information in one or more data sources OLTP                                    DWH   Model          Model                 Model E-R Model    (modify wih ETL)       Dimensional model (Design based on              (Design based on facts and Measures) Entities&Relationships)   Dimensional Modeling: Is a design methodology for designing a DWH It consists of following 3phases to design the Database 1)Conceptual Modeling: •          Understanding the requirements •     ...

FM workflow diagram

FM workflow process diagram: A windows based tool used to design metadata Models                        import Datasource------ à FM design and create the project Prepare the metadata project(Presentation layer)--Prepare  the project business view--Create & manage package---setup  security— Publish package---Content store(Metadata)---Reporting interface   Metadata source-Cognos Application •        Cognos impromptu •        Cognos 10.1 model •        Cognos Architect •        Decision stream •        Data Manager •        IBM Data Source   Third party metadata source: •        ERWIN •     ...

Book on IBM Cognos 10 written by me-

Here is a book written by me on IBM Business Intelligence Tool COGNOS which is published at the following link- http://www.amazon.com/dp/B00KKSVGPA/ref=rdr_kindle_ext_tmb Anyone who is having prior knowledge in database like oracle can easily learn to develop the business reports by following the screenshot assisted examples. This book will be a good resource for the students who wish to self learn IBM Cognos software.