Data Warehousing

Concepts

  • What is Data Warehouse?
    • Data Warehouse is like a relational database designed for analytical needs
    • It functions on the basis of OLAP (Online Analytical Processing)
    • It is a central location where consolidated data from multiple locations (databases) are stored
    • Data Warehousing is the act of organizing & storing data in a way so as to make its retrieval efficient and insightful
    • It’s also called as the process of transforming data into formation
    • Data Sources
      • Database
      • Flat Files
    • Staging Area
      • Staging Database
    • Data Warehouse
      • Meta Data
      • Aggregate Data
      • Raw Data
    • Data Marts
  • OLAP
    • OLAP is a flexible way for you to make complicated analysis of multidimensional data
    • DW is modeled on the concept of OLAP. DBs are modeled on the concept of OLTP (Online Transaction Processing).
    • Opens up new views of looking at data
    • Supports filtering/sorting of data
    • Data can be refined
  • Types of OLAP Cubes
    • MOLAP – Multidimensional OLAP
    • ROLAP – Relational OLAP
    • HOLAP – Hybrid OLAP
  • OLAP Operations
    • Roll-up
      • Roll-up performs aggregation on a data cube by either
        • Climbing up a concept hierarchy for a dimension
        • Dimension reduction
    • Drill-down
      • Drill-down is the revers operation of roll-up. It is performed by either
        • Stepping down a concept hierarchy for a dimension
        • Introducing a new dimension
    • ¬†Slice
      • The Slice operation provides a new sub-cube from one particular dimension in a given cube
    • Dice
      • The Dice operation provides a new sub-cube from two or more dimensions in a given cube
    • Pivot
      • The Pivot operation is also known as rotation operation. It transposes the axes in order to provide an alternative presentation of data.
  • Dimensions, Facts & Measures
    • Dimensions
      • The tables that describe the dimensions involved are called Dimension tables
      • Dividing a Data Warehouse project into dimensions provides structured information for analysis & reporting
    • Facts & Measures
      • A fact is a measure that can be summed, averaged or manipulated
      • A Fact tables contains 2 kinds of data – a dimension key and a measure
      • Every dimension table is linked to a Fact Table
  • Schema
    • A schema give the logical description of the entire data base.
    • It gives details about the constrains places on the tables, key values present & how the key values are linked between the different tables.
    • A database use relational model, while a data warehouse uses Star, Snowflake and Fact Constellation schema.

Videos

Reference