DATA WAREHOUSE FREQUENTLY ASKED QUESTIONS

Data Warehouse Qestions
  • Differences between OLTP and OLAP?
  • IT applications
  • a) OLTP
  • Purpose: Day-Day business storage
  • Process: Front-End, Back-End process
  • Points:
  • a) Data: Less data
  • b) Model: Noramlized model (more tables,smaller tables)
  • c) Joins: More joins (So retrieval it is slower)
  • d) Access Frequency: High [Milliseconds to seconds]
  • e) Volatality: Volatile [Read, Ins, update, Delete]
  • b) OLAP
  • Purpose: For analysis and mining
  • Process: Layered architecture
  • Points:
  • a) Data: More data
  • b) Model: DeNoramlized model (Less tables, complex tables)
  • c) Joins: Less joins (So retrieval it is faster)
  • d) Access Frequency: Low [Day/week/month...]
  • e) Volatality: NonVolatile [Read]
  • What is Datawarehouse [Single line] and warehouses available in the market?
  • Huge storage area which is suitable for decision making.
  • Ex: Teradata,Oracle,Sqlserver,Sybase,Netezza, GreenPlum etc...
  • Difference between OLAP and BI?
  • BI is a process which uses OLAP approach to gather, convert and presents the data.
  • Difference between Analysis and Mining?
  • Analysis talks about 'current understanding of data' and 'past analysis
  • of data',where as mining talks about 'future prediction of data'.
  • Note: Datamining is a knowledge analysis and discovery method.
  • How many types of decision making available in IT?
  • a) Strategic-- Past analysis [OLAP data]
  • b)Tactical--Current and recent [OLTP]
  • Difference between OLTP and ODS?
  • OLTP holds realtime data where as ODS holds near real time data[daily 4 times from OLTP]
  • ODS helps in two ways
  • a) Input to DWH
  • b) Incase enterprise database [OLTP] fails, it acts like enterprise db and continue
  • operations.
  • Note: All top companies use ODS in the ETL flow.
  • OLTP similar terminologies?
  • a)Operational Systems
  • b)Enterprises databases
  • c)Business Processing Systems
  • d) Transactional systems
  • Can you describe an End-End typical ETL Flow?
  • OLTP---->ODS--->Stage Area[Files]-------->DWH[Stage]---->DWH[WorkArea]--->DWH[Target]
  • How many life cycles does a DWH project need?
  • Three life cycles a)ETL b) Semnatic c) Reporting
  • Famous DWH life cycles: Agile, iterative incremental [Block approach], V model
  • What is granularity and which grain your project has?
  • Granularity talks about the level of detailed data maintaining in the project.
  • Dwh has multiple tables with multiple grains.
  • Always the lowest grain has lot of burden but improves analytical performance.
  • Ex: Day/ week/month etc..grains
  • Day grain is recommended for better analysis
  • Choose the below are required to implement a DWH Project?
  • a) Life Cycle
  • b) Approach
  • c) Data Model
  • d) Type of OLAP
  • e) Characterstics/principles
  • f) Historical data
  • Correct; a, b, c, e
  • Explain DWH characterstics?
  • InmonCharacterstics:
  • a) Subject oriented: Store data based on operation but not application.
  • Ex: Savings coount subject area
  • Application2 : a)Withdraw b)deposit
  • Subject Area: withdraw+deposit single area
  • b) Integrated: Integrate from diverse applications by eliminating inconsistencies
  • and following standards.
  • c) Nonvolatile: No change of data. [Don't modify the data]
  • Helps to maintain history.
  • d) Timevarient: Data store base on timeframes [granularity load / grain]
  • We can perform the below operations easily.
  • a) Current understanding
  • b)Past analysis
  • c)Future predictions
  • What is datamart and how many types available?
  • DataMart--Data sub store for specific business/ operation / functionality
  • May or not not be a subset of DWH.
  • Three types of datamarts
  • a)Dependent-- DWH created first and then datamart [subset of DWH]
  • Ex: ICICI DWH, Savings account dependent datamart
  • b) Independent-- Directly created from source systems [we don't use DWH here]
  • Ex: ICICI employee payroll System [No history, not much analysis required]
  • c) Logical Datamrt--It is replica of another data mart [Holds structure of datamart]
  • How do we create datamarts practically?
  • a) By using Complex Views [Materialized views] --Dependent DM
  • b) By using a complex table --Depedent DM
  • c) By using a seperate physical storage area --Independent DM
  • How many types of DWH approaches available, which approach your company following?
  • Two approaches
  • a) KIMBall approach--Datamarts-> DWH [Bottom up approach]
  • b) Inmon approach-- DWH -->Datamarts [Top down approach]
  • My company using KImbal approach for DWH implementation because of dynamic
  • decisions and adhod requests.
  • How many types of OLAP available, which is effective?
  • OLAP--Online Analytical Processing
  • It create aggregates(sum,avg, max,minstddev,covarience etc...) for decision making.
  • ROLAP: Relational OLAP--Data and aggrgates in the relational area [OLTP area]
  • Latency time less [fresh data anlysis]
  • Slower analysis
  • Minimal set up
  • Ex: Small, Medium
  • MLOAP: Multidimensional OLAP-- Data and aggregates in mutidimensionalarea[ Cube]
  • Adv: More and detailed analysis this is helpful
  • Latency time is high [ fresh data analysis is not possible]
  • Complex set up
  • Ex: Corporates
  • HOLAP: Data in relational area and aggreagates in multidimensional area [cube]
  • Ex: Latency time avg
  • Avg set up
  • DOLAP: Desktop OLAP: Here the data and aggregates on PC based desk top applications
  • Ex: Excel, LOtus, FoxPro,VISICALC etc...
  • Explain Time hierarchy?
  • Hour-->Day-->Week-->Fort Night-->Month-->Quarter-->half Year(Semister)-->Year
  • You have two databases with same size, then dow do you identify which is OLTP or OLAP?
  • Normalized model is OLTP, Denormalized model is OLAP.