You have two databases with same size, then dow do you identify which is OLTP or OLAP?
Normalized model is OLTP, Denormalized model is OLAP
Need of Data warehouse?
1)To Analysis of data and history maintenance.
2)Companies require Strategic information to face the competition in market.
3)The Operation system are not designed for strategic information.
4)To maintain history of data for whole Organization and to have a single place where the entire data stored.
What is data warehousing and Explain Approaches?
Many companies follow either characteristic defined by W.H.Inmon or Sean kelly.
Subjected Oriented, Integrated, Non Volatile,Time Variant
b)Sean Kelly definition
Seperate, Available, Integrated,TimeStamped,Suject Oriented, Non Volatile, Accessible
There are two Approches
1.Top Down by Inmon
2.Bottom Up by Ralph kimbal
Inmon approach:Enterprise datawarehouse structured first and next Datamart created.(TopDown).
Ralph kimbal:Datamart designed first, later Datamarts to Datawarehouse designed.(BottomUp).
What are the responsibilities of a data warehouse consultant/professional?
The basic responsibility of a data warehouse consultant is to ‘publish the right data’.
Some of the other responsibilities of a data warehouse consultant are:
1. Understand the end users by their business area, job responsibilities, and computer
2. Find out the decisions the end users want to make with the help of the data warehouse.
3. Identify the ‘best’ users who will make effective decisions using the data warehouse
4. Find the potential new users and make them aware of the data warehouse.
5. Determining the grain of the data.
6. Make the end user screens and applications much simpler and more template driven.
What are fundamental stages of Data Warehousing?
a)Offline Operational Databases - Data warehouses in this initial stage are developed by simply copying the database of an operational system to an off-line server where the processing load of reporting does not impact on the operational system's performance.
b)Offline Data Warehouse - Data warehouses in this stage of evolution are updated on a regular time cycle (usually daily, weekly or monthly) from the operational systems and the data is stored in an integrated reporting-oriented data structure.
c)Real Time Data Warehouse - Data warehouses at this stage are updated on a transaction or event basis, every time an operational system performs a transaction (e.g an order or a delivery or a booking etc.)
d)Integrated Data Warehouse - Data warehouses at this stage are used to generate activity or transactions that are passed back into the operational systems for use in the daily activity of the organization.
What is Datamart Explain Types?
It is a specific Subject area or Functionality or Task.
It is Designed to facilitate end user Analysis.
Wrong Answer-- It is a subset of warehouse
Types of Datamarts
Dependent, Independent, Logical.
Dependent: Created from Datawarehouse as a seperate physical store.
Independent :Created directly from operational systems to a seperate physical store.
Logical: Exists as a subset of existing Datawarehouse.
TERADATA Frequently Asked Questions
Q: How many types of architectures available in Teradata?
Ans: Two types a) MPP(Cloud, Linear) b) SMP
Q: On what basis Architectures identitifed and who will identify?
Architectures identified by the Teradata Corporation Advanced Technical Team.
They consider No of current and increasing a) Users b) Traffic c) Workload etc...
Q: What is the full form of MPP and SMP?
Ans: MPP--Massively Parallel Processing
SMP: Symmetric Multi Processing
Q: What is the role of MPP Bynet and what are the other names we have?
MPP bynet provides communication between Nodes.
This can also be called as Hardware Bynet / Boardless Bynet
Q: How many nodes a Bynet support?
Ans: 4096 nodes in TD14
Q: What are the important features of MPP Bynet?
Ans: a) Fault Tolerance b) LOad Balancing c) High Scalability
Q: What is the role of SMP Bynet and what are the other names we have?
Ans: Communication between PE and AMP.
This can also be called as Software Bynet or Boardless bynet or Vitual Bynet
Q: What is VPROC?
Ans: VPROc contain set of instructions and perform group of operations.
Q: What is TPA?
Ans: Trusted Parallel Application. Since TD running under PDE it is called as TPA.
Q: What is Shared Nothing Architecture?
AMP to disk communication is one one one and other AMPs can't use the disks.
This is called as Shared Nothing Process.
Q: What are the important components a Node has, Brief them?
a) PE--> Parsing Engine-->Request taking and response delivering
b)Bynet-->Communication between PE and AMP
c)AMP-->Access Module Processor--> a) Perform SQL operations b) Storage of Table and row headers
d) VSS-->Virtual Storage System-->a) Avoids communication between AMP and Disk b) Multi temperature environment (Hot,Cool,Warm)
e) Vdisk-->Vitual Disk-->Storage Area
Q: Who generates execution plan?
Ans: Optimizer in PE
Q: How many sessions does PE support?
Ans: PE support at a time 120 sessions [ one user 120 request / 120 uers 120 requets]
Q: What is session?
Ans: Logical connection between user and server.
Created at server level after credentials validation.
Q: What kind plan does optimizer generate?
Least cost plan/ shortest route path plan
Q: What exactly AMP does?
A) DBMS operations [Joins,Sort, Aggregate etc...]
B) Holds Table header and Row header etc...
Q:What is uniform distribution and Skew distribution.?
More or less equal no of rows distribution is Uniform distribution.
Less uniform distribution is skewd distribution[ One AMP holds more records,
other AMP holds less records].
Q: What is Skewed AMP and Table?
Ans: AMP which has skewness is Skewed AMP and table which has skewness is Skewed table.
Q: What exatcly VSS does?
a) Avoids direct communication between AMP and disk
b) Maintains multi temperature environment [Hot, Cool and Warm]
Q: Can you tell me row storage navigation?
Q: What is VDisk?
AMP associated memory area is called as Vdisk.
Q: What is Cylinder?
Ans: It is physical disk inside Vdisk
Q: What is Rank?
Ans: The number of cylinders in an array is called as rank
Q:Who performs sql validations?
Q: Who performs Join, sort etc... operations?
Q: How many types of bynets available?
Ans: Two types
a) SMP bynet--Boardless bynet/ virtual bynet/ software bynet
b) MPP bynet--BoardOriented / Hardware bynet
Q: What is RASUI?
Ans: Reliable,Accessable,Servicable,Usable and Installable.
Q:How many Nodes and AMPs your project has?
Ans: Refer to class notes
Q: What is Session and How multiple sessions improve performance?
Ans: Session is the logical connection between user and server.
When a user logs on, session is established and closed at the time of logoff.
Multiple sessions improve parallelism of requests or data processing.