You have two databases with same size, then dow do you identify which is OLTP or OLAP?
Normalized model is OLTP, Denormalized model is OLAP
Need of Data warehouse?
1)To Analysis of data and history maintenance.
2)Companies require Strategic information to face the competition in market.
3)The Operation system are not designed for strategic information.
4)To maintain history of data for whole Organization and to have a single place where the entire data stored.
What is data warehousing and Explain Approaches?
Many companies follow either characteristic defined by W.H.Inmon or Sean kelly.
Subjected Oriented, Integrated, Non Volatile,Time Variant
b)Sean Kelly definition
Seperate, Available, Integrated,TimeStamped,Suject Oriented, Non Volatile, Accessible
There are two Approches
1.Top Down by Inmon
2.Bottom Up by Ralph kimbal
Inmon approach:Enterprise datawarehouse structured first and next Datamart created.(TopDown).
Ralph kimbal:Datamart designed first, later Datamarts to Datawarehouse designed.(BottomUp).
What are the responsibilities of a data warehouse consultant/professional?
The basic responsibility of a data warehouse consultant is to ‘publish the right data’.
Some of the other responsibilities of a data warehouse consultant are:
1. Understand the end users by their business area, job responsibilities, and computer
2. Find out the decisions the end users want to make with the help of the data warehouse.
3. Identify the ‘best’ users who will make effective decisions using the data warehouse
4. Find the potential new users and make them aware of the data warehouse.
5. Determining the grain of the data.
6. Make the end user screens and applications much simpler and more template driven.
What are fundamental stages of Data Warehousing?
a)Offline Operational Databases - Data warehouses in this initial stage are developed by simply copying the database of an operational system to an off-line server where the processing load of reporting does not impact on the operational system's performance.
b)Offline Data Warehouse - Data warehouses in this stage of evolution are updated on a regular time cycle (usually daily, weekly or monthly) from the operational systems and the data is stored in an integrated reporting-oriented data structure.
c)Real Time Data Warehouse - Data warehouses at this stage are updated on a transaction or event basis, every time an operational system performs a transaction (e.g an order or a delivery or a booking etc.)
d)Integrated Data Warehouse - Data warehouses at this stage are used to generate activity or transactions that are passed back into the operational systems for use in the daily activity of the organization.
What is Datamart Explain Types?
It is a specific Subject area or Functionality or Task.
It is Designed to facilitate end user Analysis.
Wrong Answer-- It is a subset of warehouse
Types of Datamarts
Dependent, Independent, Logical.
Dependent: Created from Datawarehouse as a seperate physical store.
Independent :Created directly from operational systems to a seperate physical store.
Logical: Exists as a subset of existing Datawarehouse.
SSIS Interview Questions ( Simple, Medium and Complex)
Q: What is SSIS?
Ans: It is Integration of three services[ETL, Windows Admin,Sqlserver Admin]
Q: What is the previous component of SSIS?
Ans: DTS[Data Tranmsformation Services]
Q: SSIS version differences.
Q: How many engines does SSIS use and explain the operations?
Ans: Two engines SSIS use a) SSIS Runtime Engine--Starts when pkg starts
2) Data Pipeline engine--Starts when DataFlow Task Starts
Q: What is inmemory process?
Ans: DataFlow opertation is Inmemory process. It retrives source data into
SSIS internal memory (inmemory) and perform operations (options) and loads
from the memory to Target.
Q: What is Solution, Project and Package?
Solution--Is a container for one or more projects
Project--Collection of packages and related files
Package--Collection of tasks and containers.
Q: Whare the components a package has and explain in single line?
It has five components.
a) Control flow-- controls various tasks and containers with precedence constraints
b)Data Flow--contains sources, targets and transformations
c) Event Handler--Perfoms action based on event
d) Parameters--Similar to variables but take user interaction, helpful in catalog deployment
e) Package Explorer--Single view of all components inside package
Q: Difference between control flow and data flow?
Ans: COntrol flow is the brain of apckage and controls various tasks and containers.
It has ETL,WindowsAdmin and SqlServer Admin tasks.
Data flow is a component inside control flow and it is heart of ETL process.
It transfors data from sources to destinations by doing operations.
Q: Difference between Variable and Parameter?
a)Take value and support user interaction at runtime
b)Scope is across package
a) Take value and does not support user interaction
b) Scope at Package/ Container/ Task Level
c) Variables of two types 1. System Defined 2. User Defined variables
Q: What is the difference between System defined and User defined variables?
System defined variables crated by the system at the time of package creation
Store under System namespace.
Generally they hold Audit information
Ex: Execution Time,Package Name, User Name etc
User defined variables crated by the user.
Store under User namespace.
Generally they hold user information
Three scopes a)Package b)Container c) Task
Ex: Numeric value 10
String Value MSBIClass etc...
Q: What are the extensions of Solution, Project,Package, Paramter and connection Strings?
Connection Manager: .conmgr
Q: When do we Miscellaneous folder in SSIS?
Ans: Generally Business Docs, LLDs, and other credential docs are maintained here.
Q: Difference between COntainer and Group?
Group is logical object which hold tasks and containers.
No properties associated to group.
Groups are ungrouped.
Container is physical object which hold tasks and containers.
Properties associated to container.
Containers support only deletion.
Q: How do yopu identify package and task functionality easily?
Ans: By using naming conventions and annotations[detailed descriptions]
Q: How many ways we can execute package ?
Ans: 8 ways [Materail]
Q: How many colors a package uses?
Ans: 5 colors
White--Ready to execute
Q: How many types of files SSIS support
Ans: 4 types a) Flatfile b)Excel c) XML d) Raw file
Note: Refer to material for explanation
Q: What is CSV file and how do you open it?
Ans: Comma Seperated Value file, we open in two ways
1. Flatfile 2. Excel
Q: How do you use csv file in SSIS?
Ans: Flatfile source
Q: Which extension of flatfile data you get mostly?
a) txt--text file
c) out--output file
d)csv--comma seperated value file
Q: What is delimiter and explain some dlimiters?
Ans: Data seperator is delimiter.
Ex: , | : $ & @ LF CRLF etc...
Q: What is CRLF?
Ans: Carriage return[First char of a line] line feed/ forward [New line]
New line first character--Enter key
Q: How many types of flatfiles avaulable?
Ans: Two types a) Fixed length b) Delimiter
Q: What is TextQualifier, how is it helpful?
Ans: Qulifies the text with some indicators.
Ex: 'vinay'--single quote is text qualifier
Q: How many settings we need to know before processing flatfile?
Ans: 7 settings [ refer to material]
Q: What is codepage, which code page is useful in realtime to process all types of data?
Ans: Encrypted charset is codepage,in realtime Unicode or UCS [2/4/8] suitable.
Q: Data loading how many ways in real time?
a) Truncate and load
b) Incremental load [it is of three types 1.Direct incremental 2. SCD load 3. CDC Load]
Q: What is Delta Load?
Ans: Differentiated records load is delta load.
Q: What is SCD?
Ans: Slowly changing dimension for implementung actions for new and modified records.
Q: What is CDC?
And: Change Data Capture for taking sources changes and doing some actions.
Source changes are Inserts,updates and deletes.
Q: Can you give alternate terminologies for source, target, import and export?
Ans: Source--> Data producer/provider/feed/input
Target-->Desintation/ Data consumer/ Data output
Q: Can yopu explain me Dataflow task features briefly?
Ans: Moves data from source to destination.
Uses internal buffer while processing [in memory process]
Each data flow task initiates one Execution Tree internally.
Q: Why DataFlow task runs faster than all other tasks in SSSI?
Ans: Seperate process [in memory process] and seperate engine.
SSRS Interview Questions ( Simple, Medium and Complex)
What is Report?
Data display component with visuals is called report