Teradata Interview Questions
Q.How do you Generate sequence at the time of Display?
Answer: By Using CSUM
Q.How do you Generate Sequence in Teradata?
Answer: By Using Identity Column 1-for storing purpose using identity. 2-for display purpose using csum.
Q.How do you load Multiple files to a table by using fast load scripts?
Answer: Loading statement in the script and Replace the file one by one in the script till last file and submit every time so that data appended in Amp Level.For the last file specify End Loading statement in the script and Run.so that it runs from Amp to table. know more at Teradata online training
Q.What is the difference between FastLoad and MultiLoad?
Answer: FastLoad uses multiple sessions to quickly load large amount of data on empty table. MultiLoad is used for high-volume maintenance on tables and views. It works with non-empty tables also. Maximum 5 tables can be used in MultiLoad.
Q.Which is faster?
Answer: FastLoad. Q.Difference between Inner join and outer join?
Answer: An inner join gets data from both tables where the specified data exists in both tables. An outer join gets data from the source table at all times, and returns data from the outer joined table ONLY if it matches the criteria.
Q.What is multi Insert?
Answer: Inserting data records into the table using multiple insert statements. Putting a semi colon in front of the key word INSERT in the next statement rather than terminating the first statement with a semi colon achieves it. Insert into Sales “select * from customer” ; Insert into Loan “select * from customer”;
Q.Is multi insert ANSI standard?
Answer: No.
Q.How do you create a table with an existing structure of another table with data and with no data?
Answer: Create table Customerdummy as Customer with data / with no data;
Q.What is the opening step in Basic Teradata Query script?
Answer: Logon tdipid/username, password. Q.You are calling a Bteq script, which drops a table and creates a table. It will throw an error if the table does not exist.
Q.How can you do it without throwing the error?
Answer: You can it by setting error level to zero before dropping and resetting the error level to 8 after dropping. You can do it like this ERRORLEVEL (3807) SEVERITY 0; DROP TABLE EMPLOYEE; ERRORLEVEL (3807) SEVERITY 8;
Q.Can you FastExport a field, which is primary key by putting equality on that key?
Answer: No.
Q.Did you write stored procedures in Teradata?
Answer: No, because they become a single amp operation and my company didn’t encourage that.
Q.What is the use of having index’s on table?
Answer: For faster record search.
Q.Did you use Query man or SQL assistance?
Answer: SQL assistant 6.1
Q.I am updating a table in Bteq. It has to update a large number of rows, so it’s really slow. What do you suggest?
Answer: In Teradata it is not recommended to update more than 1 million rows due to journal space problems, if it is less than that and it’s slow in the Bteq, you might want to add collect statistics statement before the update statement. know more at Teradata training
Q.Is it necessary to add? QUIT statement after a Bteq query when I am calling it in a Unix environment?
Answer: Not necessary but it is good to add a QUIT statement after a query.
Q.There is a column with date in it. If I want to get just month how It can be done? Can I use sub string?
Answer: Sub string is used with char fields. So it cannot be used. To extract month from a date column, ex select extract (month from ). Same thing for year or day. Or hour or minutes if it’s a time stamp (select extract (minute from column name).
Q.What’s the syntax of sub string?
Answer: SUBSTRING (string_expression, n1 )
Q.Did you use CASE WHEN statement. Can you tell us a little about it?
Answer: Yes. When a case has to be selected depending upon the value of the expression.
Q.While creating table my DBA has FALLBACK or NO FALLBACK in his DDL. What is that?
Answer: FALLBACK requests that a second copy of each row inserted into a table be stored on another AMP in the same cluster. This is done when AMP goes down or disk fails.
Q.My table got locked during MLOAD due to a failed job. What do I do to perform other operations on it?
Answer: Using RELEASE MLOAD. It removes access locks from the target tables in Teradata. It must be entered from BTEQ and not from MultiLoad.
To proceed, you can do RELEASE MLOAD
Q.How to find duplicates in a table?
Answer: Group by those fields and select id, count(*) from table group by id having count (*) > 1
Q.How to you verify a complicated SQL?
Answer: I use explain statement to check if the query is doing what I wanted it to do.
Q.How many tables can you join in V2R5
Answer: Up to 64 tables.
Q.Did u ever use UPPER Function?
Answer: UPPER Function is used to convert all characters in a column to the same characters in upper case.
Q.What does a LOWER Function do?
Answer: LOWER function is used to convert all characters in a column to the lower case characters.
Q.How do you see a DDL for an existing table?
Answer: By using show table command.
Q.Which is more efficient GROUP BY or DISTINCT to find duplicates?
Answer: With more duplicates GROUP BY is more efficient, if only a few duplicates exist DISTINCT is more efficient.
Q.Syntax for CASE WHEN statement?
Answer: CASE value_expression_1 WHEN value_expression_n THEN scalar_expression_n END;
Q.What’s the difference between TIMESTAMP (0) and TIMESTAMP (6)?
Answer: TIMESTAMP (0) is CHAR (19) and TIMESTAMP (6) is CHAR (26)Everything is same except that TIMESTAMP (6) has microseconds too.
Q.How do you determine the number of sessions?
Answer: Teradata performance and workloadClient platform type, performance and workloadChannel performance for channel attached systemsNetwork topology and performance for network attached systems.Volume of data to be processed by the application.
Q.What is node? How many nodes and AMPs used in your previous project?
Answer: Node is a database running in a server. We used 318 nodes and each node has 2 to 4 AMPS.
Q.What is a clique
Answer: Clique is a group of disk arrays physically cabled to a group of nodes.
Q.Interviewer explained about their project (Environment, nature of work)
Answer: Listen to them carefully so that at the end of the interview you can ask questions about the project when you are given a chance to ask questions.
Q.Tell us something about yourself?
Answer: Describe about your project experience, technical skill sets, hard working, good team player, self-learner and self-motivated.
Q.What is the best project you ever worked with and why it is best project?
Answer: All the projects I worked so far are best projects. I treat every project is equal and work hard for the success of the project.
Q.What makes a project successful and how you have contributed to the success of the project?
Answer: Good team members, technical knowledge of team members, hard work, sharing knowledge among the team, individual’s contribution to the project. Explain them that you posses all the skills you mentioned above.
Q.Have you worked under stress and how did you handle it?
Answer: Yes. Many times to deliver the project on schedule, we were under lot of pressure. During those times we work extra hours and help each other in the team to deliver the project on schedule. Team effort is key factor for the success of the project.
Q.What is the difference between FastLoad and MultiLoad?
Answer: FastLoad uses multiple sessions to quickly load large amount of data on empty table.MultiLoad is used for high-volume maintenance on tables and views. It works with non-empty tables also. Maximum 5 tables can be used in MultiLoad.
Q.Have you used procedures?
Answer: No. I have not used procedures. But I have expertise knowledge writing procedures. My company have not encouraged me to write procedures because it becomes single AMP operation, as such uses lot of resources and expensive in terms of resource and time.
Q.What is the purpose of indexes?
Answer: An index is a mechanism that can be used by the SQL query optimizer to make table access more performant. Indexes enhance data access by providing a more-or-less direct path to stored data and avoiding the necessity to perform full table scans to locate the small number of rows you typically want to retrieve or update.
Q.What is primary index and secondary index?
Answer: Primary index is the mechanism for assigning a data row to an AMP and a location on the AMP’s disks. Indexes also used to access rows from a table without having to search the entire table.Secondary indexes enhance set selection by specifying access paths less frequently used than the primary index path. Secondary indexes are also used to facilitate aggregate operations. If a secondary index covers a query, then the Optimizer determines that it would be less costly to accesses its rows directly rather than using it to access the base table rows it points to. Sometimes multiple secondary indexes with low individual selectivity can be overlapped and bit mapped to provide enhanced
Q.What are the things to considered while creating secondary index?
Answer: Creating a secondary index causes Teradata to build a sub-table to contain its index rows, thus adding another set of rows that requires updating each time a table row is inserted, deleted, or updated. Secondary index sub-tables are also duplicated whenever a table is defined with FALLBACK, so the maintenance overhead is effectively doubled.
Q.What is collect statistics?
Answer: Collects demographic data for one or more columns of a table, hash index, or join index, computes a statistical profile of the collected data, and stores the synopsis in the data dictionary. The Optimizer uses the synopsis data when it generates its table access and join plans. know more at Teradata online training from india
Q.Can we collect statistics on multiple columns?
Answer: Yes we can collect statistics on multiple columns.
Q.Can we collect statistics on table level?
Answer: Yes we can collect statistics on table level. The syntax is COLLECT STAT ON TAB_A;
Q.What is inner join and outer join?
Answer: An inner join gets data from both tables where the specified data exists in both tables.An outer join gets data from the source table at all times, and returns data from the outer joined table ONLY if it matches the criteria.
Q.When Tpump is used instead of MultiLoad?
Answer: TPump provides an alternative to MultiLoad for the low volume batch maintenance of large databases under control of a Teradata system. Instead of updating Teradata databases overnight, or in batches throughout the day, TPump updates information in real time, acquiring every bit of data from the client system with low processor utilization. It does this through a continuous feed of data into the data warehouse, rather than the traditional batch updates. Continuous updates results in more accurate, timely data. And, unlike most load utilities, TPump uses row hash locks rather than table level locks. This allows you to run queries while TPump is running. This also means that TPump can be stopped instantaneously. As a result, businesses can make better decisions that are based on the most current data.
Q.What is spool space and when running a job if it reaches the maximum spool space how you solve the problem?
Answer: Spool space is used to hold intermediate rows during processing, and to hold the rows in the answer set of a transaction. Spool space reaches maximum when the query is not properly optimized. Use appropriate conditions in WHERE clause of the query to limit the answer set.
Q.What is your level of expertise in using MS office suite?
Answer: Expert level. Using it for last 8 years for documentation.
Q.Have you used Net meeting?
Answer: Yes. Used net meeting for team meeting when members of the team geographically in different locations.
Q.Do you have any question?
Answer: What is the team size going to be? What is the current status of the project? What is the project schedule?
Q.What is your available date?Answer: Immediate. Or your available date for the project.
Q.How much experience you have in MVS?
Answer: Intermediate. In my previous two projects used MVS to submit JCL jobs.
Q.Have you created JCL script from scratch?
Answer: Yes. I have created JCL scripts from scratch while creating jobs in the development environment.
Q.Have you modified any JCL script and used?Answer: Yes I have modified JCL scripts. In my previous projects many applications were re-engineered so the existing JCL scripts were modified according to the company coding standards.Q.Rate yourself on using Teradata tools like BTEQ, Query man, FastLoad, MultiLoad and Tpump!Answer: Intermediate to expert level. Extensively using for last 4 years. Also I am certified in Teradata.Q.Which is your favorite area in the project?Answer: I enjoy every working on every part of the project. Volunteer my time for my peers so that I can also learn and contribute more towards the project success.Q.What is data mart?Answer: A data mart is a special purpose subset of enterprise data used by a particular department, function or application. Data marts may have both summary and details data, however, usually the data has been pre aggregated or transformed in some way to better handle the particular type of requests of a specific user community. Data marts are categorized as independent, logical and dependant data marts.Q.Difference between star and snowflake schemas?Answer: Star schema is De-normalized and snowflake schema is normalized. Q.Why are OLTP database designs not generally a good idea for a Data Warehouse?Answer: OLTP designs are for real time data and they are not normalized and pre-aggregated. They are not good for decision support systems.Q.What type of Indexing mechanism do we need to use for a typical data warehouse?Answer: Primary Index mechanism is the ideal type of index for data warehouse.Q.What is VLDB?Answer: Very Large databases. Please find more information on it.Q.What is real time data warehousing?Answer: Real-time data warehousing is a combination of two things: 1) real-time activity and 2) data warehousing. Real-time activity is activity that is happening right now. The activity could be anything such as the sale of widgets. Once the activity is complete, there is data about it. Data warehousing captures business activity data. Real-time data warehousing captures business activity data as it occurs. As soon as the business activity is complete and there is data about it, the completed activity data flows into the data warehouse and becomes available instantly. In other words, real-time data warehousing is a framework for deriving information from data as the data becomes available. know more at Teradata online courseQ.What is ODS?Answer: An operational data store (ODS) is primarily a "dump" of relevant information from a very small number of systems (often just one) usually with little or no transformation. The benefits are an ad hoc query database, which does not affect the operation of systems required to run the business. ODS’s usually deal with data "raw" and "current" and can answer a limited set of queries as a result.Q.What is real time and near real time data warehousing?Answer: The difference between real time and near real time can be summed up in one word: latency. Latency is the time lag that is between an activity completion and the completed activity data being available in the data warehouse. In real time, the latency is negligible whereas in near real time the latency is a tangible time frame such as two hours.
Comments
Post a Comment