Duplicates can be removed by using Sort stage. We can use the option, as allow duplicate = false.
12) What steps should be taken to improve Datastage jobs?
In order to improve performance of Datastage jobs, we have to first establish the baselines. Secondly, we should not use only one flow for performance testing. Thirdly, we should work in increment. Then, we should evaluate data skews. Then we should isolate and solve the problems, one by one. After that, we should distribute the file systems to remove bottlenecks, if any. Also, we should not include RDBMS in start of testing phase. Last but not the least, we should understand and assess the available tuning knobs.
13) Differentiate between Join, Merge and Lookup stage?
All the three concepts are different from each other in the way they use the memory storage, compare input requirements and how they treat various records. Join and Merge needs less memory as compared to the Lookup stage.
14) Explain Quality stage?
Quality stage is also known as Integrity stage. It assists in integrating different types of data from various sources.
15) Define Job control?
Job control can be best performed by using Job Control Language (JCL). This tool is used to execute multiple jobs simultaneously, without using any kind of loop.
16) Differentiate between Symmetric Multiprocessing and Massive Parallel Processing?
In Symmetric Multiprocessing, the hardware resources are shared by processor. The processor has one operating system and it communicates through shared memory. While in Massive Parallel processing, the processor access the hardware resources exclusively. This type of processing is also known as Shared Nothing, since nothing is shared in this. It is faster than the Symmetric Multiprocessing.
17) What are the steps required to kill the job in Datastage?
To kill the job in Datasatge, we have to kill the respective processing ID.
18) Differentiate between validated and Compiled in the Datastage?
In Datastage, validating a job means, executing a job. While validating, the Datastage engine verifies whether all the required properties are provided or not. In other case, while compiling a job, the Datastage engine verifies that whether all the given properties are valid or not.
19) How to manage date conversion in Datastage?
We can use date conversion function for this purpose i.e. Oconv(Iconv(Filedname,”Existing Date Format”),”Another Date Format”).
20) Why do we use exception activity in Datastage?
All the stages after the exception activity in Datastage are executed in case of any unknown error occurs while executing the job sequencer.