Home Interview Questions and Answers Sqoop Interview Questions and Answers For Freshers Part-4

sqoop31.In a sqoop import command you have mentioned to run 8 parallel Mapreduce task but sqoop runs only 4. What can be the reason?
The Mapreduce cluster is configured to run 4 parallel tasks. So the sqoop command must have number of parallel tasks less or equal to that of the MapReduce cluster.

32.What is the importance of –split-by clause in running parallel import tasks in sqoop?
The –split-by clause mentions the column name based on whose value the data will be divided into groups of records. These group of records will be read in parallel by the mapreduce tasks.

33.What does this sqoop command achieve?
$ sqoop import –connnect <connect-str> –table foo –target-dir /dest \
It imports data from a database to a HDFS file named foo located in the directory /dest

34.What happens when a table is imported into a HDFS directory which already exists using the –apend parameter?
Using the –append argument, Sqoop will import data to a temporary directory and then rename the files into the normal target directory in a manner that does not conflict with existing filenames in that directory.

35.How can you control the mapping between SQL data types and Java types?
By using the –map-column-java property we can configure the mapping between.

Below is an example

$ sqoop import … –map-column-java id = String, value = Integer

36.How to import only the updated rows form a table into HDFS using sqoop assuming the source has last update timestamp details for each row?
By using the lastmodified mode. Rows where the check column holds a timestamp more recent than the timestamp specified with –last-value are imported.

37.What are the two file formats supported by sqoop for import?
Delimited text and Sequence Files.

38.Give a sqoop command to import the columns employee_id,first_name,last_name from the MySql table Employee
$ sqoop import –connect jdbc:mysql://host/dbname –table EMPLOYEES \
–columns “employee_id,first_name,last_name”

39.Give a sqoop command to run only 8 mapreduce tasks in parallel
$ sqoop import –connect jdbc:mysql://host/dbname –table table_name\
-m 8

40.What does the following query do?
$ sqoop import –connect jdbc:mysql://host/dbname –table EMPLOYEES \
–where “start_date > ‘2012-11-09’
It imports the employees who have joined after 9-NOv-2012.

You may also like

Leave a Comment