Thursday, August 13, 2020

"Target-Dir" vs "Warehouse-Dir" in Sqoop

This article is about using “Target-Directory” and “Warehouse-Directory” while Sqoop Import.

Please refer to the below codes.

Code-1: Usage of “Target-Directory”

sqoop import 

--connect jdbc:mysql://localhost/empinfo

--username root

--password cloudera

--table emp

--target-dir /user/hive/warehouse/empinfo;

Code-2: Usage of “Warehouse-Directory”

sqoop import 

--connect jdbc:mysql://localhost/empinfo

--username root

--password cloudera

--table emp

--warehouse-dir /user/hive/warehouse/empinfo;

Both the codes works in the same way. Both ‘target-dir’ and ‘warehouse-dir’ in the above mentioned examples creates the “empinfo” folder in /user/hive/warehouse location.

The difference is, when using “target-dir”, the emp data (part files) will be stored in “empinfo” directly.

The path of the data will be- /user/hive/warehouse/empinfo/part-m-00000.

Warehouse-dir creates the folder named “emp” under “empinfo” and places the data in it.

The path of the data will be /user/hive/warehouse/empinfo/emp/part-m-00000.

 

Note the below points:

  • Target-dir will work only when you import a single table. That implies this won’t work when you use “Sqoop import-all-tables”
  • Warehouse-dir creates the parent directory in which all your tables will be stored in the folders which are named after the table name.
  • If you are importing table by table, each time you need to provide the distinctive target-directory location as target-directory location can’t be same in each import.

Hope you like this article.


No comments:

Post a Comment

Big Data & SQL

Hi Everybody, Please do visit my new blog that has much more information about Big Data and SQL. The site covers big data and almost all the...