This
is the continuation part of "Sqoop Complete Tutorial". If you want to
read Part-1, please click here. We are about to discuss the following in the current and in the upcoming parts.
Importing data from MySQL to HDFS
- Importing a table from MySQL to the default location of the HDFS
- Importing a table from MySQL to user-specified location in HDFS
- Importing all the tables from MySQL to the default location of the HDFS
- Importing all the tables from MySQL to user-specified location in HDFS.
- Importing all the tables but excluding few tables from MySQL to HDFS.
- Importing part of the data from a table from MySQL to HDFS using WHERE condition.
- Importing a few columns from a table from MySQL to HDFS
- Importing and compressing the table's data to HDFS
- Importing data by specifying the column delimiter.
- Importing data by splitting the data based on a specific column.
- Import the data from MySQL to HDFS based on a SQL Query.
- From MySQL incremental import data into HDFS.
- Import and append the data.
- Import data from MySQL and store in a sequential file format in HDFS.
- Read the contents of the source database without importing it.
- Get the list of the databases.
Importing data from MySQL to Hive
- Importing a table from MySQL to Hive's default database.
- Importing a table from MySQL to Hive's user's database.
- Importing all the tables from MySQL to Hive
- Importing all the tables but excluding few from MySQL to Hive
- Import the table's structure from MySQL to Hive without data.
- Import the table from MySQL to Hive with the specific delimiter.
- Sqoop - Creating a job
- Sqoop - Displaying the list of the jobs
- Sqoop - Displaying the properties of a job
- Sqoop - Executing the job
- Sqoop - Deleting the job
- Importing a table from MySQL to HBase
- Importing a table with some selected columns from MySQL to HBase
Ok, let us begin the exercises with the above-mentioned points one by one.
Importing data from MySQL to HDFS
1)
Importing a table from MySQL to HDFS default location.
sqoop
import
--connect jdbc:mysql://localhost/retail_db
--table categories
--username root
--password cloudera
-m 1
Explanation: "Sqoop Import" is the keyword to be used
while import. Sqoop is a JDBC utility to move data between Hadoop
Ecosystems and relational databases. Username and password must be provided to
connect to the RDBMS.
Assigning
mappers is the user's choice. Users can assign a number of mappers based on the
size of the data that is being imported. In case the password not to be
disclosed in the connection string, the user can simply pass -P argument
without providing any password. During the run-time, it will prompt you to
enter the password. Refer to below -
Importing
a table from MySQL to HDFS default location - Hiding password.
sqoop
import
--connect jdbc:mysql://localhost/retail_db
--table categories
--username root
--P
-m 1
The
above command will create the directory named “categories” and the table data
will be stored as a file (part-m-00000) in the specified directory
"/user/cloudera"
2) Importing a table from MySQL to HDFS user-specified target location.
sqoop
import
--connect jdbc:mysql://localhost/retail_db
--table categories
--username root
--password cloudera
--target-dir 'retail_db'
-m 1
3)
Importing all the tables from MySQL to HDFS default location
sqoop
import-all-tables
--connect jdbc:mysql://localhost/retail_db
--username root
--password cloudera
Note: All the tables will be placed
in the root directory as files.
4)
Importing all the tables from MySQL to HDFS user-specified target location.
sqoop
import-all-tables
--connect jdbc:mysql://localhost/retail_db
--username root
--password cloudera
--warehouse-dir 'retaildb'
Note: All the tables will be placed
in the specified directory. The keyword "target-dir" will not
work with import-all-tables. No mappers assigned since a single
mapper is insufficient to import data from the "retail_db" database
as the volume is higher than the HDFS block size.
Please click here for the next part.
No comments:
Post a Comment