Tuesday, September 29, 2020

HDFS Basic Commands

This article will explore some Hadoop basic commands that help in our day-to-day activities.

Hadoop file system shell commands are organized in a similar way to Unix/Linux environments. For people who work with Unix shell, it is easy to turn to Hadoop shell commands. Such commands communicate with HDFS and other Hadoop-supported file systems.

1) List-out the contents of the directory.

ls
is to list out the files from the current directory (local system)

hadoop fs -ls
will list HDFS home directory (/user/cloudera/) content of the current user

hadoop fs -ls /
will list sub-directories of the root directory.

hdfs dfs -ls
will list the contents of the root directory.

Note: Use hadoop fs for older versions and hdfs dfs for newer versions of Hadoop. 

hadoop fs -ls /user/cloudera
/user/cloudera is default HDFS location in Cloudera VM where users files get copied.

hadoop fs -ls -R / 
recursively displays entries in all subdirectories of a path

2) Create or delete a directory

hadoop fs –mkdir /path/directory_name
mkdir is the command to create a folder/directory in a given path. 

Example:
hadoop fs -mkdir testdir1
hadoop fs –mkdir /user/cloudera/testdir2

hadoop fs -rm -r /user/cloudera/testdir2
-rm -r is the command to delete a folder/directory or a specific file.

Example:
hadoop fs -rm -r /user/cloudera/testdir2
hadoop fs -rmr /user/cloudera/testdir2/file1.txt

Note: If the OS is in safemode then you’ll not be able to create any directories in HDFS.

To check the status of safemode
hadoop dfsadmin -safemode get

To change the safemode to ON
hadoop dfsadmin -safemode enter

To change the safemode to OFF / or to leave the safemode

hadoop dfsadmin -safemode leave


3) Copy The File From Local System To Hadoop

hadoop fs -put <sourcefilepath> <destinationfilepath>

Examples:

hadoop fs -put Desktop/Documents/emp.txt /user/cloudera/empdir

hadoop fs -copyFromLocal Desktop/Documents/emp.txt /user/cloudera/emp.txt

To know more about "copyFromLocal", "put" "copyToLocal" and "get", please click here.  

4) Read the file

hadoop fs -cat /user/cloudera/emp.txt

The above command helps in reading the file however, one has to avoid using this command for large files since it can impact on I/O. This command is good for files with small data.

5) Copy the file from HDFS to Local System

hadoop fs -get /user/cloudera/emp.txt Desktop/Documents/emp1.txt
hadoop fs -copyToLocal /user/cloudera/emp.txt Desktop/Documents/emp2.txt

This is reverse scenario of Put & CopyFromLocal. For more information click here.


6) Move the file from one HDFS location to another (HDFS location)

Hadoop fs -mv emp.txt testDir

Hadoop fs -mv testDir tesDir2

Hadoop fs -mv testDir2/testDir /user/cloudera

Hadoop fs -mv testDir/emp.txt /user/cloudera

7) Admin Commands

sudo vi /etc/hadoop/conf/hdfs-site.xml 
Note: hdfs-site.xml is a configuration file where we can change.

To view the config settings
go to --> computer-browse folder-filesystem-->etc-->hadoop-->conf-->hdfs-site.xml

To change the default configuration values such as dfs.replication or dfs.blocksize from hdfs-site.xml, use the sudo commands

sudo vi /etc/hadoop/conf/hdfs-site.xml
Note: "vi" is the editor to edit such sudo files.

Click "I" for insert option or to bring it in edit mode.

Modify the values as per your requirement.

To save and exit :wq!

hadoop fs -tail [-f] <file>

The Hadoop fs shell tail command shows the last 1KB of a file on console or stdout.


1 comment:

Big Data & SQL

Hi Everybody, Please do visit my new blog that has much more information about Big Data and SQL. The site covers big data and almost all the...