Monday, September 28, 2020

Difference between CopyFromLocal, Put, CopyToLocal and Get

The purpose of this article is to let you know about few HDFS commands that are identical in behavior but distinct.


CopyFromLocal and Put: These two commands help in copying the file from one location to another. The difference between these two is that the "CopyFromLocal" command will help copy the file from local file system to HDFS, while the "Put" command will copy from anywhere (local or network) to anywhere (HDFS or local file system).

hadoop fs -put <Local system directory path or network path> <HDFS file path>

hadoop fs -copyFromLocal <Local system directory path>  <HDFS file path>

"Put" allows us to copy several file paths to HDFS at once (files or folders from 
local or remote locations), while copyFromLocal, on the other hand, is limited to local file reference.

A choice exists to overwrite an existing file using -f when using copyFromLocal. However, an error is returned if the file persists when "put" is executed.

In short, anything you do with copyFromLocal, you can do with "put", but not vice-versa.

CopyToLocal and Get: These two commands are just opposite to "CopyFromLocal" and "Put".
The destination is restricted to a local file reference when we use copyToLocal. While using "Get" there are no such restrictions.

Anything you do with copyToLocal, you can do with "get" but not vice-versa.
hadoop fs -get <HDFS file path> <Local system directory path> hadoop fs -copyToLocal <HDFS file path> <Local system directory path>

For complete HDFS commands please click here. For complete Hive DDL commands please click here.

No comments:

Post a Comment

Big Data & SQL

Hi Everybody, Please do visit my new blog that has much more information about Big Data and SQL. The site covers big data and almost all the...