This article offers an overview of the various data types that are available both in Apache Hive & Impala.
TINYINT - 1 byte
Range: -128 to 127
Range: -128 to 127
SMALLINT - 2 bytes
Range: -32,768 to 32,767
INT - 4-bytes
Range: -2,147,483,648 to 2,147,483,647
BigInt - 8 bytes value
Range: -9223372036854775808 .. 9223372036854775807.
FLOAT - 4 bytes
single precision floating point number
single precision floating point number
DOUBLE - 8-byte
double precision floating point number
double precision floating point number
DECIMAL
Hive 0.13.0 introduced user definable precision and scale
STRING
The hard limit on the size of a STRING and the total size of a row is 2 GB.
The limit is 1 GB on STRING when writing to Parquet files.
TIMESTAMP
Timestamps were introduced in Hive 0.8.0. It supports traditional UNIX timestamp with the optional nanosecond precision.
The supported Timestamps format is yyyy-mm-dd hh:mm:ss[.f…].
Complex types:
Complex types (also referred to as nested types) in Hive let you represent multiple data values within a single row/column position. Impala supports the complex types ARRAY, MAP, and STRUCT in Impala 2.3 and higher.
Arrays: Array<data_type>
Collection of Similar Data
Maps: Map<primitive_type, data_type>
Key Value Combination
Structs: Struct<col_name : data_type [Comment col_comment], …>
Collection of Different Data
No comments:
Post a Comment