Showing posts with label Regular Expressions Search in Hive (RLIKE). Show all posts
Showing posts with label Regular Expressions Search in Hive (RLIKE). Show all posts

Friday, August 14, 2020

Regular Expressions - RLIKE in Hive

Either SQL Developers or Data Analysts often use arithmetic operators in their queries such as =, >, <, !=

Apart from this, LIKE, EXISTS, IN, NOT IN, NOT EXISTS, etc. will also be used very frequently. 

These are all will help in fetching only the required data.

There is a relational operator (RLIKE) which might not be used the way the above referred to used. But it is helpful in searching the string with similar text the way LIKE operator does in SQL. Both MySQL and Hive provides RLIKE operator that can be used for searching Advanced Regular Expressions.

REGEXP_LIKE in Oracle is equivalent to RLIKE.

Let us see what RLIKE does in Hive.

The below is my “test” table data.

Id

somecol

101

123xyz

102

1234

103

3456

104

abcde

105

Vxyz

106

Bbc


select * from
test where somecol RLIKE '[0-9]+';

This will return all the numbers from the column. If the column value has “123xyz” then it will also be returned. In case if we want to eliminate text contained values and return only numbers then use the following statement.

select * from test where somecol RLIKE '[0-9]+$';

Or you can use-

select * from test where somecol RLIKE '^[0-9]+$';

If you want to retrieve the rows that has no numbers:

SELECT * FROM test WHERE somecol RLIKE '([a-z]|[A-Z])+' 

If you want to retrieve the rows that has numbers with decimal points:

SELECT * FROM test WHERE somecol RLIKE '[0-9]+[.][0-9]+'


Please leave some comments if you need more clarification. 


Big Data & SQL

Hi Everybody, Please do visit my new blog that has much more information about Big Data and SQL. The site covers big data and almost all the...