Often, some requests for reports seem simple and straightforward. But it is challenging to find a better solution to be applied. Cloudera's Impala provides several features, and in order to accomplish the tasks in the best possible way, an analyst needs to be aware of them.
Assume that, after filtering out specific values, there are two separate result sets that need to be combined into a single output with distinct values. How would you do that?
Let's see with an example.
Below are two sample datasets:
Now the requirement is -
i) Get the products that cost over 500 AED from "DXB_Products" table
ii) Get the products from SHJ_Products that cost less than 500 AED.
iii) Now combine both the resultsets and filter-out unique products.
Common table expression is the best way to get the desired output. Here's the query -
WITH CTE1 AS (
SELECT * FROM DXB_Products WHERE Price > 500
),
),
CTE2 AS (
SELECT * FROM SHJ_Products WHERE Price > 500
)
SELECT * FROM SHJ_Products WHERE Price > 500
)
SELECT * FROM CTE1 UNION SELECT * FROM CTE2;
Hope you find this article useful.