Wednesday, August 12, 2020

Concatenate rows (group concatenation) in MySQL, Hive, SQL Server and Oracle

The CONCAT() function joins the input strings or different column values and returns as a single string. However, if we need to concatenate rows of strings (rows from the same column) into a single string with a desired separator, this will not work.

There comes a function that fulfills this requirement and this function known with different names in various RDBMS and Big Data Technologies like Hive.

==========

MySQL:

==========

GROUP_CONCAT() is a function which merges the data from multiple rows into one field. It is a GROUP BY function which returns a string. Comma (,) will be used to separate the values in the string.

Example:

SELECT StudentName,GROUP_CONCAT(Subjects)

FROM tbStudentInfo

GROUP BY StudentName;

==========

Hive:

==========

Hive doesn’t have the same functionality like in MySQL however there are two functions collect_set() and CONCAT_WS() to be used to get the desired output.

Separator has to be specified explicitly.

Example:

SELECT StudentName,CONCAT_WS(‘,’, collect_set(Subjects)) as Group_Concat

FROM tbStudentInfo

GROUP BY StudentName;

==========

SQL Server:

==========

STRING_AGG() is the function which is introduced in SQL Server 2017 which is equivalent to MySQL’s GROUP_CONCAT function. There is a workaround to achieve group concatenation using STUFFF() along with FOR XML and PATH() functions if you are using older versions.

SQL Server 2017: Example:

SELECT  StudentName,

              STRING_AGG(Subjects,';') Subjects

FROM tbStudentInfo

GROUP BY

    StudentName;

Prior to SQL 2017: Example:

SELECT  DISTINCT StudentName,

              STUFF((SELECT ','+ a.Subjects FROM tbStudentsInfo a

WHERE a.StudentName = b.StudentName

FOR XML PATH('')),1,1,'') AS Subjects

              FROM tbStudentsInfo b

==========

Oracle 11g:

==========

LISTAGG orders data within each group specified in the ORDER BY clause for a specified measure, and then concatenates the values of the measure column.

SELECT

    StudentName,

    LISTAGG(Subjects, ', ') WITHIN GROUP (ORDER BY Subjects) "Subjeccts"

FROM tbStudentInfo

GROUP BY StudentName



No comments:

Post a Comment

Big Data & SQL

Hi Everybody, Please do visit my new blog that has much more information about Big Data and SQL. The site covers big data and almost all the...