Data is the backbone of the business, and it can be in any sort of format. It can be lying in your relational databases or it can be a tweet in the twitter or a post in the Facebook or a message in the Instagram. Worldwide data is expected to hit 175 zettabytes by 2025 and over 80% of this data will be in an unstructured format due to social media and mobile applications.
Unstructured data usually refers to information that doesn't reside in a traditional row-column database. For example, data stored in XML and JSON documents, CSV files, and Excel files are all unstructured. RDBMSs like Oracle SQL*Plus, Microsoft SQL Server, MySQL are adopting the changes and introducing the functionalities that can handle unstructured data.
My recent articles focused on such newly introduced functions that either helps in analyzing the data, importing or converting the unstructured data into a readable and meaningful format.
In this article, we will see how a
Json formatted file be converted into usual row-column table format in SQL
Server 2016 using OpenJson
function. Json (JavaScript Object Notation) is open standard file format that consists
of attribute-value pairs and array data types. Look at the below example.
Filename: “test.json”
[{"Name":"Zafar
Iqbal","surname": "Farooq", "Age":30,
"Address":["142", "Street 29c",
"Rashidiya", "Dubai"]},
{"Name":"Tahir","surname":
"Farooq", "Age":24, "Address":["12C",
"Happy Street", "Al Quoz", "Dubai"]}]
DECLARE @Json VARCHAR(MAX)
SELECT @Json = BulkColumn
FROM OPENROWSET (BULK 'D:\MyDocuments\blog
posts\JSON-SQLServer\test.json', SINGLE_CLOB) AS j
SELECT * FROM OPENJSON (@Json)
WITH( Name
VARCHAR(20) '$.Name',
LastName VARCHAR(20) '$.surname',
Age INT '$.Age',
Building VARCHAR(10) '$.Address[0]',
Street VARCHAR(20) '$.Address[1]',
Area VARCHAR(20) '$.Address[2]',
City VARCHAR(20) '$.Address[3]')
Explanation:
- In the above code, we are importing
the “test.json” contents into a variable named @Json.
- OPENROWSET function is invoked to
read the data from a file. Since it is a table-valued function, it returns a
table with a single column and the column name must be “BulkColumn”.
- SINGLE_CLOB reads a file as
VARCHAR(MAX).
- “As” in the above code is to provide a correlation name for the
bulk rowset in the from clause. The alias name can be anything.
- Once we populated the json data
into a variable, we can proceed further to convert the data into rows and columns
by parsing it using OpenJSON function. OpenJSON function accepts JSON as a
parameter and it returns the key:value pairs of the first-level elements in the
JSON.
- Address field in the Json is an
array data type consists of building number, street, area and city which we
split during the query and the output is –