Monday, July 20, 2020

Analytical & Window Functions


Please refer to my previous post in which schema and data for EMP and DEPT tables available.

In this article we are about to discuss about SQL Server Analytical and Window functions. As stated in Microsoft docs, analytic functions calculate an aggregate value based on a group of rows. Unlike aggregate functions, however, analytic functions can return multiple rows for each group. Use analytic functions to compute moving averages, running totals, percentages or top-N results within a group.

These functions are common in most of the RDBMS applications and are widely used by the data and business analysts.

NTILE
It divides/distributes an ordered data set (or partition) into a specified number of groups which we call it buckets and assigns an appropriate (bucket) number to each row. The bucket number will represent each row to which bucket it belongs to.

In other words, it is used to divide rows into equal sets and assign a number to each row.

SELECT Ename, sal, NTILE(2) OVER (ORDER BY sal DESC) Bucket FROM Emp;


SELECT Ename, sal, NTILE(5) OVER (ORDER BY sal DESC) Bucket FROM Emp;


The following query retrieves the records from the first bucket.

SELECT * FROM (
SELECT          Ename,
sal, NTILE(4) OVER (ORDER BY sal DESC) Bucket
FROM Emp
) EmpAlias
WHERE Bucket=1;
ROW NUMBER:
This function represents each row with a unique and sequential value based on the column used in OVER clause. Here, we are having 10 rows in our Emp table and will use ROW_NUMBER on these records.

This can also be used to assign a serial /row number to the rows within the provided dataset.

SELECT DeptNo, sal, ROW_NUMBER() OVER (ORDER BY sal) AS row_num FROM emp;

SELECT 
     DeptNo, 
     sal, 
     ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY sal DESC) AS row_num 
FROM emp;

RANK:
This function is used to assign a rank to the rows based on the column values in OVER clause.

The row with equal values assigned the same rank with next rank value skipped.   

SELECT DeptNo, sal, RANK() OVER(ORDER BY sal DESC) AS rnk FROM emp;


SELECT   DeptNo, 
                 sal, 
                 RANK() OVER(PARTITION BY DeptNo ORDER BY sal DESC) AS rnk 
FROM emp;
DENSE_RANK: The DENSE_RANK analytics function used to assign a rank to each row. The rows with equal values receive the same rank and this rank assigned in the sequential order so that no rank values are skipped.

SELECT 
          DeptNo, 
          sal, 
          DENSE_RANK() OVER(PARTITION BY DeptNo ORDER BY sal DESC) AS dns_rnk 
FROM emp;
Let us use all the above functions in one query to see the difference in the results.

SELECT DeptNo AS dept, sal AS sal,
ROW_NUMBER() OVER (PARTITION BY DeptNo ORDER BY sal DESC) AS RowNumber,
RANK() OVER (PARTITION BY DeptNo ORDER BY sal DESC) AS iRank,
DENSE_RANK() OVER(PARTITION BY DeptNo ORDER BY sal DESC) AS DenseRank
FROM emp;


CUME_DIST:
This function stands for cumulative distribution. It computes the relative position of a column value in a group. Here, we can calculate the cumulative distribution of salaries among all departments. For a row, the cumulative distribution of salary is calculated as:

SELECT DeptNo, sal, CUME_DIST() OVER (ORDER BY sal) AS cum_dist FROM emp;

SELECT DeptNo, sal, CUME_DIST() OVER (ORDER BY sal) AS cum_dist FROM emp
WHERE DEPTNO in(20,30);


CUME_DIST(salary) = Number of rows with the value lower than or equals to salary / total number of rows in the dataset.

In the above example, due to ORDER BY clause, 1st row from salary will be counted as 1 and it will be divided by the total number of rows. That is 1/14 = 0.1

For the second row, it is 2/14 = 0.14;

For the 4th row and the next immediate row too has the same value, it will be calculated as 5/14 = 0.35 and assign it for the both rows.

Look at the outcome to understand.

PERCENT_RANK:
It is very similar to the CUME_DIST function. It ranks the row as a percentage. In other words, it calculates the relative rank of a row within a group of rows.

The range of values returned by PERCENT_RANK is between 0 to 1 and first row in the dataset is always zero. This means the return value is of the double type.

Let’s rank the salary by department wise as percentage:

Percent_Rank = (rank decreased by 1)/(remaining rows in the group)

SELECT DeptNo, sal,
RANK() OVER (PARTITION BY deptNo ORDER BY sal DESC) AS iRank,
CUME_DIST() OVER (PARTITION BY deptno ORDER BY sal) AS cum_dist,
PERCENT_RANK() OVER (PARTITION BY deptNo ORDER BY sal) AS perc_rnk 
FROM EMP;
Go

If you observe its behavior when it calculates the relative rank for the rows with same values, it assigns same percentage rank value (0.75) to both of them. The behavior is similar to rank function.

Range Between & Rows Between:
These functions are called window functions which fetches records right before and after the current record to perform the aggregation. It is similar to lead and lag functions however a window function defines a frame or window of rows with a given length around the current row, and performs a calculation across the set of data in the window. Mostly these functions will be used to get the cumulative sum/average, running or moving sum or averages.

Examples:
SELECT DISTINCT    DeptNo, --sal,
            SUM(sal) OVER(ORDER BY DeptNo RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'RangeUnbound'
FROM emp
GO


SELECT DISTINCT    DeptNo, --sal,
            SUM(sal) OVER(ORDER BY DeptNo ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'RowsUnbound'
FROM emp
GO


SELECT DISTINCT    EMPNO, sal,
            SUM(sal) OVER(ORDER BY EmpNo ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'RowsUnbound'
FROM emp
GO


SELECT DISTINCT    EMPNO, sal,
            SUM(sal) OVER(ORDER BY EmpNo RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'RangeUnbound'
FROM emp
GO


SELECT DISTINCT    Job, sal, DeptNo,
            SUM(sal) OVER(ORDER BY DeptNo RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'RangeUnbound'
FROM emp
GO


lag AND Lead FUNCTIONS:
The LAG function gets the information from a past column, while LEAD brings information from an ensuing line. The two functions are fundamentally the same as one another and you can simply supplant one by the other by changing the sort request.

SELECT ename, deptno, sal,
LEAD(sal, 1) OVER(PARTITION BY deptno ORDER BY sal) AS lead1,
LEAD(sal, 2) OVER(PARTITION BY deptno ORDER BY sal) AS lead2,
LAG(sal,1) OVER(PARTITION BY deptno ORDER BY sal) AS lag1,
LAG(sal,2) OVER(PARTITION BY deptno ORDER BY sal) AS lag2
FROM emp ORDER BY deptno,sal;


Hope this article helped you in understanding Analytical and Window functions.


Oracle SQL Emp and Dept Tables For SQL Server

--Table definition - Dept
CREATE TABLE Dept( 
DeptNo INT
DName VARCHAR(14), 
Loc VARCHAR(13), 
CONSTRAINT pk_Dept PRIMARY KEY (DeptNo) ); 

-- Tabe Definition - Emp: 
CREATE TABLE Emp( 
EmpNo INT
EName VARCHAR(10), 
Job VARCHAR(9), 
Mgr INT
HireDate date
Sal INT
Comm INT
DeptNo INT
CONSTRAINT pk_Emp PRIMARY KEY (EmpNo), 
CONSTRAINT fk_DeptNo FOREIGN KEY (DeptNo) REFERENCES Dept (DeptNo) ); 

--Dept Table data: 
INSERT INTO Dept VALUES(10, 'ACCOUNTING', 'NEW YORK'); 
INSERT INTO Dept VALUES(20, 'RESEARCH', 'DALLAS'); 
INSERT INTO Dept VALUES(30, 'SALES', 'CHICAGO'); 
INSERT INTO Dept VALUES(40, 'OPERATIONS', 'BOSTON'); 

--Emp Table data: 
INSERT INTO Emp VALUES( 7839, 'KING', 'PRESIDENT', NULL, CONVERT(DATETIME,'17-11-1981',103), 5000, NULL, 10 ); 
INSERT INTO Emp VALUES( 7698, 'BLAKE', 'MANAGER', 7839, CONVERT(DATETIME,'1-5-1981',103), 2850, NULL, 30 ); 
INSERT INTO Emp VALUES( 7782, 'CLARK', 'MANAGER', 7839, CONVERT(DATETIME,'9-6-1981',103), 2450, NULL, 10 ); 
INSERT INTO Emp VALUES( 7566, 'JONES', 'MANAGER', 7839, CONVERT(DATETIME,'2-4-1981',103), 2975, NULL, 20 ); 
INSERT INTO Emp VALUES( 7788, 'SCOTT', 'ANALYST', 7566, CONVERT(DATETIME,'13-JUL-87',103) - 85, 3000, NULL, 20 ); 
INSERT INTO Emp VALUES( 7902, 'FORD', 'ANALYST', 7566, CONVERT(DATETIME,'3-12-1981',103), 3000, NULL, 20 ); 
INSERT INTO Emp VALUES( 7369, 'SMITH', 'CLERK', 7902, CONVERT(DATETIME,'17-12-1980',103), 800, NULL, 20 ); 
INSERT INTO Emp VALUES( 7499, 'ALLEN', 'SALESMAN', 7698, CONVERT(DATETIME,'20-2-1981',103), 1600, 300, 30 ); 
INSERT INTO Emp VALUES( 7521, 'WARD', 'SALESMAN', 7698, CONVERT(DATETIME,'22-2-1981',103), 1250, 500, 30 ); 
INSERT INTO Emp VALUES( 7654, 'MARTIN', 'SALESMAN', 7698, CONVERT(DATETIME,'28-9-1981',103), 1250, 1400, 30 ); 
INSERT INTO Emp VALUES( 7844, 'TURNER', 'SALESMAN', 7698, CONVERT(DATETIME,'8-9-1981',103), 1500, 0, 30 ); 
INSERT INTO Emp VALUES( 7876, 'ADAMS', 'CLERK', 7788, CONVERT(DATETIME,'13-JUL-87', 103), 1100, NULL, 20 ); 
INSERT INTO Emp VALUES( 7900, 'JAMES', 'CLERK', 7698, CONVERT(DATETIME,'3-12-1981',103), 950, NULL, 30 ); 
INSERT INTO Emp VALUES( 7934, 'MILLER', 'CLERK', 7782, CONVERT(DATETIME,'23-1-1982',103), 1300, NULL, 10 );
Go

SELECT * FROM Emp
SELECT * FROM Dept

Sunday, July 19, 2020

Database Restore Information with Restore Start and Completed Date Time


Following points to be accomplished -
  • Get all the databases restoration information
  • From all reporting, staging or testing environments
  • Information about when the database restoration started and completed
  • If the restoration is a full restore or differential
  • When the backup file was ready
  • Above said information is to be received through Email in a tabular format.
Well, it seems to be easy, and of course it's easy once you know it.

Below are the implementation steps.
  • Write a SQL code to retrieve the complete information about backups. This includes  when the backup file was ready, the restoration type, restoration start date time and end date time. Data restoration completed date time is unavailable in backup tables hence so get the information from the logs.
  • Create a Power-Shell script to automate this process and fetch the data restoration information from across various reporting/staging/testing servers.
  • Store this information in a table to maintain a history. This will help in analyzing if the restoration time is increasing/decreasing over a period of time. 
  • Fetch this information from the table and convert it into HTML tabular format. Send the information through Email using "db Mail".

Certainly, there will be some other work-around, or smarter methods version to achieve this goal. But since this solution solves my purpose and is working perfectly, I didn't dig deeper to find a shorter or better implementation. 

Here is my table which will hold the data restoration information.


Upon execution you will get the complete data restore information from the reporting server in which you had executed the code. However the requirement is to fetch the same information from across various servers. Hence write a Power-shell script as mentioned below.


clear;
$Servers = Get-Content "D:\Monitoring_RestoreInfo\Computers.txt"; 
   foreach($computer in $Servers) 
    { invoke-sqlcmd -inputfile "D:\Monitoring_RestoreInfo\Restore Information SQL Server.sql" -serverinstance $computer -database "msdb" -ErrorAction 'Stop' -Verbose -QueryTimeout 1800 # the parameter -database can be omitted based on what your sql script does.
    }

Note: Computers.txt has all the names of the reporting/staging/testing servers.

Create a windows scheduled task in the server in where you have an access to other servers. The screenshots are given below.




Arguments provided are as follows:
  • Program/Script: C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe
  • Add arguments (Optional): D:\Monitoring_RestoreInfo\DBRestoreInfo.ps1
  • Start In : D:\Monitoring_RestoreInfo\

Now let's go back to SQL Server database in which you have created the table. Write a script to convert the data into HTML format.


Create an SQL Agent Job to automate this process and run it every morning after execution of the Power-shell script.

Thank you for visiting my blog. Please do let me know if you need more clarification in this implementation.

Tuesday, March 16, 2010

SQL Server - Data Export to XML

One of my friend requested me to explain how many ways are there to export the data from SQL Server 2005 into an XML file. Well, this artcle will let you know how to export the data from SQL Server 2005 to XML using different modes.

SELECT * FROM Customers (NOLOCK)

/* Result is
CustomerID CustomerName ActiveStatus
----------- ------------------------------ ------------
100 John 0
200 Kate 1
300 Julia 1
400 Maddy 0

(4 row(s) affected)
*/

SELECT * FROM Customers (NOLOCK)
FOR XML PATH, ROOT('root')

(Please note that I have placed "`" symbol after "<" and ">" in the results to avoid execution in the web) 

/* Result is
<`root`>
<`row`>
<`CustomerID>100<`/CustomerID>
<`CustomerName>John<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/row><`row`>
<`CustomerID>200<`/CustomerID>
<`CustomerName>Kate<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/row><`row>
<`CustomerID>300<`/CustomerID>
<`CustomerName>Julia<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/row><`row>
<`CustomerID>400<`/CustomerID>
<`CustomerName>Maddy<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/row>
<`/root>*/

/**************************************/
/** EXAMPLES FOR AUTO MODE **/
/**************************************/
In order to generate simple hierarchies we can use AUTO mode. Since the result will be in form of nested elements, it doesn't provide much control over the shape of the XML whereas EXPLICIT and PATH modes provide better control and shape of the XML.

SELECT * FROM Customers FOR XML AUTO, TYPE

/* Result is
<`Customers CustomerID="100" CustomerName="John" ActiveStatus="0" /`>
<`Customers CustomerID="200" CustomerName="Kate" ActiveStatus="1" /`>
<`Customers CustomerID="300" CustomerName="Julia" ActiveStatus="1" /`>
<`Customers CustomerID="400" CustomerName="Maddy" ActiveStatus="0" /`>
*/

Using Variables
DECLARE @cust XML;
SET @cust = (SELECT * FROM Customers FOR XML AUTO, TYPE)
SELECT @cust

/* Result is
<`Customers CustomerID="100" CustomerName="John" ActiveStatus="0" /`>
<`Customers CustomerID="200" CustomerName="Kate" ActiveStatus="1" /`>
<`Customers CustomerID="300" CustomerName="Julia" ActiveStatus="1" /`>
<`Customers CustomerID="400" CustomerName="Maddy" ActiveStatus="0" /`>
*/

XML Data is into another table
CREATE TABLE Test1(i int, x XML)

INSERT INTO Test1 SELECT 1, (SELECT * FROM Customers FOR XML AUTO, TYPE)

SELECT * FROM Test1

/* Result is
<'Customers CustomerID="100" CustomerName="John" ActiveStatus="0" /`>
<'Customers CustomerID="200" CustomerName="Kate" ActiveStatus="1" /`>
<'Customers CustomerID="300" CustomerName="Julia" ActiveStatus="1" /`>
<'Customers CustomerID="400" CustomerName="Maddy" ActiveStatus="0" /`>
*/
/************************************/
/** EXAMPLE FOR RAW MODE **/
/************************************/
Each row of the result set from the query will be converted into element. The column from the result set will be mapped to the attribute of the row element.

SELECT *
FROM Customers
FOR XML RAW, ELEMENTS

/* Result is
<`row>
<`CustomerID>100<`/CustomerID>
<`CustomerName>John<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/row><`row>
<`CustomerID>200<`/CustomerID>
<`CustomerName>Kate<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/row><`row>
<`CustomerID>300<`/CustomerID>
<`CustomerName>Julia<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/row><`row>
<`CustomerID>400<`/CustomerID>
<`CustomerName>Maddy<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/row>
*/

You can rename the element by using optional argument in RAW Mode.
SELECT *
FROM Customers
FOR XML RAW ('CustomerDetails'), ELEMENTS

/* Result is
<`CustomerDetails>
<`CustomerID>100<`/CustomerID>
<`CustomerName>John<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/CustomerDetails><`CustomerDetails>
<`CustomerID>200<`/CustomerID>
<`CustomerName>Kate<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/CustomerDetails><`CustomerDetails>
<`CustomerID>300<`/CustomerID>
<`CustomerName>Julia<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/CustomerDetails><`CustomerDetails>
<`CustomerID>400<`/CustomerID>
<`CustomerName>Maddy<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/CustomerDetails>
*/

/*****************************************/
/** EXAMPLE FOR EXPLICIT MODE **/
/*****************************************/
This mode is more recommended one when compare with RAW and AUTO modes. It is because of the control over the shape of the XML.

For more details refer to :
http://msdn.microsoft.com/en-us/library/ms189068.aspx

CREATE VIEW DataExport AS
SELECT
1 AS Tag,
NULL AS Parent,
'CustomerID' AS [data!1!identifier],
NULL AS [record!2!CustomerID!element] ,
NULL AS [record!2!CustomerName!element],
NULL AS [record!2!ActiveStatus!element]

UNION ALL


SELECT
2 AS Tag,
1 AS Parent,
'CustomerID' AS [data!1!identifier],
CustomerID AS [record!2!CustomerID!element] ,
CustomerName AS [record!2!CustomerName!element],
ActiveStatus AS [record!2!ActiveStatus!element]
FROM Customers SELECT * FROM DataExport
FOR XML EXPLICIT

/* Result is
<`data identifier="CustomerID">
<`record>
<`CustomerID>100<`/CustomerID>
<`CustomerName>John<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/record>
<`record>
<`CustomerID>200<`/CustomerID>
<`CustomerName>Kate<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/record>
<`record>
<`CustomerID>300<`/CustomerID>
<`CustomerName>Julia<`/CustomerName>
<`ActiveStatus>1<`/ActiveStatus>
<`/record>
<`record>
<`CustomerID>400<`/CustomerID>
<`CustomerName>Maddy<`/CustomerName>
<`ActiveStatus>0<`/ActiveStatus>
<`/record>
<`/data>
*/

SQL Server - vs - Oracle Functions

Many developers often ask about equivalent functions existing in various RDBMSs. I found the following link which is very useful to find technical comparison between functions of Oracle vs SQL Server.

Sunday, March 7, 2010

SQL Server - INSERT through Stored Procedure

There is one article in this blog in which we had discussed about how to use INSERT statement in various aspects. Here is one more example to insert the data into the table through a stored procedure.

/**************************/
/* Existing Table Data */
/**************************/
SELECT * FROM Customers

/**************************/
/* Procedure Creation */
/**************************/
CREATE PROCEDURE sampleProc (
@CustomerID INT,
@CustomerName VARCHAR(50),
@ActiveStatus INT)

AS


SELECT @CustomerID, @CustomerName, @ActiveStatus

**************************/
/* Procedure Ends Here*/
/**************************/


/**************************/
/* Insert Data */
/**************************/
INSERT INTO Customers
EXEC sampleProc 500, 'Harry', 1

Tuesday, February 16, 2010

SQL Server - List-out Procedures, Functions, Triggers

In order to know which procedures, functions, and triggers exist in the database, the following queries can help you to find this out.:

The below set of statements can be used to retrieve stored procedures and the number of parameters in it:


Use [Test1]
Go

SELECT SPECIFIC_NAME, COUNT(PARAMETER_NAME)
FROM INFORMATION_SCHEMA.PARAMETERS
GROUP BY SPECIFIC_NAME

The below statements are an alternative to the statements above and can be used for the same purpose:

Use [Test1]
Go

SELECT SPECIFIC_NAME, COUNT(PARAMETER_NAME),
CASE WHEN xtype = 'P' THEN 'Stored Procedure'
WHEN xtype IN ('FN', 'IF', 'TF') THEN 'Function'
END AS ProcType
FROM INFORMATION_SCHEMA.PARAMETERS a
INNER JOIN sysobjects b ON b.name = a.SPECIFIC_NAME
GROUP BY SPECIFIC_NAME, xtype

To retrieve data about a specific procedure, either the following statement:

SELECT *
FROM INFORMATION_SCHEMA.PARAMETERS
WHERE SPECIFIC_NAME=3D'MentionProcName'

Or the below statement can be used to list out the procedure details:

EXEC sp_stored_procedures

To know the parameters set for the desired procedures, you can use the below SELECT statement:

SELECT a.name, a.type_desc, b.name, parameter_id FROM sys.procedures a
INNER JOIN sys.parameters b ON a.object_ID = b.object_ID
ORDER BY a.name, parameter_id

Note that Parameter_ID in the above query refers to a serial order of the parameters that belong to the procedures.

The following statements can help you to retrieve the stored procedures or functions along with the number of lines in the code:

SELECT so.name AS ProcedureName,
CONVERT(INT, (LEN(sc.text) - LEN(REPLACE(sc.text, char(10), ''))))-1 as CodeLines,
CASE WHEN so.xtype = 'P' THEN 'Stored Procedure'
WHEN so.xtype IN ('FN', 'IF', 'TF') THEN 'Function'
END AS ObjectType
FROM sysobjects so
INNER JOIN syscomments sc ON sc.id = so.id
WHERE so.xtype IN ('P', 'FN', 'IF', 'TF') AND so.category = 0


To the above statement, you can include the following clause to avoid system procedures / functions.
AND so.name NOT IN
('fn_diagramobjects',
'sp_alterdiagram',
'sp_creatediagram',
'sp_dropdiagram',
'sp_helpdiagramdefinition',
'sp_helpdiagrams',
'sp_renamediagram',
'sp_upgraddiagrams',
'sysdiagrams')


And the following query can help you to list the triggers that are created in the database:

SELECT OBJECT_NAME(ID), OBJECT_NAME(deltrig),
OBJECT_NAME(instrig), OBJECT_NAME(updtrig)
FROM sysobjects
WHERE xtype ='U' AND (deltrig > 0 OR instrig > 0 OR updtrig > 0)

SQL Server - Insert Data From Linked Server Table

In a situation where Servers are linked, it is possible to retrieve data from a table in one server to a table on the other server.
Following is the method to carry this out:

SELECT @@SERVERNAME

This will return the server’s name to which you are currently connected to.

SELECT * FROM [bngstagedb]. testDB.dbo.temp2 WITH (NoLock)

In the above statement ‘WITH’ clause is significant. If not used then you will receive the error 'Remote table-valued function calls are not allowed.'.
Hence, it is required for you to either remove NoLock clause in the statement or add 'WITH' clause to the NoLock.

Below statement will allow you to make a copy of the required table in the current server from the linked server:

SELECT * INTO temp2
FROM [bngstagedb].testDB.dbo.temp2 WITH (NoLock)

And the below statement will allow you to insert values in an existing table from the linked server:

INSERT INTO temp2
SELECT * FROM [bngstagedb].testDB.dbo.temp2 WITH (NoLock)

Wednesday, February 3, 2010

SQL Server - Self Joins

To join tables, at-least two tables are required. It does not necessarily have to be two different tables :). It is possible to join a table with itself to retrieve the required data.

When a table is joined with itself then such type of join is called Self Join. This is useful when you want to retrieve the data in the form of organizational structure. The same table is used twice using aliases in self join.

The methodology involves joining the records in a table with other records in the same table.

Following is an example of a Self Join.

CREATE TABLE #sample1 (EmpID INT, EmpName VARCHAR (30), ManageriD INT)

INSERT INTO #sample1 VALUES (1001, 'Reman D Joseph', Null)
INSERT INTO #sample1 VALUES (1002, 'Samuel Peter', 1001)
INSERT INTO #sample1 VALUES (1003, 'Arihanth', 1005)
INSERT INTO #sample1 VALUES (1004, 'Ayesha Khan', 1001)
INSERT INTO #sample1 VALUES (1005, 'Kirlosker Joseph', 1002)
INSERT INTO #sample1 VALUES (1006, 'Firdouse', 1004)
INSERT INTO #sample1 VALUES (1007, 'Peter DeSouja', 1002)

SELECT * FROM #Sample1

Observe the following by looking into the values:

Samuel Peter's Employee ID is 1002 and his manager's employee ID is 1001. The ID 1001 is Reman D Joseph's employee ID hence Samuel Peter's manager is Reman D Joseph, right?

Let's now check how this information can be more conveniently retrieved using SQL Joins:

SELECT DISTINCT s1.EmpName, s2.EmpName FROM #sample1 s1
JOIN #sample1 s2 ON s2.EmpID = s1.ManageriD

The above query will return all the employees with a manager along with their names.

What if we want to retrieve a list of all the employees regardless of having a manager or not?

SELECT DISTINCT s1.EmpID, s1.EmpName, s2.EmpName FROM #sample1 s1
LEFT JOIN #sample1 s2 ON s2.EmpID = s1.ManageriD
ORDER BY s1.EmpID


If you want to combine the employee’s name with his manager's name, then use the following query:

SELECT DISTINCT s1.EmpName + '''s Manager is ' + s2.EmpName
FROM #sample1 s1
JOIN #sample1 s2 ON s2.EmpID = s1.ManageriD


To generate a list of employees who are not managers use the following query (An example for Sub-query and Joins)

SELECT EmpID, EmpName, ManagerID FROM #Sample1
WHERE EmpID NOT IN(
SELECT DISTINCT s1.ManageriD FROM #sample1 s1
JOIN #sample1 s2 ON s2.EmpID = s1.ManageriD)

To generate a list of Only managers, use the query below:

SELECT EmpID, EmpName, ManagerID FROM #Sample1
WHERE EmpID IN(
SELECT DISTINCT s1.ManageriD FROM #sample1 s1
JOIN #sample1 s2 ON s2.EmpID = s1.ManageriD)

DROP TABLE #sample1

Monday, February 1, 2010

SQL Server - Date Time Functions

Here are a few examples of functions that can help you manipulate DATE in different manners. These will comprise most of your daily job routines. The below set of Date/Time functions are used consistently by SQL Users as they are most oftenly required to retrieve specific data from the database.
---------------------------------------------------------------------------------
Please note that this article is created on 3rd of Febraury 2010 and getDate() function will return current date.
---------------------------------------------------------------------------------
The Following statement will return the date, month and year.
SELECT DATEADD(dd,0, DATEDIFF(dd,0,GETDATE()))
Result : 2010-02-03 00:00:00.000

SELECT LEFT(DATEADD(Day,0, DATEDIFF(Day, 0, GETDATE())), 12)
Result : Feb 3 2010

The following statement will return only date.
SELECT DATEPART(Day, GETDATE())

Result : 3

The Following statement will return the day-number for the week.
SELECT DATEPART(dw, GETDATE())
Result : 4

The Following statement will return day name of the week.
SELECT DATENAME(dw, GETDATE())

Result : Wednesday

The Following statement will return month of the year.
SELECT DATENAME(mm, GETDATE())

Result : February

The Following statements will return the year. You can use 'yy' instead of 'year'
SELECT DATENAME(year, GETDATE())
SELECT DATEPART(yy, GETDATE())

Result : 2010

The Following statements will return the month-number for the year. You can use 'mm' instead of 'month'.
SELECT DATEPART(month, GETDATE())
Result : 2

Difference between two dates (number of days between two dates). You can use 'd' or 'dd' instead of 'Day'
SELECT DATEDIFF(Day, '2010-01-01', GETDATE())
Result : 33

Difference between two dates (number of months between two dates)
SELECT DATEDIFF(Month, '2010-01-01', GETDATE())

Result : 1

Difference between two dates (number of minutes between two dates)
SELECT DATEDIFF(Minute, '2010-01-01', GETDATE())
Result : 48531
SELECT CONVERT(DECIMAL(5,2), DATEDIFF(Second, '2010-02-01 10:55:00.000', '2010-02-01 19:58:10.498')/60.00)

Result : 543.17

Difference between two dates (number of seconds between two dates)
SELECT DATEDIFF(Second, '2010-01-01', GETDATE())

Result : 2911914

Difference between two dates (number of years between two dates)
SELECT DATEDIFF(Year, '2010-01-01', GETDATE())

Result : 0

SELECT Day2-Day1 FROM (
SELECT
Day1=CONVERT(DATETIME, '2010-02-01 05:47:53.497'),
Day2=CONVERT(DATETIME, '2010-02-01 06:47:10.420')) Date

Result : 1900-01-01 00:59:16.923

SELECT
CONVERT(VARCHAR(10),
CASE WHEN(
DATEDIFF(hh, CONVERT(DATETIME, '2010-02-01 01:00:53.497'),
CONVERT(DATETIME, '2010-02-02 23:59:59.497')) >= 24)
THEN
DATEDIFF(hh, CONVERT(DATETIME, '2010-02-01 01:00:53.497'),
CONVERT(DATETIME, '2010-02-02 23:59:59.497'))-24
ELSE
DATEDIFF(hh, CONVERT(DATETIME, '2010-02-01 01:00:53.497'),
CONVERT(DATETIME, '2010-02-02 23:59:59.497'))
END)
+
SUBSTRING(CONVERT(VARCHAR(30),
DATEADD(mi, DATEDIFF(Minute, CONVERT(DATETIME, '2010-02-01 01:00:53.497'),
CONVERT(DATETIME, '2010-02-02 23:59:59.497')),
DATEADD(s, DATEDIFF(Second, CONVERT(DATETIME, '2010-02-01 01:00:53.497'),
CONVERT(DATETIME, '2010-02-02 23:59:59.497')), 0)), 109), 15, 6) as 'HH:MM:SS'

Result : 22:58:06

-----------------------------------------------------------------
DATE Format using CONVERT Function
-----------------------------------------------------------------

SELECT CONVERT(VARCHAR(20), GETDATE(), 100)
Result : 'Feb 1 2010 12:11PM'
SELECT CONVERT(VARCHAR(20), GETDATE(), 101)
Result : '02/01/2010'
SELECT CONVERT(VARCHAR(20), GETDATE(), 102)
Result : '2010.02.01'
SELECT CONVERT(VARCHAR(20), GETDATE(), 103)
Result : '01/02/2010'
SELECT CONVERT(VARCHAR(20), GETDATE(), 104)
Result : '01.02.2010'
SELECT CONVERT(VARCHAR(20), GETDATE(), 105)
Result : '01-02-2010'
SELECT CONVERT(VARCHAR(20), GETDATE(), 106)
Result : '01 Feb 2010'
SELECT CONVERT(VARCHAR(20), GETDATE(), 107)
Result : 'Feb 01, 2010'
SELECT CONVERT(VARCHAR(20), GETDATE(), 108)
Result : '12:15:48'
SELECT CONVERT(VARCHAR(40), GETDATE(), 109)
Result : 'Feb 1 2010 12:16:32:570PM'
SELECT CONVERT(VARCHAR(40), GETDATE(), 110)
Result : '02-01-2010'
SELECT CONVERT(VARCHAR(40), GETDATE(), 111)
Result : '2010/02/01'
SELECT CONVERT(VARCHAR(40), GETDATE(), 112)
Result : '20100201'
SELECT CONVERT(VARCHAR(40), GETDATE(), 113)
Result : '01 Feb 2010 12:17:26:710'
SELECT CONVERT(VARCHAR(40), GETDATE(), 114)
Result : '12:17:40:040'
SELECT CONVERT(VARCHAR(40), GETDATE(), 120)
Result : '2010-02-01 12:18:24'
SELECT CONVERT(VARCHAR(40), GETDATE(), 121)
Result : '2010-02-01 12:18:37.007'
SELECT CONVERT(VARCHAR(40), GETDATE(), 126)
Result : '2010-02-01T12:19:07.160'
SELECT CONVERT(VARCHAR(40), GETDATE(), 127)
Result : '2010-02-01T12:19:34.537'
SELECT CONVERT(VARCHAR(40), GETDATE(), 130)
Result : '17 ??? 1431 12:20:37:580PM'
SELECT CONVERT(VARCHAR(40), GETDATE(), 131)
Result : '17/02/1431 12:21:04:597PM'

Big Data & SQL

Hi Everybody, Please do visit my new blog that has much more information about Big Data and SQL. The site covers big data and almost all the...