Sunday, January 31, 2010

SQL Server - CHARINDEX - Oracle INSTR

There is a CharIndex’ function in SQL Server which is similar to the Oracle Instr’ function.

In Oracle, the syntax of the INSTR function is :

instr( string1, string2 [, start_position [, nth_appearance ] ] )

String1 in the above syntax represents the string that you need to search in, whereas String2 is the substring that needs to be search for within the String1.

start_position and nth_appearance are optionals. Through start_position you can specify from which point in string1 the search should start and nth_appearance is to specify the number of occurance. By default start_position and nth_appearance is set to 1.

Now, let's check with one example.

SELECT INSTR('Rhythm of the band', 'th') FROM Dual;
SELECT INSTR('Rhythm of the band', 'th', 1, 1) FROM Dual;

The above select statements would return 4 as your output since 'th' was found to be positioned at the 4th character. But 'th' has been spelled twice in "Rhythm of the band" and if you require a second occurance from the string, then you need to change the statement like this_

SELECT INSTR('Rhythm of the band', 'th', 1, 2) FROM Dual;

The above statement however would return 11 as your output since 'th' was found at 11th position in string1.

Now, let's work on SQL Server for the same.

CHARINDEX is the alternative in SQL Server for INSTR function but it is not exactly its equivalent.

CHARINDEX ( expression1 , expression2 [ , start_location ] )

The above syntax is the same as with INSTR function except for nth_appearance. We donot have an option to set the nth_appearance.

SELECT CHARINDEX('th','Rhythm of the band')

The above statement would return 4 as your output since 'th' was found at the 4th character position.

SELECT CHARINDEX('th','Rhythm of the band', 5)
SELECT CHARINDEX('th','Rhythm of the band', 6)
SELECT CHARINDEX('th','Rhythm of the band', 7)
SELECT CHARINDEX('th','Rhythm of the band', 8)

The above statements would return 11 as your output since 'th' was found at the 11th position after the first 5 characters.

If you require an exact behaviour of INSTR function in SQL Server, then follow the link to create a user defined function.

http://www.dbforums.com/microsoft-sql-server/1101462-equivalent-oracles-instr-4-parameters-sql-server.html

SQL Server - Sample Database Link

Here is the link for Microsoft SQL Server Community Projects & Samples. They have provided the best examples and databases in the following link.

Codeplex SQLServerSamples

Friday, January 29, 2010

SQL Server - List Out Number of Rows in Database

To List-Out The Number Of Rows In Each Table From The Current Database.
To Find-Out Number Of Rows In Each Table In Database.

/****************/
--METHOD-1
/****************/
Using A Procedure_
DECLARE
@TotalRows INT,
@FirstID INT,
@table VARCHAR(255),
@cmd VARCHAR(500)

CREATE TABLE #tableData (TableName VARCHAR(255), TotalRows INT)
CREATE TABLE #temp1 (col1 INT IDENTITY, TableName VARCHAR(255))

INSERT INTO #temp1
SELECT table_name FROM Information_Schema.Tables
WHERE table_type = 'base table'

SET @TotalRows = @@ROWCOUNT
SET @FirstID = 1

WHILE @FirstID <= @TotalRows BEGIN


SELECT @table = TableName FROM #temp1 WHERE col1 = @FirstIDSET @cmd = 'SELECT ''' + @table + ''', count(*) FROM ' + @tableINSERT INTO #tableData EXEC (@cmd)SET @FirstID = @FirstID + 1END

SELECT * FROM #tableData ORDER BY TotalRows DESC



DROP TABLE #tableData
DROP TABLE #temp1



CREATE TABLE #tableInfo
(
TableName VARCHAR(255),
TotalRows INT
)

INSERT #tableInfo
EXEC sp_msFOREachTable
'SELECT PARSENAME(''?'', 1),
COUNT(*) FROM ?'

SELECT * FROM #tableInfo ORDER BY TotalRows DESC

DROP TABLE #tableInfo

/****************/
--METHOD-4
/****************/
CREATE TABLE #tableInfo
(
TableName VARCHAR(255),
TotalRows INT
)

EXEC sp_msFOREachTable
'INSERT INTO #tableInfo
SELECT ''?'', COUNT(*) FROM ?'

SELECT * FROM #tableInfo ORDER BY TotalRows DESC

DROP TABLE #tableInfo


/****************/
--METHOD-2
/****************/
Using A Cursor_
DECLARE @table VARCHAR(255),
@cmd VARCHAR(500)

CREATE TABLE #tableData (TableName VARCHAR(255), TotalRows INT)

DECLARE cursor_tableData CURSOR FOR
SELECT table_name FROM Information_Schema.Tables
WHERE table_type = 'base table'

OPEN cursor_tableData

FETCH NEXT FROM cursor_tableData INTO @table
WHILE @@fetch_status = 0
BEGIN
SET @cmd = 'SELECT ''' + @table + ''', COUNT(*) FROM ' + @table
INSERT INTO #tableData EXEC (@cmd)
FETCH NEXT FROM cursor_tableData INTO @table
END
CLOSE cursor_tableData
DEALLOCATE cursor_tableData

SELECT * FROM #tableData ORDER BY TotalRows DESC
DROP TABLE #tableData



The following methods are using un-documented stored procedure 'sp_msFOREachTable'. There is a cursor declared in that which loops through each table in the current database and executes a script that you define.

/****************/
--METHOD-3
/****************/

SQL Server - Monthly Report - Group by Month

The following example guide you to retrieve a monthly report. As discussed earlier, If the date column is SMALLDATETIME or DATETIME, we can split it into year, month, day, hour, minute and seconds separately using CONVERT function. Here is an example to generate a monthly report by GROUPING DATE column.

CREATE TABLE Test1(
AccountID INT,
AccountNumber VARCHAR(20),
PaidAmount DECIMAL,
PaidDate SMALLDATETIME
)

INSERT INTO Test1 VALUES(221001, 'A1020101', 2000.00, '2010-01-20 13:10:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020102', 3300.00, '2010-01-20 13:33:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020103', 1100.00, '2010-01-20 15:01:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020104', 5400.00, '2010-01-20 15:49:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020105', 5200.00, '2010-01-20 16:30:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020106', 3400.00, '2010-01-20 16:59:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020107', 1200.00, '2010-01-20 17:10:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020108', 4200.00, '2010-01-20 18:22:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020109', 2100.00, '2010-01-20 18:43:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020110', 3400.00, '2010-01-20 19:12:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020111', 5500.00, '2010-01-20 22:19:00:000')

SELECT * FROM Test1

SELECT
DATENAME(mm, PaidDate) Month,
YEAR(PaidDate) Year,
COUNT(*) iCount,
SUM(PaidAmount) Amount
FROM Test1
GROUP BY
DATENAME(mm, PaidDate),
Year(PaidDate),
Month(PaidDate)
ORDER BY Month(PaidDate)

SQL Server - Daily Report - Group by Date

Like hourly report, the following example guide you to retrieve a daily report. As discussed earlier, If the date column is SMALLDATETIME or DATETIME, we can split it into year, month, day, hour, minute and seconds separately using CONVERT function. Here is an example to generate a daily report by GROUPING DATE column.
CREATE TABLE Test1(
AccountID INT,
AccountNumber VARCHAR(20),
PaidAmount DECIMAL,
PaidDate SMALLDATETIME
)

INSERT INTO Test1 VALUES(221001, 'A1020101', 2000.00, '2010-01-20 13:10:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020102', 3300.00, '2010-01-20 13:33:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020103', 1100.00, '2010-01-20 15:01:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020104', 5400.00, '2010-01-20 15:49:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020105', 5200.00, '2010-01-20 16:30:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020106', 3400.00, '2010-01-20 16:59:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020107', 1200.00, '2010-01-20 17:10:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020108', 4200.00, '2010-01-20 18:22:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020109', 2100.00, '2010-01-20 18:43:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020110', 3400.00, '2010-01-20 19:12:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020111', 5500.00, '2010-01-20 22:19:00:000')

SELECT * FROM Test1

SELECT
CONVERT(VARCHAR(10), PaidDate, 101) Day,
COUNT(*) TransactionsCount,
SUM(PaidAmount) Amount
FROM Test1
GROUP BY
CONVERT(VARCHAR(10), PaidDate, 101)

SQL Server - Hourly Report - Group by Hour

Sometimes it requires to know how many inserts are happend every hour in a transactions table OR how many payments made every hour OR to retrieve the hourly based report, whatever term it is, the requirement is to generate a report based on hours from a datetime column. If the date column is SMALLDATETIME or DATETIME, we can split it into year, month, day, hour, minute and seconds separately using CONVERT function. Follow the examples.
CREATE TABLE Test1(
AccountID INT,
AccountNumber VARCHAR(20),
PaidAmount DECIMAL,
PaidDate SMALLDATETIME
)

INSERT INTO Test1 VALUES(221001, 'A1020101', 2000.00, '2010-01-20 13:10:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020102', 3300.00, '2010-01-20 13:33:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020103', 1100.00, '2010-01-20 15:01:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020104', 5400.00, '2010-01-20 15:49:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020105', 5200.00, '2010-01-20 16:30:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020106', 3400.00, '2010-01-20 16:59:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020107', 1200.00, '2010-01-20 17:10:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020108', 4200.00, '2010-01-20 18:22:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020109', 2100.00, '2010-01-20 18:43:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020110', 3400.00, '2010-01-20 19:12:00:000')
INSERT INTO Test1 VALUES(221002, 'A1020111', 5500.00, '2010-01-20 22:19:00:000')

SELECT * FROM Test1

SELECT
CONVERT(VARCHAR(10), PaidDate, 101) Day,
CONVERT(VARCHAR(2), PaidDate, 108) Hour,
SUM(PaidAmount) Amount
FROM Test1
GROUP BY
CONVERT(VARCHAR(10), PaidDate, 101),
CONVERT(VARCHAR(2), PaidDate, 108)

Thursday, January 28, 2010

SQL Server - INSERT Statements

INSERT can be used in different ways. The following examples will guide you the method of row constructions using INSERT.
CREATE TABLE Test1(
Column1 INT,
Column2 VARCHAR(20))

INSERT INTO Test1 VALUES(1001, 'Andrews')


INTO is an optional keyword and you can omit it from the statement.
INSERT Test1 VALUES(1002, 'Mathews')

SELECT statement can also be used in row construction using INSERT
INSERT INTO Test1
SELECT 1003, 'Saarika'

INSERT INTO Test1
SELECT 1004, 'Haarika'


INSERT statement can be used using 'UNION' and 'UNION ALL'
INSERT INTO Test1
SELECT 1005, 'Saritha'
UNION
SELECT 1006, 'Haritha'
UNION
SELECT 1007, 'Vanitha'
UNION
SELECT 1008, 'Kavitha'

SELECT * FROM Test1

SQL Server - Foreign Key Relationships

To establish a connection between two or more tables, foreign key constraint will be used. One table's primary key column can be referred to another table's column as foreign key. It is to enforce a constraint to match the foreign key column data-type and values with parent key column data-type and values. The foreign key columns are used in joins while retrieving the data as the relationship between the tables indicates that the tables have been optimized to be combined.

There is no specific limitation to have number of foreign key constraints in a table, however, it is recommended to have not more than 253 foreign key constraints in a table. We can create a foreign key by defining REFERENCES or FOREIGN KEY constraint while creating or modifying a table.

Look into the following examples.

-------------
METHOD-1
-------------
CREATE TABLE studentGroups(
GroupID BIGINT PRIMARY KEY,
Groups VARCHAR(20))

CREATE TABLE student_details(
stdID BIGINT PRIMARY KEY,
student_Name VARCHAR(40) NOT NULL,
GroupID BIGINT REFERENCES studentGroups(GroupID))


-------------
METHOD-2
---------------
CREATE TABLE studentGroups(
GroupID BIGINT PRIMARY KEY,
Groups VARCHAR(20))

CREATE TABLE student_details(
stdID BIGINT PRIMARY KEY,
student_Name VARCHAR(40) NOT NULL,
GroupID BIGINT,
CONSTRAINT fk_group FOREIGN KEY(GroupID) REFERENCES studentGroups(GroupID))


------------------------------------------------------------
METHOD-3 -- ADDING FOREIGN KEY CONSTRAINT ON EXISTING TABLE
------------------------------------------------------------
CREATE TABLE studentGroups(
GroupID BIGINT PRIMARY KEY,
Groups VARCHAR(20))

CREATE TABLE student_details(
stdID BIGINT PRIMARY KEY,
student_Name VARCHAR(40) NOT NULL,
GroupID BIGINT)

ALTER TABLE student_details
WITH CHECK ADD CONSTRAINT fk_group FOREIGN KEY(GroupID)
REFERENCES studentGroups(GroupID)

SQL Server - GETDATE(), CURRENT_TIMESTAMP in DEFAULT statements

Most of the transactional tables generally do require to have a column with DEFAULT value as current date time to identify when the transaction had took place. To accomplish this, In CREATE TABLE statement you can use GETDATE() function or CURRENT_TIMESTAMP (SQL-92 niladic function) as DEFAULT. However there is no difference in output.

CREATE TABLE Employee(
empID BIGINT PRIMARY KEY,
EName VARCHAR(40) NOT NULL,
Application_Date SMALLDATETIME DEFAULT CURRENT_TIMESTAMP,
Reporting_Date SMALLDATETIME DEFAULT GETDATE(),
Relieved_Date SMALLDATETIME DEFAULT '12/31/2010', -- mmddyyyy
Address VARCHAR(20) NULL)

INSERT INTO Employee(empID, Ename, Address) Values (1001, 'Catherin', 'NewYork')

SELECT * FROM Employee

SQL Server - CASE Function- Examples

I heard many people asking about alternative for Oracle DECODE in SQL Server. There is no such function in SQL Server but through CASE function we can construct the same expression.

CASE function evaluates a list of conditions and returns one of multiple possible result expressions. It has two formats:


(1) Simple Case Function
(2) Searched Case Function

The simple CASE function compares an expression to a set of simple expressions to determine the result. The searched CASE function evaluates a set of Boolean expressions to determine the result. Both formats support an optional ELSE argument.

Refer to the following examples:

------------------------------------------------
METHOD-1 - SIMPLE CASE FUNCTION
------------------------------------------------

CREATE TABLE Customers(
CustomerID INT,
CustomerName VARCHAR(30),
ActiveStatus BIT
)

INSERT INTO Customers VALUES(100, 'John', 0)
INSERT INTO Customers VALUES(200, 'Kate', 1)
INSERT INTO Customers VALUES(300, 'Julia', 1)
INSERT INTO Customers VALUES(400, 'Maddy', 0)

SELECT CustomerID,
CustomerName,
ActiveStatus = CASE ActiveStatus
WHEN 1 THEN 'Active'
ELSE 'Inactive'
END
FROM Customers

----------------------------------------------------
METHOD-2 - SEARCHED CASE FUNCTION
Case Function With Expressions
----------------------------------------------------
CREATE TABLE StudentMarks(
StudentID INT,
StudentName VARCHAR(30),
Subject1 INT,
Subject2 INT,
Subject3 INT,
)

INSERT INTO StudentMarks VALUES(10, 'John', 90, 86, 79)
INSERT INTO StudentMarks VALUES(20, 'Kate', 69, 58, 43)
INSERT INTO StudentMarks VALUES(30, 'Julia', 77, 76, 87)
INSERT INTO StudentMarks VALUES(40, 'Maddy', 92, 91, 90)

SELECT * FROM StudentMarks

SELECT StudentID,
StudentName,
Subject1,
Subject2,
Subject3,
SUM(Subject1+Subject2+Subject3) TotalMarks,
Grade = CASE
WHEN SUM(Subject1+Subject2+Subject3)>250 THEN 'A'
WHEN SUM(Subject1+Subject2+Subject3) BETWEEN 200 AND 250 THEN 'B'
ELSE 'C'
END
FROM StudentMarks
GROUP BY StudentID, StudentName, Subject1, Subject2, Subject3

--------------------------------------------
Complicated Script (For an example)
--------------------------------------------
SELECT
COUNT(CASE WHEN (
ActiveStatus = 1
AND
DATEDIFF(Day, CONVERT(DATETIME, DateColumn1, Getdate()) >= 1
AND
DATEDIFF(Day, CONVERT(DATETIME, DateColumn1, Getdate()) <= 3 )
THEN 1 ELSE 0 END)
AS Column1,
COUNT(CASE WHEN (
ActiveStatus = 2
AND
DATEDIFF(Day, CONVERT(DATETIME, DateColumn1, Getdate()) >= 4
AND
DATEDIFF(Day, CONVERT(DATETIME, DateColumn1, Getdate()) <= 6 ) THEN 1 ELSE 0 END) AS Column2,

COUNT(CASE WHEN (
ActiveStatus = 3
AND
DATEDIFF(Day, CONVERT(DATETIME, DateColumn1, Getdate()) >= 7
AND
DATEDIFF(Day, CONVERT(DATETIME, DateColumn1, Getdate()) <= 9 ) THEN 1 ELSE 0 END) AS Column3,

FROM Table1
INNER JOIN Tabl2 On Table1.Column9 = Table2.Column9
INNER JOIN Tabl3 On Table3.Column5 = Table2.Column6



SQL Server - Working with Duplicate Records

The following examples deals with how to retrieve or find duplicate records and how to remove them in various scenarios. Someone has asked me that he had imported a file twice by mistake and which caused duplicates in the data and he wanted to revert back it. In such scenarios, the following code will help you out even if there is no unique identity in the data.

To retrieve the duplicate records, use the following code
SELECT Column2, Column3, COUNT(*) FROM Test1
GROUP BY Column2, Column3
ORDER BY COUNT(*) DESC


Or you can use the following

SELECT Column2, Column3, COUNT(*) RepeatedCount FROM Test1
GROUP BY Column2, Column3
ORDER BY RepeatedCount DESC


How to remove such duplicates?

DELETE FROM Test1
WHERE column1 NOT IN
(SELECT MAX(Column1) FROM Test1
GROUP BY Column2, Column3)


However, the above code will not help you if you have multiple duplicate records. Moreover if there is no duplicate for any record, such records will also be deleted. Refer to the following scenario.


CREATE TABLE Test1(
column2 INT,
column3 VARCHAR(10))

INSERT INTO Test1 VALUES(10, 'Greens')
INSERT INTO Test1 VALUES(20, 'Springs')
INSERT INTO Test1 VALUES(30, 'Meadows')
INSERT INTO Test1 VALUES(10, 'Greens')
INSERT INTO Test1 VALUES(20, 'Springs')

SELECT * FROM Test1

SELECT Column1= IDENTITY(INT, 1,1),
Column2,
Column3
INTO Test2
FROM Test1
GROUP BY Column2, Column3

SELECT * FROM Test2


if you require a identity value to all your data then use the following; Drop the table if it is already exists.


DROP TABLE Test2;

SELECT Column1= IDENTITY(INT, 1,1),
Column2,
Column3
INTO Test2
FROM Test1

SELECT * FROM Test2


If you require a identity value to the unique records then


DELETE FROM Test2 WHERE Column1 IN
(
SELECT MAX(Column1) FROM Test2
WHERE Column2 IN
(
SELECT Column2 FROM Test2
GROUP BY Column2, Column3
HAVING COUNT(*) > 1
)
GROUP BY Column2, Column3
)

SELECT * FROM Test2


In the above scenarios we have an identity column which is mandatory while removing such duplicates from the system. If your table do not have such column then you need to dump all the unique records into another table in which you should maintain one identity column. Refer to the following scenario.


CREATE TABLE Test1(
column1 INT IDENTITY,
column2 INT,
column3 VARCHAR(10)
)

INSERT INTO Test1 VALUES(10, 'Greens')
INSERT INTO Test1 VALUES(20, 'Springs')
INSERT INTO Test1 VALUES(10, 'Greens')
INSERT INTO Test1 VALUES(20, 'Springs')

SELECT * FROM Test1

Introduction to Databases - DBMS History

When I was asked to write something on ORDBMS, I thought it would be better to explain RDBMS first. While I am writing about RDBMS, I am thinking again of discussing about DBMS before I move to RDBMS. Hence, this article has become a bit lengthy but covers the core concepts of FMS, DBMS, RDBMS, OODBMS, and ORDBMS.

As you know databases are being used more than ever before to store and to access information. Due to the ease of maintenance and outstanding performance of databases, the growth of database technologies has been increasing rapidly. Moreover, DBMS had thrived over the World Wide Web. Different web-applications are retrieving the stored data and the answers are displayed in a formatted form using web languages like HTML. This article talks about the evolution of databases from the beginning and discusses their relative strengths and weaknesses.

File management system

Before the Database systems were introduced, data in software systems was stored in flat files. This type of system is called File Management System (FMS). This typical file-processing system is supported by a conventional operating system. The system stores permanent records in various files, and it needs different application programs to extract records from, and add records to, the appropriate files.



In a file-based system, different programs in the same application may be correlated or interact with different data files. There is no particular system enforcement and no relation between these data files.

Keeping organizational information in a file-processing system has a number of major disadvantages.

Data Redundancy and Inconsistency Assume that different users or programmers have written the application programs or created the files over a period of time. Probably the files have the same or different formats or the programs might be written in different languages. The same information may be duplicated in several files hence redundancy is introduced in the data. And if the data is redundant we can surely say the data is inconsistent as well. Different programmers access the same data simultaneously, and any modifications to the data might not be reflected in the data seen by the other.


Atomicity Problems:
Suppose, we were in the process of changing or inserting some data into the data files or somewhere else, and just then the computer system fails. It could be a power failure or some other known reasons for failure. What would one expect then? We should have the data in the same state as it was before the system failure occurred, right? This is called atomicity and it is difficult to ensure the same in a conventional file-processing system.


Security Problems
If we give authority on the entire database to all the users, then there may be a chance of security violation. Access to each user should be restricted to only certain data. But, in FMS, applying such security constraints is difficult.

Data Isolation
To retrieve particular data, what we really need is an application. The data may be in different files and files may be in different formats. Writing new application programs every time is a hectic issue and difficult.

Integrity Problems
We need to maintain particular rules in any application while storing the data. These are generally known as Integrity Constraints. In the FMS model, the data stored in data files won’t satisfy certain types of rules.

Unanticipated Queries
Through this conventional file-processing system we can’t retrieve data we need in an efficient manner. More responsive data-retrieval systems are required for general use.

Concurrent Access Anomalies
In large multi-user systems, multiple users may access the same file or record simultaneously. In FMS it is difficult to handle, and there is no way to find a particular record. Every search starts from the beginning of the file and examines each and every record in the file.

May be, handling the above issues are possible in a file-based system. The real issue was that, though all these are common issues of concern to any data-intensive application, each application had to handle all these problems on its own. In fact, all or some of the above problems can be handled by additional programming in each application. But writing an application for each task is time-consuming. The application programmer is burdened with not only implementing the application business rules but also with these other common issues.

DATABASE MANAGEMENT SYSTEM :
DBMS evolved when programmers were fed-up with FMS. The DBMS is a central system, which provides a common interface between the data and the various front-end programs in the application. It also provides a central location for the whole data in the application to reside.


As we have already discussed that DBMS is having a centralized nature, the above picture illustrates the same. Applications are different, but the common data storage is the same. It overcame the limitations or disadvantages of FMS. What are they, let us discuss here.

Reduced Data Redundancy
Due to its centralized nature, the whole data will be stored in one database. So there is no chance of duplication or repetitive values. It is not possible for the data to be consistent all the way. There may be a chance of redundancy due to different reasons. May be, business or technical issues causes the data to be redundant to a certain extent. But we should be careful in controlling such redundancy to a large extent.

Data Consistency
If the data is not duplicated then we can surely say the data is consistent. Data consistency is possible only when data redundancy is reduced.

Data Integration
As we have discussed earlier, there should be some constraints in storing and accessing the data from the data file. It is possible in DBMS to give such type of integrity constraints, as all the related data is stored in one single database.

Data Sharing & Security
Data will be stored in a centralized way. So, the related data can be shared by single or multiple users. Security restrictions can also be applied.

Better Maintenance
As the data is stored in a centralized manner, we can control the data better and maintenance is also lesser and easier.

Well, the above said are to be considered as benefits when compared with the file management system. Because of these, DBMS became very popular and attracted large users in hardly anytime. Database Systems can be categorized according to the data structures and operators they present to the user.

(1) E-R Model
(2) Hierarchical Data Model
(3) Network Data Model
(4) Semi Structured Data Model
(5) Relational Data Model
(6) Object Oriented Data Model
(7) Object Relational Data Model

Among all data models mentioned here, relational model followed by object-oriented and object-relational models enjoyed most popularity. Let’s discuss these ones by one now.


RELATIONAL DATABASE MANAGEMENT SYSTEM

Dr. Edgar F. Codd, a research scholar invented the relational database concept in 1970 at IBM, which we refer to as RDBMS. This can be considered an extension of hierarchical and network models. He proposed certain rules to be qualified as a relational model, which are known as Codd’s rules.

In RDBMS, we do not have any parent-child relationships concept. All the data is in the form of simple columns and rows in a table. Each table is an individual and independent entity and we need not use any physical pointers or physical links to connect the entities like what we used to have in-network and hierarchical models. All data is maintained in the form of tables consisting of rows and columns. Data in two tables are related through common columns. Operators are provided for operating on rows in tables. Because of this, querying becomes very easy. This was one of the main reasons for the relational model to become more popular with programmers.

The difference between RDBMS and DBMS is, DBMS doesn’t support client-server architecture whereas RDBMS supports. Relationships between tables or files are maintained programmatically in a DBMS whereas the relationship between two tables is specified at the time of table creation. DBMS may satisfy less than 7 of Codd's rules whereas, to be a relational model, more than 7 or 8 rules must be satisfied.

A sample project schema along with their relations are given below :




Benefits :
(1) In RDBMS system, the tables are simple, data is easier to understand and communicate with others.
(2) RDBMS are flexible, users do not have to use predefined keys to input information.
(3) Automatic optimization of searching is possible in RDBMS
(4) Structure Query language (SQL) is easier to learn and implement to satisfy the Codd’s rules.
(5) It supports large and very large databases.
(6) RDBMS are more productive because SQL is easier to learn. This allows users to spend more time inputting instead of learning.

Limitations
(1) Not much efficient and effective integrated support.
(2) Do not have enough storage area to handle data such as images, digital, and audio/video.
(3) Relational tables are flat and do not provide good support for nested structures, such as sets and arrays. And also certain kinds of relationships, such as sub-typing between database objects are hard to represent in this model.
(4) RDBMS technology did not take advantage of Object-oriented programming concepts, which is very popular because of its approach.
(5) All the data must be in the form of tables where relationships between entities are defined by values.

Well, to know the implementation of the relational model, please wait for some time to post the next articles, Schema Designs and Normalization will let you know how to create a schema and what we need to know before its creation.

OBJECT ORIENTED DATABASE MANAGEMENT SYSTEM (OODBMS) :

As you are aware, web and Internet usage is rapidly increasing nowadays. To meet the challenge of the web and to overcome the limitations of RDBMS, OODBMS was developed. OODBMS stands for Object-oriented database management system, which we can define as; it is a combination of Object-Oriented Programming and Relational Database Management System. Actually, RDBMS did not take advantage of Object-Oriented approach whereas it has gained widespread acceptance in the IT industry.

Providing consistent, data-independent and security is the main objective of the OODBMS, and also to control and extensible data management services to support the object-oriented model. In the relational model, we can’t handle big and complex data whereas in OODBMSs we can.



OODBMSs are well integrated and designed to work with complex C#, C++, and Java object models. OODBMS is actually meant for object storage and object sharing purpose and it is the solution for persistent data handling. It allows the storage of complex data structures that cannot be easily stored using traditional database technology.

Inheritances, data encapsulation, object identity, and polymorphism are the main characteristics of object-oriented programming. By defining new objects, we can develop solutions to complex problems in inheritance. Objects are related and shared within a network and have an independent identity. The object identity (OID) works behind the scenes to ensure the uniqueness of the tuples, which is invisible to the users. Moreover, no limitations are required on the values. If we take the same thing in RDBMS, then we have to worry about uniquely identifying tuples by their values and making sure that no two tuples have the same primary key values.

On the other hand, polymorphism and dynamic binding are useful to create objects to provide solutions to the complex ones and to avoid coding for every object. These objects may be transient or persistent. By persistent object, we mean the permanent object stored inside the database to survive the execution of data process and in order to eventually reuse it in another process. OODB deals with these objects in a uniform manner.

To create, update, delete, or to retrieve the persistent data, data definition language and data manipulation languages are important in OODBMS. These languages are also useful to define a database, including creating, altering, and dropping tables and to ensure the integrity constraints in tables.

When speaking of the differences between the relational data model and object-oriented data model, one should know that relationships between entities are defined by values in a relational model whereas in OODBMS relationships are represented explicitly. This is to improve the data access performance and to support navigational and associative access to data. In a relational model to access the data, we need a query language whereas in OODBMS the interaction with the database is done by transparently accessing objects. On one hand, this may be regarded as a drawback of OODBMS because the language is completely dependent, uses specific API and it is typical to access data.

OODBMS is undoubtedly a beneficiary system for those who require a low cost with the best performance in development, as it extends the language features with transparently persistent data, associative queries, data recovery, concurrency control, and other capabilities.

OODBMS has not yet received worldwide approval as it was considered to be in the developmental stages. The main disadvantages or limitations of OODBMS are, that there are no common data models, no current standard, and no strong theoretical framework and experiment activity.

OBJECT RELATIONAL DATABASE MANAGEMENT SYSTEM (ORDBMS) :
Object-Relational Database Management System (ORDBMS) is an extended development of the existing relational database system. To overcome certain limitations and to increase the performance level, and to achieve the benefits of the relational model and object model, ORDBMS technology evolved by combining the relational databases and object-oriented concepts.

We can incorporate our custom data types, functions, operators, and methods with the database and we can store images, geographical information, and multimedia objects such as audio and video in ORDBMS. And it allows us to raise the level of abstraction at which we want to view the problem domain.

In RDBMS, it is difficult to map the objects to the database whereas in OODBMS object-oriented features are supported extensively. However, in ORDBMS limited support of object-oriented features, mostly to new data types. Moreover, it supports abstract data types (ADTs) and complex relationships.

ORDBMSs have the same query centric approach to data management as we have in RDBMS. Through declarative SQL statements, we can handle the data access and there is no procedural or object-at-a-time, navigational interface. ORDBMS allows us to continue using our existing systems, without having to make major changes. Another advantage is that it allows users and programmers to start using object-oriented systems in parallel. In ORDBMS technology the concept of extensibility is a principal innovation.

ORDBMSs have extended functionality of DBMS and information system which can be deployed over different machines as it has a central server program whereas OODBMS has a typical distributed data architecture. This is the difference in architectures of ORDBMS and OODBMS. Another difference between ORDBMS and OODBMS is, an object-oriented database’s design is based solely on Object-Oriented Analysis and Design (OOAD) principles whereas ORDBMS is an extended development of the traditional relational database with object-oriented concepts structures such as abstract datatype, nested tables, and varying arrays.

In simple words, we can say that ORDBMSs synthesize the features of RDBMSs with the best ideas of OODBMSs.


Reference: Web, Books and Journals

Big Data & SQL

Hi Everybody, Please do visit my new blog that has much more information about Big Data and SQL. The site covers big data and almost all the...