Saturday, November 29, 2008

Weird thing with inner queries in SQL SERVER 2005

Few days ago, I found a very strange thing with SQL SERVER 2005 and I strongly feel that THIS IS A BUG in SQL SERVER which actually can create lots of data discrepancy.

To prove my point I have created a sample SQL Script. I want you to run the script in SQL SERVER.

Details of the Script

· Create new database.

CREATE DATABASE DB_TEST

· Use this Database.

USE DB_TEST

· Create a table named tblCategory

CREATE TABLE tblCategory
(
CategoryID INT PRIMARY KEY,
CategoryNAME VARCHAR(50)
)

· Create another table named tblProduct

CREATE TABLE tblProduct
(
ProductID INT PRIMARY KEY,
CategoryID INT FOREIGN KEY REFERENCES tblCategory(CategoryID),
ISCOMPLETED BIT
)

· Insert 5 rows in tblCategory

INSERT INTO tblCategory VALUES (1,'Category1')
INSERT INTO tblCategory VALUES (2,'Category2')
INSERT INTO tblCategory VALUES (3,'Category3')
INSERT INTO tblCategory VALUES (4,'Category4')
INSERT INTO tblCategory VALUES (5,'Category5')

· Insert 10 rows in tblProduct

INSERT INTO tblProduct VALUES (1,1,0)
INSERT INTO tblProduct VALUES (2,1,0)
INSERT INTO tblProduct VALUES (3,2,0)
INSERT INTO tblProduct VALUES (4,2,0)
INSERT INTO tblProduct VALUES (5,3,0)
INSERT INTO tblProduct VALUES (6,3,0)
INSERT INTO tblProduct VALUES (7,4,0)
INSERT INTO tblProduct VALUES (8,4,0)
INSERT INTO tblProduct VALUES (9,4,0)
INSERT INTO tblProduct VALUES (10,5,0)

· Select statement to confirm the data is inserted or not.

SELECT * FROM tblCategory
SELECT * FROM tblProduct

· Here is a select query which is not correct. The query tells select ProductID from tblCategory table where CategoryID = 1. But the problem here is tblCategory is not having a column named ProductID. So this query throws an error and that's a correct behaviour.

SELECT ProductID FROM tblCategory WHERE CategoryID = 1

· Here is the magic. I have used the above select query which is not correct as a inner query with a update statement. What to do you think? What will happen? Update query should throw an error. Logically it should.I was thinking the same But just execute this update query and you will be shocked.

UPDATE tblProduct SET IsCompleted = 1 WHERE ProductID IN (SELECT ProductID FROM tblCategory WHERE CategoryID = 1)

· Oops, 10 rows affected. Surpised. All the data in IsCompleted field is set to 1 but my inner query (SELECT ProductID FROM tblCategory WHERE CategoryID = 1) is wrong.

· This is just a sample query but When I executed such a similiar statement, 3364 rows in my table was updated.

SELECT * FROM tblCategory
SELECT * FROM tblProduct

I think this is a bug but actaully it's not. Please read comments for more information on this..


6 comments:

Anonymous said...

You are right, Virendra.
This bug is with SQL Server 2005. The same works fine in SQL Server 2000.

In Microsoft's words:
In some Transact-SQL statements, the subquery can be evaluated as if it were an independent query. Conceptually, the subquery results are substituted into the outer query (although this is not necessarily how Microsoft SQL Server 2005 actually processes Transact-SQL statements with subqueries).

So, it is quite possible that Microsoft has modified the way SQL server processes subqueries in a query in SQL Server 2005.

Anonymous said...

A correction, Virendra.

The problem is with SQL Server 2000 as well. But, I would say that this is not a bug, but a feature for subqueries. Using it in incorrect manner may cause problems, like it did in your case.

I further explored it and found it that it cannot be categorized as bug as Microsoft has provided it intentionally.
One can use the tables of outer queries and their fields in the subquery as well. Since your field was one of the field of your outer query, subquery returned the result.

Hope the explanation will make sense.

Unknown said...

From Vinod Kumar (Working in Microsfot India)

"This is NOT a bug and this is because of name resolution inside. This is by design and has been in the best practices to add the fully-qualified name in the subquery. Just change the name from ApplicationID to tblENQUIRY.ApplicationID inside the sub-query and you will see the desired results.



To take it to the next level, just create another table without any applicatioID column and make the subquery as Select ApplicationID from "yournewtable" and you will still see the same results."

Paresh said...

Hi virendra,
yes this is not bug. some people think it as bug because in sub-query they have used the outer query's column in sub-query's WHERE clause(Correlated subqueries). the inner query first try to find it in current table(inner query's table) and if it is not found, it try to find it in outer query, the concept we use in query with JOIN where we can write column name which are not ambiguous without alias.
so the inner query in:
UPDATE tblAPPLICANTS SET ISCOMPLETED = 1 WHERE APPLICANTID IN (SELECT APPLICANTID FROM tblENQUIRY WHERE ENQUIRYID = 1)

it will return as many time the current field value of the outer query as the number of records in tblENQUIRY. like we write
select 123 from tblENQUIRY. it will display 123 as many time as the number of rows in tblENQUIRY.

Virendra said...

Kevin Cox from Microsoft has commented

"This is not a bug. Read the BOL section on correlated sub queries. Any reference to the outer table makes it a correlated sub query and not a standalone sub query. So you are referencing the outer table in the sub query. If you fully qualified the column, as in “SELECT tblApplicants.ApplicantID…” in your sub query then you could see it easier.



The best practice is to not have any fields in your select list in the sub query, such as "SELECT 1 FROM tblENQUIRY WHERE ENQUIRYID=1"."

Anonymous said...

Well this is a concept in SQL called as Correlated Sqbqueries

back to top