A dirty read is a database activity that ruins your reports and applications. Dirty reads are often nightmares that require the help of database administrators and developers. This article explains dirty reads and what you can do to increase data integrity in your SQL Server queries.
Managing a Microsoft SQL Server is a full-time task, and your ability to handle nuances of the server software determines the security and integrity of your data. The chance of dirty reads increases as more users access the data and stored procedures query tables. Dirty reads can be an administrative nightmare if you don’t know how to avoid them.
What are Dirty Reads?
To understand dirty reads, you first need to understand how a database works at a basic level. The database is responsible for storing data. There are two main functions in database software. The first function writes data to tables for storage. The second function reads the stored data and returns it to the reader or even returns it for processing and manipulation to restore it to an edited form.
When a database gets busy from hundreds of user connections, you start to run the risk of having actions run concurrently on the same record set. More importantly, you could have two transactions (write and read) committed to the same row at the same time. For instance, you could have a stored procedure updating a user order, but at the same time, you query the order for the latest activity.
What happens if the order data is read before the database is finished writing the order changes? In this example, you could read an order as “active” before the write action has time to cancel the order. The result is that you ship product for an order that was canceled. This example is just one instance of severe issues that can happen from dirty reads.
It’s hard to imagine that queries running within milliseconds can happen at the same time, but they do happen. Dirty reads are not only a nuisance but can be disastrous when data integrity is critical for business functions.
Dirty Read Example and Prevention
The best way to understand how a dirty read works is to view two stored procedures. Below are examples of two SQL code snippets:
BEGIN TRAN INSERT INTO Customer (name, address, active) VALUES ('Joe Smith', '111 Happy Ln', 1) COMIT TRAN
SELECT * from Customer where active = 1
In the first procedure, a customer is inserted into the database. The customer is inserted and the “active” column is automatically set to “1,” which is the boolean value “True” in T-SQL.
The next procedure queries all records with active customers. When you run these procedures thousands of times in the day on the same table, you run the risk of a dirty read. This is a small example, but dirty reads can lead to critical data errors.
If a dirty read happens in this example, the “select” statement could run before the insertion of the new customer. If this query is a part of a report, the report could leave out this newly active customer. If this happens with dozens of records, the report would not include dozens of newly active customers.
You can’t guarantee that there will never be dirty reads, but you can add a command to the top of your queries to block reads from uncommitted changes. The following line of code can be added to queries that read data:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
Note that some database administrators won’t allow developers to use this isolation level. Setting isolation levels reduces performance, so it can have a negative impact on database speeds, especially database servers with heavy workloads.
Dirty reads are more common in busy databases, but you can add the above statement to frequently used queries to eliminate unneeded stress on performance.
While you can’t ever guarantee that dirty reads are always eliminated, this step greatly reduces the chances that SQL Server will return inaccurate data and create issues with your business reports and processes.