Tuesday, December 06, 2011

Data Access Performance Comparison in .NET

When writing data access code you are probably using some sort of ORM or data mapper to make your life easier while retrieving and persisting data. We all know that this does not come for free, ORMs allow you to move faster, but there is always a performance hit. I recently started reading about Dapper .NET, an open-source, lightweight, single-file object mapper that was developed for the stackoverflow.com site to execute high performance queries. After seeing its amazing performance results on their project page, I decided to write my owns performance tests to compare it to other ORMs (Entity Framework, NHibernate), and also to plain ADO.NET code.

For these tests, I am using the Chinook Database for SQL Server. I defined a simple repository interface and implemented it using different data access approaches. The interface implemented is:

public interface IRepository
 Artist GetArtistById(int id);
 IEnumerable<Song> GetSongsByArtist(string name);

The first method executes a simple query in a single table by the primary key. The second method executes a query in multiple tables using joins.

There are multiple implementations of this repository interface including plain ADO.NET code with DataReaders, Entity Framework, NHibernate and Dapper .NET. For the Entity Framework, there are four implementations: Linq To Entities, compiled Linq To Entities, native SQL query and stored procedures. For NHibernate, there are implementations with HQL query, native SQL query, and stored procedure.

All the code is available at https://github.com/lerocha/DotNetDataAccessPerformance.

Single Table Query by Primary Key

The first set of tests is to execute a simple query in a single table to get one record by the primary key. The SQL query is:

SELECT ArtistId, Name FROM Artist WHERE Artist.ArtistId=@id

The following table contains the results per multiple runs:

1 run10 runs100 runs500 runs1000 runs

The 1-run test is executed twice, the first execution result is not shown since I am only considering hot start. The following chart shows the results of 1,000 runs. As you can see, using Entity Framework with Linq To Entities (non-compiled) has the worst performance result (2,552.5 ms). It gets a little bit better if you use EF compiled Linq query (2,099.8), then improvements starts when using EF with native query (859.5 ms). EF using stored procedure and all NHibernate approaches are way better, all around 500 ms. And finally, the performance of Dapper is very close to the plain ADO.NET code, providing an level of abstraction with minimal performance impact.

Multiple Table Query with Joins

The second test executes joins in multiple tables and filters by a non-key column. The SQL query is:

 SELECT Album.Title as AlbumName, 
 Track.Name as SongName, Artist.Name as ArtistName
 FROM Artist
 INNER JOIN Album ON Album.ArtistId = Artist.ArtistId
 INNER JOIN Track ON Track.AlbumId = Album.AlbumId
 WHERE Artist.Name=@name

The following table contains the results per multiple runs:

1 run10 runs100 runs500 runs1000 runs

The following chart shows the results of 1,000 runs. As you can see, the slowest implementation is NHibernate with strong type query, e.g. using NHibernate.IQuery.List(), it took 6,841 ms. EF with LinqToEntities (non-compiled) is a little bit faster (5,398 ms). Using NHibernate without strong type query, e.g. using NHibernate.IQuery.List() is significant faster (3,647 ms) than its strong type equivalent (6,841 ms). Here, we can see the performance improvement when using EF compiled Linq queries (1,071 ms) in comparison with EF non compiled Linq To Entities (5,398 ms). The EF compiled Linq query implementation showed performance equivalent to EF with stored procedures, and NHibernate with native query and stored procedure. And finally, the performance of Dapper (679-685 ms) is again very close to the plain ADO.NET code (644-658 ms), with minimal performance impact.

The performance improvements using Dapper in comparison with Entity Framework and NHibernate is very significant. For database reads that need a high performance execution, you might consider using Dapper to get the best performance with some good abstraction from plain ADO.NET code.

The source code of these tests is available at github and I will continue to work on it to incorporate any feedback and also add additional tests for adding, deleting and updating records. Stay tunned!