Tuesday, December 06, 2011

Data Access Performance Comparison in .NET

When writing data access code you are probably using some sort of ORM or data mapper to make your life easier while retrieving and persisting data. We all know that this does not come for free, ORMs allow you to move faster, but there is always a performance hit. I recently started reading about Dapper .NET, an open-source, lightweight, single-file object mapper that was developed for the stackoverflow.com site to execute high performance queries. After seeing its amazing performance results on their project page, I decided to write my owns performance tests to compare it to other ORMs (Entity Framework, NHibernate), and also to plain ADO.NET code.

For these tests, I am using the Chinook Database for SQL Server. I defined a simple repository interface and implemented it using different data access approaches. The interface implemented is:

public interface IRepository
{
 Artist GetArtistById(int id);
 IEnumerable<Song> GetSongsByArtist(string name);
}

The first method executes a simple query in a single table by the primary key. The second method executes a query in multiple tables using joins.

There are multiple implementations of this repository interface including plain ADO.NET code with DataReaders, Entity Framework, NHibernate and Dapper .NET. For the Entity Framework, there are four implementations: Linq To Entities, compiled Linq To Entities, native SQL query and stored procedures. For NHibernate, there are implementations with HQL query, native SQL query, and stored procedure.

All the code is available at https://github.com/lerocha/DotNetDataAccessPerformance.

Single Table Query by Primary Key

The first set of tests is to execute a simple query in a single table to get one record by the primary key. The SQL query is:

SELECT ArtistId, Name FROM Artist WHERE Artist.ArtistId=@id

The following table contains the results per multiple runs:

1 run10 runs100 runs500 runs1000 runs
DataReaderNativeQuery0.02.024.9134.1264.4
DataReaderStoredProcedure0.02.226.3133.8266.4
DapperNativeQuery0.12.528.0141.5285.6
DapperStoredProcedure0.02.428.0137.5276.1
EntityFrameworkCompiledLinqQuery2.222.5207.21041.02099.8
EntityFrameworkLinqToEntities5.525.2267.71263.62522.5
EntityFrameworkNativeQuery0.78.285.2426.0859.5
EntityFrameworkStoredProcedure1.45.258.3275.8552.8
NHibernateHqlQuery0.84.448.1253.6509.5
NHibernateHqlQueryStrongType0.14.147.2251.4508.7
NHibernateNativeQuery1.14.446.1256.9490.1
NHibernateStoredProcedure0.05.257.2294.0589.8

The 1-run test is executed twice, the first execution result is not shown since I am only considering hot start. The following chart shows the results of 1,000 runs. As you can see, using Entity Framework with Linq To Entities (non-compiled) has the worst performance result (2,552.5 ms). It gets a little bit better if you use EF compiled Linq query (2,099.8), then improvements starts when using EF with native query (859.5 ms). EF using stored procedure and all NHibernate approaches are way better, all around 500 ms. And finally, the performance of Dapper is very close to the plain ADO.NET code, providing an level of abstraction with minimal performance impact.


Multiple Table Query with Joins

The second test executes joins in multiple tables and filters by a non-key column. The SQL query is:

 SELECT Album.Title as AlbumName, 
 Track.Name as SongName, Artist.Name as ArtistName
 FROM Artist
 INNER JOIN Album ON Album.ArtistId = Artist.ArtistId
 INNER JOIN Track ON Track.AlbumId = Album.AlbumId
 WHERE Artist.Name=@name

The following table contains the results per multiple runs:

1 run10 runs100 runs500 runs1000 runs
DataReaderNativeQuery0.05.963.0321.8644.2
DataReaderStoredProcedure0.06.265.0332.3658.6
DapperNativeQuery0.46.468.0346.1685.5
DapperStoredProcedure0.16.767.2342.3679.7
EntityFrameworkCompiledLinqQuery1.09.9103.3525.61071.0
EntityFrameworkLinqToEntities5.658.2539.52699.95398.3
EntityFrameworkNativeQuery1.113.1135.0658.81328.3
EntityFrameworkStoredProcedure1.19.799.0489.1983.7
NHibernateHqlQuery3.435.6359.51806.23647.1
NHibernateHqlQueryStrongType6.868.3681.53430.06841.4
NHibernateNativeQuery1.110.6103.2517.7998.7
NHibernateStoredProcedure0.810.1102.0516.21033.8

The following chart shows the results of 1,000 runs. As you can see, the slowest implementation is NHibernate with strong type query, e.g. using NHibernate.IQuery.List(), it took 6,841 ms. EF with LinqToEntities (non-compiled) is a little bit faster (5,398 ms). Using NHibernate without strong type query, e.g. using NHibernate.IQuery.List() is significant faster (3,647 ms) than its strong type equivalent (6,841 ms). Here, we can see the performance improvement when using EF compiled Linq queries (1,071 ms) in comparison with EF non compiled Linq To Entities (5,398 ms). The EF compiled Linq query implementation showed performance equivalent to EF with stored procedures, and NHibernate with native query and stored procedure. And finally, the performance of Dapper (679-685 ms) is again very close to the plain ADO.NET code (644-658 ms), with minimal performance impact.


The performance improvements using Dapper in comparison with Entity Framework and NHibernate is very significant. For database reads that need a high performance execution, you might consider using Dapper to get the best performance with some good abstraction from plain ADO.NET code.

The source code of these tests is available at github and I will continue to work on it to incorporate any feedback and also add additional tests for adding, deleting and updating records. Stay tunned!

6 comments:

ahristov said...

Hello,

Thank you for sharing these statistics.

I believe we can do some optimization for the nHibernate side.

I put some changes along with commentaries what do I mean here:

https://github.com/atanashristov/DotNetDataAccessPerformance/blob/master/DotNetDataAccessPerformance/Repositories/NHibernateHqlQueryStrongTypeRepository.cs

I'll be glad if you consider to test with these versions I'm suggesting.

Thank you!

Atanas

ahristov said...

Thank you for sharing these project and the numbers!

I believe I have some optimization for a nHibernate selection, I put some code together with commentaries at:

https://github.com/atanashristov/DotNetDataAccessPerformance/blob/master/DotNetDataAccessPerformance/Repositories/NHibernateHqlQueryStrongTypeRepository.cs

I'll be glad if you consider to take a look and give a try if these versions will perform better than the original version.

Thank you!

Atanas

Anonymous said...

Where is Entity Framework Code First?

mattjcowan said...

Nice work. I would also venture to say that LLBLGen would be in the ballpark of the EF compiled query / NHibernate native queries benchmarks (based on my findings http://tinyurl.com/6wgxcyp).

BBI said...

I still found some flaws in the NHibernate HQL implementation:

- Queries are not parameterized, but have parameter values concatenated into the HQL/SQL string (while Dapper, for example, has correct use of parameter placeholders). This will cause some more database load (no DB query plan cache usage) and risk of SQL injection in "real life". To your information: NHibernate uses an Oracle-style colon : as parameter prefix, instead of the native MS SQL Server @ prefix. The Criteria query added by ahristov (GetSongsByArtist2) will also fix this.

- No projection (select list) in join query, will load all columns from the related tables. GetSongsByArtist3 from ahristov also fixes this, but unfortunately still has a concatenated HQL/SQL parameter.

Anonymous said...

I must thank you for the efforts you've put
in penning this site. I'm hoping to view the same high-grade blog posts from you later on as
well. In fact, your creative writing abilities has encouraged me to get my very own website now ;)

Feel free to visit my blog post; hack clash of clans