Concurrency and isolation

Created 24th March, 2008 14:38 (UTC), last edited 25th March, 2008 02:58 (UTC)

Or "Can your database do this?"

Here is one of the tests for the Postgres support within FOST.3™ (I've changed the code a little and omitted a couple of parts of incidental tests for clarity).

{
    FSLib::Transaction transaction( dbc );
    transaction.execute( L"INSERT INTO test VALUES (1, 'Hello')" );
    transaction.commit();
}
{
    FSLib::Transaction transaction( dbc );
    transaction.execute( L"INSERT INTO test VALUES (2, 'Goodbye')" );

    FSLib::DBConnection cnx( read_dsn.value() + L" dbname=FSL_Test" );
    FSL_CHECK_EQ( FSLib::field_cast< long >( cnx.recordset( L"SELECT COUNT(id) FROM test" ), 0 ), 1 );
    transaction.commit();
}
FSL_CHECK_EQ( FSLib::field_cast< long >( dbc.recordset( L"SELECT COUNT(id) FROM test" ), 0 ), 2 );

The code is pretty simple and you should be able to knock out your own version to test your database pretty easily.

The code in the first set of braces just adds a row to a database table so that there is something in there. The second block is the more interesting.

Firstly we open a transaction and use it to insert another row to the same table, but we don't commit it. We then open a second database connection and do a table row count. Because the second insert statement is still wrapped up in an uncommitted transaction the second database connection can't see it. It gets a count of 1 for the rows in the table. After we commit the transaction then the same SQL statement will return two rows (it doesn't matter which connection we then run the count in, they'd both see it).

If you try this on many other database engines (for example SQL Express) the first row count will hang. The reason is that the simplest implementation that provides isolation between connections and uncommitted transactions is to lock the parts of the database that are being updated — it trades a little concurrency (normally associated with speed) for a great deal of safety. Database engines that don't provide proper isolation are a continuous source of hard to track down bugs.

The most unsophisticated database engines (at least out of those than provide isolation at all) perform table locking. Any statement that changes data in the database locks the entire table that is being changed. More sophisticated engines lock the pages being changed — that is they lock the rows being updated and a few nearby rows. The most sophisticated locking databases do row level locking — they only lock the actual rows that are being updated.

Although row locking is great — it's certainly a whole lot better than table locking or page locking — it still isn't nearly as good as what Postgres does. Postgres doesn't need to use locking at all because it uses versioning. It remembers as many versions of data is is needed to be able to fully isolate everything that is going on without putting a stop to any of it — it has no effect on the concurrency of any other statements.

With this and a few other tricks that Postgres has up its sleeve I'm expecting that I should see some pretty good performance gains on the new Postgres support in FOST.3 — only time will tell for sure though, but it's looking good.


Categories: