How to track deadlocks in Firebird

Alexey Kovyazin, 08-May-2019, (c) IBSurgeon

Update conflicts (often called «deadlocks») occur in Firebird applications which perform intensive concurrent updates of data.

As you can see from the article «Transactions in Firebird», there are 3 types of errors which contain the keyword «deadlock»:

deadlock
-update conflicts with concurrent update
-concurrent transaction number is NNN

lock time-out on wait transaction
-deadlock
-update conflicts with concurrent update
-concurrent transaction number is NNN

lock conflict on no wait transaction
-deadlock
-update conflicts with concurrent update
-concurrent transaction number is NNN

All 3 types of update conflicts have in common the fact, that update conflict usually is the result of 2 modification operations which try to modify the same record.
The possible options are the following: UPDATE, DELETE SELECT WITH LOCK, MERGE, UPDATE OR INSERT.

It is relatively easy to track the «victim» operation, which receives the error message – just put it into try… catch and log the parameters used in the query which breaks on the error.

However, with this approach, it is difficult to track the «winner» - i.e., the concurrent transaction where the update was done successfully, without error.

To help developers investigate the update conflicts, Firebird puts into error messages the reference to the concurrent transaction – i.e., the transaction where the concurrent update is not yet committed. Together with Trace API, it gives us the ability to track both conflicting operations.

Let's consider the practical steps on how to do it.

First, we need to configure tracing. For this, create the text file (in our case it is C:\Temp\mytrace.conf) with the following contents:

database			
{
	enabled = true 				
	include_filter="(UPDATE%)|(DELETE%)|(MERGE%)|(SELECT%FROM%WITH LOCK)|(UPDATE OR INSERT%)" 
	log_statement_finish = true
	log_errors = true
	include_gds_codes="deadlock"
	log_initfini = false
	time_threshold = 0
	max_sql_length = 65000

}

The configuration file above enables tracing for all databases, so if you have more than 1 active database, better specify the name of the database in the configuration file, i.e.:
Database=mydatabase.fdb
{
  enabled = true
…
}
Let's review the most important parameters in the configuration file:

Include filter parameter

include_filter="(UPDATE%)|(DELETE%)|(MERGE%)|(SELECT%FROM%WITH LOCK)|(UPDATE OR INSERT%)" 
It means that we want to track only UPDATE and DELETE statements and ignore selects. It is necessary to filter out as many as possible statements, because in the production system under high load the amount of the records in the text log can be too high to analyze it.

Usually, we know what table is participating in update conflict, the good idea will be to put it into the filter, to reduce the number of updates to analyze:

include_filter="(UPDATE T1%)|(DELETE FROM T1%)|(MERGE INTO T1%)|(SELECT% FROM T1%WITH lock%)|(UPDATE OR INSERT INTO T1%)"
What if updates and deletes are done by stored procedures? Well, it makes the whole process more difficult: add to the filter all stored procedures which can be involved in concurrent updates, and also enable logging of procedures:
include_filter="(UPDATE T1%)|(DELETE FROM T1%)|(MERGE INTO T1%)|(SELECT% FROM T1%with lock%)|(UPDATE OR INSERT INTO T1%)|(%SP_UPDATE_T1%)|(%SP_DEL_T1%)" 
log_procedures=true

Error logging parameters

Then, we need to specify the error logging and limit it to the deadlocks.
log_errors = true
include_gds_codes="deadlock"
Please note: parameter include_gds_codes appeared only in Firebird 3.0.2, so if you use 2.5 or 3.0.0 or 3.0.1, it is not possible to filter only deadlock errors, so all errors will be printed to the log.

Demonstration

To demonstrate the overall process, let's prepare the test database – for simplicity, there will be 1 table with 1 primary column.
C:\HQbird\Firebird30>isql 
Use CONNECT or CREATE DATABASE to specify a database
SQL> create database "c:\temp\testdeadlock.fdb" user "SYSDBA" password "masterkey";
SQL> create table t1(i1 integer not null primary key);
SQL> insert into t1(i1) values(1);
SQL> commit; exit;
After that, we need to start trace session:
fbtracemgr.exe -se localhost:service_mgr -start -conf "C:\temp\mytrace.conf" -user SYSDBA -pass masterkey
In this case, fbtracemgr prints log to the screen, but in the production, we need to redirect it to the file, of course.

After the successful start the fbtracemgr prints something like this:

Trace session ID 4 started
Then, let's start 2 isql sessions and try to simulate the update conflict. In the table below we have 2 columns for 2 isql sessions. Each row represents a moment in time when the commands were performed.
Time isql session 1 isql session 2
T1
isql -user SYSDBA -pass masterkey localhost:c:\temp\testdeadlock.fdb
Database: localhost:c:\temp\testdeadlock.fdb, User: SYSDBA
SQL> set transaction wait;
Commit current transaction (y/n)?y
Committing.
SQL> select current_transaction from rdb$database;

  CURRENT_TRANSACTION
=====================
                   15

SQL> select * from t1;

          I1
============
           2

SQL>

 
T2  
isql -user SYSDBA -pass masterkey localhost:c:\temp\testdeadlock.fdb
Database: localhost:c:\temp\testdeadlock.fdb, User: SYSDBA
SQL> set transaction wait;
Commit current transaction (y/n)?y
Committing.
SQL> select current_transaction from rdb$database;

  CURRENT_TRANSACTION
=====================
                   20
SQL> select * from t1;

          I1
============
           2

SQL>

So, at the starting point, we can see that 2 transactions are started, #15 and #20. Then, let's try to perform the concurrent update of the same record:
     
T3
SQL> UPDATE T1 SET i1=5 where i1=2;
 
T4  
SQL> UPDATE T1 SET i1=100 where i1=2;
T5
SQL> commit;
SQL>
Statement failed, SQLSTATE = 40001
deadlock
-update conflicts with concurrent update
-concurrent transaction number is 15
SQL>

We started UPDATE in session 1 but did not commit it.
In session 2, UPDATE «hanged» - it waited for the end of the concurrent transaction, and when we have committed UPDATE in session 1, session 2 received the error:

Statement failed, SQLSTATE = 40001
deadlock
-update conflicts with concurrent update
-concurrent transaction number is 15
SQL>
So, we have a typical situation in case of update conflict with wait transaction.

Let's review the trace log to find out the evidence of the update conflict.

In this case, the trace log is very short, but in the case of the production system, it can be much longer, but the approach to find deadlocks is the same.
fbtracemgr.exe -se localhost:service_mgr -start -conf "C:\temp\fbtrace.conf" -user SYSDBA -pass masterkey
Trace session ID 4 started

2019-05-07T11:28:43.9850 (2692:00000000018C1940) EXECUTE_STATEMENT_FINISH
        C:\TEMP\TESTDEADLOCK.FDB (ATT_12, SYSDBA:NONE, NONE, TCPv6:::1/52938)
        C:\HQbird\Firebird30\isql.exe:8828
                (TRA_15, CONCURRENCY | WAIT | READ_WRITE)

Statement 52:
-------------------------------------------------------------------------------
UPDATE T1 SET i1=5 where i1=2
0 records fetched
      0 ms, 2 read(s), 21 fetch(es), 3 mark(s)


2019-05-07T11:29:45.0190 (2692:00000000018C0040) FAILED EXECUTE_STATEMENT_FINISH
        C:\TEMP\TESTDEADLOCK.FDB (ATT_13, SYSDBA:NONE, NONE, TCPv6:::1/52968)
        C:\HQbird\Firebird30\isql.exe:7032
                (TRA_20, CONCURRENCY | WAIT | READ_WRITE)

Statement 55:
-------------------------------------------------------------------------------
UPDATE T1 SET i1=100 where i1=2
0 records fetched
  40242 ms, 19 fetch(es), 2 mark(s)


2019-05-07T11:29:45.0190 (2692:00000000018C0040) ERROR AT JStatement::execute
        C:\TEMP\TESTDEADLOCK.FDB (ATT_13, SYSDBA:NONE, NONE, TCPv6:::1/52968)
        C:\HQbird\Firebird30\isql.exe:7032
335544336 : deadlock
335544451 : update conflicts with concurrent update
335544878 : concurrent transaction number is 15
As you can see, we have records in the log about both updates – successful and failed, and an error message about deadlock.

Let's consider how to analyze the log:
In the error message, let's locate the identifier of the transaction:

 2019-05-07T11:29:45.0190 (2692:00000000018C0040) ERROR AT JStatement::execute
        C:\TEMP\TESTDEADLOCK.FDB (ATT_13, SYSDBA:NONE, NONE, TCPv6:::1/52968)
        C:\HQbird\Firebird30\isql.exe:7032
335544336 : deadlock
335544451 : update conflicts with concurrent update
335544878 : concurrent transaction number is 15
Near the record about the error (usually right above it), you will see the failed statement:
2019-05-07T11:29:45.0190 (2692:00000000018C0040) FAILED EXECUTE_STATEMENT_FINISH
        C:\TEMP\TESTDEADLOCK.FDB (ATT_13, SYSDBA:NONE, NONE, TCPv6:::1/52968)
        C:\HQbird\Firebird30\isql.exe:7032
                (TRA_20, CONCURRENCY | WAIT | READ_WRITE)

Statement 55:
-------------------------------------------------------------------------------
UPDATE T1 SET i1=100 where i1=2
0 records fetched
  40242 ms, 19 fetch(es), 2 mark(s)
Notice that execution time is huge (~40 seconds)– it indicates the time when the operation awaited the result of the concurrent transaction.

And, finally, search for the transaction number with the successful UPDATE operation: for this, locate for TRA_NNN, where NNN is the transaction number. In our case, it will be TRA_15:

2019-05-07T11:28:43.9850 (2692:00000000018C1940) EXECUTE_STATEMENT_FINISH
        C:\TEMP\TESTDEADLOCK.FDB (ATT_12, SYSDBA:NONE, NONE, TCPv6:::1/52938)
        C:\HQbird\Firebird30\isql.exe:8828
                (TRA_15, CONCURRENCY | WAIT | READ_WRITE)

Statement 52:
-------------------------------------------------------------------------------
UPDATE T1 SET i1=5 where i1=2
0 records fetched
      0 ms, 2 read(s), 21 fetch(es), 3 mark(s)

So, this is it – we found both parties of the update conflict – the successful and the failed.

How to avoid deadlocks?

Now you need to identify the best solution for your application to avoid deadlocks - it can be a change of the transaction parameters, or it is possible to add extra error processing and repeat the operation or perform additional checks on the business level of the application.

 

More details we will consider during the «All About Transactions» workshop at the Firebird Conference 2019 in Berlin, October 17-19.