Deadlock between processes

Hi all,

I had a strange issue yesterday, it seems two processes were trying to update the same Task status:

File "/mnt/app/package/ext/shotgun_python_api/3.1.1/shotgun_api3/shotgun.py", line 1377, in 
 update
    record = self._call_rpc("update", params)
File "/mnt/app/package/ext/shotgun_python_api/3.1.1/shotgun_api3/shotgun.py", line 3221, in _call_rpc
    self._response_errors(response)
File "/mnt/app/package/ext/shotgun_python_api/3.1.1/shotgun_api3/shotgun.py", line 3526, in _response_errors
    raise Fault(sg_response.get("message", "Unknown Error"))
Fault: API update() CRUD ERROR #5: Update failed for [Task.sg_status_list]: PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 28746 waits for ShareLock on transaction 3667197707; blocked by process 32234.
Process 32234 waits for ShareLock on transaction 3667197715; blocked by process 28746.
HINT:  See server log for query details.
CONTEXT:  while updating tuple (21164,67) in relation "display_name_caches"
: ["UPDATE display_name_caches SET \"sg_status_list\" = $1, \"updated_at\" = $2 WHERE \"id\" = $3 /* request_id=5f4cf9c708e64fffe851402ad8f1ce20 */", ["rtk", "2019-10-30 15:09:16.038102", 571473]]

The log is pretty clear, but I would like to find what was going on and don’t know where to look at. My SGEH is running under a Linux distribution, of course I could not find any trace of the two processes.
And moreover, the issue did not come back so I cannot even reproduce it.

Best Regards,
Francois

2 Likes

Hi François,

This can happen on rare occasions when two database transactions are trying to update the same things in a sort of loop. Something like the following:

  • Transaction A acquires a lock on Shot X to update it
  • Transaction B acquires a lock on Task 1 to update it
  • Transaction A wants to acquire a lock on Task 1 to update it but can’t because of transaction B so it waits.
  • Transaction B wants to acquire a lock on Shot X to update it but can’t because of transaction A so it waits.

At this point neither transaction can finish because they are both waiting on the other transaction. This is a deadlock and is detected by the database.

The process ids mentioned in the error message are database server processes. If you want us to dig further, DM me your site name and I can take a look but do know that these can be somewhat rare and hard to diagnose after the fact. We have noticed that construction of larger batch API calls can exacerbate this because the transaction encapsulates the entire batch which can contain many updates on many things. They’re also, as you’ve already figured out, notoriously hard to reproduce because they depend on very exact timing at the database.

Cheers,
Patrick

3 Likes

Hi Patrick,

as you say it is pretty hard to track down :frowning: I don’t think it is necessary for you to dive into this for now. I will pay particular attention to this and let you know if it becomes a recurrent problem.
Best Regards,
Francois

3 Likes