## What is the difference between a lock convoy and lock/thread contention?

From wikipedia on lock convoy:

A lock convoy occurs when multiple threads of equal priority contend repeatedly for the same lock. Unlike deadlock and livelock situations, the threads in a lock convoy do progress; however, each time a thread attempts to acquire the lock and fails, it relinquishes the remainder of its scheduling quantum and forces a context switch. The overhead of repeated context switches and underutilization of scheduling quanta degrade overall performance.

From wikipedia on lock/thread contention contention:

lock contention: this occurs whenever one process or thread attempts to acquire a lock held by another process or thread. The more fine-grained the available locks, the less likely one process/thread will request a lock held by the other. (For example, locking a row rather than the entire table, or locking a cell rather than the entire row.);

Could somebody please elaborate a bit further on both of those things? To me it seems like they are essentially the same, or if they are not, then surely lock contention causes a lock convoy. Is that the case or are they separate and independent concepts? Also, I don’t understand the sentence "it relinquishes the remainder of its scheduling quantum and forces a context switch".

## Thread – contention vs race

The terms `contention` and “`race$$“$$ are used interchangeably when it comes to thread’s state(at critical section). Are they same?

## Design choice to avoid lock contention

I have a network application `F`. It receives requests from one (or many) client network function. `F` can handle multi-client requests using an `epoll` loop. `F` maintains a state machine for each client(user). `F` also maintains some context for each user or client. Whenever a message is received from a client network function (by one user), `F` processes the message and if required fetch context specific to this user and updates the context.

Currently, the contexts of different users are maintained in a C++ STL map.

``map<clientId,context> userMap; ``

where `clientId` is an integer and `context` is a `struct` containing user-specific data. Whenever I need to access the userMap I would take a lock first, access the data, and then unlock.

For example, if a client sends a request message `X`. The server `F`, `epoll_wait()` will get the event (i.e. the incoming message). Then this messages is read from the socket and processed further. In the server `handleX()` method will be invoked (if user state and context is consistent and allows for `handleX` to be executed for this user). Each of these `handleX` function needs the current user context for some computation. it locks the userMap, gets or set the data, and then unlocks it.

In the single threaded version of `F`, a single thread only waits for events and processes those events one by one.

I tried to use a thread pool to check the multicore scalability of `F`. In this case, a single thread reads the messages from the socket and puts those messages in a queue. A pool of threads are waiting on the queue and picks any messages that pushed on the queue. But the throughput is not better than single-threaded code version of `F`.

I think locking is inherently serializing the `F`‘s code. I would like to know is there any other model of user context store and retrieve to minimize lock contention?

Thanks!