What is CPython GIL

What is Python GIL?

What is an Interpreter?

As an interpreted language, Python first compiles source code into bytecode and then executes it in a virtual machine. The entire process is executed by the interpreter. Similar languages include JavaScript and PHP.

However, there are multiple interpreters. The official one is CPython, developed in C. There are also other implementations such as Jython (based on Java) and RPython (based on R).

What is GIL?

GIL is a mutual exclusion lock.

What is a Mutual Exclusion Lock?

In programming, the concept of object mutual exclusion locks is introduced to ensure the integrity of shared data. Each object corresponds to a “mutual exclusion lock” marker. This marker ensures that at any given time, only one thread can access the object.

When multiple threads almost simultaneously modify a shared data item, synchronization control is needed.

Thread synchronization guarantees safe access to competing resources. The simplest synchronization mechanism is introducing a mutual exclusion lock.

A mutual exclusion lock introduces a state to resources: locked/unlocked.

When a thread modifies a shared data item, it first locks it. At this point, the resource is in a “locked” state, and other threads cannot modify it until the thread releases the resource, changing the resource’s state to “unlocked”. This ensures that only one thread performs write operations at a time, guaranteeing data correctness in multithreaded scenarios.

Why Does GIL Exist?

GIL was created because CPython’s memory management is not safe.

In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety. A nice explanation of how the Python GIL helps in these areas can be found here. In short, this mutex is necessary mainly because CPython’s memory management is not thread-safe.

In CPython, the GIL is a mutual exclusion lock that allows only one thread to execute bytecode at any given time. This avoids race conditions and ensures thread safety. Simply put, the existence of a mutual exclusion lock is because CPython’s memory management is not “thread-safe”.

According to the official Python documentation, since CPython’s memory management is not “thread-safe,” a mutual exclusion lock is needed. This naturally leads to two other questions.

GIL was created because CPython’s memory management is unsafe

Why Is CPython’s Memory Management Unsafe?

Python was first released in 1991. At that time, CPUs were single-core, and multithreading was mainly designed to allow one thread to handle I/O while another handled CPU computation. Python’s compiler was written in C, so it is called CPython. At that time, many programming languages did not have automatic memory management features. To implement automatic garbage collection, Python used reference counting for each object. When the reference count of an object becomes zero, it can be recycled and memory released. For example:

>>> import sys
>>> a = []
>>> b = a
>>> sys.getrefcount(a)
3

Here, the a object has 3 references:

One is itself,
One is variable b,
One is the parameter of the getrefcount function.

If another thread references a, the reference count increases by 1. If a thread uses a and then ends, the reference count decreases by 1. Multiple threads modifying the same variable’s “reference count” can lead to race conditions.

How to Solve the Memory Management Insecurity Issue?

To avoid race conditions, the simplest and most effective method is to add a mutual exclusion lock. However, if you lock every object, it may cause another problem: deadlocks, and frequent acquisition and release can reduce performance.

Therefore, the simplest and most effective solution is to add an interpreter lock. Threads must acquire the interpreter lock before executing any bytecode, which avoids deadlocks and has minimal performance impact. At that time, CPUs were single-core, and this GIL design was simple and did not affect performance, so it has been used until today. The main reason for the existence of GIL is because Python’s memory management is not thread-safe, which is the primary cause of GIL’s creation and persistence.

Mutual Exclusion Lock Code Example

The threading module defines the Lock class, which can easily handle locking:

# Create a lock
mutex = threading.Lock()

# Acquire the lock
mutex.acquire()

# Release the lock
mutex.release()

If this lock is not locked before, acquire() will not block.
If another thread has already locked this lock before calling acquire(), it will block until the lock is released.

Mutual Exclusion Lock Outside of for Loop

import threading
import time

# Define a global variable
g_num = 0


def test1(num):
    global g_num
    # Acquire the lock, if it's not locked before, this will succeed
    # If the lock is already locked by another thread, this will block until it is released
    mutex.acquire()
    for i in range(num):
        g_num += 1
    mutex.release()   # Release the lock
    print("-----in test1 g_num=%d----" % g_num)


def test2(num):
    global g_num
    mutex.acquire()   # Acquire the lock
    for i in range(num):
        g_num += 1
    mutex.release()   # Release the lock
    print("-----in test2 g_num=%d=----" % g_num)


# Create a mutual exclusion lock, which is initially unlocked
mutex = threading.Lock()


def main():
    t1 = threading.Thread(target=test1, args=(1000000,))
    t2 = threading.Thread(target=test2, args=(1000000,))

    t1.start()
    t2.start()

    # Wait for the two threads to complete
    time.sleep(2)

    print("-----in main Thread g_num = %d---" % g_num)

if __name__ == "__main__":
    main()

#-----in test1 g_num=1000000----
#-----in test2 g_num=2000000=----
#-----in main Thread g_num = 2000000---

Mutual Exclusion Lock Inside the for Loop

import threading
import time

# Define a global variable
g_num = 0

def test1(num):
    global g_num
    for i in range(num):
        mutex.acquire()  # Acquire the lock
        g_num += 1
        mutex.release()  # Release the lock

    print("---test1---g_num=%d"%g_num)

def test2(num):
    global g_num
    for i in range(num):
        mutex.acquire()  # Acquire the lock
        g_num += 1
        mutex.release()  # Release the lock

    print("---test2---g_num=%d"%g_num)

# Create a mutual exclusion lock
# It is initially in an unlocked state
mutex = threading.Lock()

# Create two threads to increment g_num by 1,000,000 times each
p1 = threading.Thread(target=test1, args=(1000000,))
p1.start()

p2 = threading.Thread(target=test2, args=(1000000,))
p2.start()

# Wait for the calculations to complete
while len(threading.enumerate()) != 1:
    time.sleep(1)

print("The final result after two threads operate on the same global variable is:%s" % g_num)

# ---test1---g_num=1909909
# ---test2---g_num=2000000
# The final result after two threads operate on the same global variable is:2000000

Locking and Unlocking Process

When a thread calls the acquire() method of the lock to obtain it, the lock enters a “locked” state.
Only one thread can obtain the lock at a time. If another thread attempts to obtain this lock while it is locked, that thread will become “blocked,” known as “blocking,” until the thread holding the lock calls release() to release the lock.
The thread scheduler selects a thread from those in a synchronized blocking state to obtain the lock and moves it into the running (running) state.

Benefits of Locking

Ensures that a critical section of code can only be executed by one thread from start to finish.

Drawbacks of Locking

Prevents multiple threads from executing concurrently. The segment of code with locks actually executes in single-threaded mode, significantly reducing efficiency.
Due to the possibility of having multiple locks, different threads may hold different locks and attempt to acquire each other’s locks, potentially causing deadlocks.

Deadlock Code Example

import threading
import time


# Create a mutual exclusion lock
lock = threading.Lock()


# According to the index to get values, ensuring that only one thread can access at a time
def get_value(index):

    # Acquire the lock
    lock.acquire()
    print(threading.current_thread())
    my_list = [3,6,8,1]
    # Check for index overflow
    if index >= len(my_list):
        print("Index out of bounds:", index)

        return
    value = my_list[index]
    print(f'Value is:{value}')
    time.sleep(0.2)
    # Release the lock
    lock.release()


if __name__ == '__main__':
    # Simulate a large number of threads executing get_value operations
    for i in range(30):
        sub_thread = threading.Thread(target=get_value, args=(i,))
        sub_thread.start()

In cases of deadlock, the program cannot stop normally and keeps waiting.

<Thread(Thread-1, started 30364)>
Value is:3
<Thread(Thread-2, started 27120)>
Value is:6
<Thread(Thread-3, started 29632)>
Value is:8
<Thread(Thread-4, started 29988)>
Value is:1
<Thread(Thread-5, started 20984)>
Index out of bounds: 4

Avoiding Deadlock Code Example

# Release the lock at appropriate places
import threading
import time

# Create a mutual exclusion lock
lock = threading.Lock()


# According to the index to get values, ensuring that only one thread can access at a time
def get_value(index):

    # Acquire the lock
    lock.acquire()
    print(threading.current_thread())
    my_list = [3,6,8,1]
    if index >= len(my_list):
        print("Index out of bounds:", index)
        # When the index is out of bounds, release the lock to allow other threads to access
        lock.release()
        return
    value = my_list[index]
    print(value)
    time.sleep(0.2)
    # Release the lock
    lock.release()


if __name__ == '__main__':
    # Simulate a large number of threads executing get_value operations
    for i in range(10):
        sub_thread = threading.Thread(target=get_value, args=(i,))
        sub_thread.start()

<Thread(Thread-1, started 30336)>
3
<Thread(Thread-2, started 5920)>
6
<Thread(Thread-3, started 28308)>
8
<Thread(Thread-4, started 27324)>
1
<Thread(Thread-5, started 26840)>
Index out of bounds: 4
<Thread(Thread-6, started 30104)>
Index out of bounds: 5
<Thread(Thread-7, started 28900)>
Index out of bounds: 6
<Thread(Thread-8, started 2676)>
Index out of bounds: 7
<Thread(Thread-9, started 28912)>
Index out of bounds: 8
<Thread(Thread-10, started 30068)>
Index out of bounds: 9

Process finished with exit code 0

Finally, What Results Does GIL Cause?

Positive

Solving safety issues.

Negative

Single-threaded CPU Consumption

About 16% (i5 11th generation)

import threading

def dead_loop():
    while True:
        pass
dead_loop()

Dual-threaded CPU Consumption

Still about 16%, not 32%.

import threading

def dead_loop():
    while True:
        pass
    # Start a new thread for the dead loop
    t = threading.Thread(target=dead_loop)

    t.start()
    # Main thread also enters a dead loop
    dead_loop()
    t.join()

dead_loop()

Conclusion:

When using two threads, CPython uses the same amount of CPU resources as a single-threaded scenario.
CPython can only run one GIL thread at a time.

If you go further and try ten or N threads, Python’s CPU utilization remains unchanged.

However, rewriting the same dead loop in C, C++, or Java can fully utilize all cores. Why can’t Python do this? It is precisely the GIL.

Although Python threads are real threads, when the interpreter executes code, it has a GIL lock: Global Interpreter Lock. Any Python thread must first acquire the GIL lock before executing code, and then the interpreter automatically releases the GIL lock after every 100 bytecode instructions, allowing other threads to execute. This GIL global lock actually locks all threads’ execution codes, so multithreading in Python can only alternate execution, even if 100 threads run on a 100-core CPU, they can only use one core.

In Python, you can use multithreading, but you cannot expect to effectively utilize multi-core processors. If you must use multithreading to utilize multiple cores, you have to implement it through C extensions, although this loses the simplicity and ease of use that Python is known for.

However, there’s no need to worry too much. Although Python cannot use multithreading to achieve multi-core tasks, you can achieve multi-core tasks through multiprocessing. Multiple Python processes have their own independent GIL locks and do not interfere with each other.

Other Notes

The expression “Python’s GIL” is not rigorous but is not wrong.
The GIL is relative to the Cpython interpreter, not the Python language itself.
Cpython is used to parse Python code.
Cpython is currently the most popular and mainstream interpreter.

Finally, as long as you are willing, you can develop your own interpreter without a GIL.