🧠 Python DeepCuts — 💡 Threading vs Multiprocessing Deep Dive
Posted on: June 3, 2026
Description:
Python offers multiple ways to run tasks concurrently, but the two most common approaches are threading and multiprocessing. At first glance they seem similar. Both allow multiple tasks to run simultaneously. Internally, however, they are fundamentally different in terms of:
- memory usage
- process isolation
- communication
- CPU utilisation
This DeepCut explores those differences and when each approach should be used.
🧩 Threads Share Memory
Threads run inside the same process.
shared_data = []
def worker():
shared_data.append("thread")
All threads:
- share memory
- access the same variables
- use the same resources
This makes communication easy but introduces synchronisation challenges.
🧠 Processes Have Separate Memory
Processes are completely isolated.
data = []
def worker():
data.append("process")
Each process gets:
- its own memory space
- its own Python interpreter
- its own resources
Changes made inside a process are not visible to other processes unless explicitly shared.
🔄 The GIL Changes Everything
The Global Interpreter Lock (GIL) allows only one thread to execute Python bytecode at a time.
For CPU-heavy tasks:
def cpu_task():
total = 0
for i in range(10_000_000):
total += i
Multiple threads still compete for the same GIL.
As a result:
- CPU-bound threading rarely improves performance
- threads take turns executing Python code
🧬 Multiprocessing Bypasses the GIL
Each process runs its own Python interpreter.
processes = [
multiprocessing.Process(target=cpu_task)
for _ in range(2)
]
Because every process has:
- its own interpreter
- its own GIL
Python can execute work across multiple CPU cores simultaneously.
This is why multiprocessing is preferred for computation-heavy workloads.
🔍 Threads vs Processes at the OS Level
Processes are visible to the operating system.
import os
print(os.getpid())
Each process receives a unique Process ID (PID).
Threads belong to an existing process and share that process’s resources.
This difference affects:
- memory consumption
- scheduling
- isolation
⚠️ Inter-Process Communication (IPC)
Because processes do not share memory, communication must be explicit.
queue = multiprocessing.Queue()
Common IPC mechanisms include:
- Queue
- Pipe
- Shared Memory
- Manager Objects
These allow processes to exchange data safely.
🧠 When to Use Which?
Use Threading
- API calls
- Database operations
- File I/O
- Network requests
- Waiting-heavy workloads
Use Multiprocessing
- Data processing
- Image processing
- Scientific computing
- Machine learning workloads
- CPU-intensive calculations
The workload type usually determines the right choice.
✅ Key Points
- Threads share memory and resources
- Processes have isolated memory spaces
- The GIL limits CPU-bound threading
- Multiprocessing enables true parallel execution
- Processes communicate through IPC mechanisms
- Threading is best for I/O-bound tasks
- Multiprocessing is best for CPU-bound tasks
Understanding this distinction is essential when designing scalable Python applications.
Code Snippet:
import threading
import multiprocessing
import os
# Thread memory sharing
shared_data = []
def thread_worker():
shared_data.append("thread")
t = threading.Thread(target=thread_worker)
t.start()
t.join()
print(shared_data)
# Process isolation
data = []
def process_worker():
data.append("process")
print("Inside process:", data)
print("PID:", os.getpid())
p = multiprocessing.Process(target=process_worker)
p.start()
p.join()
print("Main process:", data)
# CPU-bound task
def cpu_task():
total = 0
for i in range(10_000_000):
total += i
# IPC example
def queue_worker(queue):
queue.put("hello")
queue = multiprocessing.Queue()
p = multiprocessing.Process(
target=queue_worker,
args=(queue,)
)
p.start()
print(queue.get())
p.join()
print("Main PID:", os.getpid())
No comments yet. Be the first to comment!