Kitsuya Azuma
Is the multiprocessing serialization bottleneck slowing down your data-intensive applications? A new piece of Python has arrived to solve this. The experimental free-threading mode in Python 3.13+, which disables the Global Interpreter Lock (GIL), offers a powerful, built-in solution that eliminates serialization overhead. In this session, I will demonstrate how this new capability can dramatically outperform multiprocessing and even achieve throughput comparable to frameworks like Ray for single-node ML workloads. Through a live refactoring demo, we will transform a slow data pipeline into a high-performance, free-threaded application, exposing common pitfalls to avoid. This isn't just theory; it's a new, battle-tested playbook to supercharge your code.
At PyCon JP 2024, we saw a fantastic introduction to the experimental free-threading mode. A year has passed—but how many of us have dared to use it to solve a real-world problem? Whether you're already fighting the multiprocessing serialization bottleneck or simply frustrated that your data-heavy tasks aren't as fast as they should be, this talk is for you.
The initial release of free-threading in Python 3.13 came with a known trade-off in single-threaded performance, but the landscape is evolving. We'll dive into the crucial improvements in Python 3.14 that mitigate this issue, making free-threading a more powerful and viable choice than ever for parallel workloads.
This session delivers a battle-tested playbook, moving from core concepts to a real-world application. Drawing from my experience developing an open-source framework to accelerate a data-intensive research field, I’ll guide you through the following:
The "Why" and "How" of Free-Threading: First, understand the fundamentals of Python's new parallel execution model. We'll cover why multiprocessing struggles with data-intensive tasks and how free-threading provides an elegant, built-in solution.
A Real-World Case Study with Hard Data: Next, we'll dive into a real-world project in Federated Learning where free-threading crushed a critical performance bottleneck. You'll see the benchmarks showing its dramatic throughput increase against both multiprocessing and an established framework like Ray.
Live Refactoring & Best Practices: Finally, watch a slow data pipeline get transformed into a high-performance application in a live demo. You will leave with a practical playbook and the confidence to apply these techniques, avoiding common pitfalls like race conditions.
Join me to see how this powerful new piece of Python can integrate with demanding ML workloads, unlocking a new level of performance and truly supercharging your code.
プロフィール
Kitsuya Azuma is a Master's student at Institute of Science Tokyo, applying his passion for writing robust Python code to his Federated Learning research. Through multiple internships at leading tech companies, he has gained broad software development experience, particularly in DevOps. He is set to launch his career as a Platform Engineer next year.