From Bottlenecks to Balance: Dynamic Skew Join Fixes in Spark
Learn how Spark's Adaptive Query Execution (AQE) solves data skew problems in joins, improving performance without memory overprovisioning
who am I?
Hi, I’m Amar, currently working as a Solutions Consultant at Sahaj Software in Pune, with a focus on data engineering. I enjoy building and fixing things, particularly when it involves working on large-scale data processing jobs and optimizing Spark workloads.
Over the years, I’ve worked on various software projects and continuously explore new technologies. I’ve also contributed as a past maintainer at CRI-O, DuckDuckGo, and Google Summer of Code.
From Bottlenecks to Balance: Dynamic Skew Join Fixes in Spark
Learn how Spark's Adaptive Query Execution (AQE) solves data skew problems in joins, improving performance without memory overprovisioning
4TB RAM, Yet an OOM Error? Debugging a Spark Memory Mystery
Our Spark job failed despite a 4TB RAM cluster. Scaling up wasn’t the fix—tuning executors and heap size was. Learn how we solved JVM memory inefficiencies and optimized Spark performance.