From Bottlenecks to Balance: Dynamic Skew Join Fixes in Spark
Learn how Spark's Adaptive Query Execution (AQE) solves data skew problems in joins, improving performance without memory overprovisioning
who am I?
Hi, I’m Amar — a Data Engineer at Citi in Pune, where I work on building and fixing things, particularly when it involves working on large-scale data processing jobs and optimizing Spark workloads. Before this, I spent four years at Sahaj Software as a Solutions Consultant, solving complex data problems and delivering reliable engineering solutions.
Over the years, I’ve worked on various software projects and continuously explore new technologies. I’ve also contributed as a past maintainer at CRI-O, DuckDuckGo, and Google Summer of Code.
From Bottlenecks to Balance: Dynamic Skew Join Fixes in Spark
Learn how Spark's Adaptive Query Execution (AQE) solves data skew problems in joins, improving performance without memory overprovisioning
4TB RAM, Yet an OOM Error? Debugging a Spark Memory Mystery
Our Spark job failed despite a 4TB RAM cluster. Scaling up wasn’t the fix—tuning executors and heap size was. Learn how we solved JVM memory inefficiencies and optimized Spark performance.