The Solution
Designing a Cost-Efficient Spark Platform on AWS
BigData Boutique partnered with EverC to evaluate alternatives and design a data platform that balanced cost efficiency, flexibility, and usability. Rather than recommending a one-size-fits-all replacement, the engagement focused on understanding how EverC actually used Spark in production.
Evaluating EMR Deployment Models
Together, the teams assessed several AWS-native options:
- EMR Serverless, for simplified operations and bursty workloads
- EMR on EC2, for traditional yarn-based execution
- EMR on EKS, for deeper integration with Kubernetes-based infrastructure
Given EverC's existing investment in Kubernetes and its desire to reuse compute resources efficiently, EMR on EKS emerged as the best fit.
Why EMR on EKS
Running EMR on EKS allowed EverC to:
- Reuse existing Kubernetes clusters instead of provisioning dedicated Spark infrastructure
- Take advantage of fine-grained scheduling and scaling
- Improve utilization of spot instances and reduce idle compute
- Eliminate the platform premium associated with managed Spark services
Crucially, this approach gave EverC more direct control over how and when compute resources were consumed — without forcing a radical change in how teams worked.
Preserving the Analyst Experience
A key requirement was ensuring that analysts could continue working productively. By incorporating EMR Studio, EverC retained a notebook-based interface that felt familiar to the previous infrastructure they were accustomed to. This minimized disruption while enabling the underlying platform shift.