Case Study

How PhonePe/Walmart Scaled by 2000% and Saved $5 Million with Acceldata

“Acceldata supports our hyper-growth and helps us manage one of the world's largest instant payment systems. PhonePe's biggest-ever data infrastructure would never have been possible without Acceldata.”
Burzin Engineer, Founder & Chief Reliability Officer @ PhonePe

About PhonePe

PhonePe is a Walmart subsidiary that provides more than 350 million consumers across India with the ability to send and receive money, make payments at more than 10 million physical and online retail stores, use ATMs, and invest in mutual funds and other securities.

  • 350 million users
  • $400 million cash transactions per month
  • 1500+ nodes, 28+ HDP clusters, 20+ PB
  • Open-source Hadoop, HBase, Hive, Spark, Kafka, and Ranger

Problem:

  • Scaling and performance issues on open-source Hadoop environments.
  • Rapid customer acquisition came with the need to rapidly increase the size of their Hadoop infrastructure while adding Hive LLAP, Spark 3.x and Druid.
  • PhonePe needed more advanced tools to support their business growth and critical data initiatives.

Solution:

Acceldata Pulse began to provide real-time insights into Hbase, Hive, Kafka, and Spark within 24 hours.

Once Acceldata's Pulse solution was implemented, the PhonePe team began to immediately identify problems with their HBase region servers and tables that were under pressure. Pulse also was able to help distinguish between HBase cluster issues caused by hardware or poorly designed tables and anomalies resulting from seasonal and campaign-related surges.

Results:

  • Scale data infrastructure rapidly from 70 to more than 1500 Hadoop nodes; more than 2000% growth.
  • Deliver 99.97% availability across its Hadoop infrastructure.
  • Reduce data warehouse costs by 65% by eliminating the need for commercial licenses.
  • Upgrade, migrate, and refactor systems and workloads with no performance degradation.

Multi-Layer Data Observability

Enterprises are frequently challenged with managing and optimizing complex, large-scale data environments. Multilayer data observability correlates information across infrastructure, platform, processing and data layers to identify and alert on trouble spots, bottlenecks, and inefficiencies. Analytics and recommendations simplify remediation and administration. In addition, Acceldata provides an extensible library of auto-actions to make systems self-healing and self-tuning. The right data observability tools can significantly improve the reliability, performance, scale, and cost of enterprise data environments.

The Acceldata Solution

Acceldata Pulse delivers improved reliability, performance, and efficiency of data processing at scale.

  • Predict & Prevent Incidents
  • Scale Performance
  • Reduce Infrastructure Costs
Download PDF Version