Apache Spark is a gift to the big data community, which adds tons of new features on every release. However, it’s difficult to manage petabyte-scale Hadoop clusters with hundreds of edge nodes, multiple Spark releases and demonstrate operational efficiencies and standardization. In order to address these challenges, Paypal has developed and deployed a REST0based Spark platform: Spark Compute as a Service (SCaaS),which provides improved application development, execution, logging, security, workload management and tuning.
This session will walk through the top challenges faced by PayPal administrators, developers and operations and describe how Paypal’s SCaaS platform overcomes them by leveraging open source tools and technologies, like Livy, Jupyter, SparkMagic, Zeppelin, SQL Tools, Kafka and Elastic. You’ll also hear about the improvements PayPal has added, which enable it to run greater than 10,000 Spark applications in production effectively.
3. Paypal Scale
Business
$354B
Total Payment
up 28% YoY
$102B
Mobile Payment
up 55% YoY
6.1B
Total Transactions
up 24% YoY
2.0B
Mobile Transactions
up 43% YoY
• One of the world’s largest internet payment companies
• 203+M active accounts on 200 markets around the world
• PayPal platform includes Braintree, Venmo, Paydiant, PP Credit and Xoom
4. Paypal Scale
Core Data Platform
70+PB
Data
40,000+
Yarn Jobs Per Day
15+
Hadoop Clusters
5+
Compute
8. Challenges
• Need extensive support and maintenance for CLI
• Need to deploy entire stack of software
• Need to sync configurations across systems
• Need extensive testing of jobs before any upgrade
Batch
Edge Nodes
Interactive
Edge Nodes
Job
Administrators
11. Challenges
• Different ways of jobs execution and coding standards
• No uniform logging, monitoring and alerting
• Limited audit and control
• No statement level history or metrics
Batch
Edge Nodes
Interactive
Edge Nodes
Job
Operations/Security
15. Batch
Interactive
Job
LIVY GRID
Job Server
Building SCaaS
Adding HAand Enhance Livy
PayPal Livy Version
ü Multi-Nodes High Availability
ü Kerberos Authentication
Changes
ü SQL Interpreter
ü Session Manager
Enhancements
ü Session GC Improvements
ü Plug-in Logger
ü Yarn Poll Re-architecture
ü Multiple Spark Versions
Support
ü White/Black listUser
Authentication
ü Dockers
ü Hbase Support
ü Flink/Beam Support
16. Batch
Interactive
Job
LIVY GRID
Job Server
Livy
API
NAS
Batch
Building SCaaS
Adding Livy API and Utilities
Batch Utilities
ü startSparkBatch
ü stopSparkBatch
ü listSparkBatch
ü startSparklingWater
ü stopSparklingWater
ü startSparkSql
ü stopSparkSql
ü startSparkSession
ü execSparkFile
ü execSparkCode
ü stopSparkSession
ü listSparkSession
ü livy-spark.jar
Interactive Utilities
ü livy-spark
28. Administrators
ü Less maintenance on CLI
ü Deploy software stack only on Job Server
ü Configurations at one place
ü Easy platform/softwareupgrade
Developers
ü REST-friendly and Docker-friendly
ü Low-latency/sub-seconds execution
ü Sharing cache across jobs
ü Modularity and easy restartability
Analysts/Scientists
ü User friendly interactive applications
ü Multi-tenancy and Private workspace
ü Direct spark sql execution
ü Kerberos Support
Operations/Security
ü Standardized coding andunified execution
ü Uniformed logging, monitoring and alerting
ü Fine-grained audit
ü Complete statement level history and metrics
SCaaS
Benefits