An overview of how we use Amazon web services at Mendeley. I go into more details on generating pdf previews for our site of 13TB of files, and scaling solr search to handle variable load on EC2.
5. Mendeley helps researchers work smarter
1) Install
Mendeley Desktop
Automatic data extraction
2) Manage
your research
papers
6. Mendeley helps researchers work smarter
1) Install
Mendeley Desktop
External database integration
2) Manage
your research
papers
7. Mendeley helps researchers work smarter
1) Install
Mendeley Desktop
Automatic bibliography generation
2) Manage
your research
papers
8. Mendeley helps researchers work smarter
1) Install
Mendeley Desktop
Tagging and annotation
2) Manage
your research
papers
9. Mendeley helps researchers work smarter
3) Mendeley aggregates research
data in the cloud
1) Install
Mendeley Desktop
2) Manage
your research
papers
10. By doing this, Mendeley makes science more
collaborative and transparent
11.
12.
13.
14.
15.
16.
17.
18. Mendeley in numbers
• 1 million users
• 130 million research articles
• 40 million unique
• 14 million unique files uploaded
• 13 TB in total
19. System Overview
S3
ng
Amazon Web Web Web S ynci
Services Server Server
EM
R
Brow
sing
Docs
EC
2
Usage Logs
MySQL
MySQL
MySQL
Da
ta S
erv
ice
s
Map Reduce
HB
ase HD
FS
20. File Storage
• Sync to and from clients
–Backed onto S3
• How to render 13TB of pdfs?
22. Adapt to take advantage
• Improve delivery
–Cloud Front
–Faster worldwide
• Re-working for cost saving
–SQS
–Spot instances
–Render when it’s cheapest!
23. Article Search
• 40 million papers
• Gives 40GB index in Solr
• Variable load
• Moved to EC2
–Elastic Load Balancer
Two fold variance in traffic over a week
–Auto-scale instances
24. Solr Instance Layout
• Master
Solr
–Single instance Master
–Matched to indexing load
–Backed onto EBS
Solr
Solr Solr
Slave
Slave Slave
• Slaves
–HTTP sync to master
–Pre-built AMI images Elastic
Load Balancer
–EC2 auto scaling
25. Desktop Client
• Client Downloads
–From S3
–Adding CloudFront
• Crash Reports
–Stack traces into S3
–Analytic reports on top
–More focused bug fixing
26. The future
• Aim to buy no more hardware
• More Java on Elastic Beanstalk
• SQS - replace queues
• EMR - log analysis
• SimpleDB & S3 for data stores
27. Problems Faced
• Accounting usage
–Mix of users on account
–Start early with this!
–IAM helps
• Orchestration
–Cloud Formation
–Elastic Beanstalk
–Finding we need more
28. Summary
• Not all or nothing
• Focus on your problem
not “Undifferentiated heavy lifting”
- Werner Vogels
• Learn the building blocks provided
• Modular system design helps
29. Mendeley Binary Battle
• $10,001 prize + $1000 aws vouchers
• Collaboration with PLoS
• Prizes to best use of the API
• Judging panel includes
–Werner Vogels
–Tim O'Reilly
30. We’re hiring
http://mendeley.com/careers/
or chat to me after
• Lead Mobile Developer, iOS
• Web Developer, PHP/MySQL
• Software Engineer, Java