SlideShare a Scribd company logo
1 of 42
behind the scenes	


                 Spotify	

             Ricardo Santos	

                @ricardovice
“spotifiera”, anyone?
Main goals	

•  A big catalogue, tons of music	

•  Available everywhere	

•  Great user experience	

•  More convenient than piracy	

•  Fast	

•  Reliable, high availability	

•  Scalable to many, many users
Business idea	

•  Free ad-funded version	

•  Paid subscription where users get:	

  •  No advertisements	

  •  Mobile access	

  •  Offline playback	

  •  API access
“music itself is going to become
like running water or electricity”	

          David Bowie, 2002
Accessibility	

•  People should always be able to access
  music	

  •  Whenever they want	

  •  Wherever they are
The catalogue	

•  All content is delivered by labels	

•  Currently over 10 million tracks	

•  Growing every day, around 10k per day	

•  96-320 kbps audio streams, most are Ogg
  Vorbis q5, 160kbps
that all sounds cool,
but let’s talk engineering!
“It’s Easy, Really.” 	

    Blaine Cook, 2007
Handling Growth	

•  Scaling is not an exact science	

•  There is no such thing as a magic formula	

•  Usage patterns differ	

•  There is always a limit to what you can
   handle	

•  Fail gracefully	

•  Continuous evolution process
Usage patterns	

Typically, some services are more demanding
than others, this can be due to:	

•  Higher popularity	

•  Higher complexity	

•  Both combined
Decoupling	

•  Divide and conquer!	

•  Resources assigned individually	

•  Using the right tools to address each
   problem	

•  Organization and delegation	

•  Problems are isolated	

•  Easier to handle growth
Decoupling	

Spotify’s internal services include:	

•  Access Point	

•  User	

•  Playlist	

•  Search	

•  Browse	

Can you guess which one is the most complex?
Playlist!
Playlist!	

Though it may sound simple, by far the most
 demanding:	

•  For each user there are several playlists	

•  Push notifications	

•  Offline writing	

•  Conflict resolution without user interaction
Metadata services	

Search and Browse allow users to find music	

•  Both handle read requests	

•  But their usage and responses differ	

•  Data sources should be optimized for each
   of these, called indices	

•  These are hard to maintain, easier to
   regenerate
Speed thrills
Latency matters	

•  High latency is a problem, not only in First
  Person Shooters	

•  Increased latency of Google searches by
  100 – 400ms decreased usage by 0.2 – 0.6%
  (Jake Brutlag, 2009)	

•  Slow performance is one of the major
  reasons users abandon services	

•  Users don't come back
Focus on low latency	

•  Our SLA is maintained by monitoring
  latency on the client side	

•  On average, the human notion of
  “instantly” is 200ms	

•  The current median latency to begin to
  play a track in Spotify is 265ms	

•  Due to disk lookup, at times it's actually
  faster to start playing a track from network
  than from disk
Playing a track	

•  Check local cache	

•  Request first piece from Spotify servers	

•  Meanwhile, search P2P for remainder	

•  Switch between servers  P2P as needed	

•  Towards the end of a track, start pre-
  fetching the next one via P2P rather than
  our servers
When to start playing?	

•  Trade off between stutter  latency	

•  Look at last 15 min of transfer rates	

•  Model as Markov chain and simulate	

•  Coupled with some heuristics
Production storage	

•  Production storage is a cache with fast
  drives  lots of RAM	

•  Serves the most popular content	

•  A cache miss will generate a request to
  master storage, slightly higher latency	

•  Production storage is available in several
  data centers to ensure closeness to the
  user (latency wise)
Master storage	

•  Works as a DHT, with some redundancy	

•  Contains all available tracks but has slower
  drives and access	

•  Tracks are kept in several formats, adding
  up to around 290TB
P2P helps	

•  Easier to scale	

•  Less servers	

•  Less bandwidth	

•  Better uptime	

•  Less costs	

•  Fun!
P2P overview	

•  Not a piracy network, all tracks are added
  by Spotify	

•  Used on all desktop clients (no mobile)	

•  Each client connected to = 60 others	

•  All nodes are equals (no super nodes)	

•  A track is downloaded from several peers
P2P custom protocol	

•  Ask for most urgent pieces first	

•  If a peer is slow, re-request from new
  peers	

•  When buffers run low, download from
  central servers	

•  If loading from servers, estimate at what
  point P2P will catch up	

•  If buffers are very low, stop uploading
P2P finding peers	

•  Partial central tracker (BitTorrent-style)	

•  Broadcast query in small neighborhood
  (Gnutella-style)	

•  Two mechanisms results in higher
  availability	

•  Limited broadcast for local (LAN) peer
  discovery (cherry on top...)
P2P security	

•  The P2P network needs to be a safe and
  trusted one	

•  All exchanged files have to come originally
  from Spotify	

•  All peers should be trusted Spotify clients
Security trough
          obscurity	

•  Our client needs to be able to read
  metadata and play music	

•  At the same time we have to prevent
  reverse engineering from doing the same	

•  Therefor, we can't openly discuss the
  details
but…	

•  Closed environment	

•  Integrity of downloaded files is checked	

•  Data transfers are encrypted	

•  Usernames are not exposed in P2P
  network, all peers assigned pseudonym	

•  Software obfuscation, makes life difficult for
  reverse engineers
Software obfuscation
So, what's the
           outcome?	

•  At over 10 million users the responses are	

  •  55.4% from client cache	

  •  35.8% from the P2P network	

  •  8.8% from the servers
Oh, and
we have
cake as
well! :D

spotify.com/jobs
jobs@spotify.com
I'd like to know more...	

•  Get in touch with us	

•  Checkout Gunnar Kreitz's slides and
  academic papers on the subject:	

http://www.csc.kth.se/~gkreitz/spotify-p2p10/
Thanks!	

http://commons.wikimedia.org/wiki/File:Surprised_young_cat.JPG	


http://commons.wikimedia.org/wiki/File:Chicken_February_2009-1.jpg	


http://xkcd.com/257/

More Related Content

What's hot

Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At SpotifyVidhya Murali
 
How data drives spotify
How data drives spotifyHow data drives spotify
How data drives spotifyAli Sarrafi
 
How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyJosh Baer
 
The Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainThe Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainRafał Wojdyła
 
Product Owner presentation for Spotify
Product Owner presentation for SpotifyProduct Owner presentation for Spotify
Product Owner presentation for Spotifypdicorpo
 
A Spotify Presentation - Case studies
A Spotify Presentation - Case studiesA Spotify Presentation - Case studies
A Spotify Presentation - Case studiesEmily Wilkinson
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Erik Bernhardsson
 
Präsentation über Spotify
Präsentation über Spotify Präsentation über Spotify
Präsentation über Spotify Karin Atzinger
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemAnoop Deoras
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At SpotifyAdam Kawa
 
Big data and machine learning @ Spotify
Big data and machine learning @ SpotifyBig data and machine learning @ Spotify
Big data and machine learning @ SpotifyOscar Carlsson
 
The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyJosh Baer
 
Spotify Company Presentation
Spotify Company PresentationSpotify Company Presentation
Spotify Company PresentationErik Forkin
 
Analysis of Spotify & New Feature Ideas
Analysis of Spotify & New Feature IdeasAnalysis of Spotify & New Feature Ideas
Analysis of Spotify & New Feature IdeasSarah L. Miller
 
Machine Learning at Netflix Scale
Machine Learning at Netflix ScaleMachine Learning at Netflix Scale
Machine Learning at Netflix ScaleAish Fenton
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...confluent
 

What's hot (20)

Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
 
How data drives spotify
How data drives spotifyHow data drives spotify
How data drives spotify
 
How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At Spotify
 
The Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainThe Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and Pain
 
Product Owner presentation for Spotify
Product Owner presentation for SpotifyProduct Owner presentation for Spotify
Product Owner presentation for Spotify
 
The Spotify Brand
The Spotify BrandThe Spotify Brand
The Spotify Brand
 
A Spotify Presentation - Case studies
A Spotify Presentation - Case studiesA Spotify Presentation - Case studies
A Spotify Presentation - Case studies
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014
 
Präsentation über Spotify
Präsentation über Spotify Präsentation über Spotify
Präsentation über Spotify
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender System
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
 
Big data and machine learning @ Spotify
Big data and machine learning @ SpotifyBig data and machine learning @ Spotify
Big data and machine learning @ Spotify
 
Data-Driven @ Netflix
Data-Driven @ NetflixData-Driven @ Netflix
Data-Driven @ Netflix
 
Machine Learning in the IoT with Apache NiFi
Machine Learning in the IoT with Apache NiFiMachine Learning in the IoT with Apache NiFi
Machine Learning in the IoT with Apache NiFi
 
The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at Spotify
 
Spotify Company Presentation
Spotify Company PresentationSpotify Company Presentation
Spotify Company Presentation
 
Analysis of Spotify & New Feature Ideas
Analysis of Spotify & New Feature IdeasAnalysis of Spotify & New Feature Ideas
Analysis of Spotify & New Feature Ideas
 
Machine Learning at Netflix Scale
Machine Learning at Netflix ScaleMachine Learning at Netflix Scale
Machine Learning at Netflix Scale
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
 

Viewers also liked

Microservices at Spotify
Microservices at SpotifyMicroservices at Spotify
Microservices at SpotifyKevin Goldsmith
 
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Kinshuk Mishra
 
Managing Experiment at Spotify
Managing Experiment at SpotifyManaging Experiment at Spotify
Managing Experiment at SpotifyAli Sarrafi
 
Spotify Business Model Analysis
Spotify Business Model AnalysisSpotify Business Model Analysis
Spotify Business Model AnalysisTrevor Clendenin
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...Kevin Goldsmith
 
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan BasalamahIndonesia Network Operators Group
 
An experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAn experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAPNIC
 
Insights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkInsights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkOscar Carlsson
 
Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreNick Barkas
 
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Hakka Labs
 
Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Affan Basalamah
 
A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1coachkevinperkins
 
Fail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedFail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedKevin Goldsmith
 
BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify Vincent Tsao
 
Social Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifySocial Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifyValeria Aguerri
 
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUBTu Pham
 
Business model of Spotify
Business model of SpotifyBusiness model of Spotify
Business model of SpotifyAnirban Ghosh
 
Spotify Business Case
Spotify Business CaseSpotify Business Case
Spotify Business CaseDavid Gorgan
 

Viewers also liked (20)

Microservices at Spotify
Microservices at SpotifyMicroservices at Spotify
Microservices at Spotify
 
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
 
Managing Experiment at Spotify
Managing Experiment at SpotifyManaging Experiment at Spotify
Managing Experiment at Spotify
 
Spotify Business Model Analysis
Spotify Business Model AnalysisSpotify Business Model Analysis
Spotify Business Model Analysis
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
 
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
 
An experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAn experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countries
 
Insights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkInsights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talk
 
Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for more
 
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
 
Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010
 
A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1
 
Spotify Teknikdagarna
Spotify TeknikdagarnaSpotify Teknikdagarna
Spotify Teknikdagarna
 
Spotify presentation
 Spotify presentation Spotify presentation
Spotify presentation
 
Fail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedFail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, Succeed
 
BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify
 
Social Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifySocial Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of Spotify
 
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
 
Business model of Spotify
Business model of SpotifyBusiness model of Spotify
Business model of Spotify
 
Spotify Business Case
Spotify Business CaseSpotify Business Case
Spotify Business Case
 

Similar to Spotify: behind the scenes

Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingRicardo Vice Santos
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementWei-Ning Huang
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21Lorenzo Miniero
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoJenn Riley
 
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressWordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressDigital Strategy Works LLC
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardNOLOH LLC.
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into OverdriveTodd Palino
 
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroScaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroGreg Kawere
 
Scaling WebRTC applications with Janus
Scaling WebRTC applications with JanusScaling WebRTC applications with Janus
Scaling WebRTC applications with JanusLorenzo Miniero
 
Going Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for PodcastingGoing Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for Podcasting3Play Media
 
Introduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WayIntroduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WaySrinath Perera
 
Apache: Code, Community and Open Source
Apache: Code, Community and Open SourceApache: Code, Community and Open Source
Apache: Code, Community and Open SourceOPNFV
 
WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022Lorenzo Miniero
 
E-commerceG1-C1 P2P
E-commerceG1-C1 P2PE-commerceG1-C1 P2P
E-commerceG1-C1 P2Pnewnwan
 
Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Lorenzo Miniero
 
Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Barry Tarlton
 

Similar to Spotify: behind the scenes (20)

Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streaming
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacement
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and Video
 
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressWordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forward
 
P2P Lecture.ppt
P2P Lecture.pptP2P Lecture.ppt
P2P Lecture.ppt
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroScaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo miniero
 
Scaling WebRTC applications with Janus
Scaling WebRTC applications with JanusScaling WebRTC applications with Janus
Scaling WebRTC applications with Janus
 
Going Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for PodcastingGoing Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for Podcasting
 
Peer to peer(p2 p)
Peer to peer(p2 p)Peer to peer(p2 p)
Peer to peer(p2 p)
 
Introduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WayIntroduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache Way
 
Apache: Code, Community and Open Source
Apache: Code, Community and Open SourceApache: Code, Community and Open Source
Apache: Code, Community and Open Source
 
Podcasting
PodcastingPodcasting
Podcasting
 
WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022
 
E-commerceG1-C1 P2P
E-commerceG1-C1 P2PE-commerceG1-C1 P2P
E-commerceG1-C1 P2P
 
Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021
 
Music streams
Music streamsMusic streams
Music streams
 
Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

Spotify: behind the scenes

  • 1. behind the scenes Spotify Ricardo Santos @ricardovice
  • 3. Main goals •  A big catalogue, tons of music •  Available everywhere •  Great user experience •  More convenient than piracy •  Fast •  Reliable, high availability •  Scalable to many, many users
  • 4. Business idea •  Free ad-funded version •  Paid subscription where users get: •  No advertisements •  Mobile access •  Offline playback •  API access
  • 5. “music itself is going to become like running water or electricity” David Bowie, 2002
  • 6. Accessibility •  People should always be able to access music •  Whenever they want •  Wherever they are
  • 7.
  • 8.
  • 9.
  • 10. The catalogue •  All content is delivered by labels •  Currently over 10 million tracks •  Growing every day, around 10k per day •  96-320 kbps audio streams, most are Ogg Vorbis q5, 160kbps
  • 11. that all sounds cool, but let’s talk engineering!
  • 12. “It’s Easy, Really.” Blaine Cook, 2007
  • 13. Handling Growth •  Scaling is not an exact science •  There is no such thing as a magic formula •  Usage patterns differ •  There is always a limit to what you can handle •  Fail gracefully •  Continuous evolution process
  • 14. Usage patterns Typically, some services are more demanding than others, this can be due to: •  Higher popularity •  Higher complexity •  Both combined
  • 15. Decoupling •  Divide and conquer! •  Resources assigned individually •  Using the right tools to address each problem •  Organization and delegation •  Problems are isolated •  Easier to handle growth
  • 16. Decoupling Spotify’s internal services include: •  Access Point •  User •  Playlist •  Search •  Browse Can you guess which one is the most complex?
  • 18. Playlist! Though it may sound simple, by far the most demanding: •  For each user there are several playlists •  Push notifications •  Offline writing •  Conflict resolution without user interaction
  • 19. Metadata services Search and Browse allow users to find music •  Both handle read requests •  But their usage and responses differ •  Data sources should be optimized for each of these, called indices •  These are hard to maintain, easier to regenerate
  • 20.
  • 22. Latency matters •  High latency is a problem, not only in First Person Shooters •  Increased latency of Google searches by 100 – 400ms decreased usage by 0.2 – 0.6% (Jake Brutlag, 2009) •  Slow performance is one of the major reasons users abandon services •  Users don't come back
  • 23. Focus on low latency •  Our SLA is maintained by monitoring latency on the client side •  On average, the human notion of “instantly” is 200ms •  The current median latency to begin to play a track in Spotify is 265ms •  Due to disk lookup, at times it's actually faster to start playing a track from network than from disk
  • 24. Playing a track •  Check local cache •  Request first piece from Spotify servers •  Meanwhile, search P2P for remainder •  Switch between servers P2P as needed •  Towards the end of a track, start pre- fetching the next one via P2P rather than our servers
  • 25. When to start playing? •  Trade off between stutter latency •  Look at last 15 min of transfer rates •  Model as Markov chain and simulate •  Coupled with some heuristics
  • 26.
  • 27. Production storage •  Production storage is a cache with fast drives lots of RAM •  Serves the most popular content •  A cache miss will generate a request to master storage, slightly higher latency •  Production storage is available in several data centers to ensure closeness to the user (latency wise)
  • 28. Master storage •  Works as a DHT, with some redundancy •  Contains all available tracks but has slower drives and access •  Tracks are kept in several formats, adding up to around 290TB
  • 29.
  • 30. P2P helps •  Easier to scale •  Less servers •  Less bandwidth •  Better uptime •  Less costs •  Fun!
  • 31. P2P overview •  Not a piracy network, all tracks are added by Spotify •  Used on all desktop clients (no mobile) •  Each client connected to = 60 others •  All nodes are equals (no super nodes) •  A track is downloaded from several peers
  • 32. P2P custom protocol •  Ask for most urgent pieces first •  If a peer is slow, re-request from new peers •  When buffers run low, download from central servers •  If loading from servers, estimate at what point P2P will catch up •  If buffers are very low, stop uploading
  • 33. P2P finding peers •  Partial central tracker (BitTorrent-style) •  Broadcast query in small neighborhood (Gnutella-style) •  Two mechanisms results in higher availability •  Limited broadcast for local (LAN) peer discovery (cherry on top...)
  • 34. P2P security •  The P2P network needs to be a safe and trusted one •  All exchanged files have to come originally from Spotify •  All peers should be trusted Spotify clients
  • 35. Security trough obscurity •  Our client needs to be able to read metadata and play music •  At the same time we have to prevent reverse engineering from doing the same •  Therefor, we can't openly discuss the details
  • 36. but… •  Closed environment •  Integrity of downloaded files is checked •  Data transfers are encrypted •  Usernames are not exposed in P2P network, all peers assigned pseudonym •  Software obfuscation, makes life difficult for reverse engineers
  • 38. So, what's the outcome? •  At over 10 million users the responses are •  55.4% from client cache •  35.8% from the P2P network •  8.8% from the servers
  • 39.
  • 40. Oh, and we have cake as well! :D spotify.com/jobs jobs@spotify.com
  • 41. I'd like to know more... •  Get in touch with us •  Checkout Gunnar Kreitz's slides and academic papers on the subject: http://www.csc.kth.se/~gkreitz/spotify-p2p10/