Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
How to build a SQL-based data warehouse for a trillion rows in Python
By Ville Tuulos
http://tuulos.github.io/pydata-2014/...
Upcoming SlideShare
Loading in …5
×

How to build a SQL-based data warehouse for a trillion rows in Python by Ville Tuulos PyData SV 2014

860 views

Published on

In this talk, we show how and why AdRoll built a custom, high-performance data warehouse in Python which can handle hundreds of billions of data points with sub-minute latency on a small cluster of servers. This feat is made possible by a non-trivial combination of compressed data structures, meta-programming, and just-in-time compilation using Numba, a compiler for numerical Python. To enable smooth interoperability with existing tools, the system provides a standard SQL-interface using Multicorn and Foreign Data Wrappers in PostgreSQL.

Published in: Technology
  • Login to see the comments

  • Be the first to like this

How to build a SQL-based data warehouse for a trillion rows in Python by Ville Tuulos PyData SV 2014

  1. 1. How to build a SQL-based data warehouse for a trillion rows in Python By Ville Tuulos http://tuulos.github.io/pydata-2014/#/

×