How to build a SQL-based data warehouse for a trillion rows in Python by Ville Tuulos PyData SV 2014

•Download as PPTX, PDF•

0 likes•933 views

In this talk, we show how and why AdRoll built a custom, high-performance data warehouse in Python which can handle hundreds of billions of data points with sub-minute latency on a small cluster of servers. This feat is made possible by a non-trivial combination of compressed data structures, meta-programming, and just-in-time compilation using Numba, a compiler for numerical Python. To enable smooth interoperability with existing tools, the system provides a standard SQL-interface using Multicorn and Foreign Data Wrappers in PostgreSQL.

Technology

How to build a SQL-based data warehouse for a trillion rows in Python
By Ville Tuulos
http://tuulos.github.io/pydata-2014/#/

Recently uploaded

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Take control of your SAP testing with UiPath Test SuiteDianaGray10

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely

How to write a Business Continuity PlanDatabarracks

WordPress Websites for Engineers: Elevate Your Brandgvaughan

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Commit 2024 - Secret Management made easyAlfredo García Lavilla

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition

SIP trunking in Janus @ Kamailio World 2024

Take control of your SAP testing with UiPath Test Suite

TeamStation AI System Report LATAM IT Salaries 2024

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf

How to write a Business Continuity Plan

WordPress Websites for Engineers: Elevate Your Brand

DevEX - reference for building teams, processes, and platforms

Vertex AI Gemini Prompt Engineering Tips

Commit 2024 - Secret Management made easy

"Debugging python applications inside k8s environment", Andrii Soldatenko

SAP Build Work Zone - Overview L2-L3.pptx

Developer Data Modeling Mistakes: From Postgres to NoSQL

Search Engine Optimization SEO PDF for 2024.pdf

Scanning the Internet for External Cloud Exposures via SSL Certs

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Designing IA for AI - Information Architecture Conference 2024

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

How AI, OpenAI, and ChatGPT impact business and software.