Submit Search
Upload
20210928_pgunconf_hll_count
•
0 likes
•
279 views
Kohei KaiGai
Follow
PG-StromのHyperLogLog機能について。 2021.09.28のPostgreSQL Unconference (online)の資料です
Read less
Read more
Software
Report
Share
Report
Share
1 of 17
Download now
Download to read offline
Recommended
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
Kohei KaiGai
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
Kohei KaiGai
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
Kohei KaiGai
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
Kohei KaiGai
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
Kohei KaiGai
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
Kohei KaiGai
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
Kohei KaiGai
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS
Kohei KaiGai
Recommended
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
Kohei KaiGai
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
Kohei KaiGai
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
Kohei KaiGai
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
Kohei KaiGai
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
Kohei KaiGai
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
Kohei KaiGai
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
Kohei KaiGai
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS
Kohei KaiGai
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
Kohei KaiGai
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Kohei KaiGai
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
Kohei KaiGai
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
Kohei KaiGai
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
Altinity Ltd
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Kohei KaiGai
PG-Strom
PG-Strom
Kohei KaiGai
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDW
Jonathan Katz
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
PostgresOpen
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco Systems
Altinity Ltd
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxData
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxData
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQL
Masahiko Sawada
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
Kohei KaiGai
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Altinity Ltd
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
Insight Technology, Inc.
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
Amit Kumar Singh
More Related Content
What's hot
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
Kohei KaiGai
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Kohei KaiGai
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
Kohei KaiGai
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
Kohei KaiGai
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
Altinity Ltd
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Kohei KaiGai
PG-Strom
PG-Strom
Kohei KaiGai
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDW
Jonathan Katz
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
PostgresOpen
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco Systems
Altinity Ltd
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxData
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxData
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQL
Masahiko Sawada
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
Kohei KaiGai
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Altinity Ltd
What's hot
(20)
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
PG-Strom
PG-Strom
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDW
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco Systems
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Bloat and Fragmentation in PostgreSQL
Bloat and Fragmentation in PostgreSQL
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Similar to 20210928_pgunconf_hll_count
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
Insight Technology, Inc.
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
Amit Kumar Singh
Pdxpugday2010 pg90
Pdxpugday2010 pg90
Selena Deckelmann
OLTP+OLAP=HTAP
OLTP+OLAP=HTAP
EDB
Dimensional performance benchmarking of SQL
Dimensional performance benchmarking of SQL
Brendan Furey
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
Altinity Ltd
HandlerSocket plugin for MySQL (English)
HandlerSocket plugin for MySQL (English)
akirahiguchi
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
Mahesh Vallampati
MLflow with R
MLflow with R
Databricks
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman Oracle
mCloud
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
mCloud
MLflow at Company Scale
MLflow at Company Scale
Databricks
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
Masayuki Matsushita
Top 10 tips for Oracle performance
Top 10 tips for Oracle performance
Guy Harrison
Performance tuning a quick intoduction
Performance tuning a quick intoduction
Riyaj Shamsudeen
Melbourne Groundbreakers Tour - Upgrading without risk
Melbourne Groundbreakers Tour - Upgrading without risk
Connor McDonald
Sangam 18 - The New Optimizer in Oracle 12c
Sangam 18 - The New Optimizer in Oracle 12c
Connor McDonald
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
Dave Stokes
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Altinity Ltd
Master tuning
Master tuning
Thomas Kejser
Similar to 20210928_pgunconf_hll_count
(20)
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
Pdxpugday2010 pg90
Pdxpugday2010 pg90
OLTP+OLAP=HTAP
OLTP+OLAP=HTAP
Dimensional performance benchmarking of SQL
Dimensional performance benchmarking of SQL
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
HandlerSocket plugin for MySQL (English)
HandlerSocket plugin for MySQL (English)
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
MLflow with R
MLflow with R
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
MLflow at Company Scale
MLflow at Company Scale
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
Top 10 tips for Oracle performance
Top 10 tips for Oracle performance
Performance tuning a quick intoduction
Performance tuning a quick intoduction
Melbourne Groundbreakers Tour - Upgrading without risk
Melbourne Groundbreakers Tour - Upgrading without risk
Sangam 18 - The New Optimizer in Oracle 12c
Sangam 18 - The New Optimizer in Oracle 12c
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Master tuning
Master tuning
More from Kohei KaiGai
20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History
Kohei KaiGai
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API
Kohei KaiGai
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow
Kohei KaiGai
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0
Kohei KaiGai
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache
Kohei KaiGai
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS
Kohei KaiGai
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online
Kohei KaiGai
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw
Kohei KaiGai
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw
Kohei KaiGai
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo
Kohei KaiGai
20191115-PGconf.Japan
20191115-PGconf.Japan
Kohei KaiGai
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta
Kohei KaiGai
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom
Kohei KaiGai
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
Kohei KaiGai
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom
Kohei KaiGai
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw
Kohei KaiGai
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw
Kohei KaiGai
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT
Kohei KaiGai
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
Kohei KaiGai
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
Kohei KaiGai
More from Kohei KaiGai
(20)
20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo
20191115-PGconf.Japan
20191115-PGconf.Japan
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
Recently uploaded
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
smiwainfosol
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
Alina Yurenko
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
qr0udbr0
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
preethippts
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
Velvetech LLC
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
Diego Iván Oliveros Acosta
MYjobs Presentation Django-based project
MYjobs Presentation Django-based project
AnoyGreter
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
VICTOR MAESTRE RAMIREZ
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
Hr365.us smith
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio, Inc.
Cyber security and its impact on E commerce
Cyber security and its impact on E commerce
manigoyal112
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
vaddepallysandeep122
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
jennyeacort
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
BradBedford3
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
StefanoLambiase
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
Philip Schwarz
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
OnePlan Solutions
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Natan Silnitsky
Recently uploaded
(20)
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
MYjobs Presentation Django-based project
MYjobs Presentation Django-based project
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Cyber security and its impact on E commerce
Cyber security and its impact on E commerce
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
20210928_pgunconf_hll_count
1.
HyperLogLogで作る “だいたい合ってる” COUNT (distinct
…) HeteroDB,Inc Chief Architect 兼 CEO 海外 浩平 <kkaigai@heterodb.com>
2.
自己紹介/HeteroDB社について PostgreSQL Unconference online
2021.09.28 2 会社概要 商号 ヘテロDB株式会社 創業 2017年7月4日 拠点 品川区北品川5-5-15 大崎ブライトコア4F 事業内容 高速データベース製品の販売 GPU&DB領域の技術コンサルティング ヘテロジニアスコンピューティング技術を データベース領域に適用し、 誰もが使いやすく、安価で高速なデータ解析基盤を提供する。 代表者プロフィール 海外 浩平(KaiGai Kohei) OSS開発者コミュニティにおいて、PostgreSQLやLinux kernelの 開発に15年以上従事。主にセキュリティ・FDW等の分野でアッ プストリームへの貢献。 IPA未踏ソフト事業において“天才プログラマー”認定 (2006) GPU Technology Conference Japan 2017でInception Awardを受賞
3.
COUNT(distinct KEY) ってキツくないですか? ▌SELECT
COUNT(*) FROM my_table my_tableの行数をカウントする ▌SELECT COUNT(KEY) FROM my_table my_tableのうち、KEY列が非NULLである行数をカウントする。 ▌SELECT COUNT(distinct KEY) FROM my_table my_tableのうち、KEY列がユニークな行数をカウントする。 ➔重複排除が必要になる PostgreSQL Unconference online 2021.09.28 3
4.
メモリ消費量が 予測不可能 COUNT(distinct KEY) ってキツくないですか? ▌重複排除を行うための戦略
戦略①-入力をキー値でソートしておき、キー値が変わったらカウンタを増分。 戦略②-キー値を全て Aggregate の内部ハッシュ表に保持しておき、重複を検出。 最後にハッシュ要素の数を出力。 KEY (=‘aaa’) KEY (=‘aaa’) KEY (=‘bbb’) KEY (=‘ccc’) KEY (=‘ccc’) KEY (=‘ccc’) Aggregate COUNT(distinct KEY) KEY (=‘eee’) Storategy-1 Increment internal counter when key value is changed in the sorted input stream +1 +1 +1 +1 Storategy-2 Keeps previously fetched keys on the internal hash-table, then output number of the element on the hash-table. Result Internal hash-table ‘aaa’ ‘bbb’ ‘ccc’ ‘eee’ 入力を事前に ソートするのが大変 並列処理も効かない PostgreSQL Unconference online 2021.09.28 4
5.
実際、COUNT(distinct KEY) ってキツいねん。 nvme=#
explain select count(distinct lo_custkey) from lineorder; QUERY PLAN ------------------------------------------------------------------------------ Aggregate (cost=18896094.80..18896094.81 rows=1 width=8) -> Seq Scan on lineorder (cost=0.00..17396057.84 rows=600014784 width=6) (2 rows) nvme=# select count(distinct lo_custkey) from lineorder; count --------- 2000000 (1 row) Time: 409851.751 ms (06:49.852) 6億行、87GB パラレルスキャンなし 遅い…。 PostgreSQL Unconference online 2021.09.28 5
6.
「だいたい合ってる」で良い場合もある 例)ここ一週間の課金ユーザ数を調べる SELECT COUNT(distinct user_id) FROM
access_log WHERE ts > now() - ‘1 week’::interval AND payment > 0; 多少違っていても、 グラフは大差ない!? むしろ表示が遅いと イラつくわ!! PostgreSQL Unconference online 2021.09.28 6
7.
HyperLogLog - カーディナリティの推定アルゴリズム ▌様々な
Big-Data 処理系でも採用 Amazon RedShift Google BigQuery Microsoft CitusDB Pivotal Greenplum PostgreSQL Unconference online 2021.09.28 7
8.
HyperLogLog のざっくり説明(1/2) SELECT count(distinct
KEY) FROM tbl KEY KEY KEY KEY KEY KEY hash: 11111100...00111001 hash: 10010010...00111010 hash: 00111010...01111111 hash: 11010110...11110100 hash: 01101111...01000001 hash: 10110100...10001011 Hash Function 10110100 ... 11011011 10111000 10001011 Register Selector (10001011b = 139) count number of contentious zero bits regs [255] regs [254] regs [253] 3 regs [139] regs [2] regs [1] regs [0] HLL Sketch (Array of 2N registers) PostgreSQL Unconference online 2021.09.28 8
9.
HyperLogLog のざっくり説明(2/2) SELECT count(distinct
KEY) FROM tbl KEY KEY KEY KEY KEY KEY hash: 11111100...00111001 hash: 10010010...00111010 hash: 00111010...01111111 hash: 11010110...11110100 hash: 01101111...01000001 hash: 10110100...10001011 Hash Function 10110100 ... 11011011 10111000 10001011 Register Selector (10001011b = 139) count number of contentious zero bits 5 regs [255] 2 regs [254] 3 regs [253] 3 regs [139] 4 regs [2] 5 regs [1] 3 regs [0] HLL Sketch (Array of 2N registers) コレの調和平均をとる。 ൗ 𝑛 σ𝑖=1 𝑛 𝑟𝑒𝑔𝑠[𝑖]−1 PostgreSQL Unconference online 2021.09.28 9
10.
PG-StromにおけるHyperLogLog(1/3) =# select hll_count(lo_custkey)
from lineorder ; hll_count ----------- 2005437 (1 row) Time: 9660.810 ms (00:09.661) =# explain verbose select hll_count(lo_custkey) from lineorder ; QUERY PLAN ---------------------------------------------------------------------------------- Aggregate (cost=4992387.95..4992387.96 rows=1 width=8) Output: hll_merge((pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey)))) -> Gather (cost=4992387.72..4992387.93 rows=2 width=32) Output: (pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey))) Workers Planned: 2 -> Parallel Custom Scan (GpuPreAgg) on public.lineorder ¥ (cost=4991387.72..4991387.73 rows=1 width=32) Output: (pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey))) GPU Output: (pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey))) GPU Setup: pgstrom.hll_hash(lo_custkey) Reduction: NoGroup Outer Scan: public.lineorder (cost=2833.33..4913260.79 rows=250006160 width=6) GPU Preference: GPU0 (Tesla V100-PCIE-16GB) GPUDirect SQL: enabled Kernel Source: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_374786.6.gpu Kernel Binary: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_374786.7.ptx (15 rows) ✓ 真の値(2,000,000)と比べて、誤差 0.3% 程度 ✓ 実行速度は 40 倍以上早かった(6億行、87GB) PostgreSQL Unconference online 2021.09.28 10
11.
PG-StromにおけるHyperLogLog(2/3) =# explain verbose
select hll_count(lo_custkey) from lineorder ; QUERY PLAN ---------------------------------------------------------------------------------- Aggregate (cost=4992387.95..4992387.96 rows=1 width=8) Output: hll_merge((pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey)))) -> Gather (cost=4992387.72..4992387.93 rows=2 width=32) Output: (pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey))) Workers Planned: 2 -> Parallel Custom Scan (GpuPreAgg) on public.lineorder ¥ (cost=4991387.72..4991387.73 rows=1 width=32) Output: (pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey))) GPU Output: (pgstrom.hll_sketch_new(pgstrom.hll_hash(lo_custkey))) GPU Setup: pgstrom.hll_hash(lo_custkey) Reduction: NoGroup Outer Scan: public.lineorder (cost=2833.33..4913260.79 rows=250006160 width=6) GPU Preference: GPU0 (Tesla V100-PCIE-16GB) GPUDirect SQL: enabled Kernel Source: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_374786.6.gpu Kernel Binary: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_374786.7.ptx (15 rows) Reduction処理の前に、 HLL用のハッシュ値を計算する(2億行) 2億件のハッシュ値を元に生成した HLL Sketchを1件だけ返す。(512バイト) 各ワーカーから上がってきた HLL Sketch を結合し、 調和平均に基づいてカーディナリティを推計する PostgreSQL Unconference online 2021.09.28 11
12.
PG-StromにおけるHyperLogLog(3/3) KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY hll_sketch_new() hll_sketch_new() hll_sketch_new() HLL
Sketch HLL Sketch HLL Sketch hll_merge() Result KEY KEY KEY hll_hash() hll_hash() hll_hash() bigint bigint bigint bytea bytea bytea CPUの世界 GPUの世界 大量のデータ 超絶並列処理 bigint PostgreSQL Unconference online 2021.09.28 12
13.
時系列データにおけるHyperLogLogの応用(1/3) KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY hll_sketch_new() hll_sketch_new() hll_sketch_new() HLL
Sketch HLL Sketch HLL Sketch hll_merge() Result KEY KEY KEY hll_hash() hll_hash() hll_hash() bigint bigint bigint bytea bytea bytea CPUの世界 GPUの世界 大量のデータ 超絶並列処理 bigint ココを保存しておいて、あとで 必要な分だけマージしてもいいよね? PostgreSQL Unconference online 2021.09.28 13
14.
時系列データにおけるHyperLogLogの応用(2/3) --- 人為的に『古い日付ほどカーディナリティが低い』状態を作ってみる nvme=# delete
from lineorder where lo_custkey % 10 = 8 and lo_orderdate < 19980101; delete from lineorder where lo_custkey % 10 = 7 and lo_orderdate < 19970101; delete from lineorder where lo_custkey % 10 = 6 and lo_orderdate < 19960101; delete from lineorder where lo_custkey % 10 = 5 and lo_orderdate < 19950101; delete from lineorder where lo_custkey % 10 = 4 and lo_orderdate < 19940101; delete from lineorder where lo_custkey % 10 = 3 and lo_orderdate < 19930101; DELETE 54657643 : DELETE 9119874 --- lo_orderdate の “年” 単位で HLL Sketch を取り出す。 nvme=# select lo_orderdate / 10000 as year, hll_sketch(lo_custkey) as sketch into pg_temp.annual from lineorder group by 1; SELECT 7 --- 生データだとアレなので、ヒストグラムにして表示 nvme=# select year, hll_sketch_histogram(sketch) from pg_temp.annual order by year; year | hll_sketch_histogram ------+------------------------------------------------------- 1992 | {0,0,0,0,0,0,0,0,0,22,73,132,118,82,39,26,12,2,4,2} 1993 | {0,0,0,0,0,0,0,0,0,9,59,118,125,96,50,30,15,2,6,2} 1994 | {0,0,0,0,0,0,0,0,0,4,33,111,133,113,53,36,17,4,6,2} 1995 | {0,0,0,0,0,0,0,0,0,2,21,99,131,121,62,42,18,5,7,3,1} 1996 | {0,0,0,0,0,0,0,0,0,1,17,84,119,131,73,50,20,5,7,4,1} 1997 | {0,0,0,0,0,0,0,0,0,0,14,71,118,128,82,53,23,10,7,4,2} 1998 | {0,0,0,0,0,0,0,0,0,0,13,64,114,126,86,61,23,11,8,4,2} (7 rows) PostgreSQL Unconference online 2021.09.28 14
15.
時系列データにおけるHyperLogLogの応用(3/3) --- max_y年までの集計結果をマージして、カーディナリティを推計する nvme=# select
max_y, (select hll_merge(sketch) from pg_temp.annual where year < max_y) from generate_series(1993,1999) max_y; max_y | hll_merge -------+----------- 1993 | 854093 (誤差:6.78% 1994 | 1052429 (誤差:5.24% 1995 | 1299916 (誤差:8.33% 1996 | 1514915 (誤差:8.21% 1997 | 1700274 (誤差:6.26% 1998 | 1889527 (誤差:4.97% 1999 | 2005437 (誤差:0.03% (7 rows) --- 答え合わせ(厳密な COUNT(distinct …) による集計) nvme=# select max_y, (select count(distinct lo_custkey) from lineorder where lo_orderdate < max_y) from generate_series(19930101,19990101,10000) max_y; max_y | count ----------+--------- 19930101 | 799862 19940101 | 999957 19950101 | 1199955 19960101 | 1399962 19970101 | 1599962 19980101 | 1799962 19990101 | 1998978 (7 rows) PostgreSQL Unconference online 2021.09.28 15
16.
結論 例)ここ一週間の課金ユーザ数を調べる SELECT HLL_COUNT(user_id) FROM access_log WHERE
ts > now() - ‘1 week’::interval AND payment > 0; 多少違っていても、 グラフは大差ない!? うわ!むっちゃ速いやん 最高すぎるわ PostgreSQL Unconference online 2021.09.28 16
Download now