SlideShare a Scribd company logo
1 of 35
Download to read offline
GPU version of PostGIS and GiST-Index
~A new horizon of geospatial data analytics~
HeteroDB,Inc
Chief Architect & CEO
KaiGai Kohei <kaigai@heterodb.com>
about Us
 KaiGai Kohei (海外浩平)
 Chief Architect & CEO of HeteroDB
 Contributor of PostgreSQL (2006-)
 Primary Developer of PG-Strom (2012-)
 Interested in: Big-data, GPU, NVME/PMEM, ...
about Myself
about HeteroDB
 Established: 4th-Jul-2017
 Location: Shinagawa, Tokyo, Japan
 Businesses:
✓ Development of high-performance data-processing
software on top of heterogeneous architecture.
✓ Technology consulting service on GPU&DB area.
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
2
PG-Strom
What is PG-Strom?
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
3
 Adds custom-scan/join/aggregate path with GPU execution
 SQL to GPU binary code generation & Just-in-time compilation
 Designed for IoT/M2M grade log-data processing
 Closely connected NVME-SSD and GPU using P2P data transfer [I/O acceleration]
 Geospatial analytics with GPU-version of PostGIS and GiST-index support
PG-Strom: An extension of PostgreSQL to pull out maximum capability of
GPU and NVME for processing of terabytes scale data
App
GPU
off-loading
for IoT/Big-Data
for ML/Analytics
PG-Strom’s Features
➢ JIT of GPU code with SQL &
Transparent SQL acceleration
➢ SSD-to-GPU Direct SQL
➢ Columnar Store (Arrow_Fdw)
➢ Asymmetric Partition-wise
JOIN/GROUP BY
➢ GPU Memory Store
➢ PostGIS support
➢ GiST-Index on GPU
➢ BRIN-Index support
➢ NVME-over-Fabric support
➢ Data-frame exchange for
Python scripts
NEW
GPU’s overview
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
4
Over 10years history in HPC, then massive popularization in Machine-Learning
NVIDIA Tesla V100
Super Computer
(TITEC; TSUBAME3.0) Computer Graphics Machine-Learning
More than thousands cores and TB/s grade memory bandwidth on a chip
for highly computing intensive workloads.
Simulation
Target: Fast data search of mobile devices
▌Mobile devices
 Location (Longitude, Latitude) is updated very frequently.
 Its log data often contains (device_id, timestamp, location (point), other attributes)
▌Area definitions
 Relatively small items, and almost static data.
 Polygon often has very complicated form, thus heavy “collision detection”.
▌Purpose
 Area marketing, Logistics analytics, Advertisement delivery, Emergency Alert, etc…
Latest Location
(Point)
Area definition
(Polygon)
Mobile device
Extract mobile devices
within the target areas
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
5
GPS
GPU-version PostGIS
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
6
about PostGIS (1/2)
 Extension to add geometry type, functions and operators.
 First release at 2005, then contentious development over 15 years.
 More than 400 functions / operators
 R-tree on GiST-index framework
 Parallel query execution
© GAIA RESOURCES © OSGeo © KTGIS.net
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
7
about PostGIS (2/2)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
8
St_Distance(a,b)
St_Crosses(a,b)
St_Contains(a,b)
PostGIS’s optimization (1/2)
▌Bounding Box
 The least rectangle that contains a polygon (that is often very complicated).
 (x1,y1) - (x2,y2) form by FP32 (= 16bytes)
 PostGIS assigns a bounding-box when it stores geometry values
 It allows to skip heavy operations prior to geolocational operations
obviously
disjointed
Tokyo
Germany
France
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
9
PostGIS’s optimization (2/2)
▌R-Tree on GiST-Index
 GiST (Generalized Search Tree) - A framework of index at PostgreSQL
 PostGIS implements R-tree for Geometry type
 Check for contains (‘@’ operator), overlaps (‘&&’ operator)
 Works efficiently to check relationship between very large number of points
and many polygons.
# Data published by Geospatial Information Authority of Japan
$ shp2pgsql N03-20_200101.shp | psql gistest
gistest=# ¥d+
List of relations
Schema | Name | Type | Owner | Size |
--------+-------------+----------+--------+------------+
public | geo_japan | table | kaigai | 243 MB |
gistest=# ¥di+
List of relations
Schema | Name | Type | Owner | Table | Size |
--------+--------------------+-------+--------+-----------+---------+
public | geo_japan_pkey | index | kaigai | geo_japan | 2616 kB |
public | geo_japan_geom_idx | index | kaigai | geo_japan | 14 MB |
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
10
Why GPU is capable for SQL acceleration?
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
11
▌GPU’s characteristics
 SIMT (Single-Instruction Multiple-Threads) architecture
 More than thousands processor cores and 1.0TB/s grade memory band
➔ Designed for “same operations on large number of values”
▌SQL characteristics
 “same operations on large number of values”,
like evaluation of WHERE-clause, JOIN or GROUP BY.
SQL runs WHERE, JOIN, GROUP BY on very large number of rows
Thousands processor units on GPU
evaluates thousands rows in parallel
CPU Parallel
GPU Parallel
Performing PostGIS functions on GPU (1/2)
postgres=# ¥d _gistest
Table "public._gistest"
Column | Type | Collation | Nullable | Default
--------+----------+-----------+----------+--------------------------------------
id | integer | | not null | nextval('_gistest_id_seq'::regclass)
a | geometry | | |
b | geometry | | |
postgres=# explain verbose select * from _gistest where st_contains(a,b);
QUERY PLAN
------------------------------------------------------------------------------------
Custom Scan (GpuScan) on public._gistest (cost=4251.50..4251.50 rows=1 width=196)
Output: id, a, b
GPU Filter: st_contains(_gistest.a, _gistest.b)
GPU Preference: None
Kernel Source: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.4.gpu
Kernel Binary: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.5.ptx
(6 rows)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
12
Performing PostGIS functions on GPU (2/2)
$ less /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.4.gpu
:
#include "cuda_postgis.h"
#include "cuda_gpuscan.h"
DEVICE_FUNCTION(cl_bool)
gpuscan_quals_eval(kern_context *kcxt,
kern_data_store *kds,
ItemPointerData *t_self,
HeapTupleHeaderData *htup)
{
void *addr __attribute__((unused));
pg_geometry_t KVAR_2;
pg_geometry_t KVAR_3;
assert(htup != NULL);
EXTRACT_HEAP_TUPLE_BEGIN(addr, kds, htup);
EXTRACT_HEAP_TUPLE_NEXT(addr);
pg_datum_ref(kcxt,KVAR_2,addr); // pg_geometry_t
EXTRACT_HEAP_TUPLE_NEXT(addr);
pg_datum_ref(kcxt,KVAR_3,addr); // pg_geometry_t
EXTRACT_HEAP_TUPLE_END();
return EVAL(pgfn_st_contains(kcxt, KVAR_2, KVAR_3));
}
:
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
13
Load the geometry
values from column-A
and column-B
Calls GPU-revision of
st_contains() for each thread
GPU code automatically
generated for evaluation
of WHERE-clause
Basic performance (1/2)
--- GpuScan (GPU-rev PostGIS)
=# SELECT count(*) FROM ft
WHERE st_contains('polygon ((10 10,90 10,90 12,12 12,12 88,90 88,90 90,¥
10 90,10 10))’, st_makepoint(x,y));
count
--------
236610
(1 row)
Time: 44.680 ms
--- Vanilla PostGIS
=# SET pg_strom.enabled = off;
SET
=# SELECT count(*) FROM tt
WHERE st_contains('polygon ((10 10,90 10,90 12,12 12,12 88,90 88,90 90,¥
10 90,10 10))', st_makepoint(x,y));
count
--------
236610
(1 row)
Time: 622.314 ms
Count number of points in the specified
area from 5 million points.
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
14
Basic performance (2/2)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
15
(100,100)
(0,0)
(90,90)
(90,10)
(90,12)
(12,12)
(12,88) (90,88)
(10,90)
(10,10)
GPU-version of PostGIS
 geometry st_makepoint(float8, float8[, float8[, float8]])
 float8 st_distance(geometry,geometry)
 bool st_dwithin(geometry,geometry,float8)
 bool st_contains(geometry,geometry)
 bool st_crosses(geometry,geometry)
 text st_relate(geometry,geometry)
 ...and more in the future version
Current Status of the supported functions
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
16
Extract Points within Polygon
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
17
Number of Polygons x Points combination is too large even for GPU
Latest Location
(Point)
Area definition
(Polygon)
Mobile device
GPS
100K-10M devices
100 - 100K polygons
10M-1T
Combinations?
GiST-Index on GPU
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
18
How GiST-Index (R-tree) works
▌How GiST-Index (R-tree) works
✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers
✓ R4 is a rectangle that contains all of (R11,R12) and their pointers
✓ R12 is a rectangle that contains the target geometry and its ItemPointer
✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched.
✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree.
(xmin,ymin)
(xmax,ymax)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
19
Search Key (Lon,Lat)
〇 〇
How GiST-Index (R-tree) works
▌How GiST-Index (R-tree) works
✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers
✓ R4 is a rectangle that contains all of (R11,R12) and their pointers
✓ R12 is a rectangle that contains the target geometry and its ItemPointer
✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched.
✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree.
(xmin,ymin)
(xmax,ymax)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
20
Search Key (Lon,Lat)
〇 〇
× 〇 ×
How GiST-Index (R-tree) works
▌How GiST-Index (R-tree) works
✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers
✓ R4 is a rectangle that contains all of (R11,R12) and their pointers
✓ R12 is a rectangle that contains the target geometry and its ItemPointer
✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched.
✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree.
(xmin,ymin)
(xmax,ymax)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
21
Search Key (Lon,Lat)
〇 〇
× 〇 ×
〇
×
How GiST-Index (R-tree) works
▌How GiST-Index (R-tree) works
✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers
✓ R4 is a rectangle that contains all of (R11,R12) and their pointers
✓ R12 is a rectangle that contains the target geometry and its ItemPointer
✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched.
✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree.
(xmin,ymin)
(xmax,ymax)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
22
Search Key (Lon,Lat)
〇 〇
× 〇 ×
〇
×
× ×
GPU version of GiST-Index
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
23
▌Overview
 PG-Strom may utilize GiST-Index for joining a table with area definition (polygons; small)
and a table with locational data (points; large).
 Both the area definition table and its index are loaded onto GPU on GpuJoin.
 GpuJoin looks at the GiST-index first for rough pruning, using bounding-box
 Then, evaluate the “collision detection” with polygon value on the table
 Above operations by thousands cores of GPU in parallel,
so we expected its search performance is better, but...
collision detection of Polygons x Points as a part of GpuJoin
GiST-Index (R-tree)
Area Definitions
(Polygon)
A table with
Location Data
(Points)
Thousands
threads search
R-tree index
in parallel
Simple Test: Random points and St_Contains (2020-Sep)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
24
(123.0, 20.0)
(154.2, 46.2)
SELECT n03_001,n03_004,count(*)
FROM geo_japan j, geopoint p
WHERE st_contains(j.geom, st_makepoint(x,y))
AND j.n03_001 like ‘Tokyo’
GROUP BY n03_001,n03_004;
10M of randomly
generated
geolocational
points
n03_001 | n03_004 | count
---------+-------------+-------
Tokyo | Akiruno | 105
Tokyo | Miyake | 76
Tokyo | Mitaka | 17
Tokyo | Setagaya-ku | 67
Tokyo | Chuo-ku | 12
Tokyo | Nakano-ku | 18
Tokyo | Hachijo | 105
: : :
Tokyo | Toshima-ku | 14
Tokyo | Adachi-ku | 55
Tokyo | Aogashima | 7
Tokyo | Ome | 117
(63 rows)
CPU-rev: 30.539s
GPU-rev: 33.841s (Slow!)
Area definition data
by the Geospatial Information
Authority of Japan
Background) GPU’s Thread Scheduling
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
25
 GPU groups its processor cores by Streaming Multiprocessor(SM)
✓ Cores, Registers and L1 cache (shared memory) are managed per streaming-multiprocessor
 A SM can run up to 1024 threads simultaneously on the shared 64 cores/SM
✓ Like a very large scale hyper-threading in CPU
 Threads are scheduled per warp (32threads).
called SIMT (Single-Instruction Multiple-Thread) architecture
✓ Uniformed workloads, like matrix operations, fully utilizes the processor cores
✓ If a particular thread tends to consume large cycles, other threads in a warp must wait.
GPU Block Diagram
(Tesla V100; 80SMs)
Streaming Multiprocessors
(64CUDA cores/SM)
●●●…●●●
●●●…●●●
●●●…●●●
●●●…●●●
●●●…●●●
●●●…●●●
Thread Group
(1~1024 Threads)
Warp (32 threads)
Background) Internal design of GpuJoin
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
26
SELECT * FROM A, B
WHERE A.id = B.id;
Table-A
512 threads fetches
512 rows at once.
Table-B
● ● ● ● ● ●
〇 × 〇 × 〇 ×
GpuHashJoin / GpuNestLoop
N = __syncthreads_count(...)
Write out the JOIN results (if any)
thread-0 allocates result buffer for N-items
Fetch next frame,
and repeat the
above steps.
GpuHashJoin
• Hash calculation
• Search Hash table
• Evaluation of JOIN condition
GpuNestLoop
• Evaluation of JOIN condition
➔little differences in processing
cycles between the threads
References
Issues of index-search in GPU
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
27
Large variance of thread’s processing time, that leads synchronization wait!
Table-A
● ● ● ● ● ●
Reference
Index-A
× × × ×
×
〇
N = __syncthreads_count(...)
Reference
Walk down to the leaf
node of R-tree, but no
matched item
No matched items,
by just looking at
the root node
of R-tree
Walk down to the leaf
node of R-tree, and
evaluate JOIN conditions
Other GPU cores must be idle by the
completion of the longest operations.
Naive implementation of GPU GiST-Index Search (2020-Sep)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
28
Little utilization ratio of GPU cores, due to inter-core synchronization
● ● ● ● ● ● ●
●
Every thread loads a row from the input buffer,
and extract the key for GiST-Index.
Hit on the GiST-Index?
Search the GiST-Index by the key
Is the JOIN-condition true?
Evaluation of the JOIN-condition
write out JOIN-results (if found)
nitems = __syncthreads_count(found);
Repeat
found=true
found=false
No
No
A thread that found a matched entry can
block other 511 threads in the same
thread-group until completion of the
evaluation of JOIN-condition.
Very low efficiency of GPU core usage.
A new more optimal implementation (2020-Nov)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
29
Minimization of the synchronization point, to pull up utilization ratio
● ● ● ● ● ● ●
●
Every thread loads a row from the input buffer,
and extract the key for GiST-Index.
Search the GiST-Index by the key
Hit on the GiST-Index?
Allocation of the temporary buffer,
to save the pointers that hit GiST-Index above.
Consumption of temporary
buffer exceeds 512 items.
__syncthreads()
Evaluation of the JOIN-condition
Is the JOIN-condition true?
No
No
nitems = __syncthreads_count(found)
Write out JOIN results
Repeat
found=false
found=true
As long as temporary buffer has space,
threads continue to fetch rows and
search GiST-Index.
➔ Then, evaluate JOIN-conditions
by all the threads at once.
Simple Test: Random points and St_Contains (2020-Nov; the latest)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
30
(123.0, 20.0)
(154.2, 46.2)
SELECT n03_001,n03_004,count(*)
FROM geo_japan j, geopoint p
WHERE st_contains(j.geom, st_makepoint(x,y))
AND j.n03_001 like ‘Tokyo’
GROUP BY n03_001,n03_004;
10M of randomly
generated
geolocational
points
n03_001 | n03_004 | count
---------+-------------+-------
Tokyo | Akiruno | 105
Tokyo | Miyake | 76
Tokyo | Mitaka | 17
Tokyo | Setagaya-ku | 67
Tokyo | Chuo-ku | 12
Tokyo | Nakano-ku | 18
Tokyo | Hachijo | 105
: : :
Tokyo | Toshima-ku | 14
Tokyo | Adachi-ku | 55
Tokyo | Aogashima | 7
Tokyo | Ome | 117
(63 rows)
CPU-rev: 30.539s
GPU-rev: 0.316s
Area definition data
by the Geospatial Information
Authority of Japan
100 times
faster!!
EXPLAIN of the simple test (1/2) - CPU version
postgres=# EXPLAIN (analyze, costs off)
SELECT n03_001,n03_004,count(*)
FROM geo_japan j, geopoint p
WHERE st_contains(j.geom, st_makepoint(x,y))
AND j.n03_001 like ‘Tokyo’
GROUP BY n03_001,n03_004;
QUERY PLAN
--------------------------------------------------------------------------------------------
Finalize GroupAggregate (actual time=30709.855..30710.080 rows=63 loops=1)
Group Key: j.n03_001, j.n03_004
-> Gather Merge (actual time=30709.838..30732.270 rows=244 loops=1)
Workers Planned: 4
Workers Launched: 3
-> Partial GroupAggregate (actual time=30687.466..30687.572 rows=61 loops=4)
Group Key: j.n03_001, j.n03_004
-> Sort (actual time=30687.452..30687.475 rows=638 loops=4)
Sort Key: j.n03_001, j.n03_004
Sort Method: quicksort Memory: 73kB
-> Nested Loop (actual time=71.496..30686.278 rows=638 loops=4)
-> Parallel Seq Scan on geopoint p (actual time=0.012..207.553 rows=2500000 loops=4)
-> Index Scan using geo_japan_geom_idx on geo_japan j ¥
(actual time=0.012..0.012 rows=0 loops=10000000)
Index Cond: (geom ~ st_makepoint(p.x, p.y))
Filter: (((n03_001)::text ~~ ‘Tokyo’::text) AND
st_contains(geom, st_makepoint(p.x, p.y)))
Rows Removed by Filter: 0
Planning Time: 0.156 ms
Execution Time: 30732.422 ms
(21 rows)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
31
EXPLAIN of the simple test (2/2) - GPU version
postgres=# EXPLAIN (analyze, costs off)
SELECT n03_001,n03_004,count(*)
FROM geo_japan j, geopoint p
WHERE st_contains(j.geom, st_makepoint(x,y))
AND j.n03_001 like ‘Tokyo’
GROUP BY n03_001,n03_004;
QUERY PLAN
--------------------------------------------------------------------------------------------
GroupAggregate (actual time=329.118..329.139 rows=63 loops=1)
Group Key: j.n03_001, j.n03_004
-> Sort (actual time=329.107..329.110 rows=63 loops=1)
Sort Key: j.n03_001, j.n03_004
Sort Method: quicksort Memory: 29kB
-> Custom Scan (GpuPreAgg) (actual time=328.902..328.911 rows=63 loops=1)
Reduction: Local
Combined GpuJoin: enabled
-> Custom Scan (GpuJoin) on fgeopoint p (never executed)
Outer Scan: fgeopoint p (never executed)
Depth 1: GpuGiSTJoin(plan nrows: 10000000...60840000, actual nrows: 10000000...2553)
HeapSize: 7841.91KB (estimated: 3113.70KB), IndexSize: 13.28MB
IndexFilter: (j.geom ~ st_makepoint(p.x, p.y)) on geo_japan_geom_idx
Rows Fetched by Index: 4952
JoinQuals: st_contains(j.geom, st_makepoint(p.x, p.y))
-> Seq Scan on geo_japan j (actual time=0.164..17.723 rows=6173 loops=1)
Filter: ((n03_001)::text ~~ ‘Tokyo’::text)
Rows Removed by Filter: 112726
Planning Time: 0.344 ms
Execution Time: 340.415 ms
(20 rows)
Portion executed on GPU
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
32
Conclusion (1/2)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
33
▌GPU-version PostGIS
 Enhancement for parallel execution of PostGIS functions
 Also support GiST-Index (R-tree) of geometry type
 100 times faster results to pickup “points in area” type workloads.
▌Expected Use scenarios
 Area marketing analytics
 Real-time advertisement delivery
 Push event notifications, etc...
➔ GPU+PostGIS allows to run “computing intensive” workloads on your workstation or
cloud instance as like you are usually doing.
▌Resources
 GitHub: https://github.com/heterodb/pg-strom
 Document: http://heterodb.github.io/pg-strom/ja/
 Contact: Tw: @kkaigai / ✉ kaigai@heterodb.com
Conclusion (2/2)
PGconf.online 2021 - GPU version of PostGIS and GiST-Index
34
▌Resources
 GitHub: https://github.com/heterodb/pg-strom
 Document: http://heterodb.github.io/pg-strom/ja/
 Contact: Tw: @kkaigai / ✉ kaigai@heterodb.com
PG-Strom project welcomes your participation. Please contact us.
20210301_PGconf_Online_GPU_PostGIS_GiST_Index

More Related Content

What's hot

pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda enKohei KaiGai
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdwKohei KaiGai
 
SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)Kohei KaiGai
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_ENKohei KaiGai
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storageKohei KaiGai
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_PlaceKohei KaiGai
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdwKohei KaiGai
 
20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStromKohei KaiGai
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrKohei KaiGai
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
Let's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwLet's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwJan Holčapek
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)Kohei KaiGai
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...Equnix Business Solutions
 
Parallel K means clustering using CUDA
Parallel K means clustering using CUDAParallel K means clustering using CUDA
Parallel K means clustering using CUDAprithan
 
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HirosePGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HiroseEqunix Business Solutions
 

What's hot (20)

pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
 
SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
PG-Strom
PG-StromPG-Strom
PG-Strom
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
 
20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom20150318-SFPUG-Meetup-PGStrom
20150318-SFPUG-Meetup-PGStrom
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
Let's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwLet's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdw
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
 
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
PGConf.ASIA 2019 Bali - AppOS: PostgreSQL Extension for Scalable File I/O - K...
 
Parallel K means clustering using CUDA
Parallel K means clustering using CUDAParallel K means clustering using CUDA
Parallel K means clustering using CUDA
 
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HirosePGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
 

Similar to 20210301_PGconf_Online_GPU_PostGIS_GiST_Index

Introduction To PostGIS
Introduction To PostGISIntroduction To PostGIS
Introduction To PostGISmleslie
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions
 
Postgres Vision 2018: PostGIS and Spatial Extensions
Postgres Vision 2018: PostGIS and Spatial ExtensionsPostgres Vision 2018: PostGIS and Spatial Extensions
Postgres Vision 2018: PostGIS and Spatial ExtensionsEDB
 
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームMasayuki Matsushita
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018VMware Tanzu
 
GeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxGeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxDatabricks
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Open source based software ‘gxt’ mangosystem
Open source based software ‘gxt’ mangosystemOpen source based software ‘gxt’ mangosystem
Open source based software ‘gxt’ mangosystemHaNJiN Lee
 
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介Masayuki Matsushita
 
FOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRaster
FOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRasterFOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRaster
FOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRasterJorge Arevalo
 
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...Chester Chen
 
Location based services for Nokia X and Nokia Asha using Geo2tag
Location based services for Nokia X and Nokia Asha using Geo2tagLocation based services for Nokia X and Nokia Asha using Geo2tag
Location based services for Nokia X and Nokia Asha using Geo2tagMicrosoft Mobile Developer
 
PostgreSQL 13 New Features
PostgreSQL 13 New FeaturesPostgreSQL 13 New Features
PostgreSQL 13 New FeaturesJosé Lin
 
Datomic R-trees
Datomic R-treesDatomic R-trees
Datomic R-treesjsofra
 
Datomic rtree-pres
Datomic rtree-presDatomic rtree-pres
Datomic rtree-presjsofra
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Sudhir Mallem
 
State of the Art Web Mapping with Open Source
State of the Art Web Mapping with Open SourceState of the Art Web Mapping with Open Source
State of the Art Web Mapping with Open SourceOSCON Byrum
 
2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...
2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...
2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...GIS in the Rockies
 

Similar to 20210301_PGconf_Online_GPU_PostGIS_GiST_Index (20)

Introduction To PostGIS
Introduction To PostGISIntroduction To PostGIS
Introduction To PostGIS
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
Postgres Vision 2018: PostGIS and Spatial Extensions
Postgres Vision 2018: PostGIS and Spatial ExtensionsPostgres Vision 2018: PostGIS and Spatial Extensions
Postgres Vision 2018: PostGIS and Spatial Extensions
 
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
 
GeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxGeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony Fox
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Open source based software ‘gxt’ mangosystem
Open source based software ‘gxt’ mangosystemOpen source based software ‘gxt’ mangosystem
Open source based software ‘gxt’ mangosystem
 
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
 
FOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRaster
FOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRasterFOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRaster
FOSS4G 2010 PostGIS Raster: an Open Source alternative to Oracle GeoRaster
 
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
SF Big Analytics 20191112: How to performance-tune Spark applications in larg...
 
Location based services for Nokia X and Nokia Asha using Geo2tag
Location based services for Nokia X and Nokia Asha using Geo2tagLocation based services for Nokia X and Nokia Asha using Geo2tag
Location based services for Nokia X and Nokia Asha using Geo2tag
 
PostgreSQL 13 New Features
PostgreSQL 13 New FeaturesPostgreSQL 13 New Features
PostgreSQL 13 New Features
 
Datomic R-trees
Datomic R-treesDatomic R-trees
Datomic R-trees
 
Datomic rtree-pres
Datomic rtree-presDatomic rtree-pres
Datomic rtree-pres
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
 
Evolution of Spark APIs
Evolution of Spark APIsEvolution of Spark APIs
Evolution of Spark APIs
 
NVIDIA CUDA
NVIDIA CUDANVIDIA CUDA
NVIDIA CUDA
 
State of the Art Web Mapping with Open Source
State of the Art Web Mapping with Open SourceState of the Art Web Mapping with Open Source
State of the Art Web Mapping with Open Source
 
2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...
2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...
2017 PLSC Track: Using a Standard Version of ArcMap with External VRS Recieve...
 

More from Kohei KaiGai

20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_HistoryKohei KaiGai
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_APIKohei KaiGai
 
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrowKohei KaiGai
 
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.020210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0Kohei KaiGai
 
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCacheKohei KaiGai
 
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGISKohei KaiGai
 
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online20200828_OSCKyoto_Online
20200828_OSCKyoto_OnlineKohei KaiGai
 
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdwKohei KaiGai
 
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_FdwKohei KaiGai
 
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_TokyoKohei KaiGai
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.JapanKohei KaiGai
 
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_BetaKohei KaiGai
 
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom20190925_DBTS_PGStrom
20190925_DBTS_PGStromKohei KaiGai
 
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom20190516_DLC10_PGStrom
20190516_DLC10_PGStromKohei KaiGai
 
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdwKohei KaiGai
 
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_FdwKohei KaiGai
 
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LTKohei KaiGai
 
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigDataKohei KaiGai
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA UnconferenceKohei KaiGai
 
20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JPKohei KaiGai
 

More from Kohei KaiGai (20)

20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History20221116_DBTS_PGStrom_History
20221116_DBTS_PGStrom_History
 
20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API20221111_JPUG_CustomScan_API
20221111_JPUG_CustomScan_API
 
20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow20211112_jpugcon_gpu_and_arrow
20211112_jpugcon_gpu_and_arrow
 
20210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.020210731_OSC_Kyoto_PGStrom3.0
20210731_OSC_Kyoto_PGStrom3.0
 
20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache20210511_PGStrom_GpuCache
20210511_PGStrom_GpuCache
 
20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS20201113_PGconf_Japan_GPU_PostGIS
20201113_PGconf_Japan_GPU_PostGIS
 
20200828_OSCKyoto_Online
20200828_OSCKyoto_Online20200828_OSCKyoto_Online
20200828_OSCKyoto_Online
 
20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw20200806_PGStrom_PostGIS_GstoreFdw
20200806_PGStrom_PostGIS_GstoreFdw
 
20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw20200424_Writable_Arrow_Fdw
20200424_Writable_Arrow_Fdw
 
20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo20191211_Apache_Arrow_Meetup_Tokyo
20191211_Apache_Arrow_Meetup_Tokyo
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.Japan
 
20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta20190926_Try_RHEL8_NVMEoF_Beta
20190926_Try_RHEL8_NVMEoF_Beta
 
20190925_DBTS_PGStrom
20190925_DBTS_PGStrom20190925_DBTS_PGStrom
20190925_DBTS_PGStrom
 
20190516_DLC10_PGStrom
20190516_DLC10_PGStrom20190516_DLC10_PGStrom
20190516_DLC10_PGStrom
 
20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw20190418_PGStrom_on_ArrowFdw
20190418_PGStrom_on_ArrowFdw
 
20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw20190314 PGStrom Arrow_Fdw
20190314 PGStrom Arrow_Fdw
 
20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT20181212 - PGconf.ASIA - LT
20181212 - PGconf.ASIA - LT
 
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
20181211 - PGconf.ASIA - NVMESSD&GPU for BigData
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
 
20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP20180920_DBTS_PGStrom_JP
20180920_DBTS_PGStrom_JP
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

20210301_PGconf_Online_GPU_PostGIS_GiST_Index

  • 1. GPU version of PostGIS and GiST-Index ~A new horizon of geospatial data analytics~ HeteroDB,Inc Chief Architect & CEO KaiGai Kohei <kaigai@heterodb.com>
  • 2. about Us  KaiGai Kohei (海外浩平)  Chief Architect & CEO of HeteroDB  Contributor of PostgreSQL (2006-)  Primary Developer of PG-Strom (2012-)  Interested in: Big-data, GPU, NVME/PMEM, ... about Myself about HeteroDB  Established: 4th-Jul-2017  Location: Shinagawa, Tokyo, Japan  Businesses: ✓ Development of high-performance data-processing software on top of heterogeneous architecture. ✓ Technology consulting service on GPU&DB area. PGconf.online 2021 - GPU version of PostGIS and GiST-Index 2 PG-Strom
  • 3. What is PG-Strom? PGconf.online 2021 - GPU version of PostGIS and GiST-Index 3  Adds custom-scan/join/aggregate path with GPU execution  SQL to GPU binary code generation & Just-in-time compilation  Designed for IoT/M2M grade log-data processing  Closely connected NVME-SSD and GPU using P2P data transfer [I/O acceleration]  Geospatial analytics with GPU-version of PostGIS and GiST-index support PG-Strom: An extension of PostgreSQL to pull out maximum capability of GPU and NVME for processing of terabytes scale data App GPU off-loading for IoT/Big-Data for ML/Analytics PG-Strom’s Features ➢ JIT of GPU code with SQL & Transparent SQL acceleration ➢ SSD-to-GPU Direct SQL ➢ Columnar Store (Arrow_Fdw) ➢ Asymmetric Partition-wise JOIN/GROUP BY ➢ GPU Memory Store ➢ PostGIS support ➢ GiST-Index on GPU ➢ BRIN-Index support ➢ NVME-over-Fabric support ➢ Data-frame exchange for Python scripts NEW
  • 4. GPU’s overview PGconf.online 2021 - GPU version of PostGIS and GiST-Index 4 Over 10years history in HPC, then massive popularization in Machine-Learning NVIDIA Tesla V100 Super Computer (TITEC; TSUBAME3.0) Computer Graphics Machine-Learning More than thousands cores and TB/s grade memory bandwidth on a chip for highly computing intensive workloads. Simulation
  • 5. Target: Fast data search of mobile devices ▌Mobile devices  Location (Longitude, Latitude) is updated very frequently.  Its log data often contains (device_id, timestamp, location (point), other attributes) ▌Area definitions  Relatively small items, and almost static data.  Polygon often has very complicated form, thus heavy “collision detection”. ▌Purpose  Area marketing, Logistics analytics, Advertisement delivery, Emergency Alert, etc… Latest Location (Point) Area definition (Polygon) Mobile device Extract mobile devices within the target areas PGconf.online 2021 - GPU version of PostGIS and GiST-Index 5 GPS
  • 6. GPU-version PostGIS PGconf.online 2021 - GPU version of PostGIS and GiST-Index 6
  • 7. about PostGIS (1/2)  Extension to add geometry type, functions and operators.  First release at 2005, then contentious development over 15 years.  More than 400 functions / operators  R-tree on GiST-index framework  Parallel query execution © GAIA RESOURCES © OSGeo © KTGIS.net PGconf.online 2021 - GPU version of PostGIS and GiST-Index 7
  • 8. about PostGIS (2/2) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 8 St_Distance(a,b) St_Crosses(a,b) St_Contains(a,b)
  • 9. PostGIS’s optimization (1/2) ▌Bounding Box  The least rectangle that contains a polygon (that is often very complicated).  (x1,y1) - (x2,y2) form by FP32 (= 16bytes)  PostGIS assigns a bounding-box when it stores geometry values  It allows to skip heavy operations prior to geolocational operations obviously disjointed Tokyo Germany France PGconf.online 2021 - GPU version of PostGIS and GiST-Index 9
  • 10. PostGIS’s optimization (2/2) ▌R-Tree on GiST-Index  GiST (Generalized Search Tree) - A framework of index at PostgreSQL  PostGIS implements R-tree for Geometry type  Check for contains (‘@’ operator), overlaps (‘&&’ operator)  Works efficiently to check relationship between very large number of points and many polygons. # Data published by Geospatial Information Authority of Japan $ shp2pgsql N03-20_200101.shp | psql gistest gistest=# ¥d+ List of relations Schema | Name | Type | Owner | Size | --------+-------------+----------+--------+------------+ public | geo_japan | table | kaigai | 243 MB | gistest=# ¥di+ List of relations Schema | Name | Type | Owner | Table | Size | --------+--------------------+-------+--------+-----------+---------+ public | geo_japan_pkey | index | kaigai | geo_japan | 2616 kB | public | geo_japan_geom_idx | index | kaigai | geo_japan | 14 MB | PGconf.online 2021 - GPU version of PostGIS and GiST-Index 10
  • 11. Why GPU is capable for SQL acceleration? PGconf.online 2021 - GPU version of PostGIS and GiST-Index 11 ▌GPU’s characteristics  SIMT (Single-Instruction Multiple-Threads) architecture  More than thousands processor cores and 1.0TB/s grade memory band ➔ Designed for “same operations on large number of values” ▌SQL characteristics  “same operations on large number of values”, like evaluation of WHERE-clause, JOIN or GROUP BY. SQL runs WHERE, JOIN, GROUP BY on very large number of rows Thousands processor units on GPU evaluates thousands rows in parallel CPU Parallel GPU Parallel
  • 12. Performing PostGIS functions on GPU (1/2) postgres=# ¥d _gistest Table "public._gistest" Column | Type | Collation | Nullable | Default --------+----------+-----------+----------+-------------------------------------- id | integer | | not null | nextval('_gistest_id_seq'::regclass) a | geometry | | | b | geometry | | | postgres=# explain verbose select * from _gistest where st_contains(a,b); QUERY PLAN ------------------------------------------------------------------------------------ Custom Scan (GpuScan) on public._gistest (cost=4251.50..4251.50 rows=1 width=196) Output: id, a, b GPU Filter: st_contains(_gistest.a, _gistest.b) GPU Preference: None Kernel Source: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.4.gpu Kernel Binary: /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.5.ptx (6 rows) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 12
  • 13. Performing PostGIS functions on GPU (2/2) $ less /var/lib/pgdata/pgsql_tmp/pgsql_tmp_strom_21028.4.gpu : #include "cuda_postgis.h" #include "cuda_gpuscan.h" DEVICE_FUNCTION(cl_bool) gpuscan_quals_eval(kern_context *kcxt, kern_data_store *kds, ItemPointerData *t_self, HeapTupleHeaderData *htup) { void *addr __attribute__((unused)); pg_geometry_t KVAR_2; pg_geometry_t KVAR_3; assert(htup != NULL); EXTRACT_HEAP_TUPLE_BEGIN(addr, kds, htup); EXTRACT_HEAP_TUPLE_NEXT(addr); pg_datum_ref(kcxt,KVAR_2,addr); // pg_geometry_t EXTRACT_HEAP_TUPLE_NEXT(addr); pg_datum_ref(kcxt,KVAR_3,addr); // pg_geometry_t EXTRACT_HEAP_TUPLE_END(); return EVAL(pgfn_st_contains(kcxt, KVAR_2, KVAR_3)); } : PGconf.online 2021 - GPU version of PostGIS and GiST-Index 13 Load the geometry values from column-A and column-B Calls GPU-revision of st_contains() for each thread GPU code automatically generated for evaluation of WHERE-clause
  • 14. Basic performance (1/2) --- GpuScan (GPU-rev PostGIS) =# SELECT count(*) FROM ft WHERE st_contains('polygon ((10 10,90 10,90 12,12 12,12 88,90 88,90 90,¥ 10 90,10 10))’, st_makepoint(x,y)); count -------- 236610 (1 row) Time: 44.680 ms --- Vanilla PostGIS =# SET pg_strom.enabled = off; SET =# SELECT count(*) FROM tt WHERE st_contains('polygon ((10 10,90 10,90 12,12 12,12 88,90 88,90 90,¥ 10 90,10 10))', st_makepoint(x,y)); count -------- 236610 (1 row) Time: 622.314 ms Count number of points in the specified area from 5 million points. PGconf.online 2021 - GPU version of PostGIS and GiST-Index 14
  • 15. Basic performance (2/2) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 15 (100,100) (0,0) (90,90) (90,10) (90,12) (12,12) (12,88) (90,88) (10,90) (10,10)
  • 16. GPU-version of PostGIS  geometry st_makepoint(float8, float8[, float8[, float8]])  float8 st_distance(geometry,geometry)  bool st_dwithin(geometry,geometry,float8)  bool st_contains(geometry,geometry)  bool st_crosses(geometry,geometry)  text st_relate(geometry,geometry)  ...and more in the future version Current Status of the supported functions PGconf.online 2021 - GPU version of PostGIS and GiST-Index 16
  • 17. Extract Points within Polygon PGconf.online 2021 - GPU version of PostGIS and GiST-Index 17 Number of Polygons x Points combination is too large even for GPU Latest Location (Point) Area definition (Polygon) Mobile device GPS 100K-10M devices 100 - 100K polygons 10M-1T Combinations?
  • 18. GiST-Index on GPU PGconf.online 2021 - GPU version of PostGIS and GiST-Index 18
  • 19. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 19 Search Key (Lon,Lat) 〇 〇
  • 20. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 20 Search Key (Lon,Lat) 〇 〇 × 〇 ×
  • 21. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 21 Search Key (Lon,Lat) 〇 〇 × 〇 × 〇 ×
  • 22. How GiST-Index (R-tree) works ▌How GiST-Index (R-tree) works ✓ R1 is a rectangle [(Xmin,Ymin) – (Xmax,Ymax)] that contains all of (R3, R4, R5) and their pointers ✓ R4 is a rectangle that contains all of (R11,R12) and their pointers ✓ R12 is a rectangle that contains the target geometry and its ItemPointer ✓ Sequentially evaluates the entry for each tree-node. Dive into the next depth only if matched. ✓ Not fast as like B-tree, because of sequential evaluation for each depth in R-tree. (xmin,ymin) (xmax,ymax) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 22 Search Key (Lon,Lat) 〇 〇 × 〇 × 〇 × × ×
  • 23. GPU version of GiST-Index PGconf.online 2021 - GPU version of PostGIS and GiST-Index 23 ▌Overview  PG-Strom may utilize GiST-Index for joining a table with area definition (polygons; small) and a table with locational data (points; large).  Both the area definition table and its index are loaded onto GPU on GpuJoin.  GpuJoin looks at the GiST-index first for rough pruning, using bounding-box  Then, evaluate the “collision detection” with polygon value on the table  Above operations by thousands cores of GPU in parallel, so we expected its search performance is better, but... collision detection of Polygons x Points as a part of GpuJoin GiST-Index (R-tree) Area Definitions (Polygon) A table with Location Data (Points) Thousands threads search R-tree index in parallel
  • 24. Simple Test: Random points and St_Contains (2020-Sep) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 24 (123.0, 20.0) (154.2, 46.2) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; 10M of randomly generated geolocational points n03_001 | n03_004 | count ---------+-------------+------- Tokyo | Akiruno | 105 Tokyo | Miyake | 76 Tokyo | Mitaka | 17 Tokyo | Setagaya-ku | 67 Tokyo | Chuo-ku | 12 Tokyo | Nakano-ku | 18 Tokyo | Hachijo | 105 : : : Tokyo | Toshima-ku | 14 Tokyo | Adachi-ku | 55 Tokyo | Aogashima | 7 Tokyo | Ome | 117 (63 rows) CPU-rev: 30.539s GPU-rev: 33.841s (Slow!) Area definition data by the Geospatial Information Authority of Japan
  • 25. Background) GPU’s Thread Scheduling PGconf.online 2021 - GPU version of PostGIS and GiST-Index 25  GPU groups its processor cores by Streaming Multiprocessor(SM) ✓ Cores, Registers and L1 cache (shared memory) are managed per streaming-multiprocessor  A SM can run up to 1024 threads simultaneously on the shared 64 cores/SM ✓ Like a very large scale hyper-threading in CPU  Threads are scheduled per warp (32threads). called SIMT (Single-Instruction Multiple-Thread) architecture ✓ Uniformed workloads, like matrix operations, fully utilizes the processor cores ✓ If a particular thread tends to consume large cycles, other threads in a warp must wait. GPU Block Diagram (Tesla V100; 80SMs) Streaming Multiprocessors (64CUDA cores/SM) ●●●…●●● ●●●…●●● ●●●…●●● ●●●…●●● ●●●…●●● ●●●…●●● Thread Group (1~1024 Threads) Warp (32 threads)
  • 26. Background) Internal design of GpuJoin PGconf.online 2021 - GPU version of PostGIS and GiST-Index 26 SELECT * FROM A, B WHERE A.id = B.id; Table-A 512 threads fetches 512 rows at once. Table-B ● ● ● ● ● ● 〇 × 〇 × 〇 × GpuHashJoin / GpuNestLoop N = __syncthreads_count(...) Write out the JOIN results (if any) thread-0 allocates result buffer for N-items Fetch next frame, and repeat the above steps. GpuHashJoin • Hash calculation • Search Hash table • Evaluation of JOIN condition GpuNestLoop • Evaluation of JOIN condition ➔little differences in processing cycles between the threads References
  • 27. Issues of index-search in GPU PGconf.online 2021 - GPU version of PostGIS and GiST-Index 27 Large variance of thread’s processing time, that leads synchronization wait! Table-A ● ● ● ● ● ● Reference Index-A × × × × × 〇 N = __syncthreads_count(...) Reference Walk down to the leaf node of R-tree, but no matched item No matched items, by just looking at the root node of R-tree Walk down to the leaf node of R-tree, and evaluate JOIN conditions Other GPU cores must be idle by the completion of the longest operations.
  • 28. Naive implementation of GPU GiST-Index Search (2020-Sep) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 28 Little utilization ratio of GPU cores, due to inter-core synchronization ● ● ● ● ● ● ● ● Every thread loads a row from the input buffer, and extract the key for GiST-Index. Hit on the GiST-Index? Search the GiST-Index by the key Is the JOIN-condition true? Evaluation of the JOIN-condition write out JOIN-results (if found) nitems = __syncthreads_count(found); Repeat found=true found=false No No A thread that found a matched entry can block other 511 threads in the same thread-group until completion of the evaluation of JOIN-condition. Very low efficiency of GPU core usage.
  • 29. A new more optimal implementation (2020-Nov) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 29 Minimization of the synchronization point, to pull up utilization ratio ● ● ● ● ● ● ● ● Every thread loads a row from the input buffer, and extract the key for GiST-Index. Search the GiST-Index by the key Hit on the GiST-Index? Allocation of the temporary buffer, to save the pointers that hit GiST-Index above. Consumption of temporary buffer exceeds 512 items. __syncthreads() Evaluation of the JOIN-condition Is the JOIN-condition true? No No nitems = __syncthreads_count(found) Write out JOIN results Repeat found=false found=true As long as temporary buffer has space, threads continue to fetch rows and search GiST-Index. ➔ Then, evaluate JOIN-conditions by all the threads at once.
  • 30. Simple Test: Random points and St_Contains (2020-Nov; the latest) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 30 (123.0, 20.0) (154.2, 46.2) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; 10M of randomly generated geolocational points n03_001 | n03_004 | count ---------+-------------+------- Tokyo | Akiruno | 105 Tokyo | Miyake | 76 Tokyo | Mitaka | 17 Tokyo | Setagaya-ku | 67 Tokyo | Chuo-ku | 12 Tokyo | Nakano-ku | 18 Tokyo | Hachijo | 105 : : : Tokyo | Toshima-ku | 14 Tokyo | Adachi-ku | 55 Tokyo | Aogashima | 7 Tokyo | Ome | 117 (63 rows) CPU-rev: 30.539s GPU-rev: 0.316s Area definition data by the Geospatial Information Authority of Japan 100 times faster!!
  • 31. EXPLAIN of the simple test (1/2) - CPU version postgres=# EXPLAIN (analyze, costs off) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; QUERY PLAN -------------------------------------------------------------------------------------------- Finalize GroupAggregate (actual time=30709.855..30710.080 rows=63 loops=1) Group Key: j.n03_001, j.n03_004 -> Gather Merge (actual time=30709.838..30732.270 rows=244 loops=1) Workers Planned: 4 Workers Launched: 3 -> Partial GroupAggregate (actual time=30687.466..30687.572 rows=61 loops=4) Group Key: j.n03_001, j.n03_004 -> Sort (actual time=30687.452..30687.475 rows=638 loops=4) Sort Key: j.n03_001, j.n03_004 Sort Method: quicksort Memory: 73kB -> Nested Loop (actual time=71.496..30686.278 rows=638 loops=4) -> Parallel Seq Scan on geopoint p (actual time=0.012..207.553 rows=2500000 loops=4) -> Index Scan using geo_japan_geom_idx on geo_japan j ¥ (actual time=0.012..0.012 rows=0 loops=10000000) Index Cond: (geom ~ st_makepoint(p.x, p.y)) Filter: (((n03_001)::text ~~ ‘Tokyo’::text) AND st_contains(geom, st_makepoint(p.x, p.y))) Rows Removed by Filter: 0 Planning Time: 0.156 ms Execution Time: 30732.422 ms (21 rows) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 31
  • 32. EXPLAIN of the simple test (2/2) - GPU version postgres=# EXPLAIN (analyze, costs off) SELECT n03_001,n03_004,count(*) FROM geo_japan j, geopoint p WHERE st_contains(j.geom, st_makepoint(x,y)) AND j.n03_001 like ‘Tokyo’ GROUP BY n03_001,n03_004; QUERY PLAN -------------------------------------------------------------------------------------------- GroupAggregate (actual time=329.118..329.139 rows=63 loops=1) Group Key: j.n03_001, j.n03_004 -> Sort (actual time=329.107..329.110 rows=63 loops=1) Sort Key: j.n03_001, j.n03_004 Sort Method: quicksort Memory: 29kB -> Custom Scan (GpuPreAgg) (actual time=328.902..328.911 rows=63 loops=1) Reduction: Local Combined GpuJoin: enabled -> Custom Scan (GpuJoin) on fgeopoint p (never executed) Outer Scan: fgeopoint p (never executed) Depth 1: GpuGiSTJoin(plan nrows: 10000000...60840000, actual nrows: 10000000...2553) HeapSize: 7841.91KB (estimated: 3113.70KB), IndexSize: 13.28MB IndexFilter: (j.geom ~ st_makepoint(p.x, p.y)) on geo_japan_geom_idx Rows Fetched by Index: 4952 JoinQuals: st_contains(j.geom, st_makepoint(p.x, p.y)) -> Seq Scan on geo_japan j (actual time=0.164..17.723 rows=6173 loops=1) Filter: ((n03_001)::text ~~ ‘Tokyo’::text) Rows Removed by Filter: 112726 Planning Time: 0.344 ms Execution Time: 340.415 ms (20 rows) Portion executed on GPU PGconf.online 2021 - GPU version of PostGIS and GiST-Index 32
  • 33. Conclusion (1/2) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 33 ▌GPU-version PostGIS  Enhancement for parallel execution of PostGIS functions  Also support GiST-Index (R-tree) of geometry type  100 times faster results to pickup “points in area” type workloads. ▌Expected Use scenarios  Area marketing analytics  Real-time advertisement delivery  Push event notifications, etc... ➔ GPU+PostGIS allows to run “computing intensive” workloads on your workstation or cloud instance as like you are usually doing. ▌Resources  GitHub: https://github.com/heterodb/pg-strom  Document: http://heterodb.github.io/pg-strom/ja/  Contact: Tw: @kkaigai / ✉ kaigai@heterodb.com
  • 34. Conclusion (2/2) PGconf.online 2021 - GPU version of PostGIS and GiST-Index 34 ▌Resources  GitHub: https://github.com/heterodb/pg-strom  Document: http://heterodb.github.io/pg-strom/ja/  Contact: Tw: @kkaigai / ✉ kaigai@heterodb.com PG-Strom project welcomes your participation. Please contact us.