This document summarizes Jonathan Katz's experience building a foreign data wrapper (FDW) between two PostgreSQL databases to enable an API for his company VenueBook. He created separate "app" and "api" databases, with the api database using FDWs to access tables in the app database. This allowed inserting and querying data across databases. However, he encountered permission errors and had to grant various privileges on the remote database to make it work properly, demonstrating the importance of permissions management with FDWs.
3. A Bit About Me
• @jkatz05
• Chief Technology Officer @ VenueBook
• Using Postgres since ~2004
• Been using it decently ~2010
• One day hope to use it well ;-)
• Active Postgres community member
• Co-Chair, PGConf US
• Co-organizer, NYC PostgreSQL User Group
• Director, United States PostgreSQL Association
• Have been to every PGConf.EU except Madrid :(
3
8. Foreign Data Wrappers
in a Nutshell
• Provide a unified interface (i.e. SQL) to access
different data sources
• RDBMS (like Postgres!)
• NoSQL
• APIs (HTTP, Twitter, etc.)
• Internet of things
8
10. History of FDWs
• Released in 9.2 with a few read-only interfaces
• SQL-MED
• Did not include Postgres :(
• 9.3: Writeable FDWs
• ...and did include Postgres :D
• 9.4: Considers triggers on foreign tables
• 9.5
• IMPORT FOREIGN SCHEMA
• Push Down API (WIP)
• Inheritance children
10
11. Not Going Anywhere
• 9.6
• Join Push Down
• Aggregate API?
• Parallelism?
• "Hey we need some data from you, we will check
back later"
11
12. So I was just waiting for a
good problem to solve
with FDWs
12
14. Some Background
14
VenueBook is revolutionizing the way
people think about event booking. Our
platform lets venues and bookers plan
together, creating a smarter and better-
connected experience for all. We simplify
planning, so you can have more fun!
15. Translation
• We have two main products:
• A CRM platform that allows venue managers to
control everything around an event.
• A marketplace that allows event planners source
venues and book events.
15
18. 18
Hey, can we build an API?
Sure, but I would want to run it as a
separate application so that way we can
isolate the load from our primary database.
Okay, that makes sense.
Great. There is a feature in Postgres that
makes it easy to talk between two separate
Postgres databases, so it shouldn't be too
difficult to build.
That sounds good. Let's do it!
There's one catch...
26. 26
# "local" is for Unix domain socket connections only!
local all all trust
Yeah, of course I don't care about authentication settings.
(Pro-tip: "trust" means user privileges don't matter)
28. 28
CREATE TABLE venues (
id serial PRIMARY KEY,
name varchar(255) NOT NULL
);
!
CREATE TABLE events (
id serial PRIMARY KEY,
venue_id int REFERENCES venues (id),
name text NOT NULL,
total int NOT NULL DEFAULT 0,
guests int NOT NULL,
start_time timestamptz NOT NULL,
end_time timestamptz NOT NULL,
created_at timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL
);
And let's pretend this is how I created the schema for it.
29. 29
And this magic function to check for availability.
CREATE FUNCTION get_availability(
venue_id int,
start_time timestamptz,
end_time timestamptz
)
RETURNS bool
AS $$
SELECT NOT EXISTS(
SELECT 1
FROM events
WHERE
events.venue_id = $1 AND
($2, $3) OVERLAPS (events.start_time, events.end_time)
LIMIT 1
);
$$ LANGUAGE SQL STABLE;
32. 32
CREATE TABLE api.users (
id serial PRIMARY KEY,
key text UNIQUE NOT NULL,
name text NOT NULL
);
!
CREATE TABLE api.venues (
id serial PRIMARY KEY,
remote_venue_id int NOT NULL
);
!
CREATE TABLE api.events (
id serial PRIMARY KEY,
user_id int REFERENCES api.users (id) NOT NULL,
venue_id int REFERENCES api.venues (id) NOT NULL,
remote_bid_id text,
ip_address text,
data json,
created_at timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL
);
Our API schema
33. 33
CREATE EXTENSION postgres_fdw;
!
CREATE SERVER app_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS
(dbname 'app');
!
CREATE USER MAPPING FOR CURRENT_USER SERVER app_server;
Our setup to pull the information from the main application
34. 34
CREATE SCHEMA app;
!
CREATE FOREIGN TABLE app.venues (
id int,
name text
) SERVER app_server OPTIONS (table_name 'venues');
We will isolate the foreign tables in their own schema
38. 38
CREATE FOREIGN TABLE app.venues (
id int,
name text
) SERVER app_server OPTIONS (
table_name 'venues',
schema_name 'public'
);
If there is a schema mismatch between local and foreign table,
you have to set the schema explicitly.
39. 39
SELECT * FROM app.venues;
id | name
----+--------------
1 | Venue A
2 | Restaurant B
3 | Bar C
4 | Club D
40. 40
CREATE FOREIGN TABLE app.events (
id int,
venue_id int,
name text,
total int,
guests int,
start_time timestamptz,
end_time timestamptz
) SERVER app_server OPTIONS (
table_name 'events',
schema_name 'public'
);
Adding in our foreign table for events
47. WARNING
• This is using a sequence on the local database
• If you do not want to generate overlapping primary
keys, this is not the solution for you.
• Want to use the sequence generating function on
the foreign database
• But FDWs cannot access foreign functions
• However...
47
49. 49
(on the "app" database)
CREATE SCHEMA api;
!
CREATE VIEW api.events_id_seq_view AS
SELECT nextval('public.events_id_seq') AS id;
50. 50
CREATE FOREIGN TABLE app.events_id_seq_view (
id int
)
SERVER app_server
OPTIONS (
table_name 'events_id_seq_view',
schema_name 'api'
);
!
CREATE FUNCTION app.events_id_seq_nextval() RETURNS int AS $$
SELECT id FROM app.events_id_seq_view
$$ LANGUAGE SQL;
!
CREATE FOREIGN TABLE app.events (
id int DEFAULT app.events_id_seq_nextval(),
venue_id int,
name text,
total int,
guests int,
start_time timestamptz,
end_time timestamptz
) SERVER app_server OPTIONS (
table_name 'events',
schema_name 'public'
);
(on the "api" database)
52. 52
Hey, can we check the availability on the api server
before making an insert on the app server?
53. 53
Sure, we have a function for that on "app" but...
FDWs do not support foreign functions.
!
And we cannot use a view.
!
However...
54. dblink
• Written in 2001 by Joe Conway
• Designed to make remote PostgreSQL database
calls
• The docs say:
• See also postgres_fdw, which provides roughly
the same functionality using a more modern and
standards-compliant infrastructure.
54
55. 55
-- setup the extensions (if not already done so)
CREATE EXTENSION plpgsql;
CREATE EXTENSION dblink;
!
-- create
CREATE FUNCTION app.get_availability(
venue_id int,
start_time timestamptz,
end_time timestamptz
)
RETURNS bool
AS $get_availability$
DECLARE
is_available bool;
remote_sql text;
BEGIN
remote_sql := format('SELECT get_availability(%L, %L, %L)', venue_id,
start_time, end_time);
SELECT availability.is_available
INTO is_available
FROM dblink('dbname=app', remote_sql) AS availability(is_available bool);
RETURN is_available;
EXCEPTION
WHEN others THEN
RETURN NULL::bool;
END;
$get_availability$ LANGUAGE plpgsql;
(on the "api" database)
56. 56
SELECT app.get_availability(1, '2015-10-28 18:00', '2015-10-28 20:00');
get_availability
------------------
f
(1 row)
get_availability
------------------
t
(1 row)
SELECT app.get_availability(1, '2015-10-28 12:00', '2015-10-28 14:00');
Works great!
57. Summary So Far...
• We created two separate databases with logical schemas
• We wrote some code using postgres_fdw and dblink that
can
• Read data from "app" to "api"
• Insert data from "api" to the "app"
• ...with the help of the sequence trick
• Make a remote function call
57
59. (And because we are good developers,
we are going to test the deploy
configuration in a staging environment, but
we can all safely assume that, right? :-)
59
60. (Note: when I say
"superuser" I mean a
Postgres superuser)
60
63. 63
# TYPE DATABASE USER ADDRESS METHOD
# for the main user
host app app 10.0.0.10/32 md5
host api api 10.0.0.20/32 md5
# for foreign table access
local api app md5
local app api md5
pg_hba.conf setup
65. 65
CREATE SERVER app_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname 'app');
ERROR: permission denied for foreign server app_server
But if we log in as the "api" user and try to run this...
66. 66
As a superuser, grant permission
GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw TO api;
67. 67
CREATE SERVER app_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname 'app');
!
CREATE FOREIGN TABLE app.venues (
id int,
name text
) SERVER app_server OPTIONS (
table_name 'venues',
schema_name 'public'
);
Now this works! Let's run a query...
68. 68
SELECT * FROM app.venues;
ERROR: user mapping not found for "api"
69. 69
CREATE USER MAPPING FOR api
SERVER app_server
OPTIONS (
user 'api',
password 'test'
);
So we create the user mapping and...
70. 70
SELECT * FROM app.venues;
ERROR: permission denied for relation venues
CONTEXT: Remote SQL command: SELECT id, name FROM public.venues
You've got to be kidding me...
71. 71
Go to "app" and as a superuser run this
GRANT SELECT ON venues TO api;
GRANT SELECT, INSERT, UPDATE ON events TO api;
72. 72
SELECT * FROM app.venues;
id | name
----+--------------
1 | Venue A
2 | Restaurant B
3 | Bar C
4 | Club D
Meanwhile, back on "api"
74. 74
CREATE SCHEMA api;
!
CREATE VIEW api.events_id_seq_view AS
SELECT nextval('public.events_id_seq') AS id;
Get things started on the "app" database
75. 75
-- setup the sequence functionality
CREATE FOREIGN TABLE app.events_id_seq_view (
id int
)
SERVER app_server
OPTIONS (
table_name 'events_id_seq_view',
schema_name 'api'
);
!
CREATE FUNCTION app.events_id_seq_nextval() RETURNS int AS $$
SELECT id FROM app.events_id_seq_view
$$ LANGUAGE SQL;
Back on the "api" database
84. 84
CREATE FOREIGN TABLE app.events (
id int DEFAULT app.events_id_seq_nextval(),
venue_id int,
name text,
total int,
guests int,
start_time timestamptz,
end_time timestamptz
) SERVER app_server OPTIONS (
table_name 'events',
schema_name 'public'
);
We can now create the foreign table and test the INSERT...
85. 85
INSERT INTO app.events (
venue_id,
name,
total,
guests,
start_time,
end_time
) VALUES (
1,
'Conference Party',
50000,
400,
'2015-10-28 18:00',
'2015-10-28 21:00'
)
RETURNING id;
id
----
2
Yup...we ran "GRANT SELECT, INSERT, UPDATE ON events TO api;" on "app" earlier!
86. 86
CREATE FUNCTION app.get_availability(
venue_id int,
start_time timestamptz,
end_time timestamptz
)
RETURNS bool
AS $get_availability$
DECLARE
is_available bool;
remote_sql text;
BEGIN
remote_sql := format('SELECT get_availability(%L, %L, %L)', venue_id, start_time,
end_time);
SELECT availability.is_available
INTO is_available
FROM dblink('dbname=app user=api password=test', remote_sql) AS
availability(is_available bool);
RETURN is_available;
EXCEPTION
WHEN others THEN
RETURN NULL::bool;
END;
$get_availability$ LANGUAGE plpgsql;
And install our availability function...
90. We Learned That...
• PostgreSQL has a robust permission system
• http://www.postgresql.org/docs/current/static/sql-
grant.html
• ...there is much more we could have done too.
• Double the databases, double the problems
• Always have a testing environment that can mimic your
production environment
• ...when it all works, it is so sweet.
90
91. Conclusion
• Foreign data wrappers are incredible
• The postgres_fdw is incredible
• ...and it is still a work in progress
• Make sure you understand its limitations
• Research what is required to properly install in
production
91