This document discusses storing product and order data as JSON in a database to support an agile development process. It describes creating tables with JSON columns to store this data, and using JSON functions like JSON_VALUE and JSON_TABLE to query and transform the JSON data. Examples are provided of indexing JSON columns for performance and updating product JSON to include unit costs by joining external data. The goal is to enable flexible and rapid evolution of the application through storing data in JSON.
7. 7
Agile Database Development
with JSON
Chris Saxon
Developer Advocate, @ChrisRSaxon & @SQLDaily
blogs.oracle.com/sql
youtube.com/c/TheMagicofSQL
asktom.oracle.com
8. The following is intended to outline our general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. The development,
release, timing, and pricing of any features or functionality described for Oracle’s products may change
and remains at the sole discretion of Oracle Corporation.
Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
at http://www.oracle.com/investor. All information in this presentation is current as of September 2019
and Oracle undertakes no duty to update any statement in light of new information or future events.
Safe Harbor
9. User Story #1
We must be able to store
product & order details
So we need to create the tables
and define CRUD operations on them
10. create table products (
product_id integer
not null
primary key,
product_json ##TODO##
not null,
check (
json_data is json
)
);
create table orders (
order_id integer
not null
primary key,
order_json ##TODO##
not null,
check (
json_data is json
)
);
The tables are just a
primary key, JSON column,
& is json constraint
11. create table products (
product_id integer
not null
primary key,
product_json ##TODO##
not null,
check (
json_data is json
)
);
create table orders (
order_id integer
not null
primary key,
order_json ##TODO##
not null,
check (
json_data is json
)
);
But which data type
to use for JSON?!
12. Which data type should you use for JSON?
"Small" documents varchar2
"Large" documents ???
<= 4,000 bytes / 32k
14. create table products (
product_id integer
not null
primary key,
product_json blob
not null,
check (
json_data is json
)
);
create table orders (
order_id integer
not null
primary key,
order_json blob
not null,
check (
json_data is json
)
);
15. insert into products ( product_json )
values ( utl_raw.cast_to_raw ( '{
"productName": "..."
}' ) );
BLOBs need extra processing on insert
16. select product_json from products;
PRODUCT_JSON
7B202274686973223A20227468617422207D
and select to make them human readable
17. select json_serialize (
product_json
returning clob
pretty
) jdata
from products;
JDATA
{
"productName": "..."
}
Added in 19c
json_serialize
converts JSON data to
text; which you can
pretty print for
readability
18. select json_query (
product_json,
'$' returning clob
pretty
) jdata
from products;
JDATA
{
"productName": "..."
}
In earlier releases use
json_query
The clob return type
was added in 18c
19. User Story #2
Customers must be able to
search by price
So we need to query the products table for JSON
where the unitPrice is in the specified range
20. {
"productName": "GEEKWAGON",
"descripion": "Ut commodo in …",
"unitPrice": 35.97,
"bricks": [ {
"colour": "red", "shape": "cube",
"quantity": 13
}, {
"colour": "green", "shape": "cube",
"quantity": 17
}, …
]
}
We need to search for this
value in the documents
21. select * from products p
where p.product_json.unitPrice <= :max_price;
But remember it returns
varchar2
=> implicit conversion!
Use simple dot-notation to access the value
22. select * from products p
where json_value (
product_json,
'$.unitPrice' returning number
) <= :max_price;
json_value gives you more control
So this returns number
=> no implicit conversion! :)
23. select * from products p
where p.product_json.unitPrice.number()
<= :max_price;
From 19c you can state
the return type with
simple dot-notation
24. User Story #3
Customers must be able to view their
orders
Showing order details and a list of what they bought
So we need to join the order productIds to products
25. {
"customerId" : 2,
"orderDatetime" : "2019-01-01T03:25:43",
"products" : [ {
"productId" : 1,
"unitPrice" : 74.95
}, {
"productId" : 10,
"unitPrice" : 35.97
}, …
]
}
We need to extract these
from the product array
27. select json_query (
order_json, '$.products[*].productId'
with array wrapper
)
from orders o;
PRODUCTS
[2,8,5]
[3,9,6]
[1,10,7,4]
...
But to join these to
products, we need to
convert them to rows…
…or with json_query
29. with order_items as (
select order_id, t.*
from orders o, json_table (
order_json
columns (
customerId,
nested products[*] columns (
productId,
unitPrice
) )
) t
)
Simplified syntax 18c
30. with order_items as (
select order_id, t.*
from orders o, json_table (
order_json
columns (
customerId,
nested products[*] columns (
productId,
unitPrice
) )
) t
)
This tells the database to
return a row for each
element in the products
array…
34. User Story #4
Sales must be able to view
today's orders
We need to create a dashboard counting orders
So we need to search for orders placed today
35. {
"customerId" : 2,
"orderDatetime" : "2019-01-01T03:25:43",
"products" : [ {
"productId" : 1,
"unitPrice" : 74.95
}, {
"productId" : 10,
"unitPrice" : 35.97
}, …
]
}
We need to search
for this value in the
documents
36. select * from orders o
where o.order_json.orderDatetime >=
trunc ( sysdate );
ORA-01861: literal does
not match format string
Remember the
implicit conversions?
It fails for dates!
Use simple dot-notation to access the value
37. select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
So you need to define the
return type; JSON dates
conform to ISO 8601 date
40. select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
But the query is very slow…
41. select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
{ "customerId": 1, … }
{ "customerId": 2, … }
…
42. User Story #4b
… and make it fast!
currently the query does a full table scan
To speed it up we need to create an index!
43. create index orders_date_i
on orders ( order_json );
ORA-02327: cannot create index on
expression with datatype LOB
You can't index LOB data
44. create search index orders_json_i
on orders ( order_json )
for json
parameters ( 'sync (on commit)' );
Added in 12.2, a json search
index enables JSON queries
to use an index
JSON Search Indexes
45. select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
{ "customerId": 1, … }
{ "customerId": 2, … }
…
46. -----------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | TABLE ACCESS BY INDEX ROWID| ORDERS |
|* 2 | DOMAIN INDEX | ORDERS_JSON_I |
-----------------------------------------------------
With the search index in place,
the optimizer can use it
47. Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_VALUE("ORDER_JSON" FORMAT JSON ,
'$.orderDatetime' RETURNING TIMESTAMP NULL
ON ERROR) >= TIMESTAMP' 2019-01-15 00:00:00')
2 - access("CTXSYS"."CONTAINS"("O"."ORDER_JSON",
'sdatap(TMS_orderDatetime >=
"2019-01-15T00:00:00+00:00" /orderDatetime)')>0)
Under the covers, this uses Oracle Text
48. create index order_date_i
on orders (
json_value (
order_json,
'$.orderDatetime'
returning date
error on error
null on empty
)
);
It's more efficient to
create a function-
based index,
matching the search
you'll do
This has some other
benefits…
49. create index order_date_i
on orders (
json_value (
order_json,
'$.orderDatetime'
returning date
error on error
null on empty
)
);
Data validation!
If the value is not
a JSON date;
inserts will raise
an exception
50. create index order_date_i
on orders (
json_value (
order_json,
'$.orderDatetime'
returning date
error on error
null on empty
)
);
From 12.2 you can
also raise an error
if the attribute is
not present
51. ------------------------------------------------------------
| Id | Operation | Name |
------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS |
|* 2 | INDEX RANGE SCAN | ORDER_DATE_I |
------------------------------------------------------------
The function-based index is more
efficient, so the optimizer will choose
this over the search index
52. Search vs. Function-Based Indexes
JSON Search Index Function-based Index
Applicability Any JSON query Matching function
Performance Slower Faster
Use Ad-hoc queries Application queries
66. "bricks": [ {
"colour": "red",
"shape": "cube",
"quantity": 13
}, {
"colour": "green",
"shape": "cuboid",
"quantity": 17
}, …
] join on
colour, shape
We need to combine the spreadsheet
data with the stored JSON
67. Step 1: transform JSON
to rows-and-columns
Step 3: convert
back to JSON
Step 2: join
the costs
68. Photo by Gus Ruballo on Unsplash
Buckle up!
This will be a bumpy ride!
69. select * from external ( (
colour varchar2(30),
shape varchar2(30),
unit_cost number
)
default directory tmp
location ( 'costs.csv' )
)
From 18c you can query files "on the
fly" with an inline external table
70. select product_id, j.*
from products, json_table (
product_json columns (
nested bricks[*] columns (
pos for ordinality,
colour path '$.colour',
shape path '$.shape',
brick format json path '$'
)
)
) j
Using JSON_table to
extract the bricks as rows
73. with costs as (
select * from external …
), bricks as (
select product_id, j.*
from products, json_table (
…
)
)
select …
from bricks join costs
on …
We've joined the data, but how do
we convert it back to JSON?
75. select json_object (
'colour' value b.colour,
'shape' value b.shape,
'quantity' value b.quantity,
'unitCost' value c.cost
)
from bricks b
join costs c
on b.colour = c.colour
and b.shape = c.shape;
So you can create a brick
object with json_object…
76. select json_mergepatch (
brick,
'{ "unitCost": ' || c.cost || '}'
)
from bricks b
join costs c
on b.colour = c.colour
and b.shape = c.shape;
Add/replace this…
…to this document
… or use
json_mergepatch (19c)
to add it to the brick object
77. {
"colour": "red",
"shape": "cube",
"quantity": 13,
"unitCost": 0.59
}
{
"colour": "green",
"shape": "cuboid",
"quantity": 17,
"unitCost": 0.39
}
This returns a row
for each brick
To combine them
into an array for
each product, use
json_arrayagg
83. {
"productName": "GEEKWAGON",
"descripion": "Ut commodo in …",
"unitPrice": 35.97,
"bricks": [ {
…, "unitCost": 0.59
}, {
…, "unitCost": 0.39
}, …
]
}
Finally!
We've added
unitCost to every
element in the array
We just need to
update the table…
84. update products
set product_json = (
with costs as (
select * from external …
), bricks as (
select …
)
select json_mergepatch …
)
85. …at least we can view order profitability now
That was hard work
Ryan McGuire / Gratisography
86. User Story #7
Create report
prices - discount – total cost
We've got the data;
but want an easier way to query it…
88. desc orders
Name Null? Type
ORDER_ID NOT NULL NUMBER(38)
ORDER_JSON NOT NULL BLOB
ORDER_JSON$customerId NUMBER
ORDER_JSON$orderDatetime VARCHAR2(32)
ORDER_JSON$code VARCHAR2(8)
ORDER_JSON$discountAmount NUMBER
Sadly it only exposes
scalar (non-array) values
91. PRODUCT_ID SHAPE COLOUR
1 cube green
1 cube red
1 cylinder blue
1 cylinder blue
1 cylinder green
1 cylinder green
… … …
The unique key
for a brick is
(colour, shape)
Some products have
duplicate entries
in the bricks array!
We're shipping too
many bricks!
92. User Story #8
FIX ALL THE DATAZ!
We need to remove all the duplicate entries
from the product brick arrays
95. Wrong Data Model
PRODUCTS BRICKS
The JSON models the relationship between
products and bricks as 1:M
This is the wrong data model
the relationship is M:M
96. Fixed It!
PRODUCTS BRICKSPRODUCT_BRICKS
unique (
product_id,
brick_id
)
{ JSON } { JSON }{ JSON }
You need a junction table
between products and bricks
This avoids duplication &
enables constraints
99. select distinct "PRODUCT_JSON$shape" shape,
"PRODUCT_JSON$colour" colour,
"PRODUCT_JSON$unitCost" unit_cost
from product_bricks_vw
Moving from 1:M to M:M
Using the JSON Data Guide
view, you can find all the
unique brick types…
100. with vals as (
select distinct "PRODUCT_JSON$shape" shape,
"PRODUCT_JSON$colour" colour,
"PRODUCT_JSON$unitCost" unit_cost
from product_bricks_vw
)
select rownum brick_id,
v.*
from vals v;
…assign a unique ID to each
( colour, shape ) …
101. create table bricks as
with vals as (
select distinct "PRODUCT_JSON$shape" shape,
"PRODUCT_JSON$colour" colour,
"PRODUCT_JSON$unitCost" unit_cost
from product_bricks_vw
)
select rownum brick_id,
v.*
from vals v;
…and create a table
from the results!
102. create table bricks as
with vals as (
select distinct "PRODUCT_JSON$shape" "shape",
"PRODUCT_JSON$colour" "colour",
"PRODUCT_JSON$unitCost" "unitCost"
from product_bricks_vw
)
select rownum brick_id,
json_object ( v.* ) brick_json
from vals v;
19c simplification
(Storing the values as
JSON if you want)
103. create table product_bricks as
select distinct product_id, brick_id
from product_bricks_vw
join bricks
on ...
Create the Join Table
104. json_mergepatch (
product_json,
'{ "bricks": null }'
)
If you pass a null value for an
attribute to JSON_mergepatch,
it's removed from the source
Removing the bricks array from products