3. Self-Introduction
Suguru Suzuki
Japan Ichiba Section
Japan Mall Group
Rakuten Ichiba Development Department
Yuhei Nishioka
Rakuten Institute of Technology
• Chief Technologist
• Application Engineer
• Joined Rakuten in 2007 • Joined Rakuten in 2008
• Semantic
• Ichiba TOP/ Rakuten
Web, Recommender
Search(All devices)
System
3
6. Rakuten Category
カテゴリーは、事柄の性質を区分する上でのもっとも基本的な分
類のことである。
In metaphysics (in particular, ontology), the different kinds or ways of
being are called categories of being or simply categories.
Source of Quote : wikipedia
http://ja.wikipedia.org/wiki/%E3%82%AB%E3%83%86%E3%82%B4%E3%83%AA
Rakuten’s Category is…
Sales area =
“売り場”
6
8. Rakuten Category
Data
Number
Category in Rakuten Ichiba
50,896 genres
Using Category Service
50 service
Using Category Application 100 application
Effective Service of using Category(Genre/Tag)
RMS
GMS
Report
TOP page
Search
Engine
Rakuten Search
Web Service
Advertisement
Auction
Review
Books
Racoupon
kobo
A lot of
service use
Category
data!
Auto
Browsing
History
Super DB
Affiliate
Ranking
Basket
Mail
Item Page
8
9. Rakuten Category
Catch up the trend
Good
Categorize Easy to navigate User
Big factor to increase
sales in each items.
9
15. Data-Driven Optimization
Modify Category by Analyzing User’s Queries
Past Example of data-driven optimization
List of high frequency queries
….
ホットプレート
(Hot Plate)
…
タジン鍋
(Tagines)
Already existing in Rakuten Category
Tree
No responding genre
Create new category
(a couple of years ago)
You can find “タジン鍋”
without using search
15
16. Types of queries
Needs browsing function for not only category tree but also other attributes
Ratio of Query Types
Podcut Category
Brand
Merchant
Spec
Character
Others
Source: User Queries tat Rakuten Ichiba in 2013
16
17. Master Database
Create new master database for brand, color and so on
Data Structure behind Navigation
Data Source
Master Data
Already Exist
Brand
Master. a
Category Tree
Navigation
Category
…..
…..
…..
Brand
Brand
Master. b
Brand
Master. c
Integration
Unified
Brand Master
New
Color Master
…
…..
…..
…..
Color
…..
…..
…..
New
17
18. The difficulty identifying brand
Brand name matching is very effective. But must solve 2 major problems
2 major technical problems in brand name matching
• Different Things with the Same Name
• カリタ
http://www.kalita.co.jp/
• The Same Thing with Different Names
• Samsung
• サムスン
http://www.carita.jp/
18
19. Check by hand
Brand name matching is very effective. But must solve 2 major problems
Data Process
Original Matching Algorithm
- Title match
- Synonym check
- Ambiguous word check
- Use other attribute
- …
Result
check
19
20. Check by hand with few costs
OpenRefine is very helpful
Server
side
Original Matching Algorithm
API for Open Refine
Web
Interface
ID
Name
Useful
Open Source Tool
Other Master Data
xxx
SONY
SONY [ Matched ]
yyy
カリタ
Karita [ Candidate1 ]
CARITA [ Candidate 2]
….
….
http://openrefine.org
20
source http://www.carita.jp/
21. Color Master
Building color dictionary automatically as much as possible
Color Dictionary
16 color groups
1,871 color names
黒
Black
黒色
.
.
blac
k
Blue
.
.
.
• Image Processing
• Natural Language
Processing
blue
navy
21
22. Tagging Data for each item
Structured Data
Category
Brand
Color
….
Merchant Input
Item ID
Category
Brand
Color
…
Extract Automatically
From item description
(in research)
xxxx
22
23. Attribute value extraction
• Generate extraction rules using attribute value
database constructed from table data
Table data
Chateau d’Issan 1994
Database
:
<Region, Margaux>
<Color, White>
:
This is a wine
from Margaux.
...
Rule
wine from x
=> x is a Region
Values not included in
the database can be
captured.
Annotation
Item page including
a dictionary entry
23
31. Extra - Before
Release
Hard to release Category data
Category data has over 15 DB…
Deliver its data to all 50 service.
Have over 15 DB....
RMS
GMS
Report
TOP page
Search
Engine
Rakuten Search
Web Service
Deliver data to all service
Add new service
sometime
Advertisement
Auction
Review
Books
Racoupon
kobo
Auto
Browsing
History
Super DB
Affiliate
Ranking
Basket
Mail
Item Page
31
32. Extra - Before
Release
Show the Maintenance time table
When Category Restructuring
maintenance.
Related Category Restructuring task
Complicated!!!
is almost
300
!!
32
33. Release
Easy to release by all service
more speedy
Already Automation
In Progress for Automation
ServiceA
Category
Data
API
Now improving!
ServiceB
ServiceC
ServiceD
ServiceE
・・
・・
33
34. Release
■System Reconstruction used by API
Before
6month
Making data by handmade
Share data by dump or excel
In Progress
Making data by management tool
Reflect new Data used by API
API
Test and operate by each service
ServiceA
serviceB
serviceD
serviceE
serviceC
・・
・・
Every week
Category
Data
serviceB
serviceD
Release in Regular Maintenance
ServiceA
serviceE
serviceC
・・
・・
Release in week
34
35. Release
More easily more Speedy!!
For operation free
Get rid of dependency in each service
GMS
Report
RMS
TOP page
Search
Engine
Rakuten Search
Web Service
Advertisement
Category
API
Auction
Books
Review
Racoupon
Category
Data
kobo
Auto
Browsing
History
Super DB
Affiliate
Ranking
Basket
Mail
Item Page
35
36. Release
■Real Time reflection
Can be released Category Data
and
search it by “Real Time” on
Real Time reference
iPhone5s
Rakuten Search.
Register
Real Time released
when needed.
36
37. Release
■Real Time reflection
Can be released Category
Data
and
summarize it on Ranking.
Register
iPhone5s
Released
as a daily/weekly
Ranking data.
37
38. Release
■Real Time reflection
Can be released Category
Data
and
Create Landing-page.
Register
iPhone5s
Can be created
Landing-page
used by
new Categorydata
38
41. Finally
Thank you for your Listening!!
If you have any idea or question, Please contact us.
Let’s talk about Category with us!!
Suguru Suzuki
Yuhei Nishioka
@sugsuzuki
@nishiokamegane
sugru.suzuki
@mail.rakuten.com
yuhei.nishioka
@mail.rakuten.com
41