SlideShare a Scribd company logo
1 of 52
Download to read offline
Uncoupled Regression from
Comparison Data
Liyuan Xu
Gatsby Unit@UCL, Former AIP member
(Twitter: @ly9988)
Disclaimer
This talk is mainly based on our paper in NeurIPS2019
Introduction
Regression Problem
(x1, y1), (x2, y2), …
(Coupled) Data
∼ PXY
f(X) ≃ 𝔼[Y|X]
Learn
Correspondence in data is assumed
Uncoupled Regression Problem
Uncoupled Data
∼ PX
x1, x2, x3, …
∼ PY
y1, y2, y3, …
f(X) ≃ 𝔼[Y|X]
Learn
Regression without data correspondence
Uncoupled Regression
Uncoupled regression is impossible itself.
→What is a practically feasible assumption?
Application of Uncoupled Regression
• Merging two datasets [Carpentier+, 2016]
• : income, housing priceX Y :
Government
Publish X
Bank
Publish Y
How to merge two datasets
collected independently?
Application of Uncoupled Regression
• Privacy Preserving Machine Learning [Xu et al. 2019]
• Consider contains sensitive informationY
(Xi, Yi)
Security Incident
Application of Uncoupled Regression
• Privacy Preserving Machine Learning [Xu et al. 2019]
• Consider contains sensitive informationY
Xi Yi
Anonymized Data
Data Fusion / Matching
Uncoupled Data w. Context
∼ PXZ
(x1, z1), (x2, z2), …
∼ PYZ
(y1, z′1), (y2, z′2), …
f(X) ≃ 𝔼[Y|X]
Learn
Use contextual data to merge two distributions
→ Data Fusion / Matching
Z
Isometric Uncoupled Regression [Carpentier+, 2016]
Uncoupled Data
∼ PX
x1, x2, x3, …
∼ PY
y1, y2, y3, …
f(X) ≃ 𝔼[Y|X]
Learn
Assuming
𝔼[Y|X] : monotonic
Monotonicity makes uncoupled regression feasible
Isometric Uncoupled Regression [Carpentier+, 2016]
• Advantage
• Consistency is proved [Rigollet et al. 2018]
→ Optimal model can be learn as data increases
• Limitation
• Monotonicity assumption may be too strong
• Is really income monotonic to housing price ?
• Only applicable to the case
• Need to know the noise distribution
• Solve problem with with known
X Y
X ∈ ℝ
Y = f*(x) + ε P(ε)
High-level concept
Message in [Carpentier+, 2016]
Uncoupled Data + Order Info. → Regression
Order info is provided by monotonic assumption
Our Idea
Order info is learned from pairwise comparison data
Uncoupled Data + Order Info. → Regression
Problem Setting
• Pairwise Comparison Data
• Originally considered in ranking context
• Sample two data points
• Obtain Pairwise Comparison Data as
(X, Y), (X′, Y′) ∼ PX,Y
(X+
, X−
)
{
X+
= X, X−
= X′ (if Y > Y′)
X+
= X′, X−
= X (if Y ≤ Y′)
Uncoupled Regression from Pairwise Comparison
∼ PX
x1, x2, x3, …
∼ PY
y1, y2, y3, …
f(X) ≃ 𝔼[Y|X]
Learn
∼ PX+,X−
(x+
1 , x−
1 ), (x+
2 , x−
2 ), …
Uncoupled Data Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison
Proposes two approaches:
Risk Approximation & Target Transformation
• Advantage
• Put no assumption on
• Need not to know noise distribution
• Limitation
• Not consistent
• Deviation from optimal model is bounded
• Empirically it works
𝔼[Y|X]
Risk Approximation Approach
Formal Problem Settings
• Data Given:
• Unlabeled Data:
• Target Set:
• Pairwise Comparison Data:
• Goal: Find that satisfies
DX = {x1, x2, …, xn} ∼ PX
DY = {y1, y2, …, yn} ∼ PY
DX+,X− = {(x+
1 , x−
1 ), …, (x+
m, x−
m)} ∼ PX+,X−
f*
f* = arg min
f
R(f ), R(f ) = 𝔼[(f(X) − Y)2
]
Risk Approximation
Loss Decomposition
R(f ) = 𝔼X,Y[(f(X) − Y)2
]
= 𝔼X[f2
(X)] − 2𝔼X,Y[Yf(X)] + const .
Estimated from unlabeled data DX
Approx. by
linear combination of and
𝔼X,Y[Yf(X)]
𝔼X+[f(X+
)] 𝔼X−[f(X−
)]
Risk Approximation
Lemma 1 [Xu et al. 2019]
For any function ,f
𝔼X+[f(X+
)] = 2𝔼X,Y[FY(Y)f(X)]
𝔼X−[f(X−
)] = 2𝔼X,Y[(1 − FY(Y))f(X)],
where is CDF ofFY Y
If we can learn such thatw1, w2
Y ≃ 2w1FY(Y) + 2w2(1 − FY(Y))
then,
𝔼XY[Yf(X)] ≃ w1 𝔼X+[f(X+
)] + w2 𝔼X−[f(X−
)]
Risk Approximation
• Risk Approximation
• Step1: Estimate CDF
• Step2: Learn weights for loss
• Step3: Learn model
̂FY
̂w1, ̂w2
̂f
Risk Approximation
• Risk Approximation
• Step1: Estimate CDF
• Step2: Learn weights for loss
• Step3: Learn model
̂FY
̂w1, ̂w2
̂f
CDF is estimated viaFY
Risk Approximation
• Risk Approximation
• Step1: Estimate CDF
• Step2: Learn weights for loss
• Step3: Learn model
̂FY
̂w1, ̂w2
̂f
Weight is learned bŷw1, ̂w2
̂w1, ̂w2 = arg min
|DY|
∑
i=1
(yi − 2w1
̂FY(yi) − 2w2(1 − ̂FY(yi)))
2
Recall, we want Y ≃ 2w1FY(Y) + 2w2(1 − FY(Y))
Risk Approximation
• Risk Approximation
• Step1: Estimate CDF
• Step2: Learn weights for loss
• Step3: Learn model
̂FY
̂w1, ̂w2
̂f
Model is learned byf
̂f = arg min
f
1
|DX |
|DX|
∑
i=1
f(xi)2
−
2
|DX+,X− |
|DX+,X−|
∑
j=1
̂w1f(x+
j ) + ̂w2 f(x−
j )
𝔼X[f2
(X)] 2𝔼XY[Yf(X)]
Theoretical Property
Theorem 2 [Xu et al. 2019]
For learned , with some assumption,̂f
R( ̂f ) ≤ R(f*) + Op
(
1
|DX |1/2
+
1
|DX−,X+ |1/2 )
+ M Err( ̂w1, ̂w2)
Here, is the approximation errorErr(w1, w2)
Err(w1, w2) = 𝔼Y[(Y − 2w1FY(Y) − 2w2(1 − FY(Y)))2
]
→ Approximate loss well, small bias in the model
Theoretical Property
Theorem 2 [Xu et al. 2019]
For learned , with some assumption,̂f
Especially, if thenY ∼ Unif[a, b] Err(b/2,a/2) = 0
R( ̂f ) ≤ R(f*) + Op
(
1
|DX |1/2
+
1
|DX−,X+ |1/2 )
+ M Err( ̂w1, ̂w2)
Theoretical Property
Theorem 2 [Xu et al. 2019]
For learned , with some assumption,̂f
In general,
① Theoretically, it’s inevitable…
② Empirically it works!
Err > 0
R( ̂f ) ≤ R(f*) + Op
(
1
|DX |1/2
+
1
|DX−,X+ |1/2 )
+ M Err( ̂w1, ̂w2)
Theoretical Property
There exists two distributions
that cannot distinguished by PX, PY, PX+,X−
Theoretical Property
PXY
X
Y
˜PXY
X
Y
1/6
1/8 5/24
1/4
1/8 5/24
1/6
1/6
1/6
1/6
1/6
1/12
Same , , butPX PY, PX+,X− 𝔼P[Y|X] ≠ 𝔼 ˜P[Y|X]
Empirical Result
• Learn a linear model in UCI datasets
• Uncoupled regression
• Use all features for , all targets for
• Note, no correspondence is given
• Generate 5000 pairs of
• Supervised regression
• Use entire coupled data
DX DY
DX+,X−
(X, Y)
Empirical Result
• MSE of linear models in UCI datasets
→ Can yield almost same MSE as supervised learning !
Conclusion So Far
• Uncoupled Regression From Pairwise Comparison
• Solve regression problem given
• Unlabeled data
• Set of target value
• Pairwise comparison data
• Introduced approach based on risk approximation
• Theoretical and empirical results are given
DX
DY
DX+,X−
Modeling CDF
from Pairwise Comparison Data
Theoretical Property (Recap)
Theorem 2 [Xu et al. 2019]
For learned , with some assumption,̂f
Especially, if then
→ We can learn optimal
Y ∼ Unif[a, b] Err(b/2,a/2) = 0
Y
R( ̂f ) ≤ R(f*) + Op
(
1
|DX |1/2
+
1
|DX−,X+ |1/2 )
+ M Err( ̂w1, ̂w2)
Predicting Percentile
• Optimize Direct Marketing
• : Customer Feature, : Probability of Purchase
• Send discount tickets to 1% of potential customers
• CDF is more the target of interest than
• Predicting might not be a best idea…
• Due to class imbalance, all can be very small
X Y
FY(Y) Y
Y
Y
Predicting Percentile
• Sometimes percentile is the target of interest
• Learn that minimizes
• follows
→We can learn optimal from pairwise comparison
f(X)
R(f ) = 𝔼[(FY(Y) − f(X))2
]
FY(Y) Unif[0,1]
f
Motivating Example for Predicting Percentile
• Online Chess Rating
• : User attributes, : Abstract measure of “Skill”
• Skill is compared by game
• Pairwise comparison data given in nature
• Want to know the percentile in skill ranking
X Y
Simple Solution
• Problem (Recap)
• Given pairwise comparison data
• Predict conditional expectation of CDF
• Simple Solution
• Learn ranking model from
• Transform to
(X+
, X−
)
𝔼[FY(Y)|X]
r(X) (X+
, X−
)
r(X) 𝔼[FY(Y)|X]
Pairwise-Ranking based Approach
• Pairwise Learn to Rank
• Learn ranker which minimizes rank loss
• e.g. SVMRank, RankBoost
• Given test data and rank model,
r(X)
Xtest
𝔼[FY(Y)|X] ≃
Rank of Xtest in entire data
Number of entire data
Weakness in Pairwise-Ranking based Approach
• Original Goal is to minimize
,
• Rank model minimizes
Small does not necessary mean small
→We aim for directly minimizing
R(f ) = 𝔼X,Y[(f(X) − FY(Y))2
]
r(X)
Rr(r) R(f )
R(f )
Direct Minimization
Lemma 1 [Xu et al. 2019]
For any function ,h
𝔼X+[h(X+
)] = 2𝔼X,Y[FY(Y)h(X)]
𝔼X−[h(X−
)] = 2𝔼X,Y[(1 − FY(Y))h(X)]
From this lemma, we have
R(f ) = 𝔼X,Y[(f(X) − FY(Y))2
]
= 𝔼X[f2
(X)] −2𝔼X,Y[FY(Y)f(X)] +const .
= 𝔼X[f2
(X)] −𝔼X+[f(X+
)] +const .
R(f ) ≤ ̂R(f ) + Op
1
|DX |
+
1
|DX+,X− |
Empirical Approximation
• The original loss (without constant)
• The empirical loss
R(f )
R(f ) = 𝔼X[f2
(X)] − 𝔼X+[f(X+
)]
̂R(f )
̂R(f ) =
1
|DX | ∑
DX
f2
(xi) −
1
|DX+,X− | ∑
DX+,X−
f(x+
i )
Summary
• Summary
• We can learn only from
• Empirical loss to minimize is
Can we use this to original regression problem?
𝔼[FY(Y)|X] DX, DX+,X−
̂R(f ) =
1
|DX | ∑
DX
f2
(xi) −
1
|DX+,X− | ∑
DX+,X−
f(x+
i )
Target Transform Approach
Target Transformation
• From previous discussion,
• We can learn optimal model for
• We can learn CDF function .
• Target Transformation Approach [Xu et al. 2019]
1. Learn function minimizes
2. Output regression model as
FY(Y)
FY
̂F
RF(F) = 𝔼X,Y[(FY(Y) − F(X))2
]
̂f
̂f = F(−1)
Y
(F(X))
Target Transformation
• Target Transformation
• Step1: Estimate CDF
• Step2: Learn CDF model
• Step3: Learn regression model
̂FY
̂F
̂f
Target Transformation
• Target Transformation
• Step1: Estimate CDF
• Step2: Learn CDF model
• Step3: Learn regression model
̂FY
̂F
̂f
CDF is estimated viaFY
Target Transformation
• Target Transformation
• Step1: Estimate CDF
• Step2: Learn CDF model
• Step3: Learn regression model
̂FY
̂F
̂f
Model is learned bŷF
̂F = arg min
F
1
|DX |
|DX|
∑
i=1
F(xi)2
−
1
|DX+,X− |
|DX+,X−|
∑
j=1
F(x+
j )
𝔼X[f2
(X)] 2𝔼XY[FY(Y)f(X)]
Target Transformation
• Target Transformation
• Step1: Estimate CDF
• Step2: Learn CDF model
• Step3: Learn regression model
̂FY
̂F
̂f
Model is learned byf
̂f = F−1
Y ( ̂F(X))
Experiment on UCI
• RA: Risk Approximation
• TT: Target Transformation
• SVMRank: TT approach with is learned based on SVMRank̂F
Conclusion
• Uncoupled Regression From Pairwise Comparison
• Solve regression problem given
• Unlabeled data
• Set of target value
• Pairwise comparison data
• Approach based on risk approximation
• Theoretical and empirical results are given
• Approach based on target transformation
• (Theoretical) and empirical results are given
DX
DY
DX+,X−
Thank you!
• Follow me on Twitter! (@ly9988)

More Related Content

What's hot

自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用Ryo Iwaki
 
物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)cvpaper. challenge
 
言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyo言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyoShuyo Nakatani
 
Stack Buffer OverFlow
Stack Buffer OverFlowStack Buffer OverFlow
Stack Buffer OverFlowsounakano
 
Introduction to A3C model
Introduction to A3C modelIntroduction to A3C model
Introduction to A3C modelWEBFARMER. ltd.
 
フーリエ変換を用いたテクスチャ解像度推定とその応用
フーリエ変換を用いたテクスチャ解像度推定とその応用フーリエ変換を用いたテクスチャ解像度推定とその応用
フーリエ変換を用いたテクスチャ解像度推定とその応用Hajime Uchimura
 
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法MITSUNARI Shigeo
 
動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)cvpaper. challenge
 
ニューラルネットワークの数理
ニューラルネットワークの数理ニューラルネットワークの数理
ニューラルネットワークの数理Task Ohmori
 
[DL輪読会]A closer look at few shot classification
[DL輪読会]A closer look at few shot classification[DL輪読会]A closer look at few shot classification
[DL輪読会]A closer look at few shot classificationDeep Learning JP
 
Reinforcement Learning in Practice: Contextual Bandits
Reinforcement Learning in Practice: Contextual BanditsReinforcement Learning in Practice: Contextual Bandits
Reinforcement Learning in Practice: Contextual BanditsMax Pagels
 
【DL輪読会】Language Conditioned Imitation Learning over Unstructured Data
【DL輪読会】Language Conditioned Imitation Learning over Unstructured Data【DL輪読会】Language Conditioned Imitation Learning over Unstructured Data
【DL輪読会】Language Conditioned Imitation Learning over Unstructured DataDeep Learning JP
 
SIGNATE 鰹節コンペ2nd Place Solution
SIGNATE 鰹節コンペ2nd Place SolutionSIGNATE 鰹節コンペ2nd Place Solution
SIGNATE 鰹節コンペ2nd Place SolutionYusuke Uchida
 
BigQuery MLの行列分解モデルを 用いた推薦システムの基礎
BigQuery MLの行列分解モデルを 用いた推薦システムの基礎BigQuery MLの行列分解モデルを 用いた推薦システムの基礎
BigQuery MLの行列分解モデルを 用いた推薦システムの基礎幸太朗 岩澤
 
子供の言語獲得と機械の言語獲得
子供の言語獲得と機械の言語獲得子供の言語獲得と機械の言語獲得
子供の言語獲得と機械の言語獲得Yuya Unno
 
方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用Ryo Iwaki
 
Inverse Reinforcement Learning Algorithms
Inverse Reinforcement Learning AlgorithmsInverse Reinforcement Learning Algorithms
Inverse Reinforcement Learning AlgorithmsSungjoon Choi
 

What's hot (20)

自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用
 
物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)
 
言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyo言語処理するのに Python でいいの? #PyDataTokyo
言語処理するのに Python でいいの? #PyDataTokyo
 
Stack Buffer OverFlow
Stack Buffer OverFlowStack Buffer OverFlow
Stack Buffer OverFlow
 
暗認本読書会4
暗認本読書会4暗認本読書会4
暗認本読書会4
 
Introduction to A3C model
Introduction to A3C modelIntroduction to A3C model
Introduction to A3C model
 
フーリエ変換を用いたテクスチャ解像度推定とその応用
フーリエ変換を用いたテクスチャ解像度推定とその応用フーリエ変換を用いたテクスチャ解像度推定とその応用
フーリエ変換を用いたテクスチャ解像度推定とその応用
 
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
 
Ch11 hmm
Ch11 hmmCh11 hmm
Ch11 hmm
 
動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)
 
ニューラルネットワークの数理
ニューラルネットワークの数理ニューラルネットワークの数理
ニューラルネットワークの数理
 
[DL輪読会]A closer look at few shot classification
[DL輪読会]A closer look at few shot classification[DL輪読会]A closer look at few shot classification
[DL輪読会]A closer look at few shot classification
 
Reinforcement Learning in Practice: Contextual Bandits
Reinforcement Learning in Practice: Contextual BanditsReinforcement Learning in Practice: Contextual Bandits
Reinforcement Learning in Practice: Contextual Bandits
 
【DL輪読会】Language Conditioned Imitation Learning over Unstructured Data
【DL輪読会】Language Conditioned Imitation Learning over Unstructured Data【DL輪読会】Language Conditioned Imitation Learning over Unstructured Data
【DL輪読会】Language Conditioned Imitation Learning over Unstructured Data
 
SIGNATE 鰹節コンペ2nd Place Solution
SIGNATE 鰹節コンペ2nd Place SolutionSIGNATE 鰹節コンペ2nd Place Solution
SIGNATE 鰹節コンペ2nd Place Solution
 
BigQuery MLの行列分解モデルを 用いた推薦システムの基礎
BigQuery MLの行列分解モデルを 用いた推薦システムの基礎BigQuery MLの行列分解モデルを 用いた推薦システムの基礎
BigQuery MLの行列分解モデルを 用いた推薦システムの基礎
 
子供の言語獲得と機械の言語獲得
子供の言語獲得と機械の言語獲得子供の言語獲得と機械の言語獲得
子供の言語獲得と機械の言語獲得
 
[DL輪読会]World Models
[DL輪読会]World Models[DL輪読会]World Models
[DL輪読会]World Models
 
方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用
 
Inverse Reinforcement Learning Algorithms
Inverse Reinforcement Learning AlgorithmsInverse Reinforcement Learning Algorithms
Inverse Reinforcement Learning Algorithms
 

Similar to Uncoupled Regression from Pairwise Comparison Data

Image Processing 2
Image Processing 2Image Processing 2
Image Processing 2jainatin
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Gabriel Peyré
 
Additive model and boosting tree
Additive model and boosting treeAdditive model and boosting tree
Additive model and boosting treeDong Guo
 
Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputszukun
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Gabriel Peyré
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Limits and Continuity of Functions
Limits and Continuity of Functions Limits and Continuity of Functions
Limits and Continuity of Functions OlooPundit
 
Derivative free optimization
Derivative free optimizationDerivative free optimization
Derivative free optimizationhelalmohammad2
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3jainatin
 
IVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methodsIVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methodsCharles Deledalle
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4Roziq Bahtiar
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureVjekoslavKovac1
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...HidenoriOgata
 
CS571: Gradient Descent
CS571: Gradient DescentCS571: Gradient Descent
CS571: Gradient DescentJinho Choi
 

Similar to Uncoupled Regression from Pairwise Comparison Data (20)

Image Processing 2
Image Processing 2Image Processing 2
Image Processing 2
 
MNAR
MNARMNAR
MNAR
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
 
Additive model and boosting tree
Additive model and boosting treeAdditive model and boosting tree
Additive model and boosting tree
 
Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputs
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
 
Fi review5
Fi review5Fi review5
Fi review5
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
sada_pres
sada_pressada_pres
sada_pres
 
Image denoising
Image denoisingImage denoising
Image denoising
 
Limits and Continuity of Functions
Limits and Continuity of Functions Limits and Continuity of Functions
Limits and Continuity of Functions
 
Derivative free optimization
Derivative free optimizationDerivative free optimization
Derivative free optimization
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3
 
IVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methodsIVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methods
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structure
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...
 
CS571: Gradient Descent
CS571: Gradient DescentCS571: Gradient Descent
CS571: Gradient Descent
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Uncoupled Regression from Pairwise Comparison Data

  • 1. Uncoupled Regression from Comparison Data Liyuan Xu Gatsby Unit@UCL, Former AIP member (Twitter: @ly9988)
  • 2. Disclaimer This talk is mainly based on our paper in NeurIPS2019
  • 4. Regression Problem (x1, y1), (x2, y2), … (Coupled) Data ∼ PXY f(X) ≃ 𝔼[Y|X] Learn Correspondence in data is assumed
  • 5. Uncoupled Regression Problem Uncoupled Data ∼ PX x1, x2, x3, … ∼ PY y1, y2, y3, … f(X) ≃ 𝔼[Y|X] Learn Regression without data correspondence
  • 6. Uncoupled Regression Uncoupled regression is impossible itself. →What is a practically feasible assumption?
  • 7. Application of Uncoupled Regression • Merging two datasets [Carpentier+, 2016] • : income, housing priceX Y : Government Publish X Bank Publish Y How to merge two datasets collected independently?
  • 8. Application of Uncoupled Regression • Privacy Preserving Machine Learning [Xu et al. 2019] • Consider contains sensitive informationY (Xi, Yi) Security Incident
  • 9. Application of Uncoupled Regression • Privacy Preserving Machine Learning [Xu et al. 2019] • Consider contains sensitive informationY Xi Yi Anonymized Data
  • 10. Data Fusion / Matching Uncoupled Data w. Context ∼ PXZ (x1, z1), (x2, z2), … ∼ PYZ (y1, z′1), (y2, z′2), … f(X) ≃ 𝔼[Y|X] Learn Use contextual data to merge two distributions → Data Fusion / Matching Z
  • 11. Isometric Uncoupled Regression [Carpentier+, 2016] Uncoupled Data ∼ PX x1, x2, x3, … ∼ PY y1, y2, y3, … f(X) ≃ 𝔼[Y|X] Learn Assuming 𝔼[Y|X] : monotonic Monotonicity makes uncoupled regression feasible
  • 12. Isometric Uncoupled Regression [Carpentier+, 2016] • Advantage • Consistency is proved [Rigollet et al. 2018] → Optimal model can be learn as data increases • Limitation • Monotonicity assumption may be too strong • Is really income monotonic to housing price ? • Only applicable to the case • Need to know the noise distribution • Solve problem with with known X Y X ∈ ℝ Y = f*(x) + ε P(ε)
  • 13. High-level concept Message in [Carpentier+, 2016] Uncoupled Data + Order Info. → Regression Order info is provided by monotonic assumption Our Idea Order info is learned from pairwise comparison data Uncoupled Data + Order Info. → Regression
  • 14. Problem Setting • Pairwise Comparison Data • Originally considered in ranking context • Sample two data points • Obtain Pairwise Comparison Data as (X, Y), (X′, Y′) ∼ PX,Y (X+ , X− ) { X+ = X, X− = X′ (if Y > Y′) X+ = X′, X− = X (if Y ≤ Y′)
  • 15. Uncoupled Regression from Pairwise Comparison ∼ PX x1, x2, x3, … ∼ PY y1, y2, y3, … f(X) ≃ 𝔼[Y|X] Learn ∼ PX+,X− (x+ 1 , x− 1 ), (x+ 2 , x− 2 ), … Uncoupled Data Pairwise Comparison Data
  • 16. Uncoupled Regression from Pairwise Comparison Proposes two approaches: Risk Approximation & Target Transformation • Advantage • Put no assumption on • Need not to know noise distribution • Limitation • Not consistent • Deviation from optimal model is bounded • Empirically it works 𝔼[Y|X]
  • 18. Formal Problem Settings • Data Given: • Unlabeled Data: • Target Set: • Pairwise Comparison Data: • Goal: Find that satisfies DX = {x1, x2, …, xn} ∼ PX DY = {y1, y2, …, yn} ∼ PY DX+,X− = {(x+ 1 , x− 1 ), …, (x+ m, x− m)} ∼ PX+,X− f* f* = arg min f R(f ), R(f ) = 𝔼[(f(X) − Y)2 ]
  • 19. Risk Approximation Loss Decomposition R(f ) = 𝔼X,Y[(f(X) − Y)2 ] = 𝔼X[f2 (X)] − 2𝔼X,Y[Yf(X)] + const . Estimated from unlabeled data DX Approx. by linear combination of and 𝔼X,Y[Yf(X)] 𝔼X+[f(X+ )] 𝔼X−[f(X− )]
  • 20. Risk Approximation Lemma 1 [Xu et al. 2019] For any function ,f 𝔼X+[f(X+ )] = 2𝔼X,Y[FY(Y)f(X)] 𝔼X−[f(X− )] = 2𝔼X,Y[(1 − FY(Y))f(X)], where is CDF ofFY Y If we can learn such thatw1, w2 Y ≃ 2w1FY(Y) + 2w2(1 − FY(Y)) then, 𝔼XY[Yf(X)] ≃ w1 𝔼X+[f(X+ )] + w2 𝔼X−[f(X− )]
  • 21. Risk Approximation • Risk Approximation • Step1: Estimate CDF • Step2: Learn weights for loss • Step3: Learn model ̂FY ̂w1, ̂w2 ̂f
  • 22. Risk Approximation • Risk Approximation • Step1: Estimate CDF • Step2: Learn weights for loss • Step3: Learn model ̂FY ̂w1, ̂w2 ̂f CDF is estimated viaFY
  • 23. Risk Approximation • Risk Approximation • Step1: Estimate CDF • Step2: Learn weights for loss • Step3: Learn model ̂FY ̂w1, ̂w2 ̂f Weight is learned bŷw1, ̂w2 ̂w1, ̂w2 = arg min |DY| ∑ i=1 (yi − 2w1 ̂FY(yi) − 2w2(1 − ̂FY(yi))) 2 Recall, we want Y ≃ 2w1FY(Y) + 2w2(1 − FY(Y))
  • 24. Risk Approximation • Risk Approximation • Step1: Estimate CDF • Step2: Learn weights for loss • Step3: Learn model ̂FY ̂w1, ̂w2 ̂f Model is learned byf ̂f = arg min f 1 |DX | |DX| ∑ i=1 f(xi)2 − 2 |DX+,X− | |DX+,X−| ∑ j=1 ̂w1f(x+ j ) + ̂w2 f(x− j ) 𝔼X[f2 (X)] 2𝔼XY[Yf(X)]
  • 25. Theoretical Property Theorem 2 [Xu et al. 2019] For learned , with some assumption,̂f R( ̂f ) ≤ R(f*) + Op ( 1 |DX |1/2 + 1 |DX−,X+ |1/2 ) + M Err( ̂w1, ̂w2) Here, is the approximation errorErr(w1, w2) Err(w1, w2) = 𝔼Y[(Y − 2w1FY(Y) − 2w2(1 − FY(Y)))2 ] → Approximate loss well, small bias in the model
  • 26. Theoretical Property Theorem 2 [Xu et al. 2019] For learned , with some assumption,̂f Especially, if thenY ∼ Unif[a, b] Err(b/2,a/2) = 0 R( ̂f ) ≤ R(f*) + Op ( 1 |DX |1/2 + 1 |DX−,X+ |1/2 ) + M Err( ̂w1, ̂w2)
  • 27. Theoretical Property Theorem 2 [Xu et al. 2019] For learned , with some assumption,̂f In general, ① Theoretically, it’s inevitable… ② Empirically it works! Err > 0 R( ̂f ) ≤ R(f*) + Op ( 1 |DX |1/2 + 1 |DX−,X+ |1/2 ) + M Err( ̂w1, ̂w2)
  • 28. Theoretical Property There exists two distributions that cannot distinguished by PX, PY, PX+,X−
  • 29. Theoretical Property PXY X Y ˜PXY X Y 1/6 1/8 5/24 1/4 1/8 5/24 1/6 1/6 1/6 1/6 1/6 1/12 Same , , butPX PY, PX+,X− 𝔼P[Y|X] ≠ 𝔼 ˜P[Y|X]
  • 30. Empirical Result • Learn a linear model in UCI datasets • Uncoupled regression • Use all features for , all targets for • Note, no correspondence is given • Generate 5000 pairs of • Supervised regression • Use entire coupled data DX DY DX+,X− (X, Y)
  • 31. Empirical Result • MSE of linear models in UCI datasets → Can yield almost same MSE as supervised learning !
  • 32. Conclusion So Far • Uncoupled Regression From Pairwise Comparison • Solve regression problem given • Unlabeled data • Set of target value • Pairwise comparison data • Introduced approach based on risk approximation • Theoretical and empirical results are given DX DY DX+,X−
  • 33. Modeling CDF from Pairwise Comparison Data
  • 34. Theoretical Property (Recap) Theorem 2 [Xu et al. 2019] For learned , with some assumption,̂f Especially, if then → We can learn optimal Y ∼ Unif[a, b] Err(b/2,a/2) = 0 Y R( ̂f ) ≤ R(f*) + Op ( 1 |DX |1/2 + 1 |DX−,X+ |1/2 ) + M Err( ̂w1, ̂w2)
  • 35. Predicting Percentile • Optimize Direct Marketing • : Customer Feature, : Probability of Purchase • Send discount tickets to 1% of potential customers • CDF is more the target of interest than • Predicting might not be a best idea… • Due to class imbalance, all can be very small X Y FY(Y) Y Y Y
  • 36. Predicting Percentile • Sometimes percentile is the target of interest • Learn that minimizes • follows →We can learn optimal from pairwise comparison f(X) R(f ) = 𝔼[(FY(Y) − f(X))2 ] FY(Y) Unif[0,1] f
  • 37. Motivating Example for Predicting Percentile • Online Chess Rating • : User attributes, : Abstract measure of “Skill” • Skill is compared by game • Pairwise comparison data given in nature • Want to know the percentile in skill ranking X Y
  • 38. Simple Solution • Problem (Recap) • Given pairwise comparison data • Predict conditional expectation of CDF • Simple Solution • Learn ranking model from • Transform to (X+ , X− ) 𝔼[FY(Y)|X] r(X) (X+ , X− ) r(X) 𝔼[FY(Y)|X]
  • 39. Pairwise-Ranking based Approach • Pairwise Learn to Rank • Learn ranker which minimizes rank loss • e.g. SVMRank, RankBoost • Given test data and rank model, r(X) Xtest 𝔼[FY(Y)|X] ≃ Rank of Xtest in entire data Number of entire data
  • 40. Weakness in Pairwise-Ranking based Approach • Original Goal is to minimize , • Rank model minimizes Small does not necessary mean small →We aim for directly minimizing R(f ) = 𝔼X,Y[(f(X) − FY(Y))2 ] r(X) Rr(r) R(f ) R(f )
  • 41. Direct Minimization Lemma 1 [Xu et al. 2019] For any function ,h 𝔼X+[h(X+ )] = 2𝔼X,Y[FY(Y)h(X)] 𝔼X−[h(X− )] = 2𝔼X,Y[(1 − FY(Y))h(X)] From this lemma, we have R(f ) = 𝔼X,Y[(f(X) − FY(Y))2 ] = 𝔼X[f2 (X)] −2𝔼X,Y[FY(Y)f(X)] +const . = 𝔼X[f2 (X)] −𝔼X+[f(X+ )] +const .
  • 42. R(f ) ≤ ̂R(f ) + Op 1 |DX | + 1 |DX+,X− | Empirical Approximation • The original loss (without constant) • The empirical loss R(f ) R(f ) = 𝔼X[f2 (X)] − 𝔼X+[f(X+ )] ̂R(f ) ̂R(f ) = 1 |DX | ∑ DX f2 (xi) − 1 |DX+,X− | ∑ DX+,X− f(x+ i )
  • 43. Summary • Summary • We can learn only from • Empirical loss to minimize is Can we use this to original regression problem? 𝔼[FY(Y)|X] DX, DX+,X− ̂R(f ) = 1 |DX | ∑ DX f2 (xi) − 1 |DX+,X− | ∑ DX+,X− f(x+ i )
  • 45. Target Transformation • From previous discussion, • We can learn optimal model for • We can learn CDF function . • Target Transformation Approach [Xu et al. 2019] 1. Learn function minimizes 2. Output regression model as FY(Y) FY ̂F RF(F) = 𝔼X,Y[(FY(Y) − F(X))2 ] ̂f ̂f = F(−1) Y (F(X))
  • 46. Target Transformation • Target Transformation • Step1: Estimate CDF • Step2: Learn CDF model • Step3: Learn regression model ̂FY ̂F ̂f
  • 47. Target Transformation • Target Transformation • Step1: Estimate CDF • Step2: Learn CDF model • Step3: Learn regression model ̂FY ̂F ̂f CDF is estimated viaFY
  • 48. Target Transformation • Target Transformation • Step1: Estimate CDF • Step2: Learn CDF model • Step3: Learn regression model ̂FY ̂F ̂f Model is learned bŷF ̂F = arg min F 1 |DX | |DX| ∑ i=1 F(xi)2 − 1 |DX+,X− | |DX+,X−| ∑ j=1 F(x+ j ) 𝔼X[f2 (X)] 2𝔼XY[FY(Y)f(X)]
  • 49. Target Transformation • Target Transformation • Step1: Estimate CDF • Step2: Learn CDF model • Step3: Learn regression model ̂FY ̂F ̂f Model is learned byf ̂f = F−1 Y ( ̂F(X))
  • 50. Experiment on UCI • RA: Risk Approximation • TT: Target Transformation • SVMRank: TT approach with is learned based on SVMRank̂F
  • 51. Conclusion • Uncoupled Regression From Pairwise Comparison • Solve regression problem given • Unlabeled data • Set of target value • Pairwise comparison data • Approach based on risk approximation • Theoretical and empirical results are given • Approach based on target transformation • (Theoretical) and empirical results are given DX DY DX+,X−
  • 52. Thank you! • Follow me on Twitter! (@ly9988)