Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- What to Upload to SlideShare by SlideShare 6427998 views
- Customer Code: Creating a Company C... by HubSpot 4794539 views
- Be A Great Product Leader (Amplify,... by Adam Nash 1073734 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 1260132 views
- APIdays Paris 2019 - Innovation @ s... by apidays 1505966 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 1102649 views

Theory of Domain Adaptation

https://www.alexkulesza.com/pubs/adapt_mlj10.pdf

License: CC Attribution License

No Downloads

Total views

434

On SlideShare

0

From Embeds

0

Number of Embeds

0

Shares

0

Downloads

24

Comments

1

Likes

1

No notes for slide

- 1. Theory of Domain Adaptation Mark Chang 2019/09/09
- 2. Outlines • Generalization Bound of Learning from Single Domain • Problem of Domain Adaptation • Generalization Bound of Domain Adaptation • Domain Adaptation Example
- 3. Generalization Bound of Learning from Single Domain data training data testing data sampling sampling hypothesis h hypothesis h training algorithm: minimize training error change the hypothesis h ✏(h) testing Error
- 4. Generalization Bound of Learning from Single Domain • Learning is feasible when is small -> is small • With 1-ẟ probability, the following inequality is satisfied ✏(h) numbef of training instances VC-Dimension (model complexity) ✏(h) ˆ✏(h) + r 8 n log( 4(2n)d )
- 5. Problem of Domain Adaptation https://www.semanticscholar.org/paper/Attribute-Based- Synthetic-Network-(ABS-Net)%3A-more-Lu- Li/2c3138782317a97526a83a7ce264c0c772ddf7e3 training data: MNIST testing data : MNIST with gray-scale words and background
- 6. Problem of Domain Adaptation data (source domain) data (target domain) training data testing data testing data ✏S(h) ˆ✏S(h) + r 8 n log( 4(2n)d ) ˆ✏S(h) ✏S(h) ✏T (h) Generalization Bound of Domain Adaptation
- 7. Problem of Domain Adaptation • Distance between source feature and target feature source domain target domain 1 target domain 2 small distance large distance
- 8. Problem of Domain Adaptation • Distance between source labeling function and target labeling function source domain target domain 1 target domain 2 feature: label: 1 0 1 0 1 feature: label: 1 0 1 1 0 feature: label: 1 1 0 1 0 small distance large distance
- 9. Generalization Bound of Domain Adaptation
- 10. Generalization Bound of Domain Adaptation source domain data DS , fs target domain data DT, fT ✏S(h) ✏T (h) the distance between source feature DS and target feature DT the distance between source labeling function fS and target labeling function fT ✏T (h) ✏s(h)+d1(DS, DT )+min ⇣ EDS [|fS(x) fT (x)|], EDT [|fS(x) fT (x)|] ⌘
- 11. The Distance between Source Feature DS and Target Feature DT d1(DS, DT ) = 2 sup B2B PrDS [B] PrDT [B] B DS DT B1 B2 B = B1 [ B2 PrDS [B] PrDT [B]
- 12. The Distance between Source Feature DS and Target Feature DT d1(DS, DT ) = 2 sup B2B PrDS [B] PrDT [B] B DS DT = + + =1 + =1 =
- 13. B DS DT B1 B = B1 [ B2 The Distance between Source Feature DS and Target Feature DT • Searching for the supremum :d1(DS, DT ) = 2 sup B2B PrDS [B] PrDT [B] B DS DT B1 B = B1 [ B2 B DS DT B1 B2 B = B1 [ B2 B DS DT B1 B2 B = B1 [ B2 supremum
- 14. The Distance between Source Labeling Function fS and Target Labeling Function fT feature label 1 0 1 0 1 feature label 1 0 1 1 0 feature label 1 0 1 0 1 feature label 1 1 0 1 0 Source: Target: Source: Target: EDS [|fS(x) fT (x)|] = 0.4 EDS [|fS(x) fT (x)|] = 0.8 min ⇣ EDS [|fS(x) fT (x)|], EDT [|fS(x) fT (x)|] ⌘
- 15. Problem of d1(DS,DT) • Hard to Estimate by Finite Samples • Can be Over Estimate DS DT B1 B2 … d1(DS, DT ) = 2 sup B2B PrDS [B] PrDT [B] B B = B1 [ B2 [ · · ·
- 16. The HΔH-Distance dH H(DS, DT ) = 2 sup h0,h”2H Prx⇠DS [h0 (x) 6= h”(x)] Prx⇠DT [h0 (x) 6= h”(x)] h0 (x) = 0 h0 (x) = 1h0 (x) = 1 h”(x) = 0 h”(x) = 1h”(x) = 0 h0 (x) = h”(x) h0 (x) 6= h”(x) h0 (x) = h”(x) DS DT h0 h”
- 17. The HΔH-Distance • Searching for the supremum (Training) : = 2 sup h0,h”2H Prx⇠DS [h0 (x) 6= h”(x)] Prx⇠DT [h0 (x) 6= h”(x)] B DS DT h0 h” B DS DT h0 h” B DS DT h0 h” B DS DT h0 h” supremum
- 18. m training samples The HΔH-Distance • can be estimated from finite samplesdH H(DS, DT ) dH H(DS, DT ) ˆdH H(US, UT ) + 4 r 1 m log( 2(2m)2d ) Source Domain Data DS Target Domain Data DT US UT m training samples dH H(DS, DT ) ˆdH H(US, UT ) distance between DS and DT distance between US and UT
- 19. The HΔH-Distance • can alleviate the problem of over-estimationdH H(DS, DT ) DS DT B h0 h”
- 20. The Distance between Source Labeling Function fS and Target Labeling Function fT feature label 1 0 1 0 1 feature label 1 0 1 1 0 feature label 1 0 1 0 1 feature label 1 1 0 1 0 h⇤ (x) 1 0 1 0 0 h⇤ (x) 1 0 0 0 0 Source: Target: Source: Target: = 0.2 + 0.2 = 0.4 = 0.4 + 0.4 = 0.8 = ✏S(h⇤ ) + ✏T (h⇤ ), such that h⇤ = arg min h2H ✏S(h) + ✏T (h)
- 21. Generalization Bound of Domain Adaptation the distance between source feature DS and target feature DT the distance between source labeling function fS and target labeling function fT ✏T (h) ✏S(h) + 1 2 dH H(DS, DT ) + ✏T (h) ✏s(h)+d1(DS, DT )+min ⇣ EDS [|fS(x) fT (x)|], EDT [|fS(x) fT (x)|] ⌘ to be estimated by hypothesis
- 22. Domain Adaptation Example
- 23. reduce reduce dH H(DS, DT ) ✏T (h) ✏S(h) + 1 2 dH H(DS, DT ) +
- 24. About the Speaker Mark Chang • Email: ckmarkoh at gmail dot com • Facebook: https://www.facebook.com/ckmarkoh.chang

No public clipboards found for this slide

Login to see the comments