SlideShare a Scribd company logo
1 of 22
SalGAN: Visual Saliency Prediction with
Generative Adversarial Networks
arXiv:1701.01081v2 [cs.CV] 9 Jan 2017
Junting Pan, Elisa Sayrol and Xavier Giro-i-Nieto Image Processing Group
Universitat Politecnica de Catalunya (UPC) Barcelona,
abstract
- using BCE as loss (instead of often used MSE)
- adding adversarial loss (seeing our saliency predictor as a generator in
GAN)
- using downsampled predicted saliency map
outline
- motivation
- architecture
- training generator/discriminator
- results
- the impact of BCE
- the impact of downsampling
- adversarial gain
- comparison with SOTA
- qualitative results
- conclusion
motivation
- The diversity of metrics has resulted also in a diversity of loss functions
- MIT300: 8 metrics
- SALICON: 4 metircs (LSUN challenge) + Information Gain
- SalGAN benefits a wide range of metrics, wihtout needing to specify a tailored
loss function.
architecture
generator
- Encoder
- VGG16 (without final pooling, FC)
- pretrained on ImageNet object classification
task
- last 2 layers is fine-tuned during saliency task
training (for computational resource limitation)
- Decoder
- the reversed ordered structure of the encoder
- pooling -> upsampling
- output layer: 1x1 conv + pixel-wise sigmoid
(not softmax)
- weight init: random
discriminator
- output: the probability that the
given saliency map is generated
or ground truth
training Generator
by keeping the discriminator weights constant
training Generator
D: the probability of
fooling the Discriminator
⇒騙せれば騙せるほど、
lossは小さくなる。
入れた方が安定し、
収束も速い
hyperparameter
used 0.05
content loss adversarial loss
※最初の15epochsはcontent loss
のみでtraining
content loss
mean squared error (baseline)
binary cross entropy (our approach)
training Discriminator
using generated and ground truth samples
符号が反転してるので、だまされ
ないほどlossは低くなる。
adversarial loss
training
Dataset: SALICON
non-adversarial training
- change from MSE to BCE
brings a improvement in all
metrics
- treating saliency prediction
as multiple binary
classification is more
appropriate
non-adversarial training
- Computing cotent loss over
downsampled saliency
maps reduces the
computational resources
and actually improve
performance.
- used ¼ downsampled
versions later
adversarial gain
adversarial gain
- after 100 and 120 epochs, the combined
GAN/BCE loss shows substantial
improvements over BCE for five of six
metrics
- The reason why SalGAN fails to improve
NSS, may be that GAN training tends to
produce a smoother and more spread out
estimate of saliency, which may increase
the false positive rate. (NSSは余計なもん
FPを拾ってないかを見てる)
NSS Normalized Scanpath Saliency
- NSS is very sensitive to flase positives.
- 余計なものに反応してしまうような saliency model を低く評価する
- image retrieval application (saliency 用いた特徴選択)では、flase negative が
多いほうが良くない
- NSSは向いてない
- 理由: FN→重要な特徴量を除外しているということ
- SalGANのような冗長性のあるmodelが向いている
- NSS is differentialble, so could be oprimised directly when important for a
particular application.
comparison with SOTA
SalGAN improves or equals
the performance of all other
models in at least one metrics.
qualitative results
1. a successful case: other models fail to detect
saliency.
2. a failure case: fail to detect the white ball, like
other models
3. limitation of the datasets
a. ground truth: the sign → reading the text (takes more
time)
b. Existing metrics tend to be agnostic to the order in
which areas are attended.
qualitative results
- BCE alone
- locally consistent with the
ground truth
- less smooth
- complex level sets
- over-fitting?
- GAN
- smoother
- simpler level sets
qualitative results
conclusion
- BCE-based content loss is more effective (than MSE) for
- initializing the generator
- regularization term for stabilizing adversarial training
- Adversarial loss improved all metrics excluding NSS, when compared to
futher training on BCE alone.
- Downsampled saliency maps to compute loss give improvements and
reduce the computational costs.
- for more performance
- VGG → ResNet
- more accurately tuning (particularly the tradeoff beween BCE and GAN loss (α))
- ensamble learning (needing more computational cost, even at predict time)
- dark knowledge is effective?

More Related Content

Similar to 研究室文献発表 10/13 SalGAN

SAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis LibrarySAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis LibrarySAP Technology
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsVijay Karan
 
StackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineStackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineLarkin Liu
 
Realtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN ModelsRealtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN Modelsnithinsai2992
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...lauratoni4
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
Observability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approachObservability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approachTech Triveni
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsVijay Karan
 
Nips 2016 tutorial generative adversarial networks review
Nips 2016 tutorial  generative adversarial networks reviewNips 2016 tutorial  generative adversarial networks review
Nips 2016 tutorial generative adversarial networks reviewMinho Heo
 
Fast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionFast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionIJSRED
 
Ieee transactions on image processing
Ieee transactions on image processingIeee transactions on image processing
Ieee transactions on image processingtsysglobalsolutions
 
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy PreservationMediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservationmultimediaeval
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsVijay Karan
 
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Jian Wu
 
2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to Spark2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to SparkDB Tsai
 

Similar to 研究室文献発表 10/13 SalGAN (20)

SAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis LibrarySAP HANA SPS09 - Predictive Analysis Library
SAP HANA SPS09 - Predictive Analysis Library
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
 
StackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineStackAdapt Machine Learning Pipeline
StackAdapt Machine Learning Pipeline
 
Realtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN ModelsRealtime pothole detection system using improved CNN Models
Realtime pothole detection system using improved CNN Models
 
Employee mode of commuting
Employee mode of commutingEmployee mode of commuting
Employee mode of commuting
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
 
Pegasus
PegasusPegasus
Pegasus
 
StarGAN
StarGANStarGAN
StarGAN
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Observability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approachObservability at scale with Neural Networks: A more proactive approach
Observability at scale with Neural Networks: A more proactive approach
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 Projects
 
Nips 2016 tutorial generative adversarial networks review
Nips 2016 tutorial  generative adversarial networks reviewNips 2016 tutorial  generative adversarial networks review
Nips 2016 tutorial generative adversarial networks review
 
Fast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionFast and Scalable Semi Supervised Adaptation For Video Action Recognition
Fast and Scalable Semi Supervised Adaptation For Video Action Recognition
 
Ieee transactions on image processing
Ieee transactions on image processingIeee transactions on image processing
Ieee transactions on image processing
 
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy PreservationMediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 Projects
 
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
 
2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to Spark2014-08-14 Alpine Innovation to Spark
2014-08-14 Alpine Innovation to Spark
 

Recently uploaded

HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesRAJNEESHKUMAR341697
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwaitjaanualu31
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadhamedmustafa094
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 

Recently uploaded (20)

HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 

研究室文献発表 10/13 SalGAN

  • 1. SalGAN: Visual Saliency Prediction with Generative Adversarial Networks arXiv:1701.01081v2 [cs.CV] 9 Jan 2017 Junting Pan, Elisa Sayrol and Xavier Giro-i-Nieto Image Processing Group Universitat Politecnica de Catalunya (UPC) Barcelona,
  • 2. abstract - using BCE as loss (instead of often used MSE) - adding adversarial loss (seeing our saliency predictor as a generator in GAN) - using downsampled predicted saliency map
  • 3. outline - motivation - architecture - training generator/discriminator - results - the impact of BCE - the impact of downsampling - adversarial gain - comparison with SOTA - qualitative results - conclusion
  • 4. motivation - The diversity of metrics has resulted also in a diversity of loss functions - MIT300: 8 metrics - SALICON: 4 metircs (LSUN challenge) + Information Gain - SalGAN benefits a wide range of metrics, wihtout needing to specify a tailored loss function.
  • 6. generator - Encoder - VGG16 (without final pooling, FC) - pretrained on ImageNet object classification task - last 2 layers is fine-tuned during saliency task training (for computational resource limitation) - Decoder - the reversed ordered structure of the encoder - pooling -> upsampling - output layer: 1x1 conv + pixel-wise sigmoid (not softmax) - weight init: random
  • 7. discriminator - output: the probability that the given saliency map is generated or ground truth
  • 8. training Generator by keeping the discriminator weights constant
  • 9. training Generator D: the probability of fooling the Discriminator ⇒騙せれば騙せるほど、 lossは小さくなる。 入れた方が安定し、 収束も速い hyperparameter used 0.05 content loss adversarial loss ※最初の15epochsはcontent loss のみでtraining
  • 10. content loss mean squared error (baseline) binary cross entropy (our approach)
  • 11. training Discriminator using generated and ground truth samples 符号が反転してるので、だまされ ないほどlossは低くなる。 adversarial loss
  • 13. non-adversarial training - change from MSE to BCE brings a improvement in all metrics - treating saliency prediction as multiple binary classification is more appropriate
  • 14. non-adversarial training - Computing cotent loss over downsampled saliency maps reduces the computational resources and actually improve performance. - used ¼ downsampled versions later
  • 16. adversarial gain - after 100 and 120 epochs, the combined GAN/BCE loss shows substantial improvements over BCE for five of six metrics - The reason why SalGAN fails to improve NSS, may be that GAN training tends to produce a smoother and more spread out estimate of saliency, which may increase the false positive rate. (NSSは余計なもん FPを拾ってないかを見てる)
  • 17. NSS Normalized Scanpath Saliency - NSS is very sensitive to flase positives. - 余計なものに反応してしまうような saliency model を低く評価する - image retrieval application (saliency 用いた特徴選択)では、flase negative が 多いほうが良くない - NSSは向いてない - 理由: FN→重要な特徴量を除外しているということ - SalGANのような冗長性のあるmodelが向いている - NSS is differentialble, so could be oprimised directly when important for a particular application.
  • 18. comparison with SOTA SalGAN improves or equals the performance of all other models in at least one metrics.
  • 19. qualitative results 1. a successful case: other models fail to detect saliency. 2. a failure case: fail to detect the white ball, like other models 3. limitation of the datasets a. ground truth: the sign → reading the text (takes more time) b. Existing metrics tend to be agnostic to the order in which areas are attended.
  • 20. qualitative results - BCE alone - locally consistent with the ground truth - less smooth - complex level sets - over-fitting? - GAN - smoother - simpler level sets
  • 22. conclusion - BCE-based content loss is more effective (than MSE) for - initializing the generator - regularization term for stabilizing adversarial training - Adversarial loss improved all metrics excluding NSS, when compared to futher training on BCE alone. - Downsampled saliency maps to compute loss give improvements and reduce the computational costs. - for more performance - VGG → ResNet - more accurately tuning (particularly the tradeoff beween BCE and GAN loss (α)) - ensamble learning (needing more computational cost, even at predict time) - dark knowledge is effective?