Ultrasound nerve segmentation, kaggle review

Kaggle ultrasound nerve segmentation
Tyantov Eduard

#1 Description
Thu 19 May 2016 – Thu 18 Aug 2016

#2 Data: to find Brachial Plexus (BP)
– 420x580 resolution
– 5635 train images with masks, 5508 test;
– ~120 images per 47 patients
– 47% of the images don’t have a mask;
– result in RLE encoding

#3 Data: mistakes in the ground truth
– 45 known errors of near duplicate images
– metric is sensitive to nerve presence mistakes

#4 Evaluation
Peculiarities
– (!) Mask presence mistake leads to zero score
– Needs smoothing in the denominator
Loss functions
– 1 – dice
– -dice
– weighted cross entropy (2 classes, per pixel prediction)
Mean mask

#5 Baselines
Score Description Framework Author
0.51 Empty submission - -
0.00 Top left pixel - -
0.57 U-Net, in the
beginning of the
competition
Keras code Marko Jocic, kaggler
0.62 U-Net, almost at the
end
Torch code Qure.ai (host)

#6 What is U-Net
Overview
– (May 2015 article) “U-Net: Convolutional Networks for Biomedical Image Segmentation”
– Winner of “Grand Challenge for Computer-Automated Detection of Caries in Bitewing
Radiography at ISBI 2015”
– Encoder-decoder architecture with skip connection on the same level
– Fully convolutional, Drop-out in the middle
– Augmentation: “Smooth deformations using random displacement vectors on a coarse 3 by 3
grid. The displacements are sampled from a Gaussian distribution with 10 pixels standard
deviation.”

#7 Another aproach, FCN
Overview
– (20 May 2016, article) “Fully Convolutional Networks for Semantic Segmentation”
– VGG-18
– Segmentation prediction on different layers of the net, +upsampling
– Average predictions

#8 Starting point, Marko Jocic’s solution
Overview
– Classic U-Net: VGG-like
– Very simple Keras code
– Image resize to 64x80, bicubic interpolation
– Loss= - Dice coefficient, per batch averaging, smooth=1
– Training on whole dataset, no validation
– RLE-encoding function
– Adam optimizer
Training
– 20 epochs, ~30 seconds on Titan X, memory footprint 800mb
– Overfits, 0.68 on training -> 0.57 on LB

#9 Aspects of the solution: basics
Overfitting basics (+2%)
– Split train/valid, 20% and early stopping patience=5 epochs
• used random split instead of more convenient by patient (due to a subtle bug)
– Dropout after each conv layer
General enhancements
– Resolution 64x80 -> 80x112 (+1%)
– ELU instead of ReLU -> faster convergence

#10 Aspects of the solution: augmentation
Augmentation*
– flip x, y
– random rotate (5)
– random zoom (0.9, 1.1)
– random channel shift (5.0)
*all transformations should be done with a mask too
All transformations can be done on the fly with a generator (randomly applied), but didn’t improve
results.
Elastic transform:
convolve with a Gaussian on random
displacement fields
Result: no added effect

#11 Aspects of the solution: blocks
Modifications of U-Net
– 2 3x3 convolution -> inception_v3 block
– BNA after each convolution
– BNA + activation after summation
– nxn -> 1xn + nx1
Results:
– lesser parameters (1M)
– faster convergence
– LB: +2%
v3 block
v3 + splitted

#12 Aspects of the solution: 2nd head, postfilter
2nd head
– mask presence branch in the middle of NN (after decoder part)
• Conv 1x1, sigmoid
• FC=1, sigmoid
– leads to better convergence
Post filter
– presence prob < 0.5 or sum(pixels) < 3000 -> empty mask (+4.5%)
– in the end: combining p_nerve = (p_score + p_segment)/2 -> +0.5%

#13 Aspects of the solution: other
Modifications
– Skip connection with Residual blocks (+1%)
– Max pool -> Conv 3x3 with stride=2 (+BNA)
– Ensemble (+1%)
• k-fold 5,6,7,8, average
– Prediction on augmented versions of test images (averaging)
Final result:
– single model 0.694 score
– ensemble 0.70399 (hour before the competition’s end)
– last submission has been human verified ;) but no help
code: https://github.com/EdwardTyantov/ultrasound-nerve-segmentation

#14 Leaderboard
– 31-th
– see Private LB huge shake-up

#15 What didn’t help
– Inception Resnet v4
– sequential training of decoder, encoder parts
– more or lesser layers/blocks/n_filters
– pixel clustering
– higher or lower resolution
– dropout, different probs
– Torch version
– deconv layers instead of upsampling
– weight decay for layers
– FCN
– Deepmask architecture

#16 Technical
– Ubuntu 14 or 16, Cuda 8, cudnn 5, keras last, torch last
– batch_size=64, 128 (depends on GPU memory)
– Single model, 2-3 hours on Titan/1080
– Ensemble – 24 hours

– train dataset: error re-labeling or zero-outing
– FCN with several heads in different resolutions (regularize)
– post process: mask to elipse, no holes
– separate training: mask/no mask
– crop images, super-resolution
– models on different resolution
– higher resolution
– loss
– smart post-processing
• which obv. led to overfitting on Public Score
– replication padding instead of zero-padding
#17 Other competitors

#18 Deepmask (FB)
– no low-level features
– CNN with two heads: mask and score
– training set: patch, mask, y_k - objected in centered and fully contained
• mask pixel=1 if it is part of object in the center
– VGG, 8-layers, can be trained
– Training
• joint learning score * 1/32
• first branch only positives
• augmentation: shift 16pix, scale little, horiz. flip
– Evaluation
• full image: 16 pixel stride

#19 Sharpmask (refined deepmask)

#20 Sharpmask (refined deepmask)
– Architecture
• trunk: resnet-50 pretrained
• reflect-pad instead of zero
• for DeepMask: headC
– Training
• Deepmask (score + coarse mask)
• then freeze and learn refinements
• Why: faster convergenc, deepmask or sharpmask result, minimal finetuning
– Inference
• only top-n locations refined

Ultrasound nerve segmentation, kaggle review

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Ultrasound nerve segmentation, kaggle review

Similar to Ultrasound nerve segmentation, kaggle review (20)

More from Eduard Tyantov

More from Eduard Tyantov (7)

Recently uploaded

Recently uploaded (20)

Ultrasound nerve segmentation, kaggle review