This document summarizes an R package called bayesImageS that enables Bayesian computation for medical image segmentation using a hidden Potts model. It discusses the statistical model, which involves a hidden Markov random field with a Potts prior on the latent labels. Bayesian computation methods like Gibbs sampling and Metropolis-Hastings using pseudolikelihood approximation are implemented in C++ for efficiency. Experimental results demonstrate the package on a CT electron density phantom and patient radiotherapy data.
Disentangling the origin of chemical differences using GHOST
bayesImageS: Bayesian computation for medical Image Segmentation using a hidden Potts Model
1. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
bayesImageS: Bayesian computation for
medical image segmentation
using a hidden Potts model
Matt Moores
MRC Biostatistics Unit Seminar Series
July 25, 2017
2. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Acknowledgements
Queensland University of Technology, Brisbane, Australia:
Prof. Kerrie Mengersen
Dr. Fiona Harden
Members of the Volume Analysis Tool project team at the
Radiation Oncology Mater Centre (ROMC):
Cathy Hargrave
A/Prof Michael Poulsen, MD
Timothy Deegan
Emmanuel Baveas
QHealth ethics HREC/12/QPAH/475 and QUT ethics 1200000724
3. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Outline
1 R packages
bayesImageS
2 Medical Imaging
Image-Guided Radiotherapy
Cone-Beam Computed Tomography
3 Statistical Model
Hidden Potts model
Informative priors
4 Bayesian Computation
Chequerboard Gibbs sampler
Pseudolikelihood
5 Experimental Results
ED phantom experiment
Radiotherapy Patient Data
4. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Why write an R package?
Portability
Test bed for new statistical methods
Build on existing code
Research impact
Kudos
Hadley Wickham (2015) R packages
5. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Why C++?
Most statistical algorithms are iterative
Markov chain Monte Carlo
Scalability for large datasets
Rcpp
OpenMP
Eigen or Armadillo
Dirk Eddelbuettel (2013) Seamless R and C++ integration with Rcpp
6. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Inline
One function at a time:
§
library ( i n l i n e )
sum_logs ← cxxfunction ( signature ( log_vec = "numeric") , plugin = "RcppArmadillo" , body=’
arma::vec log_prob = Rcpp::as<arma::vec>(log_vec);
double suml = 0.0;
double maxl = log_prob.max();
for (unsigned i=0; i < log_prob.n_elem; i++)
{
if (arma::is_finite(log_prob(i)))
suml += exp(log_prob(i) - maxl);
}
return Rcpp::wrap(log(suml) + maxl);
’)
7. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Annotations
Rcpp wrappers generated automatically:
compileAttributes("myRcppPackage")
R package documentation generated automatically:
roxygenize("myRcppPackage")
§
/ / ’ Compute the effective sample size (ESS) of the particles.
/ / ’
/ / ’ The ESS is a ‘‘rule of thumb’’ for assessing the degeneracy of
/ / ’’ the importance distribution:
/ / ’ deqn {ESS = frac { ( sum_ { q=1}^Q w_q ) ^ 2 } { sum_ { q=1}^Q w_q ^2}}
/ / ’
/ / ’’ @param log_weights logarithms of the importance weights of each particle.
/ / ’’ @return the effective sample size, a scalar between 0 and Q
/ / ’’ @references
/ / ’’ Liu, JS (2001) "Monte Carlo Strategies in Scientific Computing." Springer’
/ / [ [ Rcpp : : export ] ]
double effectiveSampleSize ( NumericVector log_weights )
{
double sum_wt = sum_logs ( log_weights ) ;
double sum_sq = sum_logs ( log_weights + log_weights ) ;
double res = exp (sum_wt + sum_wt − sum_sq ) ;
i f ( std : : i s f i n i t e ( res ) ) return res ;
else return 0;
}
8. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Package Skeleton
Create a new R package:
package.skeleton("myPackage", path=".")
Specific skeletons for each C++ library:
Rcpp.package.skeleton("myRcppPackage")
RcppArmadillo.package.skeleton("MyArmadilloPackage")
RcppEigen.package.skeleton("MyEigenPackage")
9. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Common Problems
Rcpp parameters are passed by reference (not copied):
Can rely on R for garbage collection
Memory allocation is slower
Can crash R (and Rstudio (and your OS))
R is not thread safe
Cannot call any R functions (even indirectly)
within parallel code!
Drew Schmidt (@wrathematics, 2015) Parallelism, R, and OpenMP
10. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
bayesImageS
An R package for Bayesian image segmentation
using the hidden Potts model:
RcppArmadillo for fast computation in C++
OpenMP for parallelism
§
library ( bayesImageS )
p r i o r s ← l i s t ("k"=3 ,"mu"=rep (0 ,3) ,"mu.sd"=sigma ,
"sigma"=sigma , "sigma.nu"=c (1 ,1 ,1) ,"beta"=c ( 0 , 3 ) )
mh ← l i s t ( algorithm="pseudo" , bandwidth =0.2)
r e s u l t ← mcmcPotts ( y , neigh , block ,NULL,
55000,5000, priors ,mh)
Eddelbuettel & Sanderson (2014) RcppArmadillo: Accelerating R with
high-performance C++ linear algebra. CSDA 71
11. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Bayesian computational methods
bayesImageS supports methods for updating the latent labels:
Chequerboard Gibbs sampling (Winkler 2003)
Swendsen-Wang (1987)
and also methods for updating the smoothing parameter β:
Pseudolikelihood (Rydén & Titterington 1998)
Thermodynamic integration (Gelman & Meng 1998)
Exchange algorithm (Murray, Ghahramani & MacKay 2006)
Approximate Bayesian computation (Grelaud et al. 2009)
Sequential Monte Carlo (ABC-SMC) with pre-computation
(Del Moral, Doucet & Jasra 2012; Moores et al. 2015)
12. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Image-Guided Radiotherapy
Image courtesy of Varian Medical Systems, Inc. All rights reserved.
13. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Segmentation of Anatomical Structures
Radiography courtesy of Cathy Hargrave, Radiation Oncology Mater Centre,
Queensland Health
14. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Physiological Variability
Organ Ant-Post Sup-Inf Left-Right
prostate 0.1 ± 4.1mm −0.5 ± 2.9mm 0.2 ± 0.9mm
seminal vesicles 1.2 ± 7.3mm −0.7 ± 4.5mm −0.9 ± 1.9mm
Table: Distribution of observed translations of the organs of interest
Organ Volume Gas
rectum 35 − 140cm3 4 − 26%
bladder 120 − 381cm3
Table: Volume variations in the organs of interest
Frank, et al. (2008) Quantification of Prostate and Seminal Vesicle
Interfraction Variation During IMRT. IJROBP 71(3): 813–820.
15. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Electron Density phantom
(a) CIRS Model 062 ED phantom (b) Helical, fan-beam CT scanner
16. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Cone-Beam Computed Tomography
(c) Fan-beam CT (d) Cone-beam CT
17. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Distribution of Pixel Intensity
Hounsfield unit
Frequency
−1000 −800 −600 −400 −200 0 200
050001000015000
(a) Fan-Beam CT
pixel intensity
Frequency
−1000 −800 −600 −400 −200 0 200
050001000015000
(b) Cone-Beam CT
18. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Hidden Markov Random Field
Joint distribution of observed pixel intensities y = {yi}n
i=1
and latent labels z = {zi}n
i=1:
p(y, z|µ, σ2
, β) = p(y|µ, σ2
, z)p(z|β) (1)
Additive Gaussian noise:
yi|zi =j
iid
∼ N µj, σ2
j (2)
Potts model:
π(zi|zi, β) =
exp {β i∼ δ(zi, z )}
k
j=1 exp {β i∼ δ(j, z )}
(3)
Potts (1952) Proceedings of the Cambridge Philosophical Society 48(1)
19. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Inverse Temperature
20. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Doubly-intractable likelihood
p(β|z) ∝ C(β)−1
π(β) exp {β S(z)} (4)
The normalising constant has computational complexity O(nkn):
C(β) =
z∈Z
exp {β S(z)} (5)
S(z) is the sufficient statistic of the Potts model:
S(z) =
i∼ ∈L
δ(zi, z ) (6)
where L is the set of all unique neighbour pairs.
21. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Informative Prior for µj and σ2
j
0 1 2 3 4
−1000−800−600−400−2000200
Electron Density
Hounsfieldunit
(a) Fan-Beam CT
0 1 2 3 4
−1000−800−600−400−2000200
Electron Density
pixelintensity
(b) Cone-Beam CT
22. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
External Field
π(zi|αi, β, zi∼ ) ∝ exp αi(zi) + β
i∼
δ(zi, zj) (7)
Isotropic translation:
αi(zi =j) =
1
nj ν∈j
φ ∆(ϑi, ϑν), µ = 1.2, σ2
= 7.32
(8)
where
ν ∈ j are the image voxels ϑν in object j
φ(x, µ, σ2) is the normal density function
∆(u, v) is the Euclidian distance between the coordinates
of voxel u and voxel v
23. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
External Field II
(a) Fan-beam CT (b) External Field
26. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
External Field
Organ- and patient-specific external field (slice 49, 16mm Inf)
28. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Chequerboard Gibbs II
Algorithm 1 Chequerboard sampling for z
1: for all blocks b do
2: for all pixels i ∈ b do
3: for all labels j ∈ 1 . . . k do
4: Compute λj ← p(yi | zi = j)π(zi = j | zi∼ , β)
5: end for
6: Draw zi ∼ Multinomial(λ1, . . . , λk)
7: end for
8: end for
29. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Gibbs sampler in C++
§
void gibbsLabels ( const arma : : umat & neigh , const std : : vector <arma : : uvec> & blocks ,
arma : : umat & z , arma : : umat & alloc , const double beta ,
const arma : : mat & log_ x f i e l d )
{
const Rcpp : : NumericVector randU = Rcpp : : r u n i f ( neigh . n_rows ) ;
for ( unsigned b=0; b < blocks . size ( ) ; b++)
{
const arma : : uvec block = blocks [ b ] ;
arma : : vec log_prob ( z . n_cols ) ;
#pragma omp p a r a l l e l for private ( log_prob )
for ( unsigned i =0; i < block . size ( ) ; i ++)
{
for ( unsigned j =0; j < z . n_cols ; j ++)
{
unsigned sum_neigh = 0;
for ( unsigned k=0; k < neigh . n_cols ; k++)
{
sum_neigh += z ( neigh ( block [ i ] , k ) , j ) ;
}
log_prob [ j ] = log_ x f i e l d ( block [ i ] , j ) + beta∗sum_neigh ;
}
double t o t a l _ l l i k e = sum_logs ( log_prob ) ;
double cumProb = 0.0;
z . row ( block [ i ] ) . zeros ( ) ;
for ( unsigned j =0; j < log_prob . n_elem ; j ++)
{
cumProb += exp ( log_prob [ j ] − t o t a l _ l l i k e ) ;
i f ( randU [ block [ i ] ] < cumProb )
{
z ( block [ i ] , j ) = 1;
a l l o c ( block [ i ] , j ) += 1;
break ;
30. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Pseudolikelihood (PL)
Algorithm 2 Metropolis-Hastings with PL
1: Draw proposal β ∼ q(β |β◦)
2: Approximate p(β |z) and p(β◦|z) using equation (9):
ˆpPL(β|z) ≈
n
i=1
exp{β i∼ δ(zi, z )}
k
j=1 exp{β i∼ δ(j, z )}
(9)
3: Calculate the M-H ratio ρ = ˆpPL(β |z)π(β )q(β◦|β )
ˆpPL(β◦|z)π(β◦)q(β |β◦)
4: Draw u ∼ Uniform[0, 1]
5: if u < min(1, ρ) then
6: β ← β
7: else
8: β ← β◦
9: end if
Rydén & Titterington (1998) JCGS 7(2): 194–211
31. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Pseudolikelihood in C++
§
double pseudolike ( arma : : mat & ne , arma : : uvec & e ,
double b , unsigned n , unsigned k )
{
double num = 0.0;
double denom = 0.0;
#pragma omp p a r a l l e l for reduction ( + :num, denom)
for ( unsigned i =0; i < n ; i ++)
{
num=num+ne ( e [ i ] , i ) ;
double tdenom =0.0;
for ( unsigned j =0; j < k ; j ++)
{
tdenom=tdenom+exp ( b∗ne ( j , i ) ) ;
}
denom=denom+log ( tdenom ) ;
}
return b∗num−denom ;
}
32. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Approximation Error
PL for n = 12, k = 3 in comparison to the exact likelihood
calculated using a brute force method:
0 1 2 3 4
6810121416
β
µ
exact
pseudolikelihood
(a) Expectation
0 1 2 3 4
0.00.51.01.52.02.5
β
σ
exact
pseudolikelihood
(b) Standard deviation
33. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
ED phantom experiment
27 cone-beam CT scans of the ED phantom
Cropped to 376 × 308 pixels and 23 slices
(330 × 270 × 46 mm)
Inner ring of inserts rotated by between 0◦ and 16◦
2D displacement of between 0mm and 25mm
Isotropic external field prior with σ∆ = 7.3mm
9 component Potts model
8 different tissue types, plus water-equivalent background
Priors for noise parameters estimated from 28 fan-beam CT
and 26 cone-beam CT scans
34. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Image Segmentation
35. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Quantification of Segmentation Accuracy
Dice similarity coefficient:
DSCg =
2 × |ˆg ∩ g|
|ˆg| + |g|
(10)
where
DSCg is the Dice similarity coefficient for label g
|ˆg| is the count of pixels that were classified with the
label g
|g| is the number of pixels that are known to truly
belong to component g
|ˆg ∩ g| is the count of pixels in g that were labeled
correctly
Dice (1945) Measures of the amount of ecologic association between
species. Ecology 26(3): 297–302.
36. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Results
Tissue Type Simple Potts External Field
Lung (inhale) 0.540 ± 0.037 0.902 ± 0.009
Lung (exhale) 0.172 ± 0.008 0.814 ± 0.022
Adipose 0.059 ± 0.008 0.704 ± 0.062
Breast 0.077 ± 0.011 0.720 ± 0.048
Water 0.174 ± 0.003 0.964 ± 0.003
Muscle 0.035 ± 0.004 0.697 ± 0.076
Liver 0.020 ± 0.007 0.654 ± 0.033
Spongy Bone 0.094 ± 0.014 0.758 ± 0.018
Dense Bone 0.014 ± 0.001 0.616 ± 0.151
Table: Segmentation Accuracy (Dice Similarity Coefficient ±σ)
Moores, et al. (2014) In Proc. XVII Intl Conf. ICCR; J. Phys: Conf. Ser. 489
38. R packages Medical Imaging Statistical Model Bayesian Computation Experimental Results Conclusion
Summary
Informative priors can dramatically improve segmentation
accuracy for noisy data
inverse regression for µ & σ2
external field prior for z
It is feasible to use MCMC for image analysis of realistic
datasets
but auxiliary variable methods don’t scale well
requires parallelized implementation in C++ or Fortran
RcppArmadillo & OpenMP are a good combination
faster algorithms are available, such as VB or ICM
39. Appendix
For Further Reading I
Moores, Hargrave, Deegan, Poulsen, Harden & Mengersen
An external field prior for the hidden Potts model with application to
cone-beam computed tomography.
CSDA 86: 27–41, 2015.
Moores & Mengersen
bayesImageS: Bayesian methods for image segmentation using a
hidden Potts model.
R package version 0.3-3
https://CRAN.R-project.org/package=bayesImageS
Moores, Drovandi, Mengersen & Robert
Pre-processing for approximate Bayesian computation in image
analysis.
Statistics & Computing 25(1): 23–33, 2015.
Moores, Pettitt & Mengersen
Scalable Bayesian inference for the inverse temperature of a hidden
Potts model.
arXiv:1503.08066 [stat.CO], 2015.
40. Appendix
For Further Reading II
Eddelbuettel & Sanderson
RcppArmadillo: Accelerating R with high-performance C++ linear
algebra.
Comput. Stat. Data Anal. 71: 1054–63, 2014.
Bates & Eddelbuettel
Fast and elegant numerical linear algebra using the RcppEigen
package.
J. Stat. Soft. 52(5): 1–24, 2013.
Eddelbuettel
Seamless R and C++ integration with Rcpp
Springer-Verlag, 2013.
Wickham
R packages
O’Reilly, 2015.
41. Appendix
For Further Reading III
Winkler
Image analysis, random fields and Markov chain Monte Carlo methods
2nd
ed., Springer-Verlag, 2003.
Marin & Robert
Bayesian Essentials with R
Springer-Verlag, 2014.
Roberts & Sahu
Updating Schemes, Correlation Structure, Blocking and
Parameterization for the Gibbs Sampler
J. R. Stat. Soc. Ser. B 59(2): 291–317, 1997.
Rydén & Titterington
Computational Bayesian Analysis of Hidden Markov Models
J. Comput. Graph. Stat. 7(2): 194–211, 1998.