1. BioSMACK is a Linux Live CD customized for analysis of genome-wide association studies (GWAS).
2. It provides pre-compiled, installed and configured software for GWAS analysis like PLINK, EIGENSTRAT, STRUCTURE, and others from a bootable CD/USB without installing on the hard disk.
3. Future works include supporting cloud and cluster computing for parallel GWAS analysis on large datasets.
11. Northern Han
햅맵 중국인
싱가폴-중국인
홍콩-중국인
121 samples - Chinese University of Hong Kong
12.
13. BioSMACK: a Linux Live CD for
Analysis of GWA
What is the Genome-Wide
Association Study?
14. 23andMe 설립
KHapMap 완성 (2003~)
HGP 완성
국립보건원
유전체센터 Sceience, Breakthrough of the year에
국립보건원 유 설립 Human Genetic Variation
전체센터 입사
KARE 프로젝트 시작
벤터
1991 1996 2001 2006 2011
왓슨, 얀 후안밍
1992 1997 2002 2007 김성진 박사 whole genome 완성
1000 Genomes Project 시작
Illumina 설립 1993 1998 2003 2008 PGP-10 데이터 공개
1994 1999 2004 2009 Nature Genetics 한국인 GWAS 결과 발표
Science, 한국인 이동경로
HGP 시작 1990 1995 2000 2005 2010 서울대, Nature에 한국인 whole genome 논문
KAREBrowser 개발
벤터 DTC 서비스 논문 발표
Affymetrix/Illumina SNP 칩 개발
904 published GWAS for 165 traits
HapMap 완성 (2002~) 게놈연구재단, 한국인 게놈 프로젝트 출범
최초의 GWAS-노인성 황반 변성, Science genomeunzipped등의 public personal genome 공개
Illumina, Infinium whole-genome genotyping
(100,000 markers)
15. 1 Analysis millions of genotype data requires
more computing power and highly skilled
specialist for handling large data and series of
analysis
2 Various software (e.g. PLINK, Eigensoft,
STRUCTURE and SnpMatrix) have been developed
for GWAS
3 Researchers often encounter the problem in the
process of compiling/installing and configure the
environmental parameters and library dependency
17. BioSMACK: a Linux Live CD for
Analysis of GWA
What is the Linux Live CD?
18. • Linux is the free open source operating system
• Many GWA softwares support linux
• Linux live CD is bootable customized linux from
CD/UBS flash drives
19. • Developer can makes linux live CD for their
usage (e.g. biology, chemistry, physics, games)
• For biological data analysis - BioLinux, Open
Discovery, GRIMP, BioConductorBuntu and PhyLIS
• GWAS methods are rapid development, there is a
need for a Live CD focusing on GWAS
20. BioSMACK: a Linux Live CD for
Analysis of GWA
How implementation of
BioSMACK?
21. •Based on Open-Source software (free to use,
redistribute under GNU General Public License)
•Based on the Ubuntu Linux distribution (v5.5)
•Ubuntu Linux is the most popular Linux distribution
•Pre-compiled, installed and configured for GWA
software
•Command line and JAVA Swing based GUI for GWA
software execute
•User-manual and example data also included
22. •Calling genotype from genome-wide SNP chip
•Covert PLINK binary format from raw genotype data
•Detect the population stratification
•Association analysis using PLINK
•Estimate the genotype of SNPs that were not
observed in GWAS (imputation)
•Meta-analysis in two-sample comparisons
23. •PLINK •HTML Based
•SnpMatrix 목차
•EIGENSTRAT
•STRUCTURE
•RMETA
•METAL
•IMPUTE
•MACH
명령어 설명
예제 데이터 실행 명령어
24.
25. BioSMACK: a Linux Live CD for
Analysis of GWA
How to install BioSMACK?
26. 1 Download BioSMACK ISO image file (about 1GB
size) - freely available at ksnp.cdc.go.kr/biosmack
2 Can be make CD/DVD from ISO image
Can be make USB flash drives from ISO image
3 Installed on hard disk (erasing the previous
operation system)
Not installed on hard disk (boot from CD/USB
flash drives without making changes to the
underlying operating system)
27. BioSMACK: a Linux Live CD for
Analysis of GWA
Result and Future Works
28. 1 Useful for educational purpose and simple analysis on
the fly without installation and configuration
2 Use BioSMACK on various kinds of laptops and
netbook in the 5th workshop of Asian Institute in
Statistical Genetics and Genomics
3 Fully functional research environment for GWAS can
be setting up on any computer within couple of hours
29. 1 Cloud computing
computing using resources acquired on demand
2 Cluster computing
support parallel job with job scheduler (e.g. Sun
Grid Engine, Open PBS, Torque)
3 Parallel Software
High-performance, parallel, on demand for GWAS
will be support BioSMACK AMI (Amazon Machine
Image) - for cloud computing
will be support parallel job script - for HPC