Education
and Experiences
- July
00 - Present: Professor of Statistics, Dept of Stat, Harvard University.
- Sep
01 - Present: Professor of Biostatistics (secondary), HSPH
- Nov
00 - Nov 02: Guest Professor, Peking
University, (see photos
here)
- July
03 - Sep 03: Professor (on leave), Dept of Stat, Stanford University
- Sep
00 - Jun 03: Associate Professor (on leave), Dept of Stat, Stanford University.
- Aug
94 - Aug 00: Assistant Professor, Dept
of Stat, Stanford University.
- May
96 - Aug 02: Associate Editor, J Amer Statist Assoc
- July
99 - Present: Associate Editor, Statistica Sinica
- Dec
99 - Aug 02: Associate Editor, Biometrics
- Jan
98 - Dec 98: Visiting Faculty, Department
of Statistics, UCLA.
- Mar
98 - May 98: Visiting Faculty, Dept.
of Math, NUS.
- July
91 - Dec 94: Assistant Professor, Dept. of Stat., Harvard
University.
- July
93 - Sep 93: Visiting Faculty, Nat'l
Center. Biotech. Info., NIH.
- Sep
88 - Jun 91: Ph.D., Statistics,
The University
of Chicago.
- Sep
86 - Jun 88: Ph.D program in math, Rutgers University
- Aug
81 - July 85: B.S. in Math., Peking
University, Beijing,
China.
Awards
and Honor
- Morningside
Gold Metal for
Applied Mathematics (晨兴数学奖),
2010. To outstanding mathematicians of Chinese
decent under age 45, honored once every 3 years. ICCM 2010.
- Kuwait
Lecturer, Cambridge University, 2008
- Fellow,
American Statistical Association, 2005
- Bernoulli
Lecturer, Bernoulli Society,
2004
- Fellow,
Institute of Mathematical Statistics,
2004
- ISI
most-cited mathematical scientists 2003
- The
2002 COPSS Presidents'
Award. Given annually and jointly by five leading statistical
societies in the north America to a young individual (under age 40) in
recognition of outstanding contributions to the profession of statistics.
See my award acceptance speech
and the list of past COPSS Award recipients.
- The
2002 IMS Medallion Lecturer. Each
year IMS nominates 8 Medallion
lectures (also known as Special Invited Lectures) in fields across the
IMS's subject range for presentations in different statistical meetings.
- The
2000 Mitchell Prize
for the Best Bayesian Application Paper, 2000.
- Terman
Fellow, Stanford University,
1995-1998.
- CAREER
AWARD, National Science Foundation,
1995-1998.
- AMS-SIAM
selection program winner, Beijing, 1985.
Research Interests
- Statistical
missing data problems, imputation methodology.
- Gibbs
sampling and other MCMC
methods, rate
of convergence.
- Markov
structure, graphical models (software BUGS),
and genetics.
- Image reconstructions: PET,
SPECT, etc.
- Bayesian
methodology; Even Bill Gates talks about
Bayesian ideas!!
- Nonparametric
hierarchical models, model selections and testings.
- Large-scale
computation and optimization, e.g., VLSI design; Dynamic systems; Computer
vision.
- Monte
Carlo filters, Sequential importance sampling and resampling.
Click the button to see an illustration of the
SIS filter.



(JASA 1999, 94, 1-15)
(Protein Sci. 1995, 4,
1618-32)
Main Collaborators' Websites
Computational Biology Softwares
The following downloadable softwares for analyzing biopolymer sequence data,
microarray data, gene expression data, literature text data, etc. have been
developed in my lab in collaboration with various collaborators. Please cite
the related articles if you use them in your research. They are listed
chronologically:
- MACAW. A
self-extracting Window's software developed jointly with people at NCBI.
Search and align a subtly conserved single block-motif among multiple
sequences, assuming one occurrence in each sequence. See its companion
articles published in Science (Lawrence et al. 1993) and J. Am. Statist. Assoc. (Liu
1994).
- Gibbs
Motif Sampler. (Use command "tar -xvf
gibbs9_95.tar" after the download.) A UNIX (Sun OS) software to
search for multiple motifs with unknown number of repeats in multiple
protein sequences. Its companion articles
were published in J.
Am. Statist. Assoc. (Liu et al. 1995) and Protein Sci. (Neuwald et al 1995). A server of the Motif Sampler for
both discovering DNA regulatory binding sites and protein sequence motifs
can be accessed from the Wadsworth Lab
Bioinformatics Center directed by Dr. Chip Lawrence.
- PROBE.
(Use command "tar -xvf probe.tar"
after the download.) A UNIX (Sun OS) software tool for block-based
multiple protein sequence alignment and for database search to detect
remote protein homology. Its companion articles appeared in Nucl. Acid Res. (Neuwald
et al. 1997 ) and J. Am. Statist. Assoc. (Liu et al. 1999)
- Bayesian Aligner. A Bayesian
pairwise alignment tool; also called 'Bayesian Phylogenetic Footprint.' Its companion
article appeared in Bioinformatics
- BioProspector. An improved
web-interactive algorithm for finding gene regulatory binding motifs. See
the companion article published in the Proceedings of Pacific Symposium on Biocomputing.
- BLADE v2. Bayesian LinkAge
DisEquilibrium mapping algorithm based on Liu et al. (2001) published in Genome Research. This executable
program was produced by Dr. Xin Lu with a companion publication Lu, Niu and Liu (2003) in the same
journal.
- HAPLOTYPER for
SNP haplotype reconstruction based on the Partition-Ligation method (Niu
et al. 2002)
published in Am. J. Hum. Genet.
- PL-EM for SNP haplotype
reconstruction based on the Partition-Ligation method and EM Algorithm (Qin et al. 2002) published in Am. J. Hum. Genet.
- EM-DeCODER for SNP haplotype
reconstruction (with Z. Qin and T. Niu)
- MDScan. A new, fast, and accurate algorithm for finding protein-DNA interacting sites (gene regulatory binding motifs) from the 5' untranslated sequences selected by Chromatin-immunoprecipitation microarray (ChIP-array) and other microarray experiments. Its companion paper was published in Nature Biotechnology, 2002.
- BMC. A novel Bayesian algorithm for
putative motif clustering, see the companion paper published in Nature Biotechnology, 2003.
- Motif Regressor. An efficient
algorithm for integrating sequence motif discovery with measures from mRNA
expression microarray or Chromatin-Immunoprecipitation microarray
(ChIP-chip) experiments. Its companion paper was published in Proc. Nat’l
Acad. Sci. USA,
2003. A more user-friendly download site is maintained by Erin Conlon at here.
- GMS-MP: Gibbs Motif Sampler for
Paired Correlation Model. See the Zhou & Liu (2004) in Bioinformatics.
- BioOptimizer: A Bayesian scoring
method for comparing and optimizing regulatory motif predictions from
AlignACE, BioProspector, CONSENSUS, and MEME. Read details in Jensen & Liu (2004) in Bioinformatics.
- Smoothing Spline Clustering (SSC)
algorithm (Ma et al. 2006): a data-driven statistical method for clustering
time-series gene expression data. A newer version of the software is
available here.
- BEAM (Bayesian Epistasis Association Mapping):
A powerful Bayesian inference algorithm for detecting marker interactions
in case-control population genetic studies. The companion paper, Zhang and Liu (2007) was published in Nature Genetics.
- Tmod (Toolbox for Motif Discovery):
A windows-based software
suite that incorporates 12 different popular sequence motif discovery
algorithms such as MEME, BioProspector, AlignACE, GLAM, YMF, etc. It helps
researchers to compare and combine motif finding results from different
algorithms. Compatible with Windows 2000, XP, Vista, and 7! The companion
paper, Sun et al. (2009), is accepted by Bioinformatics for publication.
- Mining
Biological Literature for Protein-Protein Interactions: Webserver is available!
The companion paper, Chowdhary, Zhang, and Liu (2009)
was published in Bioinformatics.
Programs and Scripts also available here.
Some Talks in
Slides
Courses I have taught
and am Teaching
1.
STAT 115 (Bioinformatics) ;
2.
STAT 220 (Stochastic Processes);
1.
STAT 171
(Stochastic Processes)
2.
STAT 311 (Monte
Carlo Computation)
1.
STAT 171
(Stochastic Processes)
2.
STAT 311 (Monte
Carlo Computation)
1.
STAT 171
(Stochastic Processes). Meetings: SC 309, TTh 11:30-1.
2.
STAT 315 (Comp
Biology and Related Topics). Meetings: SC 706, Th 2:30-4.
Students and Visitors
Former Ph.D. Students:
Chiara Sabatti (1998);
Associate Professor, Dept Biostatistics, Stanford
Yuguo Chen (2001);
Associate Professor, Dept Statistics, UIUC
Xiaole Liu (2002);
Associate Professor, Dept Biostatistics and Dana-Farber Cancer Institute, HSPH.
Scott Schmidler (2001, delayed 2002);
Associate Professor, Dept Statistics, Duke University
Mayetri Gupta (2003);
Associate Professor, Dept Biostatistics, Boston University
Tanya Logvinenko(2003); Assist Prof of Medicine, Institute of CRHPS, Tufts University
Shane Jensen (2004); Associate Professor, Dept Statistics, Wharton School, University of
Pennsylvania
Hosung Kang (2005);
Quantitative Analyst, Washington Mutual.
Gopika Goswami (2005); Quantitative Analyst, Barclay Bank.
Peng Zhang (HSPH,
2007), Postdoc, Harvard.
W. Evan Johnson (HSPH, 2007), Assist Prof, Brigham-Young University, Utah.
Jiajun Gu (DEAS, 2008), Analyst, Swiss Bank
Xiaodan Fan (2008),
Assist Prof, Chinese University of Hong Kong
Tingting Zhang (2008,
Joint with Sam Kou), Assist Prof, Univ of Virginia
Paul Edelfsen (2009, Joint with Art Dempster),
Assistant Member, Fred Hutchinson Cancer Research Center
Yuan Yuan (2009),
Researcher, Google
Wei Zhang (2009), Quantitative Analyst, UBS Investment Bank
Jing Maria Zhang
(2009), Dept of Statistics, Yale University
Roee Gutman (2011, joint with Don Rubin), Assistant Prof., Department of Biostatistics, Brown University
Current Ph.D.Students:
Lei Guo
Bo Jiang
Simeng Han
Daniel Fernendez
Associates and
Postdoctoral fellows:
Saunak Sen (1998-1999). Current: UCSF
Tianhua Niu (2000-2001). Current: Brigham and Women's Hospital
Steve Qin (2000-2003). Current: Dept of Biostatistics and Bioinformatics, Emory University
Erin Conlon
(2000-2003). Current: Dept of Math, U of Mass, Amherst
Haiyan Huang (2001-2003).
Current: Dept of Stat, UC Berkeley
Xin Lu (2001-2004).
Current: Dept of Biostat, UC San Diego
Lei Shen (2003-2005). Current: Senior Bioinformatics Scientist,
Agencourt Bioscience Co.
Ping Ma(2003-2005). Current: Dept of
Statistics, UIUC
Yu Zhang (2004-2006). Current: Dept
of Statistics, Penn State U
Cristian
Castillo-Davis (2004-2006). Current: Dept of Biology, U of Maryland, College Park.
Lihua Zou (2004-2006). Current: Research Scientist, Danna-Farber
Cancer Institute
Guocheng Yuan (2005-2006).
Current: Dept of Biostat, Harvard School of Public
Health
Wenxuan Zhong (2005-2007). Current: Dept of Statistics, UIUC
Jinfeng Zhang
(2004-2007). Current: Florida State University
Chowdhary, Rajesh
(2006-2008). Current: Marshfield Clinic Research Foundation
Liu, Xuxin (2007-2010).
Ke Deng (2008-Present).
Feng Hong (2009-Present).
Rotation
Students:
Junni Zhang (2000-2002)
Epaminondas Sourlas (2002)
Su Ying Quek
(2001)
Calvin Chiu (2001)
Lihua Zou (2003)
Qing Zhou (2003)
James Signorovitch (HSPH, 2005-2006)
Jiong Du (PKU,
2009-present)
Visitors:
Wen Zhang
(2003-2004). Research Scientist.
Xiaobin Dong (2003-2004).
Hongwei Xie (2004-2005): Professor and Chair, Technical University
of Defense, China
Xuegong Zhang (2006):
Professor and Director of Bioinformatics, Tsinghua U,
China
Jianhua Xu (2008-2009): Professor, Nanjing Normal University
Jack Y. Yang
(2007-present): Research Assoc, UCSD
Selected Publications
and Reports:
My Top Ten Interesting Papers (till 2007)
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
My Book on Monte Carlo (2001)
Sample chapters
from the book: Preface, Table of Content, and Chapter 1,
Chapter 4, and Chapter 5.