Education and Experiences


Awards and Honor


Research Interests

  • Statistical missing data problems, imputation methodology.
  • Gibbs sampling and other MCMC methods, rate of convergence.
  • Markov structure, graphical models (software BUGS), and genetics.
  • Image reconstructions: PET, SPECT, etc.
  • Bayesian methodology; Even Bill Gates talks about Bayesian ideas!!
  • Nonparametric hierarchical models, model selections and testings.
  • Large-scale computation and optimization, e.g., VLSI design; Dynamic systems; Computer vision.
  • Monte Carlo filters, Sequential importance sampling and resampling.


Click the button to see an illustration of the SIS filter.

Click on the cover of my new book on Monte Carlo to get a copy from Amazon.com


(JASA 1999,  94, 1-15)                      (Protein Sci. 1995, 4, 1618-32)

 


Main Collaborators' Websites


Computational Biology Softwares

The following downloadable softwares for analyzing biopolymer sequence data have been developed by my collaborators and myself. Please cite the related articles if you use them in your research. They are listed chronologically:

  • MACAW. A self-extracting Window's software developed jointly with people at NCBI. Search and align a subtly conserved single block-motif among multiple sequences, assuming one occurrence in each sequence. See its companion articles published in Science (Lawrence et al. 1993) and J. Am. Statist. Assoc. (Liu 1994).
  • Gibbs Motif Sampler. (Use command "tar -xvf gibbs9_95.tar" after the download.) A UNIX (Sun OS) software to search for multiple motifs with unknown number of repeats in multiple protein sequences. Its companion articles were published in J. Am. Statist. Assoc. (Liu et al. 1995) and Protein Sci. (Neuwald et al 1995). A server of the Motif Sampler for both discovering DNA regulatory binding sites and protein sequence motifs can be accessed from the Wadsworth Lab Bioinformatics Center directed by Dr. Chip Lawrence.
  • PROBE. (Use command "tar -xvf probe.tar" after the download.) A UNIX (Sun OS) software tool for block-based multiple protein sequence alignment and for database search to detect remote protein homology. Its companion articles appeared in Nucl. Acid Res. (Neuwald et al. 1997) and J. Am. Statist. Assoc. (Liu et al. 1999)
  • Bayesian Aligner. A Bayesian pairwise alignment tool; also called 'Bayesian Phylogenetic Footprint.' Its companion article appeared in Bioinformatics
  • BioProspector. An improved web-interactive algorithm for finding gene regulatory binding motifs. See the companion article published in the Proceedings of Pacific Symposium on Biocomputing.
  • BLADE v2. Bayesian LinkAge DisEquilibrium mapping algorithm based on Liu et al. (2001) published in Genome Research. This executable program was produced by Dr. Xin Lu with a companion publication Lu, Niu and Liu (2003) in the same journal.
  • HAPLOTYPER Users’ Documentation for SNP haplotype reconstruction based on the Partition-Ligation method (Niu et al. 2002) published in Am. J. Hum. Genet.
  • PL-EM for SNP haplotype reconstruction based on the Partition-Ligation method and EM Algorithm (Qin et al. 2002) published in Am. J. Hum. Genet.
  • EM-DeCODER for SNP haplotype reconstruction (with Z. Qin and T. Niu)
  • MDScan. A new, fast, and accurate algorithm for finding protein-DNA interacting sites (gene regulatory binding motifs) from the 5' untranslated sequences selected by Chromatin-immunoprecipitation microarray (ChIP-array) and other microarray experiments. Its companion paper was published in Nature Biotechnology, 2002.
  • BMC. A novel Bayesian algorithm for putative motif clustering, see the companion paper published in Nature Biotechnology, 2003.
  • Motif Regressor. An efficient algorithm for integrating sequence motif discovery with measures from mRNA expression microarray or Chromatin-Immunoprecipitation microarray (ChIP-chip) experiments. Its companion paper was published in Proc. Nat’l Acad. Sci. USA, 2003. A more user-friendly download site is maintained by Erin Conlon at here.
  • GMS-MP: Gibbs Motif Sampler for Paired Correlation Model. See the Zhou & Liu (2004) in Bioinformatics.
  • BioOptimizer: A Bayesian scoring method for comparing and optimizing regulatory motif predictions from AlignACE, BioProspector, CONSENSUS, and MEME. Read details in Jensen & Liu (2004) in Bioinformatics.
  • Smoothing Spline Clustering (SSC) algorithm (Ma et al. 2006): a data-driven statistical method for clustering time-series gene expression data. A newer version of the software is available here.
  • BEAM (Bayesian Epistasis Association Mapping): A powerful Bayesian inference algorithm for detecting marker interactions in case-control population genetic studies. The companion paper, Zhang and Liu (2007)  was published in Nature Genetics.
  • Tmod (Toolbox for Motif Discovery): A software suite that incorporates 12 different popular sequence motif discovery algorithms such as MEME, BioProspector, AlignACE, GLAM, YMF, etc. It helps researchers to compare and combine motif finding results from different algorithms.

Some Talks in Slides


Courses I have taught and am Teaching

1.     STAT 115 (Bioinformatics) ;

2.     STAT 220 (Stochastic Processes);

  • Year 2002-2003:
      • Spring:

1.     STAT 171 (Stochastic Processes)

2.     STAT 311 (Monte Carlo Computation)

  • Year 2003-2004:
      • Autumn: STAT 220 (Bayesian modeling and computation)
      • Spring:

1.     STAT 171 (Stochastic Processes)

2.     STAT 311 (Monte Carlo Computation)

  • Year 2004-2005:
      • Spring:

1.     STAT 171 (Stochastic Processes). Meetings: SC 309, TTh 11:30-1.

2.     STAT 315 (Comp Biology and Related Topics). Meetings: SC 706, Th 2:30-4.



Former Ph.D. Students:

·        Chiara Sabatti (1998); Associate Professor, Dept Biostatistics, Stanford

·        Yuguo Chen (2001); Associate Professor, Dept Statistics, UIUC

·        Xiaole Liu (2002); Associate Professor, Dept Biostatistics and Dana-Farber Cancer Institute, HSPH.

·        Scott Schmidler (2001, delayed 2002); Assistant Professor, Dept Statistics, Duke University

·        Mayetri Gupta (2003); Assistant Professor, Dept Biostatistics, Boston University

·        Tanya Logvinenko (2003); Assist Prof of Medicine, Institute of CRHPS, Tufts University

·        Shane Jensen (2004); Assistant Professor, Dept Statistics, Wharton School, University of Pennsylvania

·        Hosung Kang (2005); Quantitative Analyst, Washington Mutual.

·        Gopika Goswami (2005); Quantitative Analyst, Barclay Bank.

·        Peng Zhang (HSPH, 2007), Postdoc, Harvard.

·        W. Evan Johnson (HSPH, 2007), Assist Prof, Brigham-Young University, Utah.

·        Jiajun Gu (DEAS, 2008), Analyst, Swiss Bank

·        Xiaodan Fan (2008), Assist Prof, Chinese University of Hong Kong

·        Tingting Zhang (2008, Joint with Sam Kou), Assist Prof, Univ of Virginia

·        Paul Edelfsen (2009, Joint with Art Dempster), Harvard Fellow

·        Yuan Yuan (2009), Researcher, Google

·        Wei Zhang (2009), Quantitative Analyst, UBS Investment Bank

·        Jing Maria Zhang (2009), Postdoctoral Fellow, Harvard University

Current Ph.D. Students:

·        Roee Gutman (joint with Don Rubin)

·        Lei Guo

·        Bo Jiang

·        Simeng Han

·        Daniel Fernendez

Associates and Postdoctoral fellows:

·        Saunak Sen (1998-1999). Current: UCSF

·        Tim Niu (2000-2001). Current: HSPH and Harvard Medical School

·        Steve Qin (2000-2003). Current: Dept of Biostat, U of Michigan

·        Erin Conlon (2000-2003). Current: Dept of Math, U of Mass, Amherst

·        Haiyan Huang (2001-2003). Current: Dept of Stat, UC Berkeley

·        Xin Lu (2001-2004). Current: Dept of Biostat, UC San Diego

·        Lei Shen (2003-2005). Current: Senior Bioinformatics Scientist, Agencourt Bioscience Co.

·        Ping Ma (2003-2005). Current: Dept of Statistics, UIUC

·        Yu Zhang (2004-2006). Current: Dept of Statistics, Penn State U

·        Cristian Castillo-Davis (20042006). Current: Dept of Biology, U of Maryland, College Park.

·        Lihua Zou (2004-2006). Current: Research Scientist, Danna-Farber Cancer Institute

·        Guocheng Yuan (2005-2006). Current: Dept of Biostat, Harvard School of Public Health

·        Wenxuan Zhong (2005-2007). Current: Dept of Statistics, UIUC

·        Jinfeng Zhang (2004-2007). Current: Florida State University

·        Chowdhary, Rajesh (2006-2008). Current: Marshfield Clinic Research Foundation

·        Liu, Xuxin (2007 – Present).

·        Ke Deng (2008 – Present).

·        Feng Hong (2009 – Present).

Rotation Students:

·        Junni Zhang (2000-2002)

·        Epaminondas Sourlas (2002)

·        Su Ying Quek (2001)

·        Calvin Chiu (2001)

·        Lihua Zou (2003)

·        Qing Zhou (2003)

·        James Signorovitch (HSPH, 2005-2006)

Visitors:

·        Wen Zhang (2003-2004). Research Scientist.

·        Xiaobin Dong (2003-2004).

·        Hongwei Xie (2004-2005): Professor and Chair, Technical University of Defense, China

·        Xuegong Zhang (2006): Professor and Director of Bioinformatics, Tsinghua U, China

·        Jianhua Xu (2008-2009): Professor, Nanjing Normal University

·        Jack Y. Yang (2007-present): Assit Prof, Harvard Medical School

·        Marry Q. Yang (2007-present): Scientist, NIH

 



Selected Publications and Technical Reports:

·  My Ten Interesting Papers

·  2009

·  2008

·  2007

·  2006

·  2005

·  2004

·  2003

·  2002

·  2001

·  2000

·  1999

·  1998

·  1997

·  1996

·  1995

·  1994

·  1993

·  1992

·  1991

 

My Book on Monte Carlo (2001)

Sample chapters from the book: Preface, Table of Content, and Chapter 1, Chapter 4, and Chapter 5.