
news
recent and upcoming talks
 Some fundamental ideas for causal inference on large networks. Yale (Statistics), November 9, 2015, New Haven, CT.
 Designing and analyzing experiments on large networks. Yale (Biostatistics), September 29, 2015, New Haven, CT.
 Some fundamental ideas for causal inference on large networks. MIT (Stochastics & Statistics), September 25, 2015, Cambridge, MA.
 Some fundamental ideas for causal inference on large networks. Cornell Day of Statistics, September 11, 2015, Itahca, NY.
 Some fundamental ideas for causal inference on large networks. Harvard University (Center of Mathematical Sciences and Applications), August 2426, 2015, Cambridge, MA.
 Optimal design of experiments in the presence of network interference. Microsoft Research New England (Statistics & Data Science Symposium), June 1112, 2015, Cambridge, MA.
 Statistics and machine learning challenges in the analysis of large networks. University of Washington (Computer Science), June 2, 2015, Seattle, WA.
 Optimal design of experiments in the presence of network interference. University of Washington (Statistics), June 1, 2015, Seattle, WA.
 Optimal design of experiments in the presence of network interference. University of Chicago (Booth School of Business), April 2, 2015, Chicago, IL.
 Optimal design of experiments in the presence of network interference. National Academy of Sciences (Sackler Symposium on Drawing Causal Inference from Big Data), March 2627, 2015, Washington, DC.
preprints
 Conditionally specified models for nonignorable missing data mechanisms.
(pdf)
 Variational inference with copula augmentation.
(pdf)
 Optimal design of experiments in the presence of networkcorrelated outcomes.
(pdf)
 Causal inference for ordinal outcomes.
(pdf)
 Geometric representations of distributions on hypergraphs.
(pdf)
 Bayesian inference from nonignorable network sampling designs.
(pdf)
 Sharp total variation bounds for finitely exchangeable arrays.
(pdf)
 The geometry of 2x2 contingency tables.
(java app)
(source code)
 Estimating cellular pathways from an ensemble of heterogeneous data sources.
(pdf)
selected publications
(see my CV
or Google Scholar
for more publications and bibliographic details)
statistical inference strategies for massive data sets
 Scalable estimation strategies based on stochastic approximations: Classical results and new insights.
Statistics and Computing, 2015.
(pdf)
 Implicit stochastic gradient methods for principled estimation with large data sets.
(pdf)
(a shorter version appeared at ICML 2014)
theory and methods for network data analysis
 Nonparametric estimation and testing of exchangeable graph models.
Journal of Machine Learning Research, W&CP, 2014.
(pdf)
 A consistent total variation estimator for exchangeable graph models.
(pdf)
(a shorter version appeared at ICML 2014)
 Stochastic blockmodel approximation of a graphon: Theory and consistent estimation.
NIPS, 2013.
(pdf)
 Stochastic blockmodels with growing number of classes.
Biometrika, 2012.
(pdf)
 Confidence sets for network structure.
Statistical Analysis and Data Mining, 2011.
(pdf)
(a shorter version appeared at NIPS 2011)
 Graphlets decomposition of a weighted network.
Journal of Machine Learning Research, W&CP (AISTAT), 2011.
(pdf)
(MSR best student paper award, NESS 2012)
 A survey of statistical network models.
Foundations and Trends in Machine Learning, 2010.
(pdf)
 Mixedmembership stochastic blockmodels.
Journal of Machine Learning Research, 2008.
(pdf)
(r code)
(fast code)
(John Van Ryzin award, 2006)
geometry and inference in illposed inverse problems
 Estimating latent processes on a network from indirect measurements.
Journal of the American Statistical Association, 2013.
(pdf)
(supp)
(r code)
(IBM best student paper award, NESS 2011)
 Polytope samplers for inference in illposed inverse problems.
Journal of Machine Learning Research, W&CP, 2011.
(pdf)
 Tree preserving embedding
Proceedings of the National Academy of Sciences, 2011.
(pdf)
(r code)
(a shorter version appeared at ICML 2011)
modeling and inference in highthroughput biology
 Estimating a structured covariance matrix from multilab measurements in highthroughput biology.
Journal of the American Statistical Association, in press.
(IBM best student paper award, NESS 2013)
(W. J. Youden Award in Interlaboratory Testing, ASA 2015)
 Generalized species sampling priors with latent beta reinforcements.
Journal of the American Statistical Association, 2014.
(pdf)
(data)
 Multiway blockmodels for analyzing coordinated highdimensional responses.
Annals of Applied Statistics, 2013.
(pdf)
(supp)
 Analysis and design of RNA sequencing experiments for identifying mRNA isoform regulation.
Nature Methods, 2010.
(pdf)
(supp)
(code)
 Ranking relations using analogies in biological and information networks.
Annals of Applied Statistics, 2010.
(pdf)
(code)
 Predicting cellular growth from gene expression signatures.
PLoS Computational Biology, 2009.
(pdf)
(code & data)
(a shorter version appeared at NIPS 2008)
applications in molecular biology
 Reversible, specific, active aggregates of endogenous proteins assemble upon heat stress.
Cell, 2015.
(pdf)
 Differential stoichiometry among core Ribosomal proteins.
Cell Reports, 2015.
(pdf)
 Musashi proteins are posttranscriptional regulators of the epithelialluminal cell state.
eLife, 2014.
(pdf)
(editor's choice in Science)
 Systemslevel dynamic analyses of fate change in murine embryonic stem cells.
Nature, 2009.
(pdf)
(supp)
(F1000)
(news & views, Nat BT)
(editor's choice, Sci Sig)
 Coordination of growth rate, cell cycle, stress response and metabolic activity in yeast.
Molecular Biology of the Cell, 2008.
(pdf)
(code & data)
applied methodology in computational social science
 A model of text for experimentation in the social sciences.
(pdf)
(a shorter version appeared at NIPS 2013)
 A regularization scheme on word occurrence rates that improves estimation and interpretation of topical content (with discussion).
Journal of the American Statistical Association, 2016.
(pdf)
 Predicting traffic volumes and estimating the effects of shocks in massive transportation systems.
Proceedings of the National Acadamy of Sciences, 2015.
 A natural experiment of social network formation and dynamics.
Proceedings of the National Acadamy of Sciences, 2015.
 Reconceptualizing the classification of PNAS articles.
Proceedings of the National Academy of Sciences, 2010.
(pdf)
(editorial feature)
 Whose ideas? Whose words? Authorship of the Ronald Reagan radio addresses.
Political Science & Politics, 2007.
(pdf)
(oped by Skinner & Rice)
 Who wrote Ronald Reagan's radio addresses?
Bayesian Analysis, 2006.
(pdf)
(tr with detailed predictions)
(notes on Negative Binomial)
theses
 Bayesian mixedmembership models of complex and evolving networks.
Doctoral dissertation, 2007.
(Savage award honorable mention, 2007)
 The theory of weak convergence of probability measures and its applications in statistics.
Undergraduate thesis, 1999.
(Gold medal for best graduates, 1999)
