I work on a wide range of problems in statistics, probability, combinatorics, and their interactions. Specifically, I’m currently working on network-based approaches to understanding data. This includes inference for graphical models and the use of random graph models to study biological, social, and information networks.  Many questions are then of interest, such as:

1. How can we efficiently sample from a complex random graph model in which certain features are constrained?

2. When will Markov chain Monte Carlo (MCMC) be more efficient than sequential importance sampling (SIS), and vice versa? 

3. How can we identify commonly occurring patterns (“motifs”) in a network?

4. How can the parameters be estimated, avoiding commonly encountered degeneracy and convergence problems? 

Recent work in this area: 

A Sequential Importance Sampling Algorithm for Generating Random Graphs with Prescribed Degrees, Joseph Blitzstein and Persi Diaconis, submitted (2006). 

This paper gives an importance sampling algorithm for generating random graphs with given vertex degrees, using the Erdős-Gallai characterization, which efficiently tests whether a given sequence is the degree sequence of some graph. Applications such as simulating a food web and testing an exponential random graph model are given.

Code in R that implements the importance sampling algorithm for graphs is available: GraphAlgorithmR.txt

Generating Connected Random Graphs and Small World Networks, Joseph Blitzstein, submitted (2007).

In many applications, the network under consideration is required to be connected. We show how this additional constraint can be handled. As an example, the technique is applied to networks whose degrees are constrained to all be either 2 or 3, which is an interesting class of networks considered by Rick Durrett in his recent book Random Graph Dynamics. This quantitatively shows the emergence of the small world phenomenon of Watts and Strogatz.