Graph Inference and Graph Matching
Johns Hopkins University
Graphs are widely used in many fields of research, ranging from natural sciences to computer and mathematical sciences. Graph inference is an area of intense research. In this dissertation, we propose several methodologies in graph inference. We focus on statistical inference using graph invariants, vertex nomination, and a divide-and-conquer graph matching technique. We present a comparative power analysis of various graph invariants for testing the hypothesis that the graph has a subgraph with higher edge probability. Given a graph drawn from a kidney-egg random graph model, the null hypothesis is that all edge probabilities are equal. The alternative hypothesis is that there exists a subset of vertices which are more likely to be adjacenct to each other than the rest of the graph. Using Monte Carlo simulations, we estimate the power of several graph invariants acting as test statistics. We discovered that for many choices of parameters in the random graph model, the scan statistic and clustering coefficient often dominate other graph invariants. However, our results indicates that none of the graph invariants considered is uniformly most powerful. Given a graph drawn from a stochastic block model where one block is of particular interest, vertex nomination is the task of creating a list of vertices such that there are an abundance of vertices from the block of interest at the top of the list. Vertex nomination is useful in situations where only a limited number of vertices can be examined and have their block membership checked. We propose several vertex nomination schemes, derive theoretical results for performance, and compare the schemes on simulated and real data. Given two graphs, graph matching is to create a mapping from one set of vertices to the other, such that the edge structure of the graphs is preserved as best as possible. We develop a new method for scaling graph matching algorithms, and prove performance guarantees. Any graph matching algorithm can be scaled using our divide-and-conquer technique. The performance of this technique is demonstrated on large simulated graphs and human brain graphs.
Graph, graph inference, graph matching, vertex nomination, metropolis-hastings, spectral embedding, spectral partitioning, parallelization, divide and conquer,