NOVEL NON-PARAMETRIC METHODS FOR UNCOVERING STRUCTURE IN DATA: ESTIMATION OF PLACE AND GRID CELL FIELDS
MetadataShow full item record
Maximum likelihood (ML) estimators of probability density functions (pdfs) are the most popular parametric estimators today because they are often efficient to compute and have several nice properties such as consistency, asymptotic normality, functional invariance, achieving a lower bound on the variance of the estimator parameters (Cramer-Rao bound) and they have fast convergence rates. However, often the underlying data is too complex and it is not easy to parametrize the pdf. In such cases, non-parametric modeling remains the only option. Existing non-parametric methods, such as kernel density estimation (KDE), orthogonal series density estimates (OSDE) and orthogonal series square-root density estimates (OSSDE) are consistent. However, these estimators do not necessarily have other properties of parametric ML estimators and have slower convergence rates. On the other hand, non-parametric ML estimation has not been rigorously studied, because in general the likelihood is hard to maximize in a non-parametric setting. One example of a non-parametric ML estimator is the histogram (or the experimental cumulative distribution function). The histogram is a consistent estimator, but it is discontinuous. Many pdfs in nature are smooth and hence it is desirable to obtain smooth estimators. This thesis proposes a nonparametric ML estimator over the set of band-limited (BL) "smooth" pdfs - the BLML estimator. This class contains pdfs whose Fourier transforms have finite support (with a certain cut-off frequency). A semi-closed form of the BLML estimator is derived and its consistency is shown. Although convergence rates are not derived, the BLML estimator has faster convergence rates than KDE and OSSDE methods in simulation. Algorithms for fast computation of the BLML estimators are also proposed and their computational complexity is determined to be better than that of KDE and OSSDE methods. In fact, in simulation these BLML algorithms show an order of magnitude faster computational time than the KDE and OSSDE methods for dense data. Finally, algorithms for estimating the unknown cut-off frequency are proposed. BLML methods are then used to construct an estimator for mutual dependence between different random variables. Mutual dependence measures the Bhattacharya distance between the joint pdf and the product of marginal pdfs and is an "ideal" metric for measuring dependencies between random variables unlike measures such as the mutual information, Pearson or distance correlation. Currently mutual dependence is not directly estimable from data and its estimation requires numerical integration which can produce errors. The consistency of the BLML estimator for mutual dependence is then proven and simulations are used to show that the convergence rates of the BLML estimator for mutual dependence are superior to the convergence rates of the OSSD estimator for mutual dependence, the estimator for Pearson and distance correlation. Finally, an algorithm to estimate the cut-off frequencies for the BLML estimator for mutual dependence is developed and its performance on standard dataset from Wikipedia is shown. Then BLML methods are then applied to estimate the encoding fields of complex grid and place cells. Recently introduced "Fourier hypothesis" states that grid cell fields are approximately 2-dimensional cosine fields and place cell fields are the linear sum of such grid cell fields (1). Due to the close relation of the BLML methods to Fourier transforms, BLML estimation is a natural choice for testing this hypothesis. In particular, the conditional intensity function of 53 place and grid cells is estimated using the BLML methods and the Bayesian framework. The performance of the BLML methods is then compared with that of KDE and generalized linear models (GLM) (which are state-of-the-art parametric methods in neuroscience). The BLML methods outperform both the KDE and GLM methods validating the hypothesis. Further, the BLML (along with KDE) methods also successfully captures the history dependence in ring patterns of place and grid cells. Thereby, these methods are able to explain the variance that has previously been observed in ring patterns of place cells (2) and has not been explained by other models for complex place fields. Finally, the BLML methods are used to decode the trajectory of rat (using the ring patterns of the grid and place cells) with considerable accuracy (r2 = 0.89).