LEARNING MULTISCALE APPROXIMATIONS OF FUNCTIONS BETWEEN MANIFOLDS
Johns Hopkins University
In many machine learning applications, data sets are in a high dimensional space but have a low-dimensional structure. The intrinsic dimension of the structure is often much smaller than the ambient dimension. This has given rise to the studies on manifold learning, when the low-dimensional structure is a manifold, and dictionary learning, when the low-dimensional structure is a set of sparse linear combinations of vectors from a finite dictionary. However, there has been very limited research for transformations between two high dimensional data sets. These transformations can be hard and expensive to store and compute. Furthermore, the existing algorithms are limited to be applied due to the high dimensionality of the two data sets. This thesis considers the problem of estimating a function between two high dimensional data sets. Both the domain and the range are supported on low-dimensional manifolds, given random samples in the domain and corresponding samples in the range perturbed by bounded noise. Geometric Multi-Resolution Analysis (GMRA) constructs low-dimensional geometric multiscale approximations of the data set lying on or near a manifold. We estimate these two unknown manifolds using GMRA and approximate the functions locally by multiscale linear maps. We obtain the optimal learning rate up to a log factor, depending on the intrinsic dimension of data, and circumvent the curse of dimensionality in the domain and the range.
machine learning, high-dimensional data, approximation