# Research

## Current Projects in Preparation:

• Approximate $k$-Nearest-Neighbors for Structured Point Clouds in High Dimensions
• Fast Gaussian Elimination for Sparse Matrices with Field Coefficients

The projects in preparation above are backbones in manifold learning for point clouds and persistent homology in computational topology, respectively, but both have much wider applications. The first project uses randomization in a crucial way, while the second project uses several ideas from algebraic topology and dynamical systems. I should like to study further applications of these or related ideas in the near future. In particular, through work in the second project, I have become interested in new techniques for numerical linear algebra. Here are more details:

## Approximate k-Nearest-Neighbors for Structured Point Clouds in High Dimensions

For data sets which can be viewed as a collection of points in a high-dimensional metric space, many algorithms in manifold learning start by solving the $k$-nearest neighbor problem, finding for each query point $q$, the closest $k$ points in the data set. If $N$ is the number of points in the data set, algorithms for the latter problem are appealing when the space required is linear in $N$, the preprocessing time is $O(N \log(N))$, and the query time is some constant power of $\log(N)$. Unfortunately, if the data is truly high-dimensional say $D$ in the thousands, a factor exponential in $D$ arises in at least one of these computation times which is too expensive to be practical. However, many assume the data has some structure such as being sampled from a manifold of dimension $d$ much smaller than $D$. Under this assumption, the algorithms pay a factor exponential in $d$ instead, which is acceptable if $d=O(\log(\log(N)))$, but still not acceptable when $d$ is closer to $\log(N)$. Using ideas from concentration of measure and asymptotic geometric analysis, we are designing an algorithm that still performs well in the latter case as well as a modified set of assumptions that are still reasonable for many data sets.

## Fast Gaussian Elimination for Sparse Matrices with Field Coefficients

Many applications in numerical computing require solving sparse linear systems:
systems of linear equations with few nonzero coefficients in each equation. Many solvers resort to some form of matrix reduction as a final step; Gaussian elimination, also known as LR-decomposition, is the simplest of these. If $N$ is the number of columns of a square matrix, then the number of floating point operations is near cubic in $N$ in the worst case, when field coefficients are used. However, one would hope that because many of the entries in the matrix corresponding to a sparse linear system are 0, the number of floating point operations should be much less, say quadratic or smaller in $N$. Using ideas from discrete Morse theory, algebraic topology, and graph theory, we are designing a direct solver for such systems with computation time $O(pNg)$ with $g$ the number of pivots and $p$ the maximum number of entries in each column.