STREAMING CORESETS FOR HIGH-DIMENSIONAL GEOMETRY

dc.contributor.advisorSzalay, Alex
dc.contributor.committeeMemberAurora, Raman
dc.contributor.committeeMemberBraverman, Vladimir
dc.contributor.committeeMemberLu, Fei
dc.contributor.committeeMemberMaggioni, Mauro
dc.creatorLang, Harry
dc.date.accessioned2019-03-07T03:06:09Z
dc.date.available2019-03-07T03:06:09Z
dc.date.created2018-08
dc.date.issued2018-07-20
dc.date.submittedAugust 2018
dc.date.updated2019-03-07T03:06:09Z
dc.description.abstractThis thesis studies clustering problems on data streams, specifically with applications to metric spaces that are either non-Euclidean or of high-dimensionality. The algorithms are introduced to approximate minima of the k-median function, although they can be used to find approximate minima of many other functions are shown in Part I. In Part I, we introduce the first analysis of the effect of stream order on the space required to solve k-median. We use this to forge a connection to random-order streams and provide an optimal O(nk)-time algorithm for k-median in the RAM model. In Part II, we provide a nearly optimal algorithm to maintain a coreset for k-median. Specifically for the case of k-means, we introduce the first coreset to be simultaneously independent of both the dimension and the input size. This gives the first sublinear size coreset for high- dimensional sparse data. In Part III, we apply our techniques to streams where points may be deleted as well as inserted, and construct the first coreset for high-dimensional spaces.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/60041
dc.language.isoen_US
dc.publisherJohns Hopkins University
dc.publisher.countryUSA
dc.subjectstreaming, coresets
dc.titleSTREAMING CORESETS FOR HIGH-DIMENSIONAL GEOMETRY
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentMathematics
thesis.degree.disciplineMathematics
thesis.degree.grantorJohns Hopkins University
thesis.degree.grantorKrieger School of Arts and Sciences
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LANG-DISSERTATION-2018.pdf
Size:
1.15 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.67 KB
Format:
Plain Text
Description: