Show simple item record

dc.contributor.advisorFill, James A.en_US
dc.contributor.authorChestnut, Stephen Roberten_US
dc.date.accessioned2015-09-16T03:37:02Z
dc.date.available2015-09-16T03:37:02Z
dc.date.created2015-05en_US
dc.date.issued2015-02-04en_US
dc.date.submittedMay 2015en_US
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/37935
dc.description.abstractExact solutions are unattainable for important problems. The calculations are limited by the memory of our computers and the length of time that we can wait for a solution. The field of approximation algorithms has grown to address this problem; it is practically important and theoretically fascinating. We address three questions along these lines. What are the limits of streaming computation? Can we efficiently compute the likelihood of a given network of relationships? How robust are the solutions to combinatorial optimization problems? High speed network monitoring and rapid acquisition of scientific data require the development of space efficient algorithms. In these settings it is impractical or impossible to store all of the data, nonetheless the need for analyzing it persists. Typically, the goal is to compute some simple statistics on the input using sublinear, or even polylogarithmic, space. Our main contributions here are the complete classification of the space necessary for several types of statistics. Our sharpest results characterize the complexity in terms of the domain size and stream length. Furthermore, our algorithms are universal for their respective classes of statistics. A network of relationships, for example friendships or species-habitat pairings, can often be represented as a binary contingency table, which is {0,1}-matrix with given row and column sums. A natural null model for hypothesis testing here is the uniform distribution on the set of binary contingency tables with the same line sums as the observation. However, exact calculation, asymptotic approximation, and even Monte-Carlo approximation of p-values are so-far practically unattainable for many interesting examples. This thesis presents two new algorithms for sampling contingency tables. One is a hybrid algorithm that combines elements of two previously known algorithms. It is intended to exploit certain properties of the margins that are observed in some data sets. Our other algorithm samples from a larger set of tables, but it has the advantage of being fast. The robustness of a system can be assessed from optimal attack strategies. Interdiction problems ask about the worst-case impact of a limited change to an underlying optimization problem. Most interdiction problems are NP-hard, and furthermore, even designing efficient approximation algorithms that allow for estimating the order of magnitude of a worst-case impact has turned out to be very difficult. We suggest a general method to obtain pseudoapproximations for many interdiction problems.en_US
dc.format.mimetypeapplication/pdfen_US
dc.languageen
dc.publisherJohns Hopkins University
dc.subjectapproximation algorithmsen_US
dc.subjectstreamingen_US
dc.subjectsketchingen_US
dc.subjectinterdictionen_US
dc.subjectsampling contingency tablesen_US
dc.subjectcombinatorial optimizationen_US
dc.subjectmatroidsen_US
dc.titleStream sketches, sampling, and sabotageen_US
dc.typeThesisen_US
thesis.degree.disciplineApplied Mathematics & Statisticsen_US
thesis.degree.grantorJohns Hopkins Universityen_US
thesis.degree.grantorWhiting School of Engineeringen_US
thesis.degree.levelDoctoralen_US
thesis.degree.namePh.D.en_US
dc.type.materialtexten_US
thesis.degree.departmentApplied Mathematics and Statisticsen_US
dc.contributor.committeeMemberBasu, Amitabhen_US
dc.contributor.committeeMemberBraverman, Vladimiren_US
dc.contributor.committeeMemberZenklusen, Ricoen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record