Show simple item record

dc.contributor.advisorAhmad, Yanif N.
dc.creatorRing, Benjamin A
dc.date.accessioned2018-05-22T03:41:09Z
dc.date.available2018-05-22T03:41:09Z
dc.date.created2017-12
dc.date.issued2017-08-29
dc.date.submittedDecember 2017
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/58647
dc.description.abstractAdvances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within an iterative data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. User interaction derails exploratory progress due to manual oversight on lower level tasks such as tuning parameters, adjusting filters, and monitoring queries. We identify human-in-the-loop management of data generation and distributed analysis as an inhibiting problem precluding efficient online, iterative data exploration which causes delays in knowledge discovery and decision making. The flexible and scalable systems implementing the exploration workflow require semi-autonomous methods integrated as architectural support to reduce human involvement. We, thus, argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. This thesis introduces methodologies which autonomously coordinate distributed execution at a lower level in order to synchronize multiple efforts as part of a common goal. We demonstrate the impact on data exploration through serverless simulation ensemble management and multi-model machine learning by showing improved performance and reduced resource utilization enabling a more productive semi-autonomous exploration workflow. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.publisherJohns Hopkins University
dc.subjectData Exploration
dc.subjectAdaptive Control
dc.subjectMachine Learning
dc.titleAdaptive Asynchronous Control and Consistency in Distributed Data Exploration Systems
dc.typeThesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorJohns Hopkins University
thesis.degree.grantorWhiting School of Engineering
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
dc.date.updated2018-05-22T03:41:09Z
dc.type.materialtext
thesis.degree.departmentComputer Science
dc.contributor.committeeMemberWoolf, Thomas B.
dc.contributor.committeeMemberBurns, Randal
dc.publisher.countryUSA
dc.creator.orcid0000-0002-1699-3743


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record