JScholarship is an open repository of works about JHU, scholarship authored by JHU researchers, and Digital Collections of the Sheridan Libraries and Museums.
Communities in JScholarship
Select a community to browse its collections.
Oral history of S.P.
(Johns Hopkins University Sheridan Libraries, 2023-03-17)
“SP,” a member of the Johns Hopkins University class of 2023, talks with Kristen Diehl about growing up in Spanish Town, Jamaica, and her decision to attend college in the United States. She shares her early interest in medicine, her experience majoring in molecular and cellular biology, and how her career interests have expanded while attending Hopkins. She also discusses her involvement in student organizations, such as International Students at Hopkins, the Black Student Union, and Hopkins Sport Taekwondo.
The microanatomy of human skin in aging
(Johns Hopkins University, 2023-07-24)
Skin is the largest organ in the body, provides critical barrier functions, and yet visibly ages with time. Some global morphologic change like skin wrinkling and increased fragility are evident; however, how ageing affects the skin at the micro-anatomical and cellular scales is poorly understood. Here, we developed a deep learning-based workflow to deeply characterize the microanatomical tissue and cellular features that change in human skin with age. We extract 1,090 objective structural features and identify 124 which are significantly affected by age. We identify eight biomarkers of human aging in skin, six of which were previously unknown, including: dermal thinning, decreased size/number of hair follicles and sebaceous glands, and progressive horizontal alignment of the extracellular matrix and stromal cells. These biomarkers allow significantly improved prediction of skin age over comparison. These rigorously curated and segmented atlases of normal skin microarchitecture constitute important reference maps for future studies in age-associated diseases of the skin.
Some Topics in Neural Network-based System Identification
(Johns Hopkins University, 2023-07-20)
Neural network-based system identification is a modeling technique that uses a neural network to learn the relationship between the input states and output states of a system. The neural network is trained on a set of input-output pairs, and once trained, it can be used to predict the system response to new inputs. Neural network-based system identification is particularly useful for modeling complex systems with nonlinear dynamics and unknown or time-varying behavior. It is commonly used in control engineering, signal processing, and robotics to model and control complex systems.
DATA ANALYSIS AND MACHINE LEARNING FOR ENHANCING RESILIENCE TO FIRE, FROM IGNITION MAPPING TO STRUCTURAL AND SYSTEMS MODELING
(Johns Hopkins University, 2023-07-19)
Fire hazards pose significant threats to our communities. Mitigation of fire risk requires an understanding of a range of issues and processes at various scales, such as the occurrence of ignitions in a community, the performance of a building structure under fire, and the efficiency of prevention and protection strategies. In this thesis, we investigate several of these fire safety issues through the lens of data-driven methods. Bayesian methods and machine learning techniques are adopted and tailored to address selected fire hazards and provide contributions toward solving these challenges for a fire resilient built environment. The thesis focuses first on fire ignitions. It investigates the problems of fire following earthquakes and wildfires. These issues are studied at the scale of a city or a region. For fire following earthquakes, a hierarchical Bayesian method is developed to allow modeling with scarce data, while for wildfire ignitions, an ensemble-based machine learning model is adopted. Then, the thesis zooms in to the building scale to assess data-based methods for evaluation of structural fire performance. Surrogate models are derived based on machine learning to capture the capacity of slender steel members in fire. Finally, the thesis investigates a framework to assess system resilience under fire hazards. The framework is applied for resilience assessment of facilities subjected to fires in the process industry. The thesis applies different data-based and modeling approaches to deal with fire hazards from different perspectives and at different scales, with the aim to enhance fire safety and build a more resilient environment against fires for our community.
UNSUPERVISED SEGMENTAL MODELING OF SPEECH FOR LOW RESOURCE APPLICATIONS
(Johns Hopkins University, 2023-07-19)
Voice-enabled interfaces for human-machine interaction have made significant progress in recent years. Most of the success can be attributed to deep neural networks trained on thousands of hours of transcribed data. However, vast amounts of labeled data are not available for most spoken languages worldwide, e.g., regional languages. Here we explore alternate techniques that can learn directly from data without any or minimal manual transcriptions. The representation techniques employed to characterize the speech signal dictate the performance of unsupervised systems. Self-supervised methods such as Contrastive Predictive Coding (CPC) have emerged as a promising technique for representation learning from unlabeled speech data. Based on the observation that the acoustic information, e.g., phones, changes slower than the feature extraction rate in CPC, we propose regularization techniques that impose slowness constraints on the features. First, we propose two regularization techniques: Self-expressing constraint and Left-or-Right regularization. Our modifications outperform the baseline CPC in monolingual, cross-lingual, or multilingual settings on the ABX and linear phone classification benchmarks. However, CPC or our modifications mainly look at the audio signal's structure at the frame level. The speech structure exists beyond the frame level, i.e., at the phone level or even higher. We propose a segmental contrastive predictive coding (SCPC) framework to learn from the signal structure at both the frame and phone levels. SCPC is a hierarchical model with three stages trained in an end-to-end manner. In the first stage, the model predicts future feature frames and extracts frame-level representation from the raw waveform. In the second stage, a differentiable boundary detector finds variable-length segments. In the last stage, the model predicts future segments to learn segment representations. Experiments show that our model outperforms existing phone and word segmentation methods on TIMIT and Buckeye datasets. In the last part, we explore knowledge distillation from text encoders (e.g., Roberta) to speech encoders in an unsupervised manner in a multimodal setting. Text encoders operate at the sub-word level, while speech encoders operate at a much smaller scale, i.e., frames. Our segmental framework allows us to downsample frames and generate sub-words. SCPC enables us to leverage pretrained text encoders in an audio-visual and audio-only setting. We show significant performance improvements on the audio-image retrieval and semantic similarity task.
MEASUREMENT AND MITIGATION OF MICROBIAL CONTAMINATION ON VARIOUS PERSONAL PROTECTIVE EQUIPMENT
(Johns Hopkins University, 2023-07-20)
Recent global health crisis has called for methods to measure and mitigate microbial contamination on various personal protective equipment. Particularly, this document focu sed on: the implementation of modified AATCCAATCC-100 with qRT qRT-PCR assisted absolute and relative quantification for quantifiable tracing of both antimicrobial and microbial behavioral properties; the comparison of pipette tip repurposing efficiencies among lab detergent, ozone, and CAP; and the prospective application of vapor hydrogen peroxide, ozone, and CAP for mask repurposing. Log reductions from modified AATCC - 100 were compared to identify time dependent antimicrobial were compared to identify time dependent antimicrobial properties from silver ion containi containing wound dressing samples. A ntimicrobial properties of wound dressing samples diminished as incubation days are increased for both PCR and cell viability assay, while d ata from qRT qRT- PCR generally produced lower standard deviation than that of culture method s, hence shown to be more precise. Complementary parallel analysis of samples using both methods better characterized antimicrobial properties of the tested samples. A p arallel analysis using classical methods alongside the application of relative quantifi cation displayed changes in expression of virulence related genes. Although molecular assays targeting specific virulence activities are needed to verify the change in activities, relative quantification efficiently provided insight into changechanges specific t o virulence in model organisms. A c ontamination evaluation protocol were outlined t o evaluate the efficacies of the following repurposing methods: washing wit h a common laboratory detergent, exposure of ozone vapor, and CAP. Efficacy was determined by turn over ratio and log reduction in detectable genomic material of the contaminated products via re real -time quantitative PCR (qPCR). Ozone at 14400 PPM * minute is fully optimized while CAP shows promising potential post optimization. The application of ozone, hydrogen peroxide, and CAP is further explored for mask repurposing. Although further experimentation with BFE is needed, minimal change in physical properties of post post-repurposed masks showed promising potential as non non-destructive repurposing methods.
PROGRAMMABLE M13 HYPERPHAGE DISPLAY OF DIVERSE ANIMAL TOXIN LIBRARIES TO DISCOVER MEMBRANE PROTEIN TARGETED THERAPEUTICS
(Johns Hopkins University, 2023-07-19)
Animal toxins and cysteine-reinforced miniproteins present underexplored landscapes for drug discovery due to their unique structural attributes, potent bioactivity, and selective targeting. Recognizing the value of these compounds for therapeutic strategies, we developed a high-throughput strategy for large-scale screening of these miniproteins. To meet the urgent requirement for large-scale screening of cysteine-reinforced miniproteins, we devised a high-throughput strategy targeting the discovery of innovative drug candidates. Central to this strategy is the construction of two distinct libraries: the 'Animal Toxin' library, assembled from Uniprot database resources, and the 'Metavenome' library, developed to expand the former through sequence homologous proteins within an extensive Metagenomic database. Using programmable phage display integrated with high-throughput oligonucleotide library synthesis, we encoded these libraries with chosen polypeptides, thereby representing a substantial fraction of the cysteine-rich toxins universe for protein-protein interaction studies. We optimized our phage display using a programmable hyperphage technique for M13 phage display. Our hyperphage system enables the fusion of ligands with all five copies of the P3 protein expressed on the phage surface. This polyvalent display system not only enhances binding avidity but also amplifies the sensitivity to detect lower affinity interactions. Moreover, we have coupled single-round screening with next-generation sequencing (NGS) to evaluate the binding of all library members simultaneously. This innovative combination allows for rapid and efficient identification of potential ligands for target membrane proteins and can even detect interactions with very rare members of a library that might otherwise be outcompeted in a typical multiple-round panning process of traditional phage display. As an initial demonstration of the utility of our platform, we focused on two distinct receptors, epidermal growth factor receptor (EGFR) and Mas-related G-protein coupled receptor member X4 (MrgprX4), which are implicated in cellular growth processes and pain/itch signaling, respectively. We rediscovered known ligands, identified novel binders, and provided insights into potential binding modalities of these ligands. Our findings demonstrate the potential of our libraries in discovering bioactive ligands and the prospects of these binders as scaffolds for novel therapeutics. Altogether, our platform offers a promising approach for membrane protein-targeted drug discovery and exploring the structural diversity of cysteine-rich miniproteins.
Robust Speaker Recognition using Perceptual and Adversarial Speech Enhancement
(Johns Hopkins University, 2023-07-19)
In Automatic Speaker Verification (ASV), we determine whether the speaker in the test utterance is identical to the previously enrolled speaker. Deep learning has significantly improved ASV performance. However, it is still susceptible to external disturbances and domain mismatches. A standard solution is data augmentation, i.e., adding noise and reverberations in training data. We focus on developing pre-processing solutions that can be integrated with existing pipelines and advance empirical performance on state-of-the-art systems. For this, we pursue deep learning-based speech enhancement and develop solutions equipped with denoising, domain adaptation, and bandwidth extension (BWE). Existing speech enhancement solutions often lead to degradation in ASV performance, partly due to loss of speaker information. We propose using perceptual/deep features that leverage pre-trained models to handle this. We also prove the effectiveness of our denoiser by showing that it complements the missing noise class in the x-vector (training) data augmentation through ablation studies. We also improve the training data for telephony speaker verification, where it is a common practice to downsample higher-bandwidth microphone speech to lower sampling frequency and apply telephone codecs. We propose to replace this by learning a mapping using a deep feature-based CycleGAN. Our novel technique improves training data and complements the prior techniques, including data augmentation. To handle bandwidth mismatch, we pursue BWE with time-domain architectures. We develop competent Generative Adversarial Networks (GAN): supervised (conditional GAN) and unsupervised (CycleGAN). Our findings indicate that unsupervised learning can give close performance to supervised performance. We also pursue joint learning BWE schemes with domain adaptation. Finally, with our proposed Self-FiLM scheme, we leverage self-supervised representations to guide BWE models better in unknown environments. In conclusion, we provide evidence that speech enhancement can be used as a pre-processor for improving ASV. By testing on real data acquired from Speaker Recognition Evaluation challenges, we demonstrate the effectiveness of speaker-identity preserving denoisers. We also study the effectiveness of domain adaptation and Self-Supervised Learning to improve bandwidth extension. Our work opens research into a joint investigation of enhancement-related problems and better generative models to assist the x-vector.
Automating the Analysis and Improvement of Dynamic Programming Algorithms with Applications to Natural Language Processing
(Johns Hopkins University, 2023-07-20)
This thesis develops a system for automatically analyzing and improving dynamic programs, such as those that have driven progress in natural language processing and computer science, more generally, for decades. Finding a correct program with the optimal asymptotic runtime can be unintuitive, time-consuming, and error-prone. This thesis aims to automate this laborious process. To this end, we develop an approach based on (1) a high-level, domain-specific language called Dyna for concisely specifying dynamic programs (2) a general-purpose solver to efficiently execute these programs (3) a static analysis system that provides type analysis and worst-case time/space complexity analyses (4) a rich collection of meaning-preserving transformations to programs, which systematizes the repeated insights of numerous authors when speeding up algorithms in the literature (5) a search algorithm for identifying a good sequence of transformations that reduce the runtime complexity given an initial, correct program We show that, in practice, automated search—like the mental search performed by human programmers—can find substantial improvements to the initial program. Empirically, we show that many speed-ups described in the NLP literature could have been discovered automatically by our system. We provide a freely available prototype system at https://github.com/timvieira/dyna-pi
The Coevolution of Topography and Runoff Generation in Humid Landscapes
(Johns Hopkins University, 2023-07-14)
Topography is an important control on runoff generation, as slope and relief affect hydraulic gradients, and curvature affects the convergence or divergence of flow paths. Over long timescales, however, runoff also shapes topography through surface erosion. This coevolution suggests that there may be a close relationship between landscape hydrology and topography that could provide insights into both hydrological and geomorphic processes. However, we do not have a strong theoretical framework for how topography and runoff generation should be linked, nor have there been many studies to determine how these links are expressed in the field. Here I address these areas, focusing on humid climates where runoff is primarily generated through groundwater return flow and precipitation on saturated areas. First, I present a new coupled model of runoff generation and landscape evolution that incorporates fluvial erosion driven by runoff from a shallow aquifer, hillslope diffusion, and uplift. Then, I nondimensionalize the model under the condition of steady and uniform groundwater recharge, and provide a mathematical framework for understanding the link between hillslope length, geomorphic process rates, and subsurface hydrological properties. Next, I explore the hydrological function of coevolved landscapes in more detail, focusing particularly on the emergence of variable source area hydrology. For this aim, I extend the model and nondimensionalization to include evapotranspiration and a simple representation of the vadose zone. I show, among other things, that coevolution with subsurface hydrology can explain why steeper landscapes are likely to have smaller variably saturated areas than landscapes with more gentle topography, and link this difference to subsurface properties and climate. Lastly, I test some of the model predictions in the field by exploring the hydrologic and geomorphic differences between two small watersheds in the Piedmont physiographic province near Baltimore that have contrasting subsurface architecture. I show that the site with a thin permeable subsurface has larger variable source areas and shorter hillslopes than the site with a thick permeable subsurface, as predicted by the model. A full parameterization of the model for the two sites suggests that subsurface properties are necessary to explain these differences.