Compositional Linguistic Generalization in Artificial Neural Networks

dc.contributor.advisorVan Durme, Benjamin
dc.contributor.committeeMemberSmolensky, Paul
dc.contributor.committeeMemberRawlins, Kyle
dc.contributor.committeeMemberLinzen, Tal
dc.contributor.committeeMemberFrank, Robert
dc.creatorKim, Najoung
dc.date.accessioned2022-02-21T21:43:55Z
dc.date.available2022-02-21T21:43:55Z
dc.date.created2021-12
dc.date.issued2021-08-27
dc.date.submittedDecember 2021
dc.date.updated2022-02-21T21:43:56Z
dc.description.abstractCompositionality---the principle that the meaning of a complex expression is built from the meanings of its parts---is considered a central property of human language. This dissertation focuses on compositional generalization, a key benefit of compositionality that enables the production and comprehension of novel expressions. Specifically, this dissertation develops a test for compositional generalization for sequence-to-sequence artificial neural networks (ANNs). Before doing so, I start by developing a test for grammatical category abstraction: an important precondition to compositional generalization, because category membership determines the applicability of compositional rules. Then, I construct a test for compositional generalization based on human generalization patterns discussed in existing linguistic and developmental studies. The test takes the form of semantic parsing (translation from natural language expressions to semantic representations) where the training and generalization sets have systematic gaps that can be filled by composing known parts. The generalization cases fall into two broad categories: lexical and structural, depending on whether generalization to novel combinations of known lexical items and known structures is required, or generalization to novel structures is required. The ANNs evaluated on this test exhibit limited degrees of compositional generalization, implying that the inductive biases of the ANNs and human learners differ substantially. An error analysis reveals that all ANNs tested frequently make generalizations that violate faithfulness constraints (e.g., Emma saw Lina ↝ see'(Emma', Audrey') instead of see'(Emma', Lina')). Adding a glossing task (word-by-word translation)---a task that requires maximally faithful input-output mappings---as an auxiliary objective to the Transformer model (Vaswani et al. 2017) greatly improves generalization, demonstrating that a faithfulness bias can be injected through the auxiliary training approach. However, the improvement is limited to lexical generalization; all models struggle with assigning appropriate semantic representations to novel structures regardless of auxiliary training. This difficulty of structural generalization leaves open questions for both ANN and human learners. I discuss promising directions for improving structural generalization in ANNs, and furthermore propose an artificial language learning study for human subjects analogous to the tests posed to ANNs, which will lead to more detailed characterization of the patterns of structural generalization in human learners.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/66745
dc.language.isoen_US
dc.publisherJohns Hopkins University
dc.publisher.countryUSA
dc.subjectCompositional Generalization, Artificial Neural Network
dc.titleCompositional Linguistic Generalization in Artificial Neural Networks
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentCognitive Science
thesis.degree.disciplineCognitive Science
thesis.degree.grantorJohns Hopkins University
thesis.degree.grantorKrieger School of Arts and Sciences
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
KIM-DISSERTATION-2021.pdf
Size:
1.75 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.67 KB
Format:
Plain Text
Description: