177 research outputs found

    Rules, frequency, and predictability in morphological generalization: behavioral and computational evidence from the German plural system

    Get PDF
    Morphological generalization, or the task of mapping an unknown word (such as a novel noun Raun) to an inflected form (such as the plural Rauns), has historically proven a contested topic within computational linguistics and cognitive science, e.g. within the past tense debate (Rumelhart and McClelland, 1986; Pinker and Prince, 1988; Seidenberg and Plaut, 2014). Marcus et al. (1995) identified German plural inflection as a key challenge domain to evaluate two competing accounts of morphological generalization: a rule generation view focused on linguistic features of input words, and a type frequency view focused on the distribution of output inflected forms, thought to reflect more domain-general cognitive processes. More recent behavioral and computational research developments support a new view based on predictability, which integrates both input and output distributions. My research uses these methodological innovations to revisit a core dispute of the past tense debate: how do German speakers generalize plural inflection, and can computational learners generalize similarly? This dissertation evaluates the rule generation, type frequency, and predictability accounts of morphological generalization in a series of behavioral and computational experiments with the stimuli developed by Marcus et al.. I assess predictions for three aspects of German plural generalization: distribution of infrequent plural classes, influence of grammatical gender, and within-item variability. Overall, I find that speaker behavior is best characterized as frequency-matching to a phonologically-conditioned lexical distribution. This result does not support the rule generation view, and qualifies the predictability view: speakers use some, but not all available information to reduce uncertainty in morphological generalization. Neural and symbolic model predictions are typically overconfident relative to speakers; simple Bayesian models show somewhat higher speaker-like variability and accuracy. All computational models are outperformed by a static phonologically-conditioned lexical baseline, suggesting these models have not learned the selective feature preferences that inform speaker generalization

    Two Phases of Scaling Laws for Nearest Neighbor Classifiers

    Full text link
    A scaling law refers to the observation that the test performance of a model improves as the number of training data increases. A fast scaling law implies that one can solve machine learning problems by simply boosting the data and the model sizes. Yet, in many cases, the benefit of adding more data can be negligible. In this work, we study the rate of scaling laws of nearest neighbor classifiers. We show that a scaling law can have two phases: in the first phase, the generalization error depends polynomially on the data dimension and decreases fast; whereas in the second phase, the error depends exponentially on the data dimension and decreases slowly. Our analysis highlights the complexity of the data distribution in determining the generalization error. When the data distributes benignly, our result suggests that nearest neighbor classifier can achieve a generalization error that depends polynomially, instead of exponentially, on the data dimension

    Predictive Learning from Real-World Medical Data: Overcoming Quality Challenges

    Get PDF
    Randomized controlled trials (RCTs) are pivotal in medical research, notably as the gold standard, but face challenges, especially with specific groups like pregnant women and newborns. Real-world data (RWD), from sources like electronic medical records and insurance claims, complements RCTs in areas like disease risk prediction and diagnosis. However, RWD's retrospective nature leads to issues such as missing values and data imbalance, requiring intensive data preprocessing. To enhance RWD's quality for predictive modeling, this thesis introduces a suite of algorithms developed to automatically resolve RWD's low-quality issues for predictive modeling. In this study, the AMI-Net method is first introduced, innovatively treating samples as bags with various feature-value pairs and unifying them in an embedding space using a multi-instance neural network. It excels in handling incomplete datasets, a frequent issue in real-world scenarios, and shows resilience to noise and class imbalances. AMI-Net's capability to discern informative instances minimizes the effects of low-quality data. The enhanced version, AMI-Net+, improves instance selection, boosting performance and generalization. However, AMI-Net series initially only processes binary input features, a constraint overcome by AMI-Net3, which supports binary, nominal, ordinal, and continuous features. Despite advancements, challenges like missing values, data inconsistencies, and labeling errors persist in real-world data. The AMI-Net series also shows promise for regression and multi-task learning, potentially mitigating low-quality data issues. Tested on various hospital datasets, these methods prove effective, though risks of overfitting and bias remain, necessitating further research. Overall, while promising for clinical studies and other applications, ensuring data quality and reliability is crucial for these methods' success

    k-Means

    Get PDF

    Signatures of dissipative quantum chaos

    Full text link
    Understanding the far-from-equilibrium dynamics of dissipative quantum systems, where dissipation and decoherence coexist with unitary dynamics, is an enormous challenge with immense rewards. Often, the only realistic approach is to forgo a detailed microscopic description and search for signatures of universal behavior shared by collections of many distinct, yet sufficiently similar, complex systems. Quantum chaos provides a powerful statistical framework for addressing this question, relying on symmetries to obtain information not accessible otherwise. This thesis examines how to reconcile chaos with dissipation, proceeding along two complementary lines. In Part I, we apply non-Hermitian random matrix theory to open quantum systems with Markovian dissipation and discuss the relaxation timescales and steady states of three representative examples of increasing physical relevance: single-particle Lindbladians and Kraus maps, open free fermions, and dissipative Sachdev-Ye-Kitaev (SYK) models. In Part II, we investigate the symmetries, correlations, and universality of many-body open quantum systems, classifying several models of dissipative quantum matter. From a theoretical viewpoint, this thesis lays out a generic framework for the study of the universal properties of realistic, chaotic, and dissipative quantum systems. From a practical viewpoint, it provides the concrete building blocks of dynamical dissipative evolution constrained by symmetry, with potential technological impact on the fabrication of complex quantum structures. (Full abstract in the thesis.)Comment: PhD Thesis, University of Lisbon (2023). 264 pages, 54 figures. Partial overlap with arXiv:1905.02155, arXiv:1910.12784, arXiv:2007.04326, arXiv:2011.06565, arXiv:2104.07647, arXiv:2110.03444, arXiv:2112.12109, arXiv:2210.07959, arXiv:2210.01695, arXiv:2211.01650, arXiv:2212.00474, and arXiv:2305.0966

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum
    • …
    corecore