425 research outputs found

    Finding structure in language

    Get PDF
    Since the Chomskian revolution, it has become apparent that natural language is richly structured, being naturally represented hierarchically, and requiring complex context sensitive rules to define regularities over these representations. It is widely assumed that the richness of the posited structure has strong nativist implications for mechanisms which might learn natural language, since it seemed unlikely that such structures could be derived directly from the observation of linguistic data (Chomsky 1965).This thesis investigates the hypothesis that simple statistics of a large, noisy, unlabelled corpus of natural language can be exploited to discover some of the structure which exists in natural language automatically. The strategy is to initially assume no knowledge of the structures present in natural language, save that they might be found by analysing statistical regularities which pertain between a word and the words which typically surround it in the corpus.To achieve this, various statistical methods are applied to define similarity between statistical distributions, and to infer a structure for a domain given knowledge of the similarities which pertain within it. Using these tools, it is shown that it is possible to form a hierarchical classification of many domains, including words in natural language. When this is done, it is shown that all the major syntactic categories can be obtained, and the classification is both relatively complete, and very much in accord with a standard linguistic conception of how words are classified in natural language.Once this has been done, the categorisation derived is used as the basis of a similar classification of short sequences of words. If these are analysed in a similar way, then several syntactic categories can be derived. These include simple noun phrases, various tensed forms of verbs, and simple prepositional phrases. Once this has been done, the same technique can be applied one level higher, and at this level simple sentences and verb phrases, as well as more complicated noun phrases and prepositional phrases, are shown to be derivable

    Uncertainty Estimation, Explanation and Reduction with Insufficient Data

    Full text link
    Human beings have been juggling making smart decisions under uncertainties, where we manage to trade off between swift actions and collecting sufficient evidence. It is naturally expected that a generalized artificial intelligence (GAI) to navigate through uncertainties meanwhile predicting precisely. In this thesis, we aim to propose strategies that underpin machine learning with uncertainties from three perspectives: uncertainty estimation, explanation and reduction. Estimation quantifies the variability in the model inputs and outputs. It can endow us to evaluate the model predictive confidence. Explanation provides a tool to interpret the mechanism of uncertainties and to pinpoint the potentials for uncertainty reduction, which focuses on stabilizing model training, especially when the data is insufficient. We hope that this thesis can motivate related studies on quantifying predictive uncertainties in deep learning. It also aims to raise awareness for other stakeholders in the fields of smart transportation and automated medical diagnosis where data insufficiency induces high uncertainty. The thesis is dissected into the following sections: Introduction. we justify the necessity to investigate AI uncertainties and clarify the challenges existed in the latest studies, followed by our research objective. Literature review. We break down the the review of the state-of-the-art methods into uncertainty estimation, explanation and reduction. We make comparisons with the related fields encompassing meta learning, anomaly detection, continual learning as well. Uncertainty estimation. We introduce a variational framework, neural process that approximates Gaussian processes to handle uncertainty estimation. Two variants from the neural process families are proposed to enhance neural processes with scalability and continual learning. Uncertainty explanation. We inspect the functional distribution of neural processes to discover the global and local factors that affect the degree of predictive uncertainties. Uncertainty reduction. We validate the proposed uncertainty framework on two scenarios: urban irregular behaviour detection and neurological disorder diagnosis, where the intrinsic data insufficiency undermines the performance of existing deep learning models. Conclusion. We provide promising directions for future works and conclude the thesis

    What Makes Hollywood Run? Capitalist Power, Risk and the Control of Social Creativity

    Get PDF
    This dissertation combines an interest in political economy, political theory and cinema to offer an answer about the pace of the Hollywood film business and its general modes of behaviour. More specifically, this dissertation seeks to find out how the largest Hollywood firms attempt to control social creativity such that the art of filmmaking and its related social relations under capitalism do not become financial risks in the pursuit of profit. Controlling the ways people make or watch films, the thesis argues, is an institutional facet of capitalist power. Capitalist powerthe ability to control, modify and, sometimes, limit social creation through the rights of ownershipis the foundation of capital accumulation. For the Hollywood film business, capitalist power is about the ability of business concerns to set the terms that mould the future of cinema. The overall objective of Part I is to outline and rectify some of the methodological problems that obscure our understanding of how capital is accumulated from culture. Marxism stands as the theoretical foil for this argument. Because Marxism defines capital such that only economic activity can create value, it needs to clearly distinguish between economics and politicsyet this is a distinction it is ultimately unable to make. With this backdrop in mind, Part I introduces the capital-as-power approach and uses it as a foundation to an alternative political economic theory of capitalism. The capital-as-power approach views capital not as an economic category, but as a category of power. Consequently, this approach reframes the accumulation of capital as a power process. Part II focuses on the Hollywood film business. It investigates how and to what extent major filmed entertainment attempts to accumulate capital by lowering its risk. The process of lowering risk has characterized Hollywoods orientation toward the social-historical character of cinema and mass culture. This push to lower risk has been most apparent since the 1980s. In recent decades, major filmed entertainment has used its oligopolistic control of distribution to institute an order of cinema based on several key strategies: saturation booking, blockbuster cinema and high-concept filmmaking

    A Defense of Pure Connectionism

    Full text link
    Connectionism is an approach to neural-networks-based cognitive modeling that encompasses the recent deep learning movement in artificial intelligence. It came of age in the 1980s, with its roots in cybernetics and earlier attempts to model the brain as a system of simple parallel processors. Connectionist models center on statistical inference within neural networks with empirically learnable parameters, which can be represented as graphical models. More recent approaches focus on learning and inference within hierarchical generative models. Contra influential and ongoing critiques, I argue in this dissertation that the connectionist approach to cognitive science possesses in principle (and, as is becoming increasingly clear, in practice) the resources to model even the most rich and distinctly human cognitive capacities, such as abstract, conceptual thought and natural language comprehension and production. Consonant with much previous philosophical work on connectionism, I argue that a core principle—that proximal representations in a vector space have similar semantic values—is the key to a successful connectionist account of the systematicity and productivity of thought, language, and other core cognitive phenomena. My work here differs from preceding work in philosophy in several respects: (1) I compare a wide variety of connectionist responses to the systematicity challenge and isolate two main strands that are both historically important and reflected in ongoing work today: (a) vector symbolic architectures and (b) (compositional) vector space semantic models; (2) I consider very recent applications of these approaches, including their deployment on large-scale machine learning tasks such as machine translation; (3) I argue, again on the basis mostly of recent developments, for a continuity in representation and processing across natural language, image processing and other domains; (4) I explicitly link broad, abstract features of connectionist representation to recent proposals in cognitive science similar in spirit, such as hierarchical Bayesian and free energy minimization approaches, and offer a single rebuttal of criticisms of these related paradigms; (5) I critique recent alternative proposals that argue for a hybrid Classical (i.e. serial symbolic)/statistical model of mind; (6) I argue that defending the most plausible form of a connectionist cognitive architecture requires rethinking certain distinctions that have figured prominently in the history of the philosophy of mind and language, such as that between word- and phrase-level semantic content, and between inference and association

    Annual Research Report 2020

    Get PDF

    Proceedings of minisemester on evolution of interfaces, Sapporo 2010

    Get PDF
    conf: Special Project A, Proceedings of minisemester on evolution of interfaces, Sapporo (Department of Mathematics, Hokkaido University, July 12- August 13, 2010

    Computer-aided detection and diagnosis of breast cancer in 2D and 3D medical imaging through multifractal analysis

    Get PDF
    This Thesis describes the research work performed in the scope of a doctoral research program and presents its conclusions and contributions. The research activities were carried on in the industry with Siemens S.A. Healthcare Sector, in integration with a research team. Siemens S.A. Healthcare Sector is one of the world biggest suppliers of products, services and complete solutions in the medical sector. The company offers a wide selection of diagnostic and therapeutic equipment and information systems. Siemens products for medical imaging and in vivo diagnostics include: ultrasound, computer tomography, mammography, digital breast tomosynthesis, magnetic resonance, equipment to angiography and coronary angiography, nuclear imaging, and many others. Siemens has a vast experience in Healthcare and at the beginning of this project it was strategically interested in solutions to improve the detection of Breast Cancer, to increase its competitiveness in the sector. The company owns several patents related with self-similarity analysis, which formed the background of this Thesis. Furthermore, Siemens intended to explore commercially the computer- aided automatic detection and diagnosis eld for portfolio integration. Therefore, with the high knowledge acquired by University of Beira Interior in this area together with this Thesis, will allow Siemens to apply the most recent scienti c progress in the detection of the breast cancer, and it is foreseeable that together we can develop a new technology with high potential. The project resulted in the submission of two invention disclosures for evaluation in Siemens A.G., two articles published in peer-reviewed journals indexed in ISI Science Citation Index, two other articles submitted in peer-reviewed journals, and several international conference papers. This work on computer-aided-diagnosis in breast led to innovative software and novel processes of research and development, for which the project received the Siemens Innovation Award in 2012. It was very rewarding to carry on such technological and innovative project in a socially sensitive area as Breast Cancer.No cancro da mama a deteção precoce e o diagnóstico correto são de extrema importância na prescrição terapêutica e caz e e ciente, que potencie o aumento da taxa de sobrevivência à doença. A teoria multifractal foi inicialmente introduzida no contexto da análise de sinal e a sua utilidade foi demonstrada na descrição de comportamentos siológicos de bio-sinais e até na deteção e predição de patologias. Nesta Tese, três métodos multifractais foram estendidos para imagens bi-dimensionais (2D) e comparados na deteção de microcalci cações em mamogramas. Um destes métodos foi também adaptado para a classi cação de massas da mama, em cortes transversais 2D obtidos por ressonância magnética (RM) de mama, em grupos de massas provavelmente benignas e com suspeição de malignidade. Um novo método de análise multifractal usando a lacunaridade tri-dimensional (3D) foi proposto para classi cação de massas da mama em imagens volumétricas 3D de RM de mama. A análise multifractal revelou diferenças na complexidade subjacente às localizações das microcalci cações em relação aos tecidos normais, permitindo uma boa exatidão da sua deteção em mamogramas. Adicionalmente, foram extraídas por análise multifractal características dos tecidos que permitiram identi car os casos tipicamente recomendados para biópsia em imagens 2D de RM de mama. A análise multifractal 3D foi e caz na classi cação de lesões mamárias benignas e malignas em imagens 3D de RM de mama. Este método foi mais exato para esta classi cação do que o método 2D ou o método padrão de análise de contraste cinético tumoral. Em conclusão, a análise multifractal fornece informação útil para deteção auxiliada por computador em mamogra a e diagnóstico auxiliado por computador em imagens 2D e 3D de RM de mama, tendo o potencial de complementar a interpretação dos radiologistas

    Compositional Linguistic Generalization in Artificial Neural Networks

    Get PDF
    Compositionality---the principle that the meaning of a complex expression is built from the meanings of its parts---is considered a central property of human language. This dissertation focuses on compositional generalization, a key benefit of compositionality that enables the production and comprehension of novel expressions. Specifically, this dissertation develops a test for compositional generalization for sequence-to-sequence artificial neural networks (ANNs). Before doing so, I start by developing a test for grammatical category abstraction: an important precondition to compositional generalization, because category membership determines the applicability of compositional rules. Then, I construct a test for compositional generalization based on human generalization patterns discussed in existing linguistic and developmental studies. The test takes the form of semantic parsing (translation from natural language expressions to semantic representations) where the training and generalization sets have systematic gaps that can be filled by composing known parts. The generalization cases fall into two broad categories: lexical and structural, depending on whether generalization to novel combinations of known lexical items and known structures is required, or generalization to novel structures is required. The ANNs evaluated on this test exhibit limited degrees of compositional generalization, implying that the inductive biases of the ANNs and human learners differ substantially. An error analysis reveals that all ANNs tested frequently make generalizations that violate faithfulness constraints (e.g., Emma saw Lina ↝ see'(Emma', Audrey') instead of see'(Emma', Lina')). Adding a glossing task (word-by-word translation)---a task that requires maximally faithful input-output mappings---as an auxiliary objective to the Transformer model (Vaswani et al. 2017) greatly improves generalization, demonstrating that a faithfulness bias can be injected through the auxiliary training approach. However, the improvement is limited to lexical generalization; all models struggle with assigning appropriate semantic representations to novel structures regardless of auxiliary training. This difficulty of structural generalization leaves open questions for both ANN and human learners. I discuss promising directions for improving structural generalization in ANNs, and furthermore propose an artificial language learning study for human subjects analogous to the tests posed to ANNs, which will lead to more detailed characterization of the patterns of structural generalization in human learners

    Tense-aspect processing in second language learners

    Get PDF
    This dissertation provides a language processing perspective on the study of second language acquisition (SLA) of tense and aspect. Of special interest are the universal vis-à-vis language- specific dimensions of temporal and aspectual semantics involved. According to the Aspect Hypothesis (AH, e.g. Andersen & Shirai, 1994), the initial acquisition and subsequent emergence of (perfective) past tense and progressive aspect morphology follow a semantic-driven, universal sequence. The AH appeals to a cognitive-based prototype account (Shirai & Andersen, 1995), and has gained ample empirical support from offline data in the past two decades. Mounting evidence of transfer, however, has begun to emerge in recent psycholinguistic research, suggesting that grammatical aspectual categories such as the English progressive have non-trivial influence on principles of information organization in language comprehension among L2 learners and bilingual speakers (Stutterheim & Carroll, 2006). This dissertation undertakes a psycholinguistic investigation of L2 learners’ processing of English past and progressive morphology. Participants included native English speakers as well as English L2 learners from Korean, German, and Mandarin Chinese backgrounds, whose L1s differ systematically with respect to past and progressive morphology. This cross-linguistic design enabled a systematic testing of both the prototype and transfer hypotheses in one single study. Three word-by-word self-paced reading experiments examined L2 learners’ automaticity in morphological processing, the universality of tense-aspect prototypes, and aspectual coercion. Experiment I generated evidence that L2 learners were generally capable of detecting tense- aspect morphosyntactic errors online. Reading time results from Experiment II revealed that L2 learners did not show uniform processing advantages afforded by tense-aspect prototypes. Instead, there exist L1 effects in prototypes, at least from evidence in processing L2 tense-aspect distinctions. Experiment III investigated the processing consequences of aspectual coercion in L2 learners, and results indicated strong L1 influence. The most robust finding across the three experiments is that the L2 learners showed clear L1-based variations in their performance, reflecting a strong tendency for transfer. Notably, these results were obtained after controlling for L2 proficiency and inflected verb form frequencies. A more prominent role of L1 influence is implicated in L2 learners’ representation of tense-aspect prototypes than previously assumed
    corecore