28 research outputs found

    Finding structure in language

    Get PDF
    Since the Chomskian revolution, it has become apparent that natural language is richly structured, being naturally represented hierarchically, and requiring complex context sensitive rules to define regularities over these representations. It is widely assumed that the richness of the posited structure has strong nativist implications for mechanisms which might learn natural language, since it seemed unlikely that such structures could be derived directly from the observation of linguistic data (Chomsky 1965).This thesis investigates the hypothesis that simple statistics of a large, noisy, unlabelled corpus of natural language can be exploited to discover some of the structure which exists in natural language automatically. The strategy is to initially assume no knowledge of the structures present in natural language, save that they might be found by analysing statistical regularities which pertain between a word and the words which typically surround it in the corpus.To achieve this, various statistical methods are applied to define similarity between statistical distributions, and to infer a structure for a domain given knowledge of the similarities which pertain within it. Using these tools, it is shown that it is possible to form a hierarchical classification of many domains, including words in natural language. When this is done, it is shown that all the major syntactic categories can be obtained, and the classification is both relatively complete, and very much in accord with a standard linguistic conception of how words are classified in natural language.Once this has been done, the categorisation derived is used as the basis of a similar classification of short sequences of words. If these are analysed in a similar way, then several syntactic categories can be derived. These include simple noun phrases, various tensed forms of verbs, and simple prepositional phrases. Once this has been done, the same technique can be applied one level higher, and at this level simple sentences and verb phrases, as well as more complicated noun phrases and prepositional phrases, are shown to be derivable

    Genome Sequence of Striga asiatica Provides Insight into the Evolution of Plant Parasitism

    Get PDF
    Parasitic plants in the genus Striga, commonly known as witchweeds, cause major crop losses in sub-Saharan Africa and pose a threat to agriculture worldwide. An understanding of Striga parasite biology, which could lead to agricultural solutions, has been hampered by the lack of genome information. Here, we report the draft genome sequence of Striga asiatica with 34,577 predicted protein-coding genes, which reflects gene family contractions and expansions that are consistent with a three-phase model of parasitic plant genome evolution. Striga seeds germinate in response to host-derived strigolactones (SLs) and then develop a specialized penetration structure, the haustorium, to invade the host root. A family of SL receptors has undergone a striking expansion, suggesting a molecular basis for the evolution of broad host range among Striga spp. We found that genes involved in lateral root development in non-parasitic model species are coordinately induced during haustorium development in Striga, suggesting a pathway that was partly co-opted during the evolution of the haustorium. In addition, we found evidence for horizontal transfer of host genes as well as retrotransposons, indicating gene flow to S. asiatica from hosts. Our results provide valuable insights into the evolution of parasitism and a key resource for the future development of Striga control strategies.Peer reviewe

    A machine learning approach to the unsupervised segmentation of mitochondria in subcellular electron microscopy data

    Get PDF
    Recent advances in cellular and subcellular microscopy demonstrated its potential towards unravelling the mechanisms of various diseases at the molecular level. The biggest challenge in both human- and computer-based visual analysis of micrographs is the variety of nanostructures and mitochondrial morphologies. The state-of-the-art is, however, dominated by supervised manual data annotation and early attempts to automate the segmentation process were based on supervised machine learning techniques which require large datasets for training. Given a minimal number of training sequences or none at all, unsupervised machine learning formulations, such as spectral dimensionality reduction, are known to be superior in detecting salient image structures. This thesis presents three major contributions developed around the spectral clustering framework which is proven to capture perceptual organization features. Firstly, we approach the problem of mitochondria localization. We propose a novel grouping method for the extracted line segments which describes the normal mitochondrial morphology. Experimental findings show that the clusters obtained successfully model the inner mitochondrial membrane folding and therefore can be used as markers for the subsequent segmentation approaches. Secondly, we developed an unsupervised mitochondria segmentation framework. This method follows the evolutional ability of human vision to extrapolate salient membrane structures in a micrograph. Furthermore, we designed robust non-parametric similarity models according to Gestaltic laws of visual segregation. Experiments demonstrate that such models automatically adapt to the statistical structure of the biological domain and return optimal performance in pixel classification tasks under the wide variety of distributional assumptions. The last major contribution addresses the computational complexity of spectral clustering. Here, we introduced a new anticorrelation-based spectral clustering formulation with the objective to improve both: speed and quality of segmentation. The experimental findings showed the applicability of our dimensionality reduction algorithm to very large scale problems as well as asymmetric, dense and non-Euclidean datasets

    The 11th Conference of PhD Students in Computer Science

    Get PDF

    Sense and Respond

    Get PDF
    Over the past century, the manufacturing industry has undergone a number of paradigm shifts: from the Ford assembly line (1900s) and its focus on efficiency to the Toyota production system (1960s) and its focus on effectiveness and JIDOKA; from flexible manufacturing (1980s) to reconfigurable manufacturing (1990s) (both following the trend of mass customization); and from agent-based manufacturing (2000s) to cloud manufacturing (2010s) (both deploying the value stream complexity into the material and information flow, respectively). The next natural evolutionary step is to provide value by creating industrial cyber-physical assets with human-like intelligence. This will only be possible by further integrating strategic smart sensor technology into the manufacturing cyber-physical value creating processes in which industrial equipment is monitored and controlled for analyzing compression, temperature, moisture, vibrations, and performance. For instance, in the new wave of the ‘Industrial Internet of Things’ (IIoT), smart sensors will enable the development of new applications by interconnecting software, machines, and humans throughout the manufacturing process, thus enabling suppliers and manufacturers to rapidly respond to changing standards. This reprint of “Sense and Respond” aims to cover recent developments in the field of industrial applications, especially smart sensor technologies that increase the productivity, quality, reliability, and safety of industrial cyber-physical value-creating processes

    Combining SOA and BPM Technologies for Cross-System Process Automation

    Get PDF
    This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

    A Molecular Anthropological Study of Altaian Histories Utilizing Population Genetics and Phylogeography

    Get PDF
    This dissertation explores the genetic histories of several populations living in the Altai Republic of Russia. It employs an approach combining methods from population genetics and phylogeography to characterize genetic diversity in these populations, and places the results in a molecular anthropological context. Previously, researchers used anthropological, historical, ethnographic and linguistic evidence to categorize the indigenous inhabitants of the Altai into two groups – northern and southern Altaians. Genetic data obtained in this study were therefore used to determine whether these anthropological groupings resulted from historical processes involving different source populations, and if the observed geographical and anthropological separation between northern and southern Altaians also represented a genetic boundary between them. These comparisons were made by examining mitochondrial DNA (mtDNA) coding region single nucleotide polymorphisms (SNPs), control region sequences (including HVS1), and several complete mitochondrial genomes. Variation in the non-recombining portion of the Y-chromosome (NRY) was characterized with biallelic markers and short tandem repeat (STR) haplotypes. Overall, this work provided a high-resolution data set for both unipaternally inherited genetic marker systems. The resulting data were analyzed using both population genetic and phylogeographic methods. Northern Altaians (Chelkan, Kumandin and Tubalar) were distinctive from the southern Altaians (Altai-kizhi) with both genetic systems, yet the Tubalar consistently showed evidence of admixture with southern Altaians, reflecting differences in the origin and population history of northern and southern groups as well as between ethnic northern Altaian populations. These results complement the observation of cultural differences as noted by anthropological/ethnographic research on Altaian populations. These differences likely reinforced and maintained the genetic differences between ethnic groups (i.e., a cultural barrier to genetic exchange). Therefore, biological and cultural lines of evidence suggest separate origins for northern and southern Altaians. Phylogeographic analysis of mtDNA and NRY haplotypes examined the impact of different historical events on genetic diversity in Altaians, including Neolithic expansions, the introduction of Kurgan cultures, the spread of Altaic-speakers, and the intrusion of the Mongol Empire. These insights also allowed for a greater understanding of the peopling of Siberia itself. The cultures of Altaian peoples ultimately helped to shape their current genetic variation

    The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE)

    Get PDF
    corecore