61,378 research outputs found

    Holistic corpus-based dialectology

    Get PDF
    This paper is concerned with sketching future directions for corpus-based dialectology. We advocate a holistic approach to the study of geographically conditioned linguistic variability, and we present a suitable methodology, 'corpusbased dialectometry', in exactly this spirit. Specifically, we argue that in order to live up to the potential of the corpus-based method, practitioners need to (i) abandon their exclusive focus on individual linguistic features in favor of the study of feature aggregates, (ii) draw on computationally advanced multivariate analysis techniques (such as multidimensional scaling, cluster analysis, and principal component analysis), and (iii) aid interpretation of empirical results by marshalling state-of-the-art data visualization techniques. To exemplify this line of analysis, we present a case study which explores joint frequency variability of 57 morphosyntax features in 34 dialects all over Great Britain

    Fish assemblages and indicator species: reef fishes off the southeastern United States

    Get PDF
    For many fish stocks, resource management cannot be based on stock assessment because data are insufficient-a situation that requires alternative approaches to management. One possible approach is to manage data-limited stocks as part of an assemblage and to determine the status of the entire unit by a data-rich indicator species. The utility of this approach was evaluated in analyses of 15 years of commercial and 34 years of recreational logbook data from reef fisheries off the southeastern United States coast. Multivariate statistical analyses successfully revealed three primary assemblages. Within assemblages, however, there was little evidence of synchrony in population dynamics of member species, and thus, no support for the use of indicator species. Nonetheless, assemblages could prove useful as management units. Their identification offers opportunities for implementing management to address such ecological considerations as bycatch and species interrelations

    Measuring Global Similarity between Texts

    Get PDF
    We propose a new similarity measure between texts which, contrary to the current state-of-the-art approaches, takes a global view of the texts to be compared. We have implemented a tool to compute our textual distance and conducted experiments on several corpuses of texts. The experiments show that our methods can reliably identify different global types of texts.Comment: Submitted to SLSP 201

    The Inferred Cardiogenic Gene Regulatory Network in the Mammalian Heart

    Get PDF
    Cardiac development is a complex, multiscale process encompassing cell fate adoption, differentiation and morphogenesis. To elucidate pathways underlying this process, a recently developed algorithm to reverse engineer gene regulatory networks was applied to time-course microarray data obtained from the developing mouse heart. Approximately 200 genes of interest were input into the algorithm to generate putative network topologies that are capable of explaining the experimental data via model simulation. To cull specious network interactions, thousands of putative networks are merged and filtered to generate scale-free, hierarchical networks that are statistically significant and biologically relevant. The networks are validated with known gene interactions and used to predict regulatory pathways important for the developing mammalian heart. Area under the precision-recall curve and receiver operator characteristic curve are 9% and 58%, respectively. Of the top 10 ranked predicted interactions, 4 have already been validated. The algorithm is further tested using a network enriched with known interactions and another depleted of them. The inferred networks contained more interactions for the enriched network versus the depleted network. In all test cases, maximum performance of the algorithm was achieved when the purely data-driven method of network inference was combined with a data-independent, functional-based association method. Lastly, the network generated from the list of approximately 200 genes of interest was expanded using gene-profile uniqueness metrics to include approximately 900 additional known mouse genes and to form the most likely cardiogenic gene regulatory network. The resultant network supports known regulatory interactions and contains several novel cardiogenic regulatory interactions. The method outlined herein provides an informative approach to network inference and leads to clear testable hypotheses related to gene regulation

    Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires

    Full text link
    The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the quantitative and molecular-level profiling of immune repertoires thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity in order to understand the dynamics of adaptive immunity. Here, we review the current research on (i) diversity, (ii) clustering and network, (iii) phylogenetic and (iv) machine learning methods applied to dissect, quantify and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology towards coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics.Comment: 27 pages, 2 figure
    • 

    corecore