31,371 research outputs found

    Protein sectors: statistical coupling analysis versus conservation

    Full text link
    Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed "sectors". The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation.Comment: 36 pages, 17 figure

    t-Exponential Memory Networks for Question-Answering Machines

    Full text link
    Recent advances in deep learning have brought to the fore models that can make multiple computational steps in the service of completing a task; these are capable of describ- ing long-term dependencies in sequential data. Novel recurrent attention models over possibly large external memory modules constitute the core mechanisms that enable these capabilities. Our work addresses learning subtler and more complex underlying temporal dynamics in language modeling tasks that deal with sparse sequential data. To this end, we improve upon these recent advances, by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network parameters as latent variables with a prior distribution imposed over them. Our statistical assumptions go beyond the standard practice of postulating Gaussian priors. Indeed, to allow for handling outliers, which are prevalent in long observed sequences of multivariate data, multivariate t-exponential distributions are imposed. On this basis, we proceed to infer corresponding posteriors; these can be used for inference and prediction at test time, in a way that accounts for the uncertainty in the available sparse training data. Specifically, to allow for our approach to best exploit the merits of the t-exponential family, our method considers a new t-divergence measure, which generalizes the concept of the Kullback-Leibler divergence. We perform an extensive experimental evaluation of our approach, using challenging language modeling benchmarks, and illustrate its superiority over existing state-of-the-art techniques

    Replication and discovery of musculoskeletal QTLs in LG/J and SM/J advanced intercross lines

    Get PDF
    AR056280 awarded to DAB and AL. AIHC supported by IMS and Elphinstone Scholarship from the University of Aberdeen. GRV supported by Medical Research Scotland (Vac-929-2016).Peer reviewedPublisher PD

    Handbook Of Liquid Crystal Research

    Get PDF

    K2P2^2 - A photometry pipeline for the K2 mission

    Full text link
    With the loss of a second reaction wheel, resulting in the inability to point continuously and stably at the same field of view, the NASA Kepler satellite recently entered a new mode of observation known as the K2 mission. The data from this redesigned mission present a specific challenge; the targets systematically drift in position on a ~6 hour time scale, inducing a significant instrumental signal in the photometric time series --- this greatly impacts the ability to detect planetary signals and perform asteroseismic analysis. Here we detail our version of a reduction pipeline for K2 target pixel data, which automatically: defines masks for all targets in a given frame; extracts the target's flux- and position time series; corrects the time series based on the apparent movement on the CCD (either in 1D or 2D) combined with the correction of instrumental and/or planetary signals via the KASOC filter (Handberg & Lund 2014), thus rendering the time series ready for asteroseismic analysis; computes power spectra for all targets, and identifies potential contaminations between targets. From a test of our pipeline on a sample of targets from the K2 campaign 0, the recovery of data for multiple targets increases the amount of potential light curves by a factor 10{\geq}10. Our pipeline could be applied to the upcoming TESS (Ricker et al. 2014) and PLATO 2.0 (Rauer et al. 2013) missions.Comment: 14 pages, 20 figures, Accepted for publication in The Astrophysical Journal (Apj

    Multi-document Summarization Based on Sentence Clustering Improved Using Topic Words

    Full text link
    Informasi dalam bentuk teks berita telah menjadi salah satu komoditas yang paling penting dalam era informasi ini. Ada banyak berita yang dihasilkan sehari-hari, tetapi berita-berita ini sering memberikan konten kontekstual yang sama dengan narasi berbeda. Oleh karena itu, diperlukan metode untuk mengumpulkan informasi ini ke dalam ringkasan sederhana. Di antara sejumlah subtugas yang terlibat dalam peringkasan multi-dokumen termasuk ekstraksi kalimat, deteksi topik, ekstraksi kalimat representatif, dan kalimat rep-resentatif. Dalam tulisan ini, kami mengusulkan metode baru untuk merepresentasikan kalimat ber-dasarkan kata kunci dari topic teks menggunakan Latent Dirichlet Allocation (LDA). Metode ini terdiri dari tiga langkah dasar. Pertama, kami mengelompokkan kalimat di set dokumen menggunakan kesamaan histogram pengelompokan (SHC). Selanjutnya, peringkat cluster menggunakan klaster penting. Terakhir, kalimat perwakilan yang dipilih oleh topik diidentifikasi pada LDA. Metode yang diusulkan diuji pada dataset DUC2004. Hasil penelitian menunjukkan rata-rata 0,3419 dan 0,0766 untuk ROUGE-1 dan ROUGE-2, masing-masing. Selain itu, dari pembaca prespective, metode kami diusulkan menyajikan pengaturan yang koheren dan baik dalam memesan kalimat representatif, sehingga dapat mempermudah pemahaman bacaan dan mengurangi waktu yang dibutuhkan untuk membaca ringkasan

    Correspondence matching with modal clusters

    Get PDF
    The modal correspondence method of Shapiro and Brady aims to match point-sets by comparing the eigenvectors of a pairwise point proximity matrix. Although elegant by means of its matrix representation, the method is notoriously susceptible to differences in the relational structure of the point-sets under consideration. In this paper, we demonstrate how the method can be rendered robust to structural differences by adopting a hierarchical approach. To do this, we place the modal matching problem in a probabilistic setting in which the correspondences between pairwise clusters can be used to constrain the individual point correspondences. We demonstrate the utility of the method on a number of synthetic and real-world point-pattern matching problems
    corecore