39,800 research outputs found

    Probabilistic Model Incorporating Auxiliary Covariates to Control FDR

    Full text link
    Controlling False Discovery Rate (FDR) while leveraging the side information of multiple hypothesis testing is an emerging research topic in modern data science. Existing methods rely on the test-level covariates while ignoring metrics about test-level covariates. This strategy may not be optimal for complex large-scale problems, where indirect relations often exist among test-level covariates and auxiliary metrics or covariates. We incorporate auxiliary covariates among test-level covariates in a deep Black-Box framework controlling FDR (named as NeurT-FDR) which boosts statistical power and controls FDR for multiple-hypothesis testing. Our method parametrizes the test-level covariates as a neural network and adjusts the auxiliary covariates through a regression framework, which enables flexible handling of high-dimensional features as well as efficient end-to-end optimization. We show that NeurT-FDR makes substantially more discoveries in three real datasets compared to competitive baselines.Comment: Short Version of NeurT-FDR, accepted at CIKM 2022. arXiv admin note: substantial text overlap with arXiv:2101.0980

    Optimal model-free prediction from multivariate time series

    Get PDF
    Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal pre-selection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used sub-optimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Ni\~no Southern Oscillation.Comment: 14 pages, 9 figure

    Alat Uji Sinyal Ultrasonik dan Tegangan Baterai pada Underwater Locator Beacon

    Full text link
    Setiap pesawat terbang mempunyai flight data recorder (FDR) dan setiap kapal laut memiliki voyage data recorder (VDR) yang sering diterjemahkan oleh kalangan umum sebagai kotak hitam atau black box. Sehingga seandainya pesawat jatuh di air atau kapal tenggelam, maka FDR atau VDR tersebut menjadi suatu barang bukti yang amat penting untuk mengetahui penyebab kecelakaan dan bagaimana menghindarinya pada masa mendatang. Untuk menemukan FDR atau VDR tersebut di dalam air yang amat luas diperlukan sebuah perangkat khususdinamakan Underwater Locator Beacon atau ULB – yang akan mengeluarkan sinyal di dalam air secara otomatis saat terendam dalam air. Keberadaan ULB yang berfungsi dengan baik merupakan faktor utama penemuan kotak hitam dengan mudah. ULB adalah sebuah perangkat yang dipasang ditempat perekaman data pada FDR maupun VDR. Di dalam air ULB akan mengeluarkan sinyal ultrasonik dengan frekuensi 37,5 kHz ±1 kHz dengan durasi 0,01 detik dalam interval 1 detik. Metode test untuk ULB dengan cara ULB akan dimasukan ke dalam lubang alat test yang telah disediakan. Pada awalnya alat akan mengecek tegangan pada baterai ULB setelah dilakukan pengujian baterai ULB maka selanjutnya adalah menguji frekuensi yang dikeluarkan oleh ULB. Setelah melakukan pengujian tersebut hasil akan ditampilkan dalam bentuk tulisan pada Liquid Crystal Display (LCD), nyala lampu pada Light Emitting Diode (LED) dan suara pada Buzzer. Hasil pengujian dari skripsi ini, menunjukkan bahwa bagian mulai dari ultrasound, mikrokontroler ATMega88PA, DC to DC Converter bekerja dengan baik, namun dari hasil pengujian menunjukan bahwa pengukuran tegangan kurang presisi. Hal ini terjadi karena tegangan referensi yang berasal dari DC to DC Converter kurang stabil

    Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

    Get PDF
    We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

    A foundation for provitamin A biofortification of maize: genome-wide association and genomic prediction models of carotenoid levels.

    Get PDF
    Efforts are underway for development of crops with improved levels of provitamin A carotenoids to help combat dietary vitamin A deficiency. As a global staple crop with considerable variation in kernel carotenoid composition, maize (Zea mays L.) could have a widespread impact. We performed a genome-wide association study (GWAS) of quantified seed carotenoids across a panel of maize inbreds ranging from light yellow to dark orange in grain color to identify some of the key genes controlling maize grain carotenoid composition. Significant associations at the genome-wide level were detected within the coding regions of zep1 and lut1, carotenoid biosynthetic genes not previously shown to impact grain carotenoid composition in association studies, as well as within previously associated lcyE and crtRB1 genes. We leveraged existing biochemical and genomic information to identify 58 a priori candidate genes relevant to the biosynthesis and retention of carotenoids in maize to test in a pathway-level analysis. This revealed dxs2 and lut5, genes not previously associated with kernel carotenoids. In genomic prediction models, use of markers that targeted a small set of quantitative trait loci associated with carotenoid levels in prior linkage studies were as effective as genome-wide markers for predicting carotenoid traits. Based on GWAS, pathway-level analysis, and genomic prediction studies, we outline a flexible strategy involving use of a small number of genes that can be selected for rapid conversion of elite white grain germplasm, with minimal amounts of carotenoids, to orange grain versions containing high levels of provitamin A
    • …
    corecore