11 research outputs found
A step towards a reinforcement learning de novo genome assembler
The use of reinforcement learning has proven to be very promising for solving
complex activities without human supervision during their learning process.
However, their successful applications are predominantly focused on fictional
and entertainment problems - such as games. Based on the above, this work aims
to shed light on the application of reinforcement learning to solve this
relevant real-world problem, the genome assembly. By expanding the only
approach found in the literature that addresses this problem, we carefully
explored the aspects of intelligent agent learning, performed by the Q-learning
algorithm, to understand its suitability to be applied in scenarios whose
characteristics are more similar to those faced by real genome projects. The
improvements proposed here include changing the previously proposed reward
system and including state space exploration optimization strategies based on
dynamic pruning and mutual collaboration with evolutionary computing. These
investigations were tried on 23 new environments with larger inputs than those
used previously. All these environments are freely available on the internet
for the evolution of this research by the scientific community. The results
suggest consistent performance progress using the proposed improvements,
however, they also demonstrate the limitations of them, especially related to
the high dimensionality of state and action spaces. We also present, later, the
paths that can be traced to tackle genome assembly efficiently in real
scenarios considering recent, successfully reinforcement learning applications
- including deep reinforcement learning - from other domains dealing with
high-dimensional inputs
Atrial fibrillation genetic risk differentiates cardioembolic stroke from other stroke subtypes
AbstractObjectiveWe sought to assess whether genetic risk factors for atrial fibrillation can explain cardioembolic stroke risk.MethodsWe evaluated genetic correlations between a prior genetic study of AF and AF in the presence of cardioembolic stroke using genome-wide genotypes from the Stroke Genetics Network (N = 3,190 AF cases, 3,000 cardioembolic stroke cases, and 28,026 referents). We tested whether a previously-validated AF polygenic risk score (PRS) associated with cardioembolic and other stroke subtypes after accounting for AF clinical risk factors.ResultsWe observed strong correlation between previously reported genetic risk for AF, AF in the presence of stroke, and cardioembolic stroke (Pearson’s r=0.77 and 0.76, respectively, across SNPs with p < 4.4 × 10−4 in the prior AF meta-analysis). An AF PRS, adjusted for clinical AF risk factors, was associated with cardioembolic stroke (odds ratio (OR) per standard deviation (sd) = 1.40, p = 1.45×10−48), explaining ∼20% of the heritable component of cardioembolic stroke risk. The AF PRS was also associated with stroke of undetermined cause (OR per sd = 1.07, p = 0.004), but no other primary stroke subtypes (all p > 0.1).ConclusionsGenetic risk for AF is associated with cardioembolic stroke, independent of clinical risk factors. Studies are warranted to determine whether AF genetic risk can serve as a biomarker for strokes caused by AF.</jats:sec
Kleber Padovani's Quick Files
The Quick Files feature was discontinued and it’s files were migrated into this Project on March 11, 2022. The file URL’s will still resolve properly, and the Quick Files logs are available in the Project’s Recent Activity
A dataset of multi-functional ecological traits of Brazilian bees (data validation)
This project contains source codes used to validate data provided by https://doi.org/10.6084/m9.figshare.7100525.v4 as well as the instructions to reproduce all executed validations
geneRFinder: gene finding in distinct metagenomic data complexities
Script files, model, datasets and supplementary dat
A metagenomic survey of soil microbial communities: script files, read quality reports and supplementary sample metadata
This project contains supplementary files used to produce the data presented in the paper entitled A metagenomic survey of soil microbial communities along a rehabilitation chronosequence after iron ore mining
Machine learning meets genome assembly
International audienceMotivation: With the recent advances in DNA sequencing technologies, the study of the genetic composition of living organisms has become more accessible for researchers. Several advances have been achieved because of it, especially in the health sciences. However, many challenges which emerge from the complexity of sequencing projects remain unsolved. Among them is the task of assembling DNA fragments from previously unsequenced organisms, which is classified as an NP-hard (nondeterministic polynomial time hard) problem, for which no efficient computational solution with reasonable execution time exists. However, several tools that produce approximate solutions have been used with results that have facilitated scientific discoveries, although there is ample room for improvement. As with other NP-hard problems, machine learning algorithms have been one of the approaches used in recent years in an attempt to find better solutions to the DNA fragment assembly problem, although still at a low scale.Results: This paper presents a broad review of pioneering literature comprising artificial intelligence-based DNA assemblers—particularly the ones that use machine learning—to provide an overview of state-of-the-art approaches and to serve as a starting point for further study in this field
Machine learning meets genome assembly
International audienceMotivation: With the recent advances in DNA sequencing technologies, the study of the genetic composition of living organisms has become more accessible for researchers. Several advances have been achieved because of it, especially in the health sciences. However, many challenges which emerge from the complexity of sequencing projects remain unsolved. Among them is the task of assembling DNA fragments from previously unsequenced organisms, which is classified as an NP-hard (nondeterministic polynomial time hard) problem, for which no efficient computational solution with reasonable execution time exists. However, several tools that produce approximate solutions have been used with results that have facilitated scientific discoveries, although there is ample room for improvement. As with other NP-hard problems, machine learning algorithms have been one of the approaches used in recent years in an attempt to find better solutions to the DNA fragment assembly problem, although still at a low scale.Results: This paper presents a broad review of pioneering literature comprising artificial intelligence-based DNA assemblers—particularly the ones that use machine learning—to provide an overview of state-of-the-art approaches and to serve as a starting point for further study in this field
Rastreamento do caruncho do bambu usando fluxo óptico
Neste artigo é proposta a utilização da técnica de fluxo óptico, em particular, o algoritmo de [Lucas e Kanade 1981] para rastreamento do carunchos-do-bambu (Dinoderus minutus) em ambiente experimental. Analisou-se, através da distância Euclidiana, qual tamanho de janela mostrase mais adequado para detecção do inseto. Conclui-se que a janela do filtro de tamanho 2 oferece o melhore resultado
Atrial fibrillation genetic risk differentiates cardioembolic stroke from other stroke subtypes
Objective: We sought to assess whether genetic risk factors for atrial fibrillation (AF) can explain cardioembolic stroke risk. Methods: We evaluated genetic correlations between a previous genetic study of AF and AF in the presence of cardioembolic stroke using genome-wide genotypes from the Stroke Genetics Network (N = 3,190 AF cases, 3,000 cardioembolic stroke cases, and 28,026 referents). We tested whether a previously validated AF polygenic risk score (PRS) associated with cardioembolic and other stroke subtypes after accounting for AF clinical risk factors. Results: We observed a strong correlation between previously reported genetic risk for AF, AF in the presence of stroke, and cardioembolic stroke (Pearson r = 0.77 and 0.76, respectively, across SNPs with p 0.1). Conclusions: Genetic risk of AF is associated with cardioembolic stroke, independent of clinical risk factors. Studies are warranted to determine whether AF genetic risk can serve as a biomarker for strokes caused by AF