40,018 research outputs found
QueryOR: a comprehensive web platform for genetic variant analysis and prioritization
Background: Whole genome and exome sequencing are contributing to the extraordinary progress in the study of
human genetic variants. In this fast developing field, appropriate and easily accessible tools are required to facilitate
data analysis.
Results: Here we describe QueryOR, a web platform suitable for searching among known candidate genes as well
as for finding novel gene-disease associations. QueryOR combines several innovative features that make it comprehensive,
flexible and easy to use. Instead of being designed on specific datasets, it works on a general XML schema specifying
formats and criteria of each data source. Thanks to this flexibility, new criteria can be easily added for future
expansion. Currently, up to 70 user-selectable criteria are available, including a wide range of gene and variant features.
Moreover, rather than progressively discarding variants taking one criterion at a time, the prioritization is achieved by a
global positive selection process that considers all transcript isoforms, thus producing reliable results. QueryOR is easy
to use and its intuitive interface allows to handle different kinds of inheritance as well as features related to sharing
variants in different patients. QueryOR is suitable for investigating single patients, families or cohorts.
Conclusions: QueryOR is a comprehensive and flexible web platform eligible for an easy user-driven variant
prioritization. It is freely available for academic institutions at http://queryor.cribi.unipd.it/
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
XML Schema Clustering with Semantic and Hierarchical Similarity Measures
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis
- âŠ