Search CORE

368 research outputs found

A unified framework for managing provenance information in translational research

Author: A Ayers
A Borgida
A Gangemi
AH Asiaee
Amit P Sheth
AP Chapman
B Smith
B Weatherly
C Aurrecoechea
CF Taylor
D Brickley
D Oberle
DL McGuinness
DL Wheeler
DLWD Martin
E Prud'ommeaux
E Sirin
G Klyne
HSU Parkinson
I Niles
J Pérez
J Widom
J Zhao
JR Hobbs
KKSM Muniswamy-Reddy
KLSE Eilbeck
L Chiticariu
M Ashburner
M Kanehisa
M Vardi
O Bodenreider
O Bodenreider
O Bodenreider
Olivier Bodenreider
P Buneman
P Hayes
P Hitzler
Priti Parikh
R Angles
RSK Mehra
Satya S Sahoo
SS Sahoo
SS Sahoo
SS Sahoo
SS Sahoo
SS Sahoo
T Lee
TJ Green
Todd Minning
V Cross
Vinh Nguyen
Y Cui
YL Simmhan
YR Wang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background A critical aspect of the NIH <it>Translational Research </it>roadmap, which seeks to accelerate the delivery of "bench-side" discoveries to patient's "bedside," is the management of the <it>provenance </it>metadata that keeps track of the origin and history of data resources as they traverse the path from the bench to the bedside and back. A comprehensive provenance framework is essential for researchers to verify the quality of data, reproduce scientific results published in peer-reviewed literature, validate scientific process, and associate trust value with data and results. Traditional approaches to provenance management have focused on only partial sections of the translational research life cycle and they do not incorporate "domain semantics", which is essential to support domain-specific querying and analysis by scientists. Results We identify a common set of challenges in managing provenance information across the <it>pre-publication </it>and <it>post-publication </it>phases of data in the translational research lifecycle. We define the semantic provenance framework (SPF), underpinned by the Provenir upper-level provenance ontology, to address these challenges in the four stages of provenance metadata: (a) Provenance collection - during data generation (b) Provenance representation - to support interoperability, reasoning, and incorporate domain semantics (c) Provenance storage and propagation - to allow efficient storage and seamless propagation of provenance as the data is transferred across applications (d) Provenance query - to support queries with increasing complexity over large data size and also support knowledge discovery applications We apply the SPF to two exemplar translational research projects, namely the Semantic Problem Solving Environment for <it>Trypanosoma cruzi </it>(<it>T.cruzi </it>SPSE) and the Biomedical Knowledge Repository (BKR) project, to demonstrate its effectiveness. Conclusions The SPF provides a unified framework to effectively manage provenance of translational research data during pre and post-publication phases. This framework is underpinned by an upper-level provenance ontology called Provenir that is extended to create domain-specific provenance ontologies to facilitate provenance interoperability, seamless propagation of provenance, automated querying, and analysis.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

CORE

“What if There's Something Wrong with Her?”‐How Biomedical Technologies Contribute to Epistemic Injustice in Healthcare

While there is a steadily growing literature on epistemic injustice in healthcare, there are few discussions of the role that biomedical technologies play in harming patients in their capacity as knowers. Through an analysis of newborn and pediatric genetic and genomic sequencing technologies (GSTs), I argue that biomedical technologies can lead to epistemic injustice through two primary pathways: epistemic capture and value partitioning. I close by discussing the larger ethical and political context of critical analyses of GSTs and their broader implications for just and equitable healthcare delivery

PhilPapers

Crossref

Challenges in the Analysis of Mass-Throughput Data: A Technical Commentary from the Statistical Machine Learning Perspective

Author: Aliferis Constantin F.
Statnikov Alexander
Tsamardinos Ioannis
Publication venue: Libertas Academica
Publication date: 01/01/2006
Field of study

Sound data analysis is critical to the success of modern molecular medicine research that involves collection and interpretation of mass-throughput data. The novel nature and high-dimensionality in such datasets pose a series of nontrivial data analysis problems. This technical commentary discusses the problems of over-fitting, error estimation, curse of dimensionality, causal versus predictive modeling, integration of heterogeneous types of data, and lack of standard protocols for data analysis. We attempt to shed light on the nature and causes of these problems and to outline viable methodological approaches to overcome them

Directory of Open Access Journals

PubMed Central

Evaluation of the current knowledge limitations in breast cancer research: a gap analysis

Author: A Cox
A Howell
A Kamb
A Maurice
A Moyer
A Renwick
A Rodger
Adrian Harris
AH Eliassen
AH Sims
Alastair Thompson
Angela Cox
Anthony Howell
C Julian-Reynier
C Kuperwasser
CA Purdie
CD Wagner
CH Kroenke
Charles Streuli
CJ Watson
CJ Watson
CM Perou
D Sachdev
DF Easton
Diana Harcourt
DJ Britton
DJ Hunter
DM Abd El-Rehim
DM Harcourt
DS Main
E Amir
E Katz
E Tiligada
G Bon
GP Gui
GW Sledge
H Goulding
HE Huang
Ingunn Holen
International Agency for Research on Cancer Expert Group
IO Ellis
J Bogaerts
J Brennan
J Brett
J Corner
J Debnath
J Polanowska
J Stingl
J Teulière
JJ Dignam
JK Camoriano
JL Jones
JM Bartlett
JM Dixon
JM Gee
JT Scott
Julia Gee
JW Jonker
K Polyak
K Rennstam
KA Green
KE Sleeman
Keith Brennan
KJ Fowler
KL Schwertfeger
KR Brennan
KW Kinzler
L Cortesi
L Fallowfield
L Hennighausen
L Larue
L Lostumbo
LL Northouse
LS Freedman
M Adams
M Harvie
M Harvie
M Shackleton
M Sidani
M Tanner
M Zhang
MD Holmes
MD Sternlicht
MD Sternlicht
Michael Steel
Michelle Harvie
MJ Piccart-Gebhart
ML Asselin-Labat
ML McNeely
ML McNeely
MM Ip
MO Leach
N Barnes
N Rahman
N Sarwar
NG Howlett
P Dent
P Schofield
PA Kenny
R DasGupta
R Doll
R Leake
R Scully
R Serra
RA Walker
RB Clarke
Robert Nicholson
RT Chlebowski
S Anand
S Hiscox
S Seal
S Stylianou
SA Bingham
SE Moody
SJ Cleator
SK Sharan
SN Stacey
SR Johnston
T Sheard
T Sjöblom
T Sorlie
The Breast Cancer Association Consortium
The CHEK2 Breast Cancer Case-Control Consortium
V Aranda
V Speirs
VA McCormack
VG Vogel
WD Foulkes
X Yu
YL Low
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

BACKGROUND A gap analysis was conducted to determine which areas of breast cancer research, if targeted by researchers and funding bodies, could produce the greatest impact on patients. METHODS Fifty-six Breast Cancer Campaign grant holders and prominent UK breast cancer researchers participated in a gap analysis of current breast cancer research. Before, during and following the meeting, groups in seven key research areas participated in cycles of presentation, literature review and discussion. Summary papers were prepared by each group and collated into this position paper highlighting the research gaps, with recommendations for action. RESULTS Gaps were identified in all seven themes. General barriers to progress were lack of financial and practical resources, and poor collaboration between disciplines. Critical gaps in each theme included: (1) genetics (knowledge of genetic changes, their effects and interactions); (2) initiation of breast cancer (how developmental signalling pathways cause ductal elongation and branching at the cellular level and influence stem cell dynamics, and how their disruption initiates tumour formation); (3) progression of breast cancer (deciphering the intracellular and extracellular regulators of early progression, tumour growth, angiogenesis and metastasis); (4) therapies and targets (understanding who develops advanced disease); (5) disease markers (incorporating intelligent trial design into all studies to ensure new treatments are tested in patient groups stratified using biomarkers); (6) prevention (strategies to prevent oestrogen-receptor negative tumours and the long-term effects of chemoprevention for oestrogen-receptor positive tumours); (7) psychosocial aspects of cancer (the use of appropriate psychosocial interventions, and the personal impact of all stages of the disease among patients from a range of ethnic and demographic backgrounds). CONCLUSION Through recommendations to address these gaps with future research, the long-term benefits to patients will include: better estimation of risk in families with breast cancer and strategies to reduce risk; better prediction of drug response and patient prognosis; improved tailoring of treatments to patient subgroups and development of new therapeutic approaches; earlier initiation of treatment; more effective use of resources for screening populations; and an enhanced experience for people with or at risk of breast cancer and their families. The challenge to funding bodies and researchers in all disciplines is to focus on these gaps and to drive advances in knowledge into improvements in patient care

Crossref

Online Research @ Cardiff

Springer - Publisher Connector

PubMed Central

UWE Bristol Research Repository

Oxford University Research Archive

The University of Manchester - Institutional Repository

University of Dundee Online Publications

White Rose Research Online

Data Mining

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Data mining is a branch of computer science that is used to automatically extract meaningful, useful knowledge and previously unknown, hidden, interesting patterns from a large amount of data to support the decision-making process. This book presents recent theoretical and practical advances in the field of data mining. It discusses a number of data mining methods, including classification, clustering, and association rule mining. This book brings together many different successful data mining studies in various areas such as health, banking, education, software engineering, animal science, and the environment

Directory of Open Access Books (DOAB)

The FAIR Guiding Principles for scientific data management and stewardship

Author: Aalbersberg I.J. (Ijsbrand Jan)
Appleton G. (Gabrielle)
Axton M. (Myles)
Baak A. (Arie)
Blomberg N. (Niklas)
Boiten J.W. (Jan-Willem)
Bourne P.E. (Philip)
Bouwman J. (Jildau)
Brookes A.J. (Anthony)
Clark T. (Tim)
Crosas M. (Mercè)
Dillo I. (Ingrid)
Dumon O. (Olivier)
Dumontier M. (Michel)
Edmunts S. (Scott)
Evelo C.T. (Chris)
Finkers R. (Richard)
Goble C.A. (Carole Ann)
Gonzalez-Beltran A. (Alejandra)
Gray A. (Alastair)
Grethe S. (Jeffrey)
Groth P. (Paul)
Heringa J. (Jaap)
Hoen P.A.C. (Peter) 't
Hooft R. (Rob)
Kok J. (Joost)
Kok R. (Ruben)
Kuhn T. (Tobias)
Lei J. (Johan) van der
Lusher S.J. (Scott)
Martone M.E. (Maryann)
Mons A. (Albert)
Mons B. (Barend)
Mulligen E.M. (Erik) van
Packer A. (Abel)
Persson B. (Bengt)
Roca-Serra P. (Philippe)
Roos M. (Marco)
Sansone S.A. (Susanna-Assunta)
Schaik R. (Rene) van
Schultes E. (Erik)
Sengstag T. (Thierry)
Silva Santos L.B. (Luiz Bonino) da
Slater T. (Ted)
Strawn G. (George)
Swertz M. (Morris)
Thompson M. (Mark)
Velterop J. (Jan)
Waagmeester A. (Andra)
Wilkinson J.M. (Mark)
Wittenburg P. (Peter)
Wolstencroft K. (Katherine)
Zhao J. (Jun)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/03/2016
Field of study

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community

Erasmus University Digital Repository

Competent Program Evolution, Doctoral Dissertation, December 2006

Author: Looks Moshe
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2006
Field of study

Heuristic optimization methods are adaptive when they sample problem solutions based on knowledge of the search space gathered from past sampling. Recently, competent evolutionary optimization methods have been developed that adapt via probabilistic modeling of the search space. However, their effectiveness requires the existence of a compact problem decomposition in terms of prespecified solution parameters. How can we use these techniques to effectively and reliably solve program learning problems, given that program spaces will rarely have compact decompositions? One method is to manually build a problem-specific representation that is more tractable than the general space. But can this process be automated? My thesis is that the properties of programs and program spaces can be leveraged as inductive bias to reduce the burden of manual representation-building, leading to competent program evolution. The central contributions of this dissertation are a synthesis of the requirements for competent program evolution, and the design of a procedure, meta-optimizing semantic evolutionary search (MOSES), that meets these requirements. In support of my thesis, experimental results are provided to analyze and verify the effectiveness of MOSES, demonstrating scalability and real-world applicability

Washington University St. Louis: Open Scholarship

Graphical Model approaches for Biclustering

Author: Denitto Matteo
Publication venue
Publication date: 01/01/2017
Field of study

In many scientific areas, it is crucial to group (cluster) a set of objects, based on a set of observed features. Such operation is widely known as Clustering and it has been exploited in the most different scenarios ranging from Economics to Biology passing through Psychology. Making a step forward, there exist contexts where it is crucial to group objects and simultaneously identify the features that allow to recognize such objects from the others. In gene expression analysis, for instance, the identification of subsets of genes showing a coherent pattern of expression in subsets of objects/samples can provide crucial information about active biological processes. Such information, which cannot be retrieved by classical clustering approaches, can be extracted with the so called Biclustering, a class of approaches which aim at simultaneously clustering both rows and columns of a given data matrix (where each row corresponds to a different object/sample and each column to a different feature). The problem of biclustering, also known as co-clustering, has been recently exploited in a wide range of scenarios such as Bioinformatics, market segmentation, data mining, text analysis and recommender systems. Many approaches have been proposed to address the biclustering problem, each one characterized by different properties such as interpretability, effectiveness or computational complexity. A recent trend involves the exploitation of sophisticated computational models (Graphical Models) to face the intrinsic complexity of biclustering, and to retrieve very accurate solutions. Graphical Models represent the decomposition of a global objective function to analyse in a set of smaller/local functions defined over a subset of variables. The advantages in using Graphical Models relies in the fact that the graphical representation can highlight useful hidden properties of the considered objective function, plus, the analysis of smaller local problems can be dealt with less computational effort. Due to the difficulties in obtaining a representative and solvable model, and since biclustering is a complex and challenging problem, there exist few promising approaches in literature based on Graphical models facing biclustering. 3 This thesis is inserted in the above mentioned scenario and it investigates the exploitation of Graphical Models to face the biclustering problem. We explored different type of Graphical Models, in particular: Factor Graphs and Bayesian Networks. We present three novel algorithms (with extensions) and evaluate such techniques using available benchmark datasets. All the models have been compared with the state-of-the-art competitors and the results show that Factor Graph approaches lead to solid and efficient solutions for dataset of contained dimensions, whereas Bayesian Networks can manage huge datasets, with the overcome that setting the parameters can be not trivial. As another contribution of the thesis, we widen the range of biclustering applications by studying the suitability of these approaches in some Computer Vision problems where biclustering has been never adopted before. Summarizing, with this thesis we provide evidence that Graphical Model techniques can have a significant impact in the biclustering scenario. Moreover, we demonstrate that biclustering techniques are ductile and can produce effective solutions in the most different fields of applications

Catalogo dei prodotti della ricerca

A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency

BackgroundOncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials. Currently, there is a paucity of reliable genomic reference samples having a suitably large number of pre-identified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyze ten diverse cancer cell lines individually and their pool, termed Sample A, to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance.ResultsIn reference Sample A, we identify more than 40,000 variants down to 1% allele frequency with more than 25,000 variants having less than 20% allele frequency with 1653 variants in COSMIC-related genes. This is 5-100x more than existing commercially available samples. We also identify an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection, sensitivity, and precision. Over 300 loci are randomly selected and independently verified via droplet digital PCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower allele frequency than what exists in Sample A natively, including known variants having allele frequency of 0.02%, a range suitable for assessing liquid biopsy panels.ConclusionThese new reference samples and their admixtures provide superior capability for performing oncopanel quality control, analytical accuracy, and validation for small to large oncopanels and liquid biopsy assays.Peer reviewe

Helsingin yliopiston digitaalinen arkisto