221 research outputs found

    Big data: the potential role of research data management and research data registries

    Get PDF
    Universities generate and hold increasingly vast quantities of research data – both in the form of large, well-structured datasets but more often in the form of a long tail of small, distributed datasets which collectively amount to ‘Big Data’ and offer significant potential for reuse. However, unlike big data, these collections of small data are often less well curated and are usually very difficult to find thereby reducing their potential reuse value. The Digital Curation Centre (DCC) works to support UK universities to better manage and expose their research data so that its full value may be realised. With a focus on tapping into this long tail of small data, this presentation will cover two main DCC, services: DMPonline which helps researchers to identify potentially valuable research data and to plan for its longer-term retention and reuse; and the UK pilot research data registry and discovery service (RDRDS) which will help to ensure that research data produced in UK HEIs can be found, understood, and reused. Initially we will introduce participants to the role of data management planning to open up dialogue between researchers and library services to ensure potentially valuable research data are managed appropriately and made available for reuse where feasible. DMPs provide institutions with valuable insights into the scale of their data holdings, highlight any ethical and legal requirements that need to be met, and enable planning for dissemination and reuse. We will also introduce the DCC’s DMPonline, a tool to help researchers write DMPs, which can be customised by institutions and integrated with other systems to simplify and enhance the management and reuse of data. In the second part of the presentation we will focus on making selected research data more visible for reuse and explore the potential value of local and national research data registries. In particular we will highlight the Jisc-funded RDRDS pilot to establish a UK national service that aggregates metadata relating to data collections held in research institutions and subject data centres. The session will conclude by exploring some of the opportunities we may collaboratively explore in facilitating the management, aggregation and reuse of research data

    Emerging good practice in managing research data and research information within UK Universities

    Get PDF
    Sound data intensive science depends upon effective research data and information management. Efficient and interoperable research information systems will be crucial for enabling and exploiting data intensive research however it is equally important that a research ecosystem is cultivated within research-intensive institutions that foster sustainable communication, cooperation and support of a diverse range of research-related staff. Researchers, librarians, administrators, ethics advisors, and IT professionals all have a vital contribution to make in ensuring that research data and related information is available, visible, understandable and usable over the mid to long term. This paper will provide a summary of several ongoing initiatives that the Jisc-funded Digital Curation Centre (DCC) are currently involved with in the UK and internationally to help staff within higher education institutions prepare to meet funding body mandates relating to research data management and sharing and to engage fully in the digital agenda

    Optimal Completion of Incomplete Gene Trees in Polynomial Time Using OCTAL

    Get PDF
    Here we introduce the Optimal Tree Completion Problem, a general optimization problem that involves completing an unrooted binary tree (i.e., adding missing leaves) so as to minimize its distance from a reference tree on a superset of the leaves. More formally, given a pair of unrooted binary trees (T,t) where T has leaf set S and t has leaf set R, a subset of S, we wish to add all the leaves from S R to t so as to produce a new tree t\u27 on leaf set S that has the minimum distance to T. We show that when the distance is defined by the Robinson-Foulds (RF) distance, an optimal solution can be found in polynomial time. We also present OCTAL, an algorithm that solves this RF Optimal Tree Completion Problem exactly in quadratic time. We report on a simulation study where we complete estimated gene trees using a reference tree that is based on a species tree estimated from a multi-locus dataset. OCTAL produces completed gene trees that are closer to the true gene trees than an existing heuristic approach, but the accuracy of the completed gene trees computed by OCTAL depends on how topologically similar the estimated species tree is to the true gene tree. Hence, under conditions with relatively low gene tree heterogeneity, OCTAL can be used to provide highly accurate completions of estimated gene trees. We close with a discussion of future research

    A Model for Analysing and Grading the Quality of Scientific Authorities Presented to State Legislative Committees

    Get PDF
    Longitudinal studies have confirmed that human brains continue to mature and restructure throughout adolescence, with the prefrontal cortex – responsible for executive functions – maturing into an individual’s twenties. Studies examining adolescent decision-making demonstrate that young people prioritise rewards when assessing risk, take more risks in ‘hot’ contexts and are more likely to take risks when in the presence of their peers. These findings have motivated arguments that the immaturity of an adolescent brain could impact on culpability for criminal offences; a point recognised by the US Supreme Court in 2005: From a moral standpoint it would be misguided to equate the failings of a minor with those of an adult, for a greater possibility exists that a minor's character deficiencies will be reformed. Indeed, “[t]he relevance of youth as a mitigating factor derives from the fact that the signature qualities of youth are transient; as individuals mature, the impetuousness and recklessness that may dominate in younger years can subside.” Since 2007, states have begun to ‘Raise the Age’ and move towards a national consensus of 18 for the upper age limit of juvenile court jurisdiction. Vermont has even gone beyond this, raising the age limit to 20. Little is known, however, about the extent to which, one, the evidential body of adolescent brain science is informing this legislative movement, or, two, robust science is presented to legislative decision-makers and by whom. We have developed a model for analysing and grading the quality of scientific arguments and authorities presented to legislative committees examining ‘Raise the Age’ legislation and have applied it to four states: Connecticut, Vermont, Michigan and Wisconsin. The former two were selected as states which had already, or were repeatedly attempting, to raise the age of juvenile jurisdiction above 18 and the latter two were states which, as of 2018, had not reached the national consensus of 18. Almost 700 pieces of evidence were examined, assessing criteria including whether studies were peer-reviewed, performed in humans, randomised control trials or whether they were opinion-based. Testimony was also categorised by author and a thematic analysis conducted. Our research has shown that campaign organisations, academia, religious groups, police chiefs and parents regularly provide testimony in this public process and that the themes of funding, recidivism and serious offences are repeatedly referenced. The model tells us that overall, although detailed scientific arguments about brain science and culpability are made to the legislature, poor quality evidence is provided to support these and, most often, there is a lack of scientific evidence entirely. This paper provides a summary of the results from Connecticut, Michigan, Vermont and Wisconsin. Part I discusses the methodology and development of the analysis model and Part II offers conclusions about the quality of science referenced, who participates, and the themes discussed in public committee testimony

    TRACTION: Fast Non-Parametric Improvement of Estimated Gene Trees

    Get PDF
    Gene tree correction aims to improve the accuracy of a gene tree by using computational techniques along with a reference tree (and in some cases available sequence data). It is an active area of research when dealing with gene tree heterogeneity due to duplication and loss (GDL). Here, we study the problem of gene tree correction where gene tree heterogeneity is instead due to incomplete lineage sorting (ILS, a common problem in eukaryotic phylogenetics) and horizontal gene transfer (HGT, a common problem in bacterial phylogenetics). We introduce TRACTION, a simple polynomial time method that provably finds an optimal solution to the RF-Optimal Tree Refinement and Completion Problem, which seeks a refinement and completion of an input tree t with respect to a given binary tree T so as to minimize the Robinson-Foulds (RF) distance. We present the results of an extensive simulation study evaluating TRACTION within gene tree correction pipelines on 68,000 estimated gene trees, using estimated species trees as reference trees. We explore accuracy under conditions with varying levels of gene tree heterogeneity due to ILS and HGT. We show that TRACTION matches or improves the accuracy of well-established methods from the GDL literature under conditions with HGT and ILS, and ties for best under the ILS-only conditions. Furthermore, TRACTION ties for fastest on these datasets. TRACTION is available at https://github.com/pranjalv123/TRACTION-RF and the study datasets are available at https://doi.org/10.13012/B2IDB-1747658_V1

    Using climate change models to inform the recovery of the western ground parrot Pezoporus flaviventris

    Get PDF
    Translocation of species to areas of former habitat after threats have been mitigated is a common conservation action. However, the long-term success of reintroduction relies on identification of currently available habitat and areas that will remain, or become, habitat in the future. Commonly, a short-term view is taken, focusing on obvious and assumed threats such as predators and habitat degradation. However, in areas subject to significant climate change, challenges include correctly identifying variables that define habitat, and considering probable changes over time. This poses challenges with species such as the western ground parrot Pezoporus flaviventris, which was once relatively common in near-coastal south-western Australia, an area subject to major climate change. This species has declined to one small population, estimated to comprise \u3c 150 individuals. Reasons for the decline include altered fire regimes, introduced predators and habitat clearing. The establishment of new populations is a high priority, but the extent to which a rapidly changing climate has affected, and will continue to affect, this species remains largely conjecture, and understanding probable climate change impacts is essential to the prioritization of potential reintroduction sites. We developed high-resolution species distribution models and used these to investigate climate change impacts on current and historical distributions, and identify locations that will remain, or become, bioclimatically suitable habitat in the future. This information has been given to an expert panel to identify and prioritize areas suitable for site-specific management and/or translocation

    Advancing Divide-And-Conquer Phylogeny Estimation Using Robinson-Foulds Supertrees

    Get PDF
    One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics for NP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a "supertree method". Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees is NP-hard. We also present GreedyRFS (a greedy heuristic that operates by repeatedly using Exact-RFS-2 on pairs of trees, until all the trees are merged into a single supertree). We evaluate Exact-RFS-2 and GreedyRFS, and show that they have better accuracy than the current leading heuristic for RFS

    Psychotic symptoms in adolescence index risk for suicidal behavior: findings from 2 population-based case-control clinical interview studies.

    Get PDF
    CONTEXT: Recent evidence from both clinical and population research has pointed to psychotic symptoms as potentially important markers of risk for suicidal behavior. However, to our knowledge, there have been no epidemiological studies to date that have reported data on psychotic symptoms and suicidality in individuals who have been clinically assessed for suicidal behavior. OBJECTIVES: To explore associations between psychotic symptoms in nonpsychotic adolescents and risk for suicidal behavior in (1) the general population, (2) adolescents with psychiatric disorder, and (3) adolescents with suicidal ideation. DESIGN Two independently conducted case-control clinical interview studies. SETTING Population-based studies in Ireland. PARTICIPANTS Study 1 included 212 adolescents aged 11 to 13 years. Study 2 included 211 adolescents aged 13 to 15 years. Participants were recruited from schools. MAIN OUTCOME MEASURES: Suicidal behavior and psychotic symptoms, assessed by semi-structured diagnostic clinical interview. RESULTS Psychotic symptoms were associated with a 10-fold increased odds of any suicidal behavior (ideation, plans, or acts) in both the early and middle adolescence studies (odds ratio [OR], 10.23; 95% CI, 3.25-32.26; P \u3c .001 and OR, 10.5; 95% CI, 3.14-35.17; P \u3c .001, respectively). Adolescents with depressive disorders who also experienced psychotic symptoms were at a nearly 14-fold increased odds of more severe suicidal behavior (suicide plans and suicide acts) compared with adolescents with depressive disorders who did not experience psychotic symptoms (OR, 13.7; 95% CI, 2.1-89.6). Among all adolescents with suicidal ideation, those who also reported psychotic symptoms had a nearly 20-fold increased odds of suicide plans and suicide acts compared with adolescents with suicidal ideation who did not report psychotic symptoms (OR, 19.6; 95% CI, 1.8-216.1). CONCLUSIONS: Psychotic symptoms are strongly associated with increased risk for suicidal behavior in the general adolescent population and in adolescents with (nonpsychotic) psychiatric disorder. In both studies, an absolute majority of adolescents with more severe suicidal behavior (suicidal plans and acts) reported psychotic symptoms when directly questioned about this as part of a psychiatric interview. Assessment of psychotic symptoms should form a key part of suicide risk assessment

    Identification and characterization of prodromal risk syndromes in young adolescents in the community: a population-based clinical interview study.

    Get PDF
    While a great deal of research has been conducted on prodromal risk syndromes in relation to help-seeking individuals who present to the clinic, there is a lack of research on prodromal risk syndromes in the general population. The current study aimed first to establish whether prodromal risk syndromes could be detected in non-help-seeking community-based adolescents and secondly to characterize this group in terms of Axis-1 psychopathology and general functioning. We conducted in-depth clinical interviews with a population sample of 212 school-going adolescents in order to assess for prodromal risk syndromes, Axis-1 psychopathology, and global (social/occupational) functioning. Between 0.9% and 8% of the community sample met criteria for a risk syndrome, depending on varying disability criteria. The risk syndrome group had a higher prevalence of co-occurring nonpsychotic Axis-1 psychiatric disorders (OR = 4.77, 95% CI = 1.81-12.52; P \u3c .01) and poorer global functioning (F = 24.5, df = 1, P \u3c .0001) compared with controls. Individuals in the community who fulfill criteria for prodromal risk syndromes demonstrate strong similarities with clinically presenting risk syndrome patients not just in terms of psychotic symptom criteria but also in terms of co-occurring psychopathology and global functioning

    Making Sense: Talking Data Management with Researchers

    Get PDF
    Incremental is one of eight projects in the JISC Managing Research Data programme funded to identify institutional requirements for digital research data management and pilot relevant infrastructure. Our findings concur with those of other Managing Research Data projects, as well as with several previous studies. We found that many researchers: (i) organise their data in an ad hoc fashion, posing difficulties with retrieval and re-use; (ii) store their data on all kinds of media without always considering security and back-up; (iii) are positive about data sharing in principle though reluctant in practice; (iv) believe back-up is equivalent to preservation. The key difference between our approach and that of other Managing Research Data projects is the type of infrastructure we are piloting. While the majority of these projects focus on developing technical solutions, we are focusing on the need for ‘soft’ infrastructure, such as one-to-one tailored support, training, and easy-to-find, concise guidance that breaks down some of the barriers information professionals have unintentionally built with their use of specialist terminology.We are employing a bottom-up approach as we feel that to support the step-by-step development of sound research data management practices, you must first understand researchers’ needs and perspectives. Over the life of the project, Incremental staff will act as mediators, assisting researchers and local support staff to understand the data management requirements within which they are expect to work, and will determine how these can be addressed within research workflows and the existing technical infrastructure.Our primary goal is to build data management capacity within the Universities of Cambridge and Glasgow by raising awareness of basic principles so everyone can manage their data to a certain extent. We will ensure our lessons can be picked up and used by other institutions. Our affiliation with the Digital Curation Centre and Digital Preservation Coalition will assist in this and all outputs will be released under a Creative Commons licence
    • …
    corecore