648 research outputs found

    Discovering Restricted Regular Expressions with Interleaving

    Full text link
    Discovering a concise schema from given XML documents is an important problem in XML applications. In this paper, we focus on the problem of learning an unordered schema from a given set of XML examples, which is actually a problem of learning a restricted regular expression with interleaving using positive example strings. Schemas with interleaving could present meaningful knowledge that cannot be disclosed by previous inference techniques. Moreover, inference of the minimal schema with interleaving is challenging. The problem of finding a minimal schema with interleaving is shown to be NP-hard. Therefore, we develop an approximation algorithm and a heuristic solution to tackle the problem using techniques different from known inference algorithms. We do experiments on real-world data sets to demonstrate the effectiveness of our approaches. Our heuristic algorithm is shown to produce results that are very close to optimal.Comment: 12 page

    Absolute abundance estimates from shallow water baited underwater camera surveys; a stochastic modelling approach tested against field data

    Get PDF
    This research was supported by a University of Glasgow Faculty Scholarship to KMD, a Collaborative Gearing Scheme grant from the Natural Environmental Research Council and the British Antarctic Survey (CGS-77) and an ASSEMBLE infrastructure access grant to DMB.Baited underwater cameras are becoming a popular tool to monitor fish and invertebrate populations within protected and inshore environments where trawl surveys are unsuitable. Modelling the arrival times of deep-sea grenadiers using an inverse square relationship has enabled abundance estimates, comparable to those from bottom trawl surveys, to be gathered from deep-sea baited camera surveys. Baited underwater camera systems in the shallow water environments are however, currently limited to relative comparisons of assemblages based on simple metrics such as MaxN (maximum number of fish seen at any one time). This study describes a stochastic simulation approach used to model the behaviour of fish and invertebrates around a BUC system to enable absolute abundance estimates to be generated from arrival patterns. Species-specific models were developed for the tropical reef fishes the black tip grouper (Epinephelus fasciatus) and moray eel (Gymnothorax spp.) and the Antarctic scavengers; the asteroid (Odontaster validus) and the nemertean worm (Parbolasia corrugatus). A sensitivity analysis explored the impact of input parameters on the arrival patterns (MaxN, time to the arrival of the first individual and the time to reach MaxN) for each species generated by the model. Sensitivity analysis showed a particularly strong link between MaxN and abundance indicating that this model could be used to generate absolute abundances from existing or future MaxN data. It in effect allows the slope of the MaxN vs. abundance relationship to be estimated. Arrival patterns generated by each model were used to estimate population abundance for the focal species and these estimates were compared to data from underwater visual census transects. Using a Bland-Altman analysis, baited underwater camera data processed using this model were shown to generate absolute abundance estimates that were comparable to underwater visual census data.PostprintPeer reviewe

    Motif Discovery through Predictive Modeling of Gene Regulation

    Full text link
    We present MEDUSA, an integrative method for learning motif models of transcription factor binding sites by incorporating promoter sequence and gene expression data. We use a modern large-margin machine learning approach, based on boosting, to enable feature selection from the high-dimensional search space of candidate binding sequences while avoiding overfitting. At each iteration of the algorithm, MEDUSA builds a motif model whose presence in the promoter region of a gene, coupled with activity of a regulator in an experiment, is predictive of differential expression. In this way, we learn motifs that are functional and predictive of regulatory response rather than motifs that are simply overrepresented in promoter sequences. Moreover, MEDUSA produces a model of the transcriptional control logic that can predict the expression of any gene in the organism, given the sequence of the promoter region of the target gene and the expression state of a set of known or putative transcription factors and signaling molecules. Each motif model is either a kk-length sequence, a dimer, or a PSSM that is built by agglomerative probabilistic clustering of sequences with similar boosting loss. By applying MEDUSA to a set of environmental stress response expression data in yeast, we learn motifs whose ability to predict differential expression of target genes outperforms motifs from the TRANSFAC dataset and from a previously published candidate set of PSSMs. We also show that MEDUSA retrieves many experimentally confirmed binding sites associated with environmental stress response from the literature.Comment: RECOMB 200

    A new multivariable 6-psi-6 summation formula

    Full text link
    By multidimensional matrix inversion, combined with an A_r extension of Jackson's 8-phi-7 summation formula by Milne, a new multivariable 8-phi-7 summation is derived. By a polynomial argument this 8-phi-7 summation is transformed to another multivariable 8-phi-7 summation which, by taking a suitable limit, is reduced to a new multivariable extension of the nonterminating 6-phi-5 summation. The latter is then extended, by analytic continuation, to a new multivariable extension of Bailey's very-well-poised 6-psi-6 summation formula.Comment: 16 page

    Working with Research Integrity—Guidance for Research Performing Organisations: The Bonn PRINTEGER Statement

    Get PDF
    This document presents the Bonn PRINTEGER Consensus Statement: Working with Research Integrity—Guidance for research performing organisations. The aim of the statement is to complement existing instruments by focusing specifically on institutional responsibilities for strengthening integrity. It takes into account the daily challenges and organisational contexts of most researchers. The statement intends to make research integrity challenges recognisable from the work-floor perspective, providing concrete advice on organisational measures to strengthen integrity. The statement, which was concluded February 7th 2018, provides guidance on the following key issues: § 1. Providing information about research integrity§ 2. Providing education, training and mentoring § 3. Strengthening a research integrity culture § 4. Facilitating open dialogue § 5. Wise incentive management§ 6. Implementing quality assurance procedures § 7. Improving the work environment and work satisfaction § 8. Increasing transparency of misconduct cases § 9. Opening up research § 10. Implementing safe and effective whistle-blowing channels § 11. Protecting the alleged perpetrators § 12. Establishing a research integrity committee and appointing an ombudsperson § 13. Making explicit the applicable standards for research integrityMerit, Expertise and Measuremen

    A framework for complexity in palliative care: A qualitative study with patients, family carers and professionals

    Get PDF
    Background:Palliative care patients are often described as complex but evidence on complexity is limited. We need to understand complexity, including at individual patient-level, to define specialist palliative care, characterise palliative care populations and meaningfully compare interventions/outcomes.Aim:To explore palliative care stakeholders’ views on what makes a patient more or less complex and insights on capturing complexity at patient-level.Design:In-depth qualitative interviews, analysed using Framework analysis.Participants/setting:Semi-structured interviews across six UK centres with patients, family, professionals, managers and senior leads, purposively sampled by experience, background, location and setting (hospital, hospice and community).Results:65 participants provided an understanding of complexity, which extended far beyond the commonly used physical, psychological, social and spiritual domains. Complexity included how patients interact with family/professionals, how services’ respond to needs and societal perspectives on care. ‘Pre-existing’, ‘cumulative’ and ‘invisible’ complexity are further important dimensions to delivering effective palliative and end-of-life care. The dynamic nature of illness and needs over time was also profoundly influential. Adapting Bronfenbrenner’s Ecological Systems Theory, we categorised findings into the microsystem (person, needs and characteristics), chronosystem (dynamic influences of time), mesosystem (interactions with family/health professionals), exosystem (palliative care services/systems) and macrosystem (societal influences). Stakeholders found it acceptable to capture complexity at the patient-level, with perceived benefits for improving palliative care resource allocation.Conclusion:Our conceptual framework encompasses additional elements beyond physical, psychological, social and spiritual domains and advances systematic understanding of complexity within the context of palliative care. This framework helps capture patient-level complexity and target resource provision in specialist palliative care

    A Study of the S=1/2 Alternating Chain using Multiprecision Methods

    Full text link
    In this paper we present results for the ground state and low-lying excitations of the S=1/2S=1/2 alternating Heisenberg antiferromagnetic chain. Our more conventional techniques include perturbation theory about the dimer limit and numerical diagonalization of systems of up to 28 spins. A novel application of multiple precision numerical diagonalization allows us to determine analytical perturbation series to high order; the results found using this approach include ninth-order perturbation series for the ground state energy and one magnon gap, which were previously known only to third order. We also give the fifth-order dispersion relation and third-order exclusive neutron scattering structure factor for one-magnon modes and numerical and analytical binding energies of S=0 and S=1 two-magnon bound states.Comment: 16 pages, 9 figures. for submission to Phys.Rev.B. PICT files of figs available at http://csep2.phy.ornl.gov/theory_group/people/barnes/barnes.htm

    Dapagliflozin: a sodium glucose cotransporter 2 inhibitor in development for type 2 diabetes

    Get PDF
    Type 2 diabetes mellitus (T2DM) is a growing worldwide epidemic. Patients face lifelong therapy to control hyperglycemia and prevent the associated complications. There are many medications, with varying mechanisms, available for the treatment of T2DM, but almost all target the declining insulin sensitivity and secretion that are associated with disease progression. Medications with such insulin-dependent mechanisms of action often lose efficacy over time, and there is increasing interest in the development of new antidiabetes medications that are not dependent upon insulin. One such approach is through the inhibition of renal glucose reuptake. Dapagliflozin, the first of a class of selective sodium glucose cotransporter 2 inhibitors, reduces renal glucose reabsorption and is currently under development for the treatment of T2DM. Here, we review the literature relating to the preclinical and clinical development of dapagliflozin

    Quantitative PCR tissue expression profiling of the human SGLT2 gene and related family members

    Get PDF
    SGLT2 (for “Sodium GLucose coTransporter” protein 2) is the major protein responsible for glucose reabsorption in the kidney and its inhibition has been the focus of drug discovery efforts to treat type 2 diabetes. In order to better clarify the human tissue distribution of expression of SGLT2 and related members of this cotransporter class, we performed TaqMan™ (Applied Biosystems, Foster City, CA, USA) quantitative polymerase chain reaction (PCR) analysis of SGLT2 and other sodium/glucose transporter genes on RNAs from 72 normal tissues from three different individuals. We consistently observe that SGLT2 is highly kidney specific while SGLT5 is highly kidney abundant; SGLT1, sodium-dependent amino acid transporter (SAAT1), and SGLT4 are highly abundant in small intestine and skeletal muscle; SGLT6 is expressed in the central nervous system; and sodium myoinositol cotransporter is ubiquitously expressed across all human tissues
    corecore