137 research outputs found

    Marked Gibbs point processes with unbounded interaction: an existence result

    Full text link
    We construct marked Gibbs point processes in Rd\mathbb{R}^d under quite general assumptions. Firstly, we allow for interaction functionals that may be unbounded and whose range is not assumed to be uniformly bounded. Indeed, our typical interaction admits an a.s. finite but random range. Secondly, the random marks -- attached to the locations in Rd\mathbb{R}^d -- belong to a general normed space S\mathcal{S}. They are not bounded, but their law should admit a super-exponential moment. The approach used here relies on the so-called entropy method and large-deviation tools in order to prove tightness of a family of finite-volume Gibbs point processes. An application to infinite-dimensional interacting diffusions is also presented

    Locality properties for discrete and continuum Widom--Rowlinson models in random environments

    Full text link
    We consider the Widom--Rowlinson model in which hard disks of two possible colors are constrained to a hard-core repulsion between particles of different colors, in quenched random environments. These random environments model spatially dependent preferences for the attachment of disks. We investigate the possibility to represent the joint process of environment and infinite-volume Widom--Rowlinson measure in terms of continuous (quasilocal) Papangelou intensities. We show that this is not always possible: In the case of the symmetric Widom-Rowlinson model on a non-percolating environment, we can explicitly construct a discontinuity coming from the environment. This is a new phenomenon for systems of continuous particles, but it can be understood as a continuous-space echo of a simpler non-locality phenomenon known to appear for the diluted Ising model (Griffiths singularity random field) on the lattice, as we explain in the course of the proof.Comment: 24 pages, 8 figure

    Diffusion dynamics for an infinite system of two-type spheres and the associated depletion effect

    Get PDF
    We consider a random diffusion dynamics for an infinite system of hard spheres of two different sizes evolving in ℝd, its reversible probability measure, and its projection on the subset of the large spheres. The main feature is the occurrence of an attractive short-range dynamical interaction --- known in the physics literature as a depletion interaction -- between the large spheres, which is induced by the hidden presence of the small ones. By considering the asymptotic limit for such a system when the density of the particles is high, we also obtain a constructive dynamical approach to the famous discrete geometry problem of maximisation of the contact number of n identical spheres in ℝd. As support material, we propose numerical simulations in the form of movies

    Data Management Plans in the genomics research revolution of Africa: Challenges and Recommendations

    Get PDF
    Drafting and writing a data management plan (DMP) is increasingly seen as a key part of the academic research process. A DMP is a document that describes how a researcher will collect, document, describe, share, and preserve the data that will be generated as part of a research project. The DMP illustrates the importance of utilizing best practices through all stages of working with data while ensuring accessibility, quality, and longevity of the data. The benefits of writing a DMP include compliance with funder and institutional mandates; making research more transparent (for reproduction and validation purposes); and FAIR (findable, accessible, interoperable, reusable); protecting data subjects and compliance with the General Data Protection Regulation (GDPR) and/or local data protection policies. In this review, we highlight the importance of a DMP in modern biomedical research, explaining both the rationale and current best practices associated with DMPs. In addition, we outline various funders’ requirements concerning DMPs and discuss open-source tools that facilitate the development and implementation of a DMP. Finally, we discuss DMPs in the context of African research, and the considerations that need to be made in this regard

    Proposed minimum information guideline for kidney disease—research and clinical data reporting: a cross-sectional study

    Get PDF
    Objective This project aimed to develop and propose a standardised reporting guideline for kidney disease research and clinical data reporting, in order to improve kidney disease data quality and integrity, and combat challenges associated with the management and challenges of ‘Big Data’. Methods A list of recommendations was proposed for the reporting guideline based on the systematic review and consolidation of previously published data collection and reporting standards, including PhenX measures and Minimal Information about a Proteomics Experiment (MIAPE). Thereafter, these recommendations were reviewed by domain-specialists using an online survey, developed in Research Electronic Data Capture (REDCap). Following interpretation and consolidation of the survey results, the recommendations were mapped to existing ontologies using Zooma, Ontology Lookup Service and the Bioportal search engine. Additionally, an associated eXtensible Markup Language schema was created for the REDCap implementation to increase user friendliness and adoption. Results The online survey was completed by 53 respondents; the majority of respondents were dual clinician-researchers (57%), based in Australia (35%), Africa (33%) and North America (22%). Data elements within the reporting standard were identified as participant-level, study-level and experiment-level information, further subdivided into essential or optional information. Conclusion The reporting guideline is readily employable for kidney disease research projects, and also adaptable for clinical utility. The adoption of the reporting guideline in kidney disease research can increase data quality and the value for long-term preservation, ensuring researchers gain the maximum benefit from their collected and generated data. This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial

    Ten simple rules for providing effective bioinformatics research support.

    Get PDF
    Life scientists are increasingly turning to high-throughput sequencing technologies in their research programs, owing to the enormous potential of these methods. In a parallel manner, the number of core facilities that provide bioinformatics support are also increasing. Notably, the generation of complex large datasets has necessitated the development of bioinformatics support core facilities that aid laboratory scientists with cost-effective and efficient data management, analysis, and interpretation. In this article, we address the challenges-related to communication, good laboratory practice, and data handling-that may be encountered in core support facilities when providing bioinformatics support, drawing on our own experiences working as support bioinformaticians on multidisciplinary research projects. Most importantly, the article proposes a list of guidelines that outline how these challenges can be preemptively avoided and effectively managed to increase the value of outputs to the end user, covering the entire research project lifecycle, including experimental design, data analysis, and management (i.e., sharing and storage). In addition, we highlight the importance of clear and transparent communication, comprehensive preparation, appropriate handling of samples and data using monitoring systems, and the employment of appropriate tools and standard operating procedures to provide effective bioinformatics support

    Nonnegative principal component analysis for mass spectral serum profiles and biomarker discovery

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>As a novel cancer diagnostic paradigm, mass spectroscopic serum proteomic pattern diagnostics was reported superior to the conventional serologic cancer biomarkers. However, its clinical use is not fully validated yet. An important factor to prevent this young technology to become a mainstream cancer diagnostic paradigm is that robustly identifying cancer molecular patterns from high-dimensional protein expression data is still a challenge in machine learning and oncology research. As a well-established dimension reduction technique, PCA is widely integrated in pattern recognition analysis to discover cancer molecular patterns. However, its global feature selection mechanism prevents it from capturing local features. This may lead to difficulty in achieving high-performance proteomic pattern discovery, because only features interpreting global data behavior are used to train a learning machine.</p> <p>Methods</p> <p>In this study, we develop a nonnegative principal component analysis algorithm and present a nonnegative principal component analysis based support vector machine algorithm with sparse coding to conduct a high-performance proteomic pattern classification. Moreover, we also propose a nonnegative principal component analysis based filter-wrapper biomarker capturing algorithm for mass spectral serum profiles.</p> <p>Results</p> <p>We demonstrate the superiority of the proposed algorithm by comparison with six peer algorithms on four benchmark datasets. Moreover, we illustrate that nonnegative principal component analysis can be effectively used to capture meaningful biomarkers.</p> <p>Conclusion</p> <p>Our analysis suggests that nonnegative principal component analysis effectively conduct local feature selection for mass spectral profiles and contribute to improving sensitivities and specificities in the following classification, and meaningful biomarker discovery.</p

    Graph similarity through entropic manifold alignment

    Get PDF
    In this paper we decouple the problem of measuring graph similarity into two sequential steps. The first step is the linearization of the quadratic assignment problem (QAP) in a low-dimensional space, given by the embedding trick. The second step is the evaluation of an information-theoretic distributional measure, which relies on deformable manifold alignment. The proposed measure is a normalized conditional entropy, which induces a positive definite kernel when symmetrized. We use bypass entropy estimation methods to compute an approximation of the normalized conditional entropy. Our approach, which is purely topological (i.e., it does not rely on node or edge attributes although it can potentially accommodate them as additional sources of information) is competitive with state-of-the-art graph matching algorithms as sources of correspondence-based graph similarity, but its complexity is linear instead of cubic (although the complexity of the similarity measure is quadratic). We also determine that the best embedding strategy for graph similarity is provided by commute time embedding, and we conjecture that this is related to its inversibility property, since the inverse of the embeddings obtained using our method can be used as a generative sampler of graph structure.The work of the first and third authors was supported by the projects TIN2012-32839 and TIN2015-69077-P of the Spanish Government. The work of the second author was supported by a Royal Society Wolfson Research Merit Award
    corecore