170 research outputs found

    Hidden Markov model speed heuristic and iterative HMM search procedure

    Get PDF
    BACKGROUND: Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require considerable time to search large sequence databases. RESULTS: We have designed a series of database filtering steps, HMMERHEAD, that are applied prior to the scoring algorithms, as implemented in the HMMER package, in an effort to reduce search time. Using this heuristic, we obtain a 20-fold decrease in Forward and a 6-fold decrease in Viterbi search time with a minimal loss in sensitivity relative to the unfiltered approaches. We then implemented an iterative profile-HMM search method, JackHMMER, which employs the HMMERHEAD heuristic. Due to our search heuristic, we eliminated the subdatabase creation that is common in current iterative profile-HMM approaches. On our benchmark, JackHMMER detects 14% more remote protein homologs than SAM's iterative method T2K. CONCLUSIONS: Our search heuristic, HMMERHEAD, significantly reduces the time needed to score a profile-HMM against large sequence databases. This search heuristic allowed us to implement an iterative profile-HMM search method, JackHMMER, which detects significantly more remote protein homologs than SAM's T2K and NCBI's PSI-BLAST

    Eigenvectors of the discrete Laplacian on regular graphs - a statistical approach

    Full text link
    In an attempt to characterize the structure of eigenvectors of random regular graphs, we investigate the correlations between the components of the eigenvectors associated to different vertices. In addition, we provide numerical observations, suggesting that the eigenvectors follow a Gaussian distribution. Following this assumption, we reconstruct some properties of the nodal structure which were observed in numerical simulations, but were not explained so far. We also show that some statistical properties of the nodal pattern cannot be described in terms of a percolation model, as opposed to the suggested correspondence for eigenvectors of 2 dimensional manifolds.Comment: 28 pages, 11 figure

    Geometric characterization of nodal domains: the area-to-perimeter ratio

    Full text link
    In an attempt to characterize the distribution of forms and shapes of nodal domains in wave functions, we define a geometric parameter - the ratio ρ\rho between the area of a domain and its perimeter, measured in units of the wavelength 1/E1/\sqrt{E}. We show that the distribution function P(ρ)P(\rho) can distinguish between domains in which the classical dynamics is regular or chaotic. For separable surfaces, we compute the limiting distribution, and show that it is supported by an interval, which is independent of the properties of the surface. In systems which are chaotic, or in random-waves, the area-to-perimeter distribution has substantially different features which we study numerically. We compare the features of the distribution for chaotic wave functions with the predictions of the percolation model to find agreement, but only for nodal domains which are big with respect to the wavelength scale. This work is also closely related to, and provides a new point of view on isoperimetric inequalities.Comment: 22 pages, 11 figure

    Personal health promotion at US medical schools: a quantitative study and qualitative description of deans' and students' perceptions

    Get PDF
    BACKGROUND: Prior literature has shown that physicians with healthy personal habits are more likely to encourage patients to adopt similar habits. However, despite the possibility that promoting medical student health might therefore efficiently improve patient outcomes, no one has studied whether such promotion happens in medical school. We therefore wished to describe both typical and outstanding personal health promotion environments experienced by students in U.S. medical schools. METHODS: We collected information through four different modalities: a literature review, written surveys of medical school deans and students, student and dean focus groups, and site visits at and interviews with medical schools with reportedly outstanding student health promotion programs. RESULTS: We found strong correlations between deans' and students' perceptions of their schools' health promotion environments, including consistent support of the idea of schools' encouraging healthy student behaviors, with less consistent follow-through by schools on this concept. Though students seemed to have thought little about the relationships between their own personal and clinical health promotion practices, deans felt strongly that faculty members should model healthy behaviors. CONCLUSIONS: Deans' support of the relationship between physicians' personal and clinical health practices, and concern about their institutions' acting on this relationship augurs well for the role of student health promotion in the future of medical education. Deans seem to understand their students' health environment, and believe it could and should be improved; if this is acted on, it could create important positive changes in medical education and in disease prevention

    Isospectral discrete and quantum graphs with the same flip counts and nodal counts

    Get PDF
    The existence of non-isomorphic graphs which share the same Laplace spectrum (to be referred to as isospectral graphs) leads naturally to the following question: What additional information is required in order to resolve isospectral graphs? It was suggested by Band, Shapira and Smilansky that this might be achieved by either counting the number of nodal domains or the number of times the eigenfunctions change sign (the so-called flip count). Recently examples of (discrete) isospectral graphs with the same flip count and nodal count have been constructed by K. Ammann by utilising Godsil-McKay switching. Here we provide a simple alternative mechanism that produces systematic examples of both discrete and quantum isospectral graphs with the same flip and nodal counts.Comment: 16 pages, 4 figure

    Data standards can boost metabolomics research, and if there is a will, there is a way.

    Get PDF
    Thousands of articles using metabolomics approaches are published every year. With the increasing amounts of data being produced, mere description of investigations as text in manuscripts is not sufficient to enable re-use anymore: the underlying data needs to be published together with the findings in the literature to maximise the benefit from public and private expenditure and to take advantage of an enormous opportunity to improve scientific reproducibility in metabolomics and cognate disciplines. Reporting recommendations in metabolomics started to emerge about a decade ago and were mostly concerned with inventories of the information that had to be reported in the literature for consistency. In recent years, metabolomics data standards have developed extensively, to include the primary research data, derived results and the experimental description and importantly the metadata in a machine-readable way. This includes vendor independent data standards such as mzML for mass spectrometry and nmrML for NMR raw data that have both enabled the development of advanced data processing algorithms by the scientific community. Standards such as ISA-Tab cover essential metadata, including the experimental design, the applied protocols, association between samples, data files and the experimental factors for further statistical analysis. Altogether, they pave the way for both reproducible research and data reuse, including meta-analyses. Further incentives to prepare standards compliant data sets include new opportunities to publish data sets, but also require a little "arm twisting" in the author guidelines of scientific journals to submit the data sets to public repositories such as the NIH Metabolomics Workbench or MetaboLights at EMBL-EBI. In the present article, we look at standards for data sharing, investigate their impact in metabolomics and give suggestions to improve their adoption

    A genetic algorithm-Bayesian network approach for the analysis of metabolomics and spectroscopic data: application to the rapid detection of Bacillus spores and identification of Bacillus species

    Get PDF
    Background The rapid identification of Bacillus spores and bacterial identification are paramount because of their implications in food poisoning, pathogenesis and their use as potential biowarfare agents. Many automated analytical techniques such as Curie-point pyrolysis mass spectrometry (Py-MS) have been used to identify bacterial spores giving use to large amounts of analytical data. This high number of features makes interpretation of the data extremely difficult We analysed Py-MS data from 36 different strains of aerobic endospore-forming bacteria encompassing seven different species. These bacteria were grown axenically on nutrient agar and vegetative biomass and spores were analyzed by Curie-point Py-MS. Results We develop a novel genetic algorithm-Bayesian network algorithm that accurately identifies sand selects a small subset of key relevant mass spectra (biomarkers) to be further analysed. Once identified, this subset of relevant biomarkers was then used to identify Bacillus spores successfully and to identify Bacillus species via a Bayesian network model specifically built for this reduced set of features. Conclusions This final compact Bayesian network classification model is parsimonious, computationally fast to run and its graphical visualization allows easy interpretation of the probabilistic relationships among selected biomarkers. In addition, we compare the features selected by the genetic algorithm-Bayesian network approach with the features selected by partial least squares-discriminant analysis (PLS-DA). The classification accuracy results show that the set of features selected by the GA-BN is far superior to PLS-DA

    COordination of Standards in MetabOlomicS (COSMOS): facilitating integrated metabolomics data access

    Get PDF
    Metabolomics has become a crucial phenotyping technique in a range of research fields including medicine, the life sciences, biotechnology and the environmental sciences. This necessitates the transfer of experimental information between research groups, as well as potentially to publishers and funders. After the initial efforts of the metabolomics standards initiative, minimum reporting standards were proposed which included the concepts for metabolomics databases. Built by the community, standards and infrastructure for metabolomics are still needed to allow storage, exchange, comparison and re-utilization of metabolomics data. The Framework Programme 7 EU Initiative ‘coordination of standards in metabolomics’ (COSMOS) is developing a robust data infrastructure and exchange standards for metabolomics data and metadata. This is to support workflows for a broad range of metabolomics applications within the European metabolomics community and the wider metabolomics and biomedical communities’ participation. Here we announce our concepts and efforts asking for re-engagement of the metabolomics community, academics and industry, journal publishers, software and hardware vendors, as well as those interested in standardisation worldwide (addressing missing metabolomics ontologies, complex-metadata capturing and XML based open source data exchange format), to join and work towards updating and implementing metabolomics standards
    • 

    corecore