453 research outputs found

    Predictability study on the aftershock sequence following the 2011 Tohoku-Oki, Japan, earthquake: first results

    Get PDF
    Although no deterministic and reliable earthquake precursor is known to date, we are steadily gaining insight into probabilistic forecasting that draws on space–time characteristics of earthquake clustering. Clustering-based models aiming to forecast earthquakes within the next 24 hours are under test in the global project ‘Collaboratory for the Study of Earthquake Predictability’ (CSEP). The 2011 March 11 magnitude 9.0 Tohoku-Oki earthquake in Japan provides a unique opportunity to test the existing 1-day CSEP models against its unprecedentedly active aftershock sequence. The original CSEP experiment performs tests after the catalogue is finalized to avoid bias due to poor data quality. However, this study differs from this tradition and uses the preliminary catalogue revised and updated by the Japan Meteorological Agency (JMA), which is often incomplete but is immediately available. This study is intended as a first step towards operability-oriented earthquake forecasting in Japan. Encouragingly, at least one model passed the test in most combinations of the target day and the testing method, although the models could not take account of the megaquake in advance and the catalogue used for forecast generation was incomplete. However, it can also be seen that all models have only limited forecasting power for the period immediately after the quake. Our conclusion does not change when the preliminary JMAcatalogue is replaced by the finalized one, implying that the models perform stably over the catalogue replacement and are applicable to operational earthquake forecasting. However, we emphasize the need of further research on model improvement to assure the reliability of forecasts for the days immediately after the main quake. Seismicity is expected to remain high in all parts of Japan over the coming years. Our results present a way to answer the urgent need to promote research on time-dependent earthquake predictability to prepare for subsequent large earthquakes in the near future in Japan.Published653-6583.1. Fisica dei terremotiJCR Journalrestricte

    PPLook: an automated data mining tool for protein-protein interaction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Extracting and visualizing of protein-protein interaction (PPI) from text literatures are a meaningful topic in protein science. It assists the identification of interactions among proteins. There is a lack of tools to extract PPI, visualize and classify the results.</p> <p>Results</p> <p>We developed a PPI search system, termed PPLook, which automatically extracts and visualizes protein-protein interaction (PPI) from text. Given a query protein name, PPLook can search a dataset for other proteins interacting with it by using a keywords dictionary pattern-matching algorithm, and display the topological parameters, such as the number of nodes, edges, and connected components. The visualization component of PPLook enables us to view the interaction relationship among the proteins in a three-dimensional space based on the OpenGL graphics interface technology. PPLook can also provide the functions of selecting protein semantic class, counting the number of semantic class proteins which interact with query protein, counting the literature number of articles appearing the interaction relationship about the query protein. Moreover, PPLook provides heterogeneous search and a user-friendly graphical interface.</p> <p>Conclusions</p> <p>PPLook is an effective tool for biologists and biosystem developers who need to access PPI information from the literature. PPLook is freely available for non-commercial users at <url>http://meta.usc.edu/softs/PPLook</url>.</p

    Comparative Genetic Mapping and Discovery of Linkage Disequilibrium Across Linkage Groups in White Clover (Trifolium repens L.)

    Get PDF
    White clover (Trifolium repens L.) is an allotetraploid species (2n = 4X = 32) that is widely distributed in temperate regions and cultivated as a forage legume. In this study, we developed expressed sequence tag (EST)–derived simple sequence repeat (SSR) markers, constructed linkage maps, and performed comparative mapping with other legume species. A total of 7982 ESTs that could be assembled into 5400 contigs and 2582 singletons were generated. Using the EST sequences that were obtained, 1973 primer pairs to amplify EST-derived SSR markers were designed and used for linkage analysis of 188 F1 progenies, which were generated by a cross between two Japanese plants, ‘273-7’ and ‘T17-349,’ with previously published SSR markers. An integrated linkage map was constructed by combining parental-specific maps, which consisted of 1743 SSR loci on 16 homeologous linkage groups with a total length of 2511 cM. The primer sequences of the developed EST-SSR markers and their map positions are available on http://clovergarden.jp/. Linkage disequilibrium (LD) was observed on 9 of 16 linkage groups of a parental-specific map. The genome structures were compared among white clover, red clover (T. pratense L.), Medicago truncatula, and Lotus japonicus. Macrosynteny was observed across the four legume species. Surprisingly, the comparative genome structure between white clover and M. truncatula had a higher degree of conservation than that of the two clover species

    Text Mining the History of Medicine

    Get PDF
    Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform

    Enhanced annealing of mismatched oligonucleotides using a novel melting curve assay allows efficient in vitro discrimination and restriction of a single nucleotide polymorphism

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many SNP discrimination strategies employ natural restriction endonucleases to discriminate between allelic states. However, SNPs are often not associated with a restriction site and therefore, a number of attempts have been made to generate sequence-adaptable restriction endonucleases. In this study, a simple, sequence-adaptable SNP discrimination mechanism between a 'wild-type' and 'mutant' template is demonstrated. This model differs from other artificial restriction endonuclease models as <it>cis- </it>rather than <it>trans-</it>orientated regions of single stranded DNA were generated and cleaved, and therefore, overcomes potential issues of either inefficient or non-specific binding when only a single variant is targeted.</p> <p>Results</p> <p>A series of mismatch 'bubbles' that spanned 0-5-bp surrounding a point mutation was generated and analysed for sensitivity to S1 nuclease. In this model, generation of oligonucleotide-mediated ssDNA mismatch 'bubbles' in the presence of S1 nuclease resulted in the selective degradation of the mutant template while maintaining wild-type template integrity. Increasing the size of the mismatch increased the rate of mutant sequence degradation, until a threshold above which discrimination was lost and the wild-type sequence was degraded. This level of fine discrimination was possible due to the development of a novel high-resolution melting curve assay to empirically determine changes in Tm (~5.0°C per base-pair mismatch) and to optimise annealing conditions (~18.38°C below Tm) of the mismatched oligonucleotide sets.</p> <p>Conclusions</p> <p>The <it>in vitro </it>'cleavage bubble' model presented is sequence-adaptable as determined by the binding oligonucleotide, and hence, has the potential to be tailored to discriminate between any two or more SNPs. Furthermore, the demonstrated fluorometric assay has broad application potential, offering a rapid, sensitive and high-throughput means to determine Tm and annealing rates as an alternative to conventional hybridisation detection strategies.</p

    Mining metabolites: extracting the yeast metabolome from the literature

    Get PDF
    Text mining methods have added considerably to our capacity to extract biological knowledge from the literature. Recently the field of systems biology has begun to model and simulate metabolic networks, requiring knowledge of the set of molecules involved. While genomics and proteomics technologies are able to supply the macromolecular parts list, the metabolites are less easily assembled. Most metabolites are known and reported through the scientific literature, rather than through large-scale experimental surveys. Thus it is important to recover them from the literature. Here we present a novel tool to automatically identify metabolite names in the literature, and associate structures where possible, to define the reported yeast metabolome. With ten-fold cross validation on a manually annotated corpus, our recognition tool generates an f-score of 78.49 (precision of 83.02) and demonstrates greater suitability in identifying metabolite names than other existing recognition tools for general chemical molecules. The metabolite recognition tool has been applied to the literature covering an important model organism, the yeast Saccharomyces cerevisiae, to define its reported metabolome. By coupling to ChemSpider, a major chemical database, we have identified structures for much of the reported metabolome and, where structure identification fails, been able to suggest extensions to ChemSpider. Our manually annotated gold-standard data on 296 abstracts are available as supplementary materials. Metabolite names and, where appropriate, structures are also available as supplementary materials

    Using Workflows to Explore and Optimise Named Entity Recognition for Chemistry

    Get PDF
    Chemistry text mining tools should be interoperable and adaptable regardless of system-level implementation, installation or even programming issues. We aim to abstract the functionality of these tools from the underlying implementation via reconfigurable workflows for automatically identifying chemical names. To achieve this, we refactored an established named entity recogniser (in the chemistry domain), OSCAR and studied the impact of each component on the net performance. We developed two reconfigurable workflows from OSCAR using an interoperable text mining framework, U-Compare. These workflows can be altered using the drag-&-drop mechanism of the graphical user interface of U-Compare. These workflows also provide a platform to study the relationship between text mining components such as tokenisation and named entity recognition (using maximum entropy Markov model (MEMM) and pattern recognition based classifiers). Results indicate that, for chemistry in particular, eliminating noise generated by tokenisation techniques lead to a slightly better performance than others, in terms of named entity recognition (NER) accuracy. Poor tokenisation translates into poorer input to the classifier components which in turn leads to an increase in Type I or Type II errors, thus, lowering the overall performance. On the Sciborg corpus, the workflow based system, which uses a new tokeniser whilst retaining the same MEMM component, increases the F-score from 82.35% to 84.44%. On the PubMed corpus, it recorded an F-score of 84.84% as against 84.23% by OSCAR

    Centrality Dependence of Charged Particle Multiplicity in Au-Au Collisions at sqrt(s_NN)=130 GeV

    Full text link
    We present results for the charged-particle multiplicity distribution at mid-rapidity in Au - Au collisions at sqrt(s_NN)=130 GeV measured with the PHENIX detector at RHIC. For the 5% most central collisions we find dNch/dηη=0=622±1(stat)±41(syst)dN_{ch}/d\eta_{|\eta=0} = 622 \pm 1 (stat) \pm 41 (syst). The results, analyzed as a function of centrality, show a steady rise of the particle density per participating nucleon with centrality.Comment: 307 authors, 43 institutions, 6 pages, 4 figures, 1 table Minor changes to figure labels and text to meet PRL requirements. One author added: M. Hibino of Waseda Universit

    Flow Measurements via Two-particle Azimuthal Correlations in Au + Au Collisions at sqrt(s_NN) = 130 GeV

    Full text link
    Two particle azimuthal correlation functions are presented for charged hadrons produced in Au + Au collisions at RHIC sqrt(s_NN) = 130 GeV. The measurements permit determination of elliptic flow without event-by-event estimation of the reaction plane. The extracted elliptic flow values v_2 show significant sensitivity to both the collision centrality and the transverse momenta of emitted hadrons, suggesting rapid thermalization and relatively strong velocity fields. When scaled by the eccentricity of the collision zone, epsilon, the scaled elliptic flow shows little or no dependence on centrality for charged hadrons with relatively low p_T. A breakdown of this epsilon scaling is observed for charged hadrons with p_T > 1.0 GeV/c for the most central collisions.Comment: 6 pages, RevTeX 3, 4 figures, 307 authors, submitted to Phys. Rev. Lett. on 11 April 2002. Plain text data tables for the points plotted in figures for this and previous PHENIX publications are (will be made) publicly available at http://www.phenix.bnl.gov/phenix/WWW/run/phenix/papers.htm

    Measurement of the mid-rapidity transverse energy distribution from sNN=130\sqrt{s_{NN}}=130 GeV Au+Au collisions at RHIC

    Get PDF
    The first measurement of energy produced transverse to the beam direction at RHIC is presented. The mid-rapidity transverse energy density per participating nucleon rises steadily with the number of participants, closely paralleling the rise in charged-particle density, such that E_T / N_ch remains relatively constant as a function of centrality. The energy density calculated via Bjorken's prescription for the 2% most central Au+Au collisions at sqrt(s_NN)=130 GeV is at least epsilon_Bj = 4.6 GeV/fm^3 which is a factor of 1.6 larger than found at sqrt(s_NN)=17.2 GeV (Pb+Pb at CERN).Comment: 307 authors, 6 pages, 4 figures, 1 table, submitted to PRL 4/18/2001; revised version submitted to PRL 5/24/200
    corecore