614 research outputs found

    Moment tensors for rapid characterization of megathrust earthquakes: the example of the 2011 M9 Tohoku-oki, Japan earthquake

    Get PDF
    The rapid detection and characterization of megathrust earthquakes is a difficult task given their large rupture zone and duration. These events produce very strong ground vibrations in the near field that can cause weak motion instruments to clip, and they are also capable of generating large-scale tsunamis. The 2011 M9 Tohoku-oki earthquake that occurred offshore Japan is one member of a series of great earthquakes for which extended geophysical observations are available. Here, we test an automated scanning algorithm for great earthquakes using continuous very long-period (100-200 s) seismic records from K-NET strong-motion seismograms of the earthquake. By continuously performing the cross-correlation of data and Green's functions (GFs) in a moment tensor analysis, we show that the algorithm automatically detects, locates and determines source parameters including the moment magnitude and mechanism of the great Tohoku-oki earthquake within 8 min of its origin time. The method does not saturate. We also show that quasi-finite-source GFs, which take into account the effects of a finite-source, in a single-point source moment tensor algorithm better fit the data, especially in the near-field. We show that this technique allows the correct characterization of the earthquake using a limited number of stations. This can yield information usable for tsunami early warnin

    Normalizing biomedical terms by minimizing ambiguity and variability

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One of the difficulties in mapping biomedical named entities, e.g. genes, proteins, chemicals and diseases, to their concept identifiers stems from the potential variability of the terms. Soft string matching is a possible solution to the problem, but its inherent heavy computational cost discourages its use when the dictionaries are large or when real time processing is required. A less computationally demanding approach is to normalize the terms by using heuristic rules, which enables us to look up a dictionary in a constant time regardless of its size. The development of good heuristic rules, however, requires extensive knowledge of the terminology in question and thus is the bottleneck of the normalization approach.</p> <p>Results</p> <p>We present a novel framework for discovering a list of normalization rules from a dictionary in a fully automated manner. The rules are discovered in such a way that they minimize the ambiguity and variability of the terms in the dictionary. We evaluated our algorithm using two large dictionaries: a human gene/protein name dictionary built from BioThesaurus and a disease name dictionary built from UMLS.</p> <p>Conclusions</p> <p>The experimental results showed that automatically discovered rules can perform comparably to carefully crafted heuristic rules in term mapping tasks, and the computational overhead of rule application is small enough that a very fast implementation is possible. This work will help improve the performance of term-concept mapping tasks in biomedical information extraction especially when good normalization heuristics for the target terminology are not fully known.</p

    PPLook: an automated data mining tool for protein-protein interaction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Extracting and visualizing of protein-protein interaction (PPI) from text literatures are a meaningful topic in protein science. It assists the identification of interactions among proteins. There is a lack of tools to extract PPI, visualize and classify the results.</p> <p>Results</p> <p>We developed a PPI search system, termed PPLook, which automatically extracts and visualizes protein-protein interaction (PPI) from text. Given a query protein name, PPLook can search a dataset for other proteins interacting with it by using a keywords dictionary pattern-matching algorithm, and display the topological parameters, such as the number of nodes, edges, and connected components. The visualization component of PPLook enables us to view the interaction relationship among the proteins in a three-dimensional space based on the OpenGL graphics interface technology. PPLook can also provide the functions of selecting protein semantic class, counting the number of semantic class proteins which interact with query protein, counting the literature number of articles appearing the interaction relationship about the query protein. Moreover, PPLook provides heterogeneous search and a user-friendly graphical interface.</p> <p>Conclusions</p> <p>PPLook is an effective tool for biologists and biosystem developers who need to access PPI information from the literature. PPLook is freely available for non-commercial users at <url>http://meta.usc.edu/softs/PPLook</url>.</p

    Supporting the education evidence portal via text mining

    Get PDF
    The UK Education Evidence Portal (eep) provides a single, searchable, point of access to the contents of the websites of 33 organizations relating to education, with the aim of revolutionizing work practices for the education community. Use of the portal alleviates the need to spend time searching multiple resources to find relevant information. However, the combined content of the websites of interest is still very large (over 500 000 documents and growing). This means that searches using the portal can produce very large numbers of hits. As users often have limited time, they would benefit from enhanced methods of performing searches and viewing results, allowing them to drill down to information of interest more efficiently, without having to sift through potentially long lists of irrelevant documents. The Joint Information Systems Committee (JISC)-funded ASSIST project has produced a prototype web interface to demonstrate the applicability of integrating a number of text-mining tools and methods into the eep, to facilitate an enhanced searching, browsing and document-viewing experience. New features include automatic classification of documents according to a taxonomy, automatic clustering of search results according to similar document content, and automatic identification and highlighting of key terms within documents

    Predictability study on the aftershock sequence following the 2011 Tohoku-Oki, Japan, earthquake: first results

    Get PDF
    Although no deterministic and reliable earthquake precursor is known to date, we are steadily gaining insight into probabilistic forecasting that draws on space–time characteristics of earthquake clustering. Clustering-based models aiming to forecast earthquakes within the next 24 hours are under test in the global project ‘Collaboratory for the Study of Earthquake Predictability’ (CSEP). The 2011 March 11 magnitude 9.0 Tohoku-Oki earthquake in Japan provides a unique opportunity to test the existing 1-day CSEP models against its unprecedentedly active aftershock sequence. The original CSEP experiment performs tests after the catalogue is finalized to avoid bias due to poor data quality. However, this study differs from this tradition and uses the preliminary catalogue revised and updated by the Japan Meteorological Agency (JMA), which is often incomplete but is immediately available. This study is intended as a first step towards operability-oriented earthquake forecasting in Japan. Encouragingly, at least one model passed the test in most combinations of the target day and the testing method, although the models could not take account of the megaquake in advance and the catalogue used for forecast generation was incomplete. However, it can also be seen that all models have only limited forecasting power for the period immediately after the quake. Our conclusion does not change when the preliminary JMAcatalogue is replaced by the finalized one, implying that the models perform stably over the catalogue replacement and are applicable to operational earthquake forecasting. However, we emphasize the need of further research on model improvement to assure the reliability of forecasts for the days immediately after the main quake. Seismicity is expected to remain high in all parts of Japan over the coming years. Our results present a way to answer the urgent need to promote research on time-dependent earthquake predictability to prepare for subsequent large earthquakes in the near future in Japan.Published653-6583.1. Fisica dei terremotiJCR Journalrestricte

    Text Mining the History of Medicine

    Get PDF
    Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform

    An EST-SSR Linkage Map of Raphanus sativus and Comparative Genomics of the Brassicaceae†

    Get PDF
    Raphanus sativus (2n = 2x = 18) is a widely cultivated member of the family Brassicaceae, for which genomic resources are available only to a limited extent in comparison to many other members of the family. To promote more genetic and genomic studies and to enhance breeding programmes of R. sativus, we have prepared genetic resources such as complementary DNA libraries, expressed sequences tags (ESTs), simple sequence repeat (SSR) markers and a genetic linkage map. A total of 26 606 ESTs have been collected from seedlings, roots, leaves, and flowers, and clustered into 10 381 unigenes. Similarities were observed between the expression patterns of transcripts from R. sativus and those from representative members of the genera Arabidopsis and Brassica, indicating their functional relatedness. The EST sequence data were used to design 3800 SSR markers and consequently 630 polymorphic SSR loci and 213 reported marker loci have been mapped onto nine linkage groups, covering 1129.2 cM with an average distance of 1.3 cM between loci. Comparison of the mapped EST-SSR marker positions in R. sativus with the genome sequence of A. thaliana indicated that the Brassicaceae members have evolved from a common ancestor. It appears that genomic fragments corresponding to those of A. thaliana have been doubled and tripled in R. sativus. The genetic map developed here is expected to provide a standard map for the genetics, genomics, and molecular breeding of R. sativus as well as of related species. The resources are available at http://marker.kazusa.or.jp/Daikon
    corecore