132 research outputs found

    Multiscale estimation of the field-aligned current density

    Get PDF
    Field-aligned currents (FACs) in the magnetosphere–ionosphere (M–I) system exhibit a range of spatial and temporal scales that are linked to key dynamic coupling processes. To disentangle the scale dependence in magnetic field signatures of auroral FACs and to characterize their geometry and orientation, Bunescu et al. (2015) introduced the multiscale FAC analyzer framework based on minimum variance analysis (MVA) of magnetic time series segments. In the present report this approach is carried further to include in the analysis framework a FAC density scalogram, i.e., a multiscale representation of the FAC density time series. The new technique is validated and illustrated using synthetic data consisting of overlapping sheets of FACs at different scales. The method is applied to Swarm data showing both large-scale and quiet aurora as well as mesoscale FAC structures observed during more disturbed conditions. We show both planar and non-planar FAC structures as well as uniform and non-uniform FAC density structures. For both synthetic and Swarm data, the multiscale analysis is applied by two scale sampling schemes, namely the linear and logarithmic scanning of the FAC scale domain. The local FAC density is compared with the input FAC density for the synthetic data, whereas for the Swarm data we cross-check the results with well-established single- and dual-spacecraft techniques. All the multiscale information provides a new visualization tool for the complex FAC signatures that complements other FAC analysis tools.</p

    Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

    Full text link
    Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic summarization, semantic search or machine translation. Name ambiguity, word polysemy, context dependencies and a heavy-tailed distribution of entities contribute to the complexity of this problem. We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation. Input mentions (i.e.,~linkable token spans) are disambiguated jointly across an entire document by combining a document-level prior of entity co-occurrences with local information captured from mentions and their surrounding context. The model is based on simple sufficient statistics extracted from data, thus relying on few parameters to be learned. Our method does not require extensive feature engineering, nor an expensive training procedure. We use loopy belief propagation to perform approximate inference. The low complexity of our model makes this step sufficiently fast for real-time usage. We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing state-of-the-art methods

    Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome

    Get PDF
    BACKGROUND: Extensive protein interaction maps are being constructed for yeast, worm, and fly to ask how the proteins organize into pathways and systems, but no such genome-wide interaction map yet exists for the set of human proteins. To prepare for studies in humans, we wished to establish tests for the accuracy of future interaction assays and to consolidate the known interactions among human proteins. RESULTS: We established two tests of the accuracy of human protein interaction datasets and measured the relative accuracy of the available data. We then developed and applied natural language processing and literature-mining algorithms to recover from Medline abstracts 6,580 interactions among 3,737 human proteins. A three-part algorithm was used: first, human protein names were identified in Medline abstracts using a discriminator based on conditional random fields, then interactions were identified by the co-occurrence of protein names across the set of Medline abstracts, filtering the interactions with a Bayesian classifier to enrich for legitimate physical interactions. These mined interactions were combined with existing interaction data to obtain a network of 31,609 interactions among 7,748 human proteins, accurate to the same degree as the existing datasets. CONCLUSION: These interactions and the accuracy benchmarks will aid interpretation of current functional genomics data and provide a basis for determining the quality of future large-scale human protein interaction assays. Projecting from the approximately 15 interactions per protein in the best-sampled interaction set to the estimated 25,000 human genes implies more than 375,000 interactions in the complete human protein interaction network. This set therefore represents no more than 10% of the complete network

    Ranking deep web text collections for scalable information extraction

    Get PDF
    Information extraction (IE) systems discover structured in-formation from natural language text, to enable much richer querying and data mining than possible directly over the unstructured text. Unfortunately, IE is generally a com-putationally expensive process, and hence improving its ef-ficiency, so that it scales over large volumes of text, is of critical importance. State-of-the-art approaches for scaling the IE process focus on one text collection at a time. These approaches prioritize the extraction effort by learning key-word queries to identify the “useful ” documents for the IE task at hand, namely, those that lead to the extraction of structured “tuples. ” These approaches, however, do not at-tempt to predict which text collections are useful for the IE task—and hence merit further processing—and which ones will not contribute any useful output—and hence should be ignored altogether, for efficiency. In this paper, we focus on an especially valuable family of text sources, the so-called deep web collections, whose (remote) contents are only ac-cessible via querying. Specifically, we introduce and study techniques for ranking deep web collections for an IE task, to prioritize the extraction effort by focusing on collections with substantial numbers of useful documents for the task. We study both (adaptations of) state-of-the-art resource se-lection strategies for distributed information retrieval, and IE-specific approaches. Our extensive experimental eval-uation over realistic deep web collections, and for several different IE tasks, shows the merits and limitations of the alternative families of approaches, and provides a roadmap for addressing this critically important building block for efficient, scalable information extraction. 1

    Корреляции между улучшением функционального статуса и повышением качества жизни у больных с поясничной дископатией профессиональной этиологии

    Get PDF
    Universitatea de Medicină şi Farmacie, Craiova, Departamentul IV – Specialităţi medicale II, Spitalul Clinic Judeţean de Urgenţă, Craiova, Clinica de Medicina Muncii, Conferinţa știinţifico-practică naţională cu participare internaţională Sănătatea ocupaţională: probleme și realizări prima ediţie 5-7 iunie 2014Studies on quality of life in patients with low back pain have proved the impact of pain on daily activities and social life of their Patients were evaluated clinically, laboratory and functional at baseline and then every 3 months, 6 months and one year. We showed a statistically significant correlation between improvement in functional status (RMDQ score) and quality of life (HAQ) for patients with chronic occupational low back pain. Improvements in clinical and functional parameters evaluated has a significant impact on quality of life of patients with chronic low back pain .Исследование качества жизни у пациентов с ломбосакралгиями показало влияние боли в спине на их повседневную деятельность и социальную жизнь. Пациенты были оценены клинически, лабораторно и функционально при отборе для исследования и затем каждые 3 месяца, 6 месяцев и год. Установили статистически значимую корреляцию между улучшением функционального состояния (RMDQ) и качеством жизни (HAQ) для пациентов с хронической профессиональной ломбосакралгией. Улучшение оцениваемых клинических и функциональных параметров оказывает значительное влияние на качество жизни пациентов с хронической профессиональной ломбосакралгией

    Linguistic feature analysis for protein interaction extraction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The rapid growth of the amount of publicly available reports on biomedical experimental results has recently caused a boost of text mining approaches for protein interaction extraction. Most approaches rely implicitly or explicitly on linguistic, i.e., lexical and syntactic, data extracted from text. However, only few attempts have been made to evaluate the contribution of the different feature types. In this work, we contribute to this evaluation by studying the relative importance of deep syntactic features, i.e., grammatical relations, shallow syntactic features (part-of-speech information) and lexical features. For this purpose, we use a recently proposed approach that uses support vector machines with structured kernels.</p> <p>Results</p> <p>Our results reveal that the contribution of the different feature types varies for the different data sets on which the experiments were conducted. The smaller the training corpus compared to the test data, the more important the role of grammatical relations becomes. Moreover, deep syntactic information based classifiers prove to be more robust on heterogeneous texts where no or only limited common vocabulary is shared.</p> <p>Conclusion</p> <p>Our findings suggest that grammatical relations play an important role in the interaction extraction task. Moreover, the net advantage of adding lexical and shallow syntactic features is small related to the number of added features. This implies that efficient classifiers can be built by using only a small fraction of the features that are typically being used in recent approaches.</p
    corecore