6,283 research outputs found

    Collocation analysis for UMLS knowledge-based word sense disambiguation

    Get PDF
    BACKGROUND: The effectiveness of knowledge-based word sense disambiguation (WSD) approaches depends in part on the information available in the reference knowledge resource. Off the shelf, these resources are not optimized for WSD and might lack terms to model the context properly. In addition, they might include noisy terms which contribute to false positives in the disambiguation results. METHODS: We analyzed some collocation types which could improve the performance of knowledge-based disambiguation methods. Collocations are obtained by extracting candidate collocations from MEDLINE and then assigning them to one of the senses of an ambiguous word. We performed this assignment either using semantic group profiles or a knowledge-based disambiguation method. In addition to collocations, we used second-order features from a previously implemented approach.Specifically, we measured the effect of these collocations in two knowledge-based WSD methods. The first method, AEC, uses the knowledge from the UMLS to collect examples from MEDLINE which are used to train a Naïve Bayes approach. The second method, MRD, builds a profile for each candidate sense based on the UMLS and compares the profile to the context of the ambiguous word.We have used two WSD test sets which contain disambiguation cases which are mapped to UMLS concepts. The first one, the NLM WSD set, was developed manually by several domain experts and contains words with high frequency occurrence in MEDLINE. The second one, the MSH WSD set, was developed automatically using the MeSH indexing in MEDLINE. It contains a larger set of words and covers a larger number of UMLS semantic types. RESULTS: The results indicate an improvement after the use of collocations, although the approaches have different performance depending on the data set. In the NLM WSD set, the improvement is larger for the MRD disambiguation method using second-order features. Assignment of collocations to a candidate sense based on UMLS semantic group profiles is more effective in the AEC method.In the MSH WSD set, the increment in performance is modest for all the methods. Collocations combined with the MRD disambiguation method have the best performance. The MRD disambiguation method and second-order features provide an insignificant change in performance. The AEC disambiguation method gives a modest improvement in performance. Assignment of collocations to a candidate sense based on knowledge-based methods has better performance. CONCLUSIONS: Collocations improve the performance of knowledge-based disambiguation methods, although results vary depending on the test set and method used. Generally, the AEC method is sensitive to query drift. Using AEC, just a few selected terms provide a large improvement in disambiguation performance. The MRD method handles noisy terms better but requires a larger set of terms to improve performance

    Employment Expectations and Gross Flows by Type of Work Contract

    Full text link
    There is growing interest in understanding firms’ temporary and permanent employment practices and how institutional changes shape them. Using data on Spanish establishments, we examine: (a) how employers adjust temporary and permanent job and worker flows to prior employment expectations, and (b) how the 1994 and 1997 labour reforms promoting permanent employment affected establishments’ employment practices. Generally, establishments’ prior employment expectations are realized through changes in all job and worker flows. However, establishments uniquely rely on temporary hires as a buffer to confront diminishing long-run employment expectations. None of the reforms significantly affected establishments’ net temporary or permanent employment flows.http://deepblue.lib.umich.edu/bitstream/2027.42/40032/3/wp646.pd

    Knowledge-based biomedical word sense disambiguation: comparison of approaches

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Word sense disambiguation (WSD) algorithms attempt to select the proper sense of ambiguous terms in text. Resources like the UMLS provide a reference thesaurus to be used to annotate the biomedical literature. Statistical learning approaches have produced good results, but the size of the UMLS makes the production of training data infeasible to cover all the domain.</p> <p>Methods</p> <p>We present research on existing WSD approaches based on knowledge bases, which complement the studies performed on statistical learning. We compare four approaches which rely on the UMLS Metathesaurus as the source of knowledge. The first approach compares the overlap of the context of the ambiguous word to the candidate senses based on a representation built out of the definitions, synonyms and related terms. The second approach collects training data for each of the candidate senses to perform WSD based on queries built using monosemous synonyms and related terms. These queries are used to retrieve MEDLINE citations. Then, a machine learning approach is trained on this corpus. The third approach is a graph-based method which exploits the structure of the Metathesaurus network of relations to perform unsupervised WSD. This approach ranks nodes in the graph according to their relative structural importance. The last approach uses the semantic types assigned to the concepts in the Metathesaurus to perform WSD. The context of the ambiguous word and semantic types of the candidate concepts are mapped to Journal Descriptors. These mappings are compared to decide among the candidate concepts. Results are provided estimating accuracy of the different methods on the WSD test collection available from the NLM.</p> <p>Conclusions</p> <p>We have found that the last approach achieves better results compared to the other methods. The graph-based approach, using the structure of the Metathesaurus network to estimate the relevance of the Metathesaurus concepts, does not perform well compared to the first two methods. In addition, the combination of methods improves the performance over the individual approaches. On the other hand, the performance is still below statistical learning trained on manually produced data and below the maximum frequency sense baseline. Finally, we propose several directions to improve the existing methods and to improve the Metathesaurus to be more effective in WSD.</p

    Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD.</p> <p>Methods</p> <p>In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set.</p> <p>Results</p> <p>The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE.</p> <p>We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods.</p> <p>Conclusions</p> <p>The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions.</p

    Identification of animal species housed and herding practices in ancient sediments from the Vallone Inferno rock-shelter (Scillato, Sicily, Italy) using faecal biomarkers, hormones, and their metabolites

    Get PDF
    The interest in the identification of animal species housed in caves or rock-shelters used as livestock pen and herding management along prehistoric and historic ages, is increasing to understand better the development of pastoral activities. In this manuscript, a method for the quantification of β-sterol/phytosterols, bile acids, hormones and hormones metabolites has been developed to determine the main pastoral activities carried out in Vallone Inferno rock-shelter (Scillato, Sicily, Italy) from Middle Neolithic to Early Middle Age. According to the result obtained, the main animals housed in the rock-shelter went gradually changing from ovicaprids in Middle Neolithic to pigs in Early Middle Age. Additionally, new proxies (progesterone/Ʃbile acids and metabolites of progesterone/Ʃbile acids) were used to detect a high hormonal activity at Early Middle Age samples related with female pig management

    A phase II study of Yondelis® (trabectedin, ET-743) as a 24-h continuous intravenous infusion in pretreated advanced breast cancer

    Get PDF
    Yondelis® (trabectedin, ET-743) is a novel marine-derived anticancer compound found in the ascidian Ecteinascidia turbinata. It is currently under phase II/III development in breast cancer, hormone refractory prostate cancer, sarcomas and ovarian cancer. Activity in breast cancer experimental models has been reported, and preliminary evidence of activity in this setting during the phase I programme has also been observed. The present study assessed the activity and feasibility of trabectedin in women with advanced breast cancer previously treated with conventional therapies. Patients with advanced disease previously treated with at least one but not more than two regimens that included taxanes or anthracyclines as palliative therapy were eligible. Trabectedin 1.5 mg m−2 was administered as a 24-h continuous infusion every 3 weeks. Patients were kept on therapy until disease progression, unacceptable toxicity or patient refusal. Twenty-seven patients were included between April 1999 and September 2000. Their median age was 54 years (range: 36–67) and 63% of them had two metastatic sites. Twenty-two patients were performance status 1. All patients had previously received anthracyclines, and 23 out of 27 patients had received taxanes. Of 21 patients with measurable disease, three confirmed partial responses, one unconfirmed partial response and two minor responses (49 and 32% tumour shrinkage) were observed; six patients had stable disease. Median survival was 10 months (95% confidence interval: 4.88–15.18). Transient and noncumulative transaminitis was observed in most of the patients. The pharmacokinetic profile of trabectedin in this patient's population is in line with the overall data available with this schedule. The policy of dose adjustments based on the intercycle peaks of bilirubin and alkaline phosphatase appears to have a positive impact in the therapeutic index of trabectedin. Trabectedin can induce response and tumour control in previously treated advanced breast cancer, with manageable toxicity, thus warranting further development as a single agent or in combination regimens

    Somatic and germline analysis of a familial Rothmund-Thomson syndrome in two siblings with osteosarcoma

    Get PDF
    Rothmund–Thomson syndrome (RTS) is characterized by a rash that begins in the first few months of life and eventually develops into poikiloderma. Associated symptoms are alterations in the teeth, sparse hair, thin eyebrows, lack of eyelashes, low stature, bone abnormalities, hematological illnesses, gastrointestinal disease, malnutrition, cataracts, and predisposition to cancer, principally to bone tumors and skin cancer. Diagnostic certitude is provided by a genetic study involving detection of pathogenic variants of the RECQL4 gene. We hereby present a familiar case of RTS in two siblings from a Portuguese family, both diagnosed with osteosarcoma. Genomic analysis (203 genes) of both tumors as well as germline analysis of the RECQL4 gene, thus confirming the syndrome in the family, have been performed. The relevance of clinical recognition of the hallmarks of the disease and thus early diagnosis with early intervention is highlighted

    La acústica submarina y su desarrollo desde la creación del Instituto de Acústica

    Get PDF
    PACS: 43.30.Xm; 43.30.Yj; 43.30.Vh; 43.30.Nb; 43.30.Ma.-- Publicado en el Vol. XXXI, núm. 3-4, tercer y cuarto trimestre 2000 de la Revista de Acústica: Número especial dedicado al XXV Aniversario del Instituto de Acústica del C.S.I.C.[ES] La Acústica Submarina fue una de las líneas de la Acústica que se desarrollan en el Instituto desde los primeros tiempos. Este trabajo describe como se inició, cómo se desarrolló y el estado actual de este campo.[EN] Underwater Acoustics was one line of Acoustics first developped at the Instituto de Acústica. This paper presents a description of the activities in the underwater field, done since 1969, when the Underwater Tank was installed, up to the present times.Peer reviewe
    corecore