88,195 research outputs found

    Event-based media monitoring methodology for Human Rights Watch

    Get PDF
    Executive Summary This report, prepared by a team of researchers from the University of Minnesota for Human Rights Watch (HRW), investigates the use of event-based media monitoring (EMM) to review its application, identify its strengths and weaknesses, and offer suggestions on how HRW can better utilize EMM in its own work. Media monitoring systems include both human-operated (manual) and automated systems, both of which we review throughout the report. The process begins with the selection of news sources, proceeds to the development of a coding manual (for manual searches) or “dictionary” (for automated searches), continues with gathering data, and concludes with the coding of news stories. EMM enables the near real-time tracking of events reported by the media, allowing researchers to get a sense of the scope of and trends in an event, but there are limits to what EMM can accomplish on its own. The media will only cover a portion of a given event, so information will always be missing from EMM data. EMM also introduces research biases of various kinds; mitigating these biases requires careful selection of media sources and clearly defined coding manuals or dictionaries. In manual EMM, coding the gathered data requires human researchers to apply codebook rules in order to collect consistent data from each story they read. In automated EMM, computers apply the dictionary directly to the news stories, automatically picking up the desired information. There are trade-offs in each system. Automated EMM can code stories far more quickly, but the software may incorrectly code stories, requiring manual corrections. Conversely, manual EMM allows for a more nuanced analysis, but the investment of time and effort may diminish the tool’s utility. We believe that both manual and automated EMM, when deployed correctly, can effectively support human rights research and advocacy

    Automated schema matching techniques: an exploratory study

    Get PDF
    Manual schema matching is a problem for many database applications that use multiple data sources including data warehousing and e-commerce applications. Current research attempts to address this problem by developing algorithms to automate aspects of the schema-matching task. In this paper, an approach using an external dictionary facilitates automated discovery of the semantic meaning of database schema terms. An experimental study was conducted to evaluate the performance and accuracy of five schema-matching techniques with the proposed approach, called SemMA. The proposed approach and results are compared with two existing semi-automated schema-matching approaches and suggestions for future research are made

    Extracting (good) discourse examples from an oral specialised corpus of wine tasting interactions

    No full text
    International audienceThis article outlines the semi-automated extraction of dictionary examples used in the compilation of a professional online dictionary of wine tasting. Named OenoLex Bourgogne, this dictionary was started to respond to the demand for a lexicographic information tool from the French wine industry of Burgundy, the Bureau Interprofessionnel des Vins de Bourgogne

    Using Freebase, An Automatically Generated Dictionary, And A Classifier To Identify A Person\u27s Profession In Tweets

    Get PDF
    Algorithms for classifying pre-tagged person entities in tweets into one of eight profession categories are presented. A classifier using a semi-supervised learning algorithm that takes into consideration the local context surrounding the entity in the tweet, hash tag information, and topic signature scores is described. In addition to the classifier, this research investigates two dictionaries containing the professions of persons. These two dictionaries are used in their own classification algorithms which are independent of the classifier. The method for creating the first dictionary dynamically from the web and the algorithm that accesses this dictionary to classify a person into one of the eight profession categories are explained next. The second dictionary is freebase, an openly available online database that is maintained by its online community. The algorithm that uses freebase for classifying a person into one of the eight professions is described. The results also show that classifications made using the automated constructed dictionary, freebase, or the classifier are all moderately successful. The results also show that classifications made with the automated constructed person dictionary are slightly more accurate than classifications made using freebase. Various hybrid methods, combining the classifier and the two dictionaries are also explained. The results of those hybrid methods show significant improvement over any of the individual methods

    An engineering approach to knowledge acquisition by the interactive analysis of dictionary definitions

    Get PDF
    It has long been recognised that everyday dictionaries are a potential source of lexical and world knowledge of the type required by many Natural Language Processing (NLP) systems. This research presents a semi-automated approach to the extraction of rich semantic relationships from dictionary definitions. The definitions are taken from the recently published "Cambridge International Dictionary of English" (CIDE). The thesis illustrates how many of the innovative features of CIDE can be exploited during the knowledge acquisition process. The approach introduced in this thesis uses the LOLITA NLP system to extract and represent semantic relationships, along with a human operator to resolve the different forms of ambiguity which exist within dictionary definitions. Such a strategy combines the strengths of both participants in the acquisition process: automated procedures provide consistency in the construction of complex and inter-related semantic relationships, while the human participant can use his or her knowledge to determine the correct interpretation of a definition. This semi-automated strategy eliminates the weakness of many existing approaches because it guarantees feasibility and correctness: feasibility is ensured by exploiting LOLITA's existing NLP capabilities so that humans with minimal linguistic training can resolve the ambiguities within dictionary definitions; and correctness is ensured because incorrectly interpreted definitions can be manually eliminated. The feasibility and correctness of the solution is supported by the results of an evaluation which is presented in detail in the thesis

    Feasibility of automated 3-dimensional magnetic resonance imaging pancreas segmentation.

    Get PDF
    PurposeWith the advent of MR guided radiotherapy, internal organ motion can be imaged simultaneously during treatment. In this study, we evaluate the feasibility of pancreas MRI segmentation using state-of-the-art segmentation methods.Methods and materialT2 weighted HASTE and T1 weighted VIBE images were acquired on 3 patients and 2 healthy volunteers for a total of 12 imaging volumes. A novel dictionary learning (DL) method was used to segment the pancreas and compared to t mean-shift merging (MSM), distance regularized level set (DRLS), graph cuts (GC) and the segmentation results were compared to manual contours using Dice's index (DI), Hausdorff distance and shift of the-center-of-the-organ (SHIFT).ResultsAll VIBE images were successfully segmented by at least one of the auto-segmentation method with DI >0.83 and SHIFT ≤2 mm using the best automated segmentation method. The automated segmentation error of HASTE images was significantly greater. DL is statistically superior to the other methods in Dice's overlapping index. For the Hausdorff distance and SHIFT measurement, DRLS and DL performed slightly superior to the GC method, and substantially superior to MSM. DL required least human supervision and was faster to compute.ConclusionOur study demonstrated potential feasibility of automated segmentation of the pancreas on MRI images with minimal human supervision at the beginning of imaging acquisition. The achieved accuracy is promising for organ localization
    • …
    corecore