2,979 research outputs found

    Functional genre in Illinois State Government digital documents

    Get PDF
    Provisions for collecting or archiving digital documents can be informed by knowledge of the genres of documents likely to be encountered. Although different aspects of collecting and curation may classify documents into genres based on differing criteria (e.g., size, file format, subject), this document addresses classification based on the functional role the document plays in state government, akin to (Toms, 2001), but here specifically Illinois State Government (ISG). The classifications listed herein are based on an overview of ISG digital documents, encountered in over nine years of gathering and archiving work with and for the Illinois State Library (ISL), and on discussions with practitioners in cataloging and in government documents librarianship. This report states definitions, and including examples of each such genre. State government documents are interesting in this regard in that they are presumably somewhat comparable to both federal government documents and business documents. Perhaps surprisingly, there are also portions of the State Web that are somewhat less than businesslike, either in tone or in technological proficiency of implementation. In this respect state government digital documents may also be useful approximations to documents produced either personally or by small activities. Having a list of government document genres can inform work in information promulgation (e.g., through website design, or the design of a series of printed materials), and the grouping of documents for digital library or archival purposes.Library of Congress / NDIIPP-2 A6075unpublishednot peer reviewe

    Information Extraction from Biomedical Text Using Machine Learning

    Get PDF
    Inadequate drug experimental data and the use of unlicensed drugs may cause adverse drug reactions, especially in pediatric populations. Every year the U.S. Food and Drug Administration approves human prescription drugs for marketing. The labels associated with these drugs include information about clinical trials and drug response in pediatric population. In order for doctors to make an informed decision about the safety and effectiveness of these drugs for children, there is a need to analyze complex and often unstructured drug labels. In this work, first, an exploratory analysis of drug labels using a Natural Language Processing pipeline is performed. Second, Machine Learning algorithms have been employed to build baseline binary classification models to identify pediatric text in unstructured drug labels. Third, a series of experiments have been executed to evaluate the accuracy of the model. The prototype is able to classify pediatrics-related text with a recall of 0.93 and precision of 0.86

    An Ontology based Text-to-Picture Multimedia m-Learning System

    Get PDF
    Multimedia Text-to-Picture is the process of building mental representation from words associated with images. From the research aspect, multimedia instructional message items are illustrations of material using words and pictures that are designed to promote user realization. Illustrations can be presented in a static form such as images, symbols, icons, figures, tables, charts, and maps; or in a dynamic form such as animation, or video clips. Due to the intuitiveness and vividness of visual illustration, many text to picture systems have been proposed in the literature like, Word2Image, Chat with Illustrations, and many others as discussed in the literature review chapter of this thesis. However, we found that some common limitations exist in these systems, especially for the presented images. In fact, the retrieved materials are not fully suitable for educational purposes. Many of them are not context-based and didn’t take into consideration the need of learners (i.e., general purpose images). Manually finding the required pedagogic images to illustrate educational content for learners is inefficient and requires huge efforts, which is a very challenging task. In addition, the available learning systems that mine text based on keywords or sentences selection provide incomplete pedagogic illustrations. This is because words and their semantically related terms are not considered during the process of finding illustrations. In this dissertation, we propose new approaches based on the semantic conceptual graph and semantically distributed weights to mine optimal illustrations that match Arabic text in the children’s story domain. We combine these approaches with best keywords and sentences selection algorithms, in order to improve the retrieval of images matching the Arabic text. Our findings show significant improvements in modelling Arabic vocabulary with the most meaningful images and best coverage of the domain in discourse. We also develop a mobile Text-to-Picture System that has two novel features, which are (1) a conceptual graph visualization (CGV) and (2) a visual illustrative assessment. The CGV shows the relationship between terms associated with a picture. It enables the learners to discover the semantic links between Arabic terms and improve their understanding of Arabic vocabulary. The assessment component allows the instructor to automatically follow up the performance of learners. Our experiments demonstrate the efficiency of our multimedia text-to-picture system in enhancing the learners’ knowledge and boost their comprehension of Arabic vocabulary

    Multi-Source Spatial Entity Extraction and Linkage

    Get PDF

    User Provisioning Processes in Identity Management addressing SAP Campus Management

    Get PDF
    This document is the report of the work of an ISWA working team on a WUSKAR case study. This study tackles on the desire of meta directory synchronisation with a proprietary SAP R/3 system in the context of an identity management system. Early tasks concern identifying exact desires and scenarios, modelling the synchronisation process, identifying what relevant data is to be processed, as well as proposing templates for the matching and transformation process. Intermediate tasks are related to the technical aspects of the case study, as well as problem task division and progress management, regular review of strategic and technical choices

    Converting Medical Service Provider Data into a Unified Format for Processing

    Get PDF
    Most organizations process flat files regularly. There are different options for processing files, including SQL Server Integration Services (SSIS), BizTalk, SQL import job, and other Extract, Transform, and Load (ETL) processes. All of these options have very strict requirements for file formats. If the format of the file changes, all of these options throw a catastrophic error, and implementing a fix to handle the new format is difficult. With each of the methods, the new format needs to be configured in the development environment, and the data flow must be modified to process all of the changes. Due to the inflexibility of options in processing flat files, there was a request by Dr. Corliss to build an alternative solution. The team of Ivan Paez, Niharika Jain, and Brandon Krugman created an alternative solution called FileParser. While the solution originally was built to meet the needs of Dr. Corliss and the GasDay team at Marquette University, the end result was a file parser that allows additional flexibility in processing of a variety of flat file formats. This thesis provides an alternative way to parse data, transform a flat file, and consume the data into a generic format; this process is called Provider Processing. Provider File Processing consists of the FileParser command line executable handling the file parsing and data transformation. After FileParser generates a provider output file, a health insurance domain-specific command line executable called DelegatedProviderProcessing performs data cleansing, address normalization, and imports the provider output file into an internal database. The difference between the strict format examples and Provider Processing is that if the format of the input files change, Provider Processing can adapt to the change with minimal work being completed
    • …
    corecore