1,953 research outputs found

    Identification and characterization of diseases on social web

    Get PDF
    [no abstract

    Private Semi-supervised Knowledge Transfer for Deep Learning from Noisy Labels

    Full text link
    Deep learning models trained on large-scale data have achieved encouraging performance in many real-world tasks. Meanwhile, publishing those models trained on sensitive datasets, such as medical records, could pose serious privacy concerns. To counter these issues, one of the current state-of-the-art approaches is the Private Aggregation of Teacher Ensembles, or PATE, which achieved promising results in preserving the utility of the model while providing a strong privacy guarantee. PATE combines an ensemble of "teacher models" trained on sensitive data and transfers the knowledge to a "student" model through the noisy aggregation of teachers' votes for labeling unlabeled public data which the student model will be trained on. However, the knowledge or voted labels learned by the student are noisy due to private aggregation. Learning directly from noisy labels can significantly impact the accuracy of the student model. In this paper, we propose the PATE++ mechanism, which combines the current advanced noisy label training mechanisms with the original PATE framework to enhance its accuracy. A novel structure of Generative Adversarial Nets (GANs) is developed in order to integrate them effectively. In addition, we develop a novel noisy label detection mechanism for semi-supervised model training to further improve student model performance when training with noisy labels. We evaluate our method on Fashion-MNIST and SVHN to show the improvements on the original PATE on all measures

    User Interfaces to the Web of Data based on Natural Language Generation

    Get PDF
    We explore how Virtual Research Environments based on Semantic Web technologies support research interactions with RDF data in various stages of corpus-based analysis, analyze the Web of Data in terms of human readability, derive labels from variables in SPARQL queries, apply Natural Language Generation to improve user interfaces to the Web of Data by verbalizing SPARQL queries and RDF graphs, and present a method to automatically induce RDF graph verbalization templates via distant supervision

    LabelVizier: Interactive Validation and Relabeling for Technical Text Annotations

    Full text link
    With the rapid accumulation of text data produced by data-driven techniques, the task of extracting "data annotations"--concise, high-quality data summaries from unstructured raw text--has become increasingly important. The recent advances in weak supervision and crowd-sourcing techniques provide promising solutions to efficiently create annotations (labels) for large-scale technical text data. However, such annotations may fail in practice because of the change in annotation requirements, application scenarios, and modeling goals, where label validation and relabeling by domain experts are required. To approach this issue, we present LabelVizier, a human-in-the-loop workflow that incorporates domain knowledge and user-specific requirements to reveal actionable insights into annotation flaws, then produce better-quality labels for large-scale multi-label datasets. We implement our workflow as an interactive notebook to facilitate flexible error profiling, in-depth annotation validation for three error types, and efficient annotation relabeling on different data scales. We evaluated the efficiency and generalizability of our workflow with two use cases and four expert reviews. The results indicate that LabelVizier is applicable in various application scenarios and assist domain experts with different knowledge backgrounds to efficiently improve technical text annotation quality.Comment: 10 pages, 5 figure

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    SPECIMEN LABELING IMPROVEMENT PROJECT: SLIP

    Get PDF
    Blood specimens are labeled at the time of acquisition in order to identify and match the specimen, label, and order to the patient. While the labeling process is not new, it is frequently laden with errors (Brown, Smith, & Sherfy, 2011). Wrong blood in tube (WBIT) poses significant risk. Multiple factors contribute to mislabeling errors, including lax policies, limited technological solutions, decentralized labeling processes, multi-tasking, distraction from the clinician, and insufficient education and training of staff. To reduce blood specimen labeling errors, a large academic medical center implemented an innovative technological solution for specimen labeling that integrates patient identification, physician order, and laboratory specimen identification through barcode technology that interfaces with the electronic medical record at the point of care. A failure mode, effects and critical analysis (FMECA) were completed to assess for system failure points, and to design workflow prior to training staff. Four failure points were identified and eliminated through workflow adjustments with the new system. Staff training utilizing simulation highlighted system safety points. This quality improvement process applied across adult and pediatric acute and critical care units provided dramatic reductions in blood specimen labeling errors pre/post intervention

    Blockchain-based life cycle assessment: An implementation framework and system architecture

    Get PDF
    Life cycle assessment (LCA) is widely used for assessing the environmental impacts of a product or service. Collecting reliable data is a major challenge in LCA due to the complexities involved in the tracking and quantifying inputs and outputs at multiple supply chain stages. Blockchain technology offers an ideal solution to overcome the challenge in sustainable supply chain management. Its use in combination with internet-of-things (IoT) and big data analytics and visualization can help organizations achieve operational excellence in conducting LCA for improving supply chain sustainability. This research develops a framework to guide the implementation of Blockchain-based LCA. It proposes a system architecture that integrates the use of Blockchain, IoT, and big data analytics and visualization. The proposed implementation framework and system architecture were validated by practitioners who were experienced with Blockchain applications. The research also analyzes system implementation costs and discusses potential issues and solutions, as well as managerial and policy implications

    Beauty Shouldn’t Cause Pain: A Makeover Proposal for the FDA’s Cosmetics Regulation

    Get PDF
    The American cosmetics industry is not required by the Food and Drug Administration (FDA) to conduct pre-market safety assessments of cosmetics. The FDA only reviews personal care products when people voluntarily report problems. Further, companies continue to test animals for cosmetics, despite the FDA’s recommendation that manufacturers seek more humane and accurate testing. Although the FDA does not require animal testing for product safety or premarket approval, the United States is one of the largest users of laboratory animals for product testing. There are two pending pieces of legislation, which if passed would be the first acts of cosmetic regulation in over eighty-years: the Safe Cosmetics and Personal Care Products Act (Safe Cosmetics Act) and the Personal Care Products Safety Act (Personal Care Act). This note discusses the reasons the bills should pass and examines the FDA’s current personal care product regulatory scheme. Section II examines recent events in the media, which brought awareness to the current regulatory system’s inadequacies and concerning chemicals. Section III details the current federal legislation governing American cosmetics and proposed legislation. Section IV discusses the European Union’s and California’s stronger approach to cosmetic regulation. Section V proposes adding an animal testing ban and legal definitions for cosmetic terms to pending legislation. Section VI discusses consumer education as a temporary alternative until stronger legislation is passed

    Lessons from Nutritional Labeling on the 20th Anniversary of the NLEA: Applying the History of Food Labeling to the Future of Household Chemical Labeling

    Get PDF
    To remedy the defects of the current household chemical labeling system, Senator Al Franken of Minnesota and Representative Steve Israel of New York introduced legislation in the 111th Congress that would mandate labeling of “household cleaning products and similar products” with disclosure of all ingredients. This Note adopts that position as a starting point and proposes a new labeling scheme for all household chemicals modeled on the “Nutrition Facts” label mandated for food products. The Note reviews the history of food labeling regulation, examines present household chemical regulations, and proposes a new regulatory regime that learns from the successes and failures of food labeling past and present. Part II discusses the history of food and nutritional labeling since 1900. Part III features an overview of current household chemical labeling regulations. Part IV contains a brief introduction to some of the chemicals found in household products, including some of the known and suspected health and environmental concerns they may pose. Part V analyzes potential regulatory solutions to the problems presented by the current state of household chemical labeling and suggests some forms that a new labeling scheme should adopt
    • …
    corecore