2,313 research outputs found

    Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias

    Full text link
    Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks. While previous research has explored different approaches to training models using generated data, they generally rely on simple class-conditional prompts, which may limit the diversity of the generated data and inherit systematic biases of LLM. Thus, we investigate training data generation with diversely attributed prompts (e.g., specifying attributes like length and style), which have the potential to yield diverse and attributed generated data. Our investigation focuses on datasets with high cardinality and diverse domains, wherein we demonstrate that attributed prompts outperform simple class-conditional prompts in terms of the resulting model's performance. Additionally, we present a comprehensive empirical study on data generation encompassing vital aspects like bias, diversity, and efficiency, and highlight three key observations: firstly, synthetic datasets generated by simple prompts exhibit significant biases, such as regional bias; secondly, attribute diversity plays a pivotal role in enhancing model performance; lastly, attributed prompts achieve the performance of simple class-conditional prompts while utilizing only 5\% of the querying cost of ChatGPT associated with the latter. We release the generated dataset and used prompts to facilitate future research. The data and code will be available on \url{https://github.com/yueyu1030/AttrPrompt}.Comment: Work in progress. A shorter version is accepted to the ICML DMLR worksho

    Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities

    Get PDF
    One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. Prior works in this domain operate on a static snapshot of the community, making strong assumptions about the structure of the data (e.g., relational tables), or consider only shallow features for text classification. To address the above limitations, we propose probabilistic graphical models that can leverage the joint interplay between multiple factors in online communities --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online content, and the expertise of users and their evolution with user-interpretable explanation. To this end, we devise new models based on Conditional Random Fields for different settings like incorporating partial expert knowledge for semi-supervised learning, and handling discrete labels as well as numeric ratings for fine-grained analysis. This enables applications such as extracting reliable side-effects of drugs from user-contributed posts in healthforums, and identifying credible content in news communities. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This also enables applications such as identifying helpful product reviews, and detecting fake and anomalous reviews with limited information.Comment: PhD thesis, Mar 201

    Can Carbon Sinks be Operational? An RFF Workshop Summary

    Get PDF
    An RFF Workshop brought together experts from around the world to assess the feasibility of using biological sinks to sequester carbon as part of a global atmospheric mitigation effort. The chapters of this proceeding are a result of that effort. Although the intent of the workshop was not to generate a consensus, a number of studies suggest that sinks could be a relatively inexpensive and effective carbon management tool. The chapters cover a variety of aspects and topics related to the monitoring and measurement of carbon in biological systems. They tend to support the view the carbon sequestration using biological systems is technically feasible with relatively good precision and at relatively low cost. Thus carbon sinks can be operational.carbon, sinks, global warming, sequestration, forests

    Networks and trust: systems for understanding and supporting internet security

    Get PDF
    Includes bibliographical references.2022 Fall.This dissertation takes a systems-level view of the multitude of existing trust management systems to make sense of when, where and how (or, in some cases, if) each is best utilized. Trust is a belief by one person that by transacting with another person (or organization) within a specific context, a positive outcome will result. Trust serves as a heuristic that enables us to simplify the dozens decisions we make each day about whom we will transact with. In today's hyperconnected world, in which for many people a bulk of their daily transactions related to business, entertainment, news, and even critical services like healthcare take place online, we tend to rely even more on heuristics like trust to help us simplify complex decisions. Thus, trust plays a critical role in online transactions. For this reason, over the past several decades researchers have developed a plethora of trust metrics and trust management systems for use in online systems. These systems have been most frequently applied to improve recommender systems and reputation systems. They have been designed for and applied to varied online systems including peer-to-peer (P2P) filesharing networks, e-commerce platforms, online social networks, messaging and communication networks, sensor networks, distributed computing networks, and others. However, comparatively little research has examined the effects on individuals, organizations or society of the presence or absence of trust in online sociotechnical systems. Using these existing trust metrics and trust management systems, we design a set of experiments to benchmark the performance of these existing systems, which rely heavily on network analysis methods. Drawing on the experiments' results, we propose a heuristic decision-making framework for selecting a trust management system for use in online systems. In this dissertation we also investigate several related but distinct aspects of trust in online sociotechnical systems. Using network/graph analysis methods, we examine how trust (or lack of trust) affects the performance of online networks in terms of security and quality of service. We explore the structure and behavior of online networks including Twitter, GitHub, and Reddit through the lens of trust. We find that higher levels of trust within a network are associated with more spread of misinformation (a form of cybersecurity threat, according to the US CISA) on Twitter. We also find that higher levels of trust in open source developer networks on GitHub are associated with more frequent incidences of cybersecurity vulnerabilities. Using our experimental and empirical findings previously described, we apply the Systems Engineering Process to design and prototype a trust management tool for use on Reddit, which we dub Coni the Trust Moderating Bot. Coni is, to the best of our knowledge, the first trust management tool designed specifically for use on the Reddit platform. Through our work with Coni, we develop and present a blueprint for constructing a Reddit trust tool which not only measures trust levels, but can use these trust levels to take actions on Reddit to improve the quality of submissions within the community (a subreddit)

    The Algorithm Game

    Get PDF
    Most of the discourse on algorithmic decisionmaking, whether it comes in the form of praise or warning, assumes that algorithms apply to a static world. But automated decisionmaking is a dynamic process. Algorithms attempt to estimate some difficult-to-measure quality about a subject using proxies, and the subjects in turn change their behavior in order to game the system and get a better treatment for themselves (or, in some cases, to protest the system.) These behavioral changes can then prompt the algorithm to make corrections. The moves and countermoves create a dance that has great import to the fairness and efficiency of a decision-making process. And this dance can be structured through law. Yet existing law lacks a clear policy vision or even a coherent language to foster productive debate. This Article provides the foundation. We describe gaming and countergaming strategies using credit scoring, employment markets, criminal investigation, and corporate reputation management as key examples. We then show how the law implicitly promotes or discourages these behaviors, with mixed effects on accuracy, distributional fairness, efficiency, and autonomy

    Towards Reliable Online Feedback : The Impact of User Preference and Visual Cues in Rating Scales and User Ratings

    Get PDF
    With the rise of dependency on online shopping and service providers, consumer ratings and reviews help users decide between good and bad options. Reliable and useful ratings can ensure better consumer service, product sales, brand management. Any underlying bias or external factors affecting users emotional stability can corrupt the credibility of user feedback. Prior studies suggest that the visual representation and design elements provided with a rating scale can affect the user's responses, specially if the rating scales have visual labels that generate an emotional response in users. Since there are a number of rating scale designs used in online e-commerce sites and recommender systems, it is also important that users get a say in which rating scale they are comfortable in using. Online marketplace still does not provide a platform to consider user's own choice in this matter. This preferential choice of scales can make users more involved in the rating process and help get the best response from them. Earlier research have already proved that users have specific personalized preferences when it comes to using rating scales to give feedback online. Further emphasis on how this preference and visual cues together can elicit more reliable online feedback mechanism is required in this area. This thesis aims to investigate whether the preference of users in rating scales influences the reliability and authenticity of user's ratings. It also explores the user's reaction to certain visual cues in rating scales, and how user's preferences of rating scale are influenced by such visual elements. A within-subject study (nn = 187) was conducted to collect user ratings of popular products with six different rating scale designs, using two types of visual icons (stars and emojis) and colour-metaphors (using a warm-cool and a traffic-light metaphors). Statistical analysis from the survey shows that users prefer the scale with most visually informative design (traffic-light metaphor colours with emoji icons). It also shows that users tend to give their true ratings on scales they prefer most, rather than the scale design they are most familiar with. The rating score analysis also demonstrates a positive shift and better consistency in the ratings given on more visually rich scales. Based on these results, it can be concluded that user involvement is desirable in selecting the rating scale designs, and meaningful visual cues can contribute in getting more accurate (truthful) rating scores from users. The proposed approach of user preference based rating system has novelty because I elicited the user's own opinion on what their accurate or ``true" rating is; rather than only relying on analysing the data received from the rating scores. This work can offer insights for online rating scale designs to improve the rating decision quality of users and help online business platforms obtain more credible feedback from customers which can significantly improve their services and user satisfaction

    Dietary Supplement Company Evaluation

    Get PDF
    Due to the lack of clear regulations, and the excessive amounts of subjective marketing, it is difficult to determine which dietary supplement companies are producing quality products, making recommending companies challenging for healthcare professionals. Dr. Brian Cotter and his team from Newton Square Chiropractic in Worcester, MA help patients improve their quality of life through a variety of services, including nutritional counseling. This project will help local healthcare professionals and everyday consumers make confident decisions when recommending or purchasing dietary supplements. The results emphasize the importance of accessible scientific factors to select good quality products
    • …
    corecore