428 research outputs found

    Using Data Analytics to Filter Insincere Posts from Online Social Networks A case study: Quora Insincere Questions

    Get PDF
    The internet in general and Online Social Networks (OSNs) in particular continue to play a significant role in our life where information is massively uploaded and exchanged. With such high importance and attention, abuses of such media of communication for different purposes are common. Driven by goals such as marketing and financial gains, some users use OSNs to post their misleading or insincere content. In this context, we utilized a real-world dataset posted by Quora in Kaggle.com to evaluate different mechanisms and algorithms to filter insincere and spam contents. We evaluated different preprocessing and analysis models. Moreover, we analyzed the cognitive efforts users made in writing their posts and whether that can improve the prediction accuracy. We reported the best models in terms of insincerity prediction accuracy

    The metabolic regimes of 356 rivers in the United States

    Get PDF
    A national-scale quantification of metabolic energy flow in streams and rivers can improve understanding of the temporal dynamics of in-stream activity, links between energy cycling and ecosystem services, and the effects of human activities on aquatic metabolism. The two dominant terms in aquatic metabolism, gross primary production (GPP) and aerobic respiration (ER), have recently become practical to estimate for many sites due to improved modeling approaches and the availability of requisite model inputs in public datasets. We assembled inputs from the U.S. Geological Survey and National Aeronautics and Space Administration for October 2007 to January 2017. We then ran models to estimate daily GPP, ER, and the gas exchange rate coefficient for 356 streams and rivers across the continental United States. We also gathered potential explanatory variables and spatial information for cross-referencing this dataset with other datasets of watershed characteristics. This dataset offers a first national assessment of many-day time series of metabolic rates for up to 9 years per site, with a total of 490,907 site-days of estimates.We thank Jill Baron and the USGS Powell Center for financial support for this collaborative effort (Powell Center Working Group title: "Continental-scale overview of stream primary productivity, its links to water quality, and consequences for aquatic carbon biogeochemistry"). Additional financial support came from the USGS NAWQA program and Office of Water Information. NSF grants DEB-1146283 and EF1442501 partially supported ROH. A post-doctoral grant from the Basque Government partially supported MA. NAG was supported by the U.S. Department of Energy's Office of Science, Biological and Environmental Research. Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-00OR22725. Leah Colasuonno provided expert logistical support of our working group meetings. The developers of USGS ScienceBase were very helpful both in hosting this dataset and in responding to our requests. Randy Hunt and Mike Fienen of the USGS Wisconsin Modeling Center graciously provided access to their HTCondor cluster. Mike Vlah provided detailed and insightful reviews of the data and metadata

    A Survey on Image Mining Techniques: Theory and Applications

    Get PDF
    Image mining is a vital technique which is used to mine knowledge straightforwardly from image. Image segmentation is the primary phase in image mining. Image mining is simply an expansion of data mining in the field of image processing. Image mining handles with the hidden knowledge extraction, image data association and additional patterns which are not clearly accumulated in the images. It is an interdisciplinary field that integrates techniques like computer vision, image processing, data mining, machine learning, data base and artificial intelligence. The most important function of the mining is to generate all significant patterns without prior information of the patterns. Rule mining has been adopting to huge image data bases. Mining has been done in accordance with the integrated collections of images and its related data. Numerous researches have been carried on this image mining. This paper presents a survey on various image mining techniques that were proposed earlier in literature. Also, this paper provides a marginal overview for future research and improvements. Keywords— Data Mining, Image Mining, Knowledge Discovery, Segmentation, Machine Learning, Artificial Intelligence, Rule Mining, Datasets

    SEDIMENT BASED TURBIDITY ANALYSES FOR REPRESENTATIVE SOUTH CAROLINA SOILS

    Get PDF
    Construction activities have been recognized to have significant impacts on the environment. Excess sediment from construction sites is frequently deposited into nearby surface waters, negatively altering the chemical, physical and biological properties of the water body. This environmental concern has led to strict laws concerning erosion and sediment control, such as imposing permit conditions that limit the concentration of suspended solids that can be present in effluent water from construction sites. However, sediment concentration measurements are not routinely used to detect and correct short-term problems or permit violations because laboratory analysis of sediment concentrations is time-consuming and costly. Nevertheless, timely, accurate field estimation of sediment loading could be facilitated through the development of empirical relationships between suspended solids and turbidity. Previous research indicates that turbidity measurements may be a more practical method of estimating sediment loads by indirectly relating sediment concentration to turbidity. In addition, recognition of turbidity as an indicator of pollution in surface runoff from disturbed areas has resulted in efforts by the U.S Environmental Protection Agency (EPA) to implement turbidity effluent limitation guidelines to control the discharge of pollutants from construction sites. Therefore, given the importance of a proposed turbidity limit, focus of this research is to determine relationships between representative soils and corresponding turbidity as a function of suspended sediment concentration and sediment settling. Turbidity is not only a function of suspended sediment concentration, but also of particle size, shape, and composition; so this research was needed to analyze turbidity responses based on sediment characteristics of representative South Carolina soils. First, accuracy and precision of commercially available nephelometers needed to be quantified for use in subsequent sediment/ surface water analysis and potential regulatory compliance. Analysis of accuracy and precision for instruments showed that even though meters may be very precise, they could also be inaccurate. However, three of the four meters that performed well provided statistically accurate and precise results. It was also found that formazin calibration standards may be a better standard than AMCO EPA standards for surface water analysis. Utilizing representative South Carolina soils, both relationships of turbidity to sediment concentration and turbidity to settling time were used to form mathematical correlations. Turbidity versus suspended sediment concentration and turbidity versus settling time correlated well when top soil and subsoils were classified based on their predominant South Carolina region and their measured clay content. Derived trends for suspended sediment concentration to turbidity correlated well with either a linear or log relationship (R2 values ranging from 0.7945 to 0.9846) as opposed to previous research utilizing a power function or the assumption of a one-to-one relationship. For the correlation of turbidity and sediment settling time, trends were well correlated with a power function (R2 values ranging from 0.7674 to 0.9347). This relationship suggests Stoke\u27s Law was followed; where smaller particles remain in suspension longer and contribute more to turbidity compared to soils with less clay content. Altogether, results of this research provide a step in determining potential site-specific equations relating sediment concentration to turbidity and sediment settling time to turbidity. With this knowledge, results could ultimately aid in the design of future sediment basins of South Carolina and provide information for potential regulatory compliance

    Demystifying Social Bots: On the Intelligence of Automated Social Media Actors

    Get PDF
    Recently, social bots, (semi-) automatized accounts in social media, gained global attention in the context of public opinion manipulation. Dystopian scenarios like the malicious amplification of topics, the spreading of disinformation, and the manipulation of elections through “opinion machines” created headlines around the globe. As a consequence, much research effort has been put into the classification and detection of social bots. Yet, it is still unclear how easy an average online media user can purchase social bots, which platforms they target, where they originate from, and how sophisticated these bots are. This work provides a much needed new perspective on these questions. By providing insights into the markets of social bots in the clearnet and darknet as well as an exhaustive analysis of freely available software tools for automation during the last decade, we shed light on the availability and capabilities of automated profiles in social media platforms. Our results confirm the increasing importance of social bot technology but also uncover an as yet unknown discrepancy of theoretical and practically achieved artificial intelligence in social bots: while literature reports on a high degree of intelligence for chat bots and assumes the same for social bots, the observed degree of intelligence in social bot implementations is limited. In fact, the overwhelming majority of available services and software are of supportive nature and merely provide modules of automation instead of fully fledged “intelligent” social bots

    The Effects of Confirmation Bias and Susceptibility to Deception on an Individual’s Choice to Share Information

    Get PDF
    abstract: As deception in cyberspace becomes more dynamic, research in this area should also take a dynamic approach to battling deception and false information. Research has previously shown that people are no better than chance at detecting deception. Deceptive information in cyberspace, specifically on social media, is not exempt from this pitfall. Current practices in social media rely on the users to detect false information and use appropriate discretion when deciding to share information online. This is ineffective and will predicatively end with users being unable to discern true from false information at all, as deceptive information becomes more difficult to distinguish from true information. To proactively combat inaccurate and deceptive information on social media, research must be conducted to understand not only the interaction effects of false content and user characteristics, but user behavior that stems from this interaction as well. This study investigated the effects of confirmation bias and susceptibility to deception on an individual’s choice to share information, specifically to understand how these factors relate to the sharing of false controversial information.Dissertation/ThesisMasters Thesis Human Systems Engineering 201
    corecore