94,773 research outputs found

    Popular Ensemble Methods: An Empirical Study

    Full text link
    An ensemble consists of a set of individually trained classifiers (such as neural networks or decision trees) whose predictions are combined when classifying novel instances. Previous research has shown that an ensemble is often more accurate than any of the single classifiers in the ensemble. Bagging (Breiman, 1996c) and Boosting (Freund and Shapire, 1996; Shapire, 1990) are two relatively new but popular methods for producing ensembles. In this paper we evaluate these methods on 23 data sets using both neural networks and decision trees as our classification algorithm. Our results clearly indicate a number of conclusions. First, while Bagging is almost always more accurate than a single classifier, it is sometimes much less accurate than Boosting. On the other hand, Boosting can create ensembles that are less accurate than a single classifier -- especially when using neural networks. Analysis indicates that the performance of the Boosting methods is dependent on the characteristics of the data set being examined. In fact, further results show that Boosting ensembles may overfit noisy data sets, thus decreasing its performance. Finally, consistent with previous studies, our work suggests that most of the gain in an ensemble's performance comes in the first few classifiers combined; however, relatively large gains can be seen up to 25 classifiers when Boosting decision trees

    An Empirical Study of AI Generated Text Detection Tools

    Full text link
    Since ChatGPT has emerged as a major AIGC model, providing high-quality responses across a wide range of applications (including software development and maintenance), it has attracted much interest from many individuals. ChatGPT has great promise, but there are serious problems that might arise from its misuse, especially in the realms of education and public safety. Several AIGC detectors are available, and they have all been tested on genuine text. However, more study is needed to see how effective they are for multi-domain ChatGPT material. This study aims to fill this need by creating a multi-domain dataset for testing the state-of-the-art APIs and tools for detecting artificially generated information used by universities and other research institutions. A large dataset consisting of articles, abstracts, stories, news, and product reviews was created for this study. The second step is to use the newly created dataset to put six tools through their paces. Six different artificial intelligence (AI) text identification systems, including "GPTkit," "GPTZero," "Originality," "Sapling," "Writer," and "Zylalab," have accuracy rates between 55.29 and 97.0%. Although all the tools fared well in the evaluations, originality was particularly effective across the board.Comment: 15 Pages, 4 Figures, 2 Tables, 42 Reference

    Cyber-crime Science = Crime Science + Information Security

    Get PDF
    Cyber-crime Science is an emerging area of study aiming to prevent cyber-crime by combining security protection techniques from Information Security with empirical research methods used in Crime Science. Information security research has developed techniques for protecting the confidentiality, integrity, and availability of information assets but is less strong on the empirical study of the effectiveness of these techniques. Crime Science studies the effect of crime prevention techniques empirically in the real world, and proposes improvements to these techniques based on this. Combining both approaches, Cyber-crime Science transfers and further develops Information Security techniques to prevent cyber-crime, and empirically studies the effectiveness of these techniques in the real world. In this paper we review the main contributions of Crime Science as of today, illustrate its application to a typical Information Security problem, namely phishing, explore the interdisciplinary structure of Cyber-crime Science, and present an agenda for research in Cyber-crime Science in the form of a set of suggested research questions

    From Large Urban to Small Rural Schools: An Empirical Study of National Board Certification and Teaching Effectiveness Final Report

    Get PDF
    The National Board for Professional Teaching Standards (NBPTS) is a professional organization that provides national certification to teachers who apply for and meet the Board's standards of performance for "accomplished" educators. This study responds to a request from the NBPTS to analyze National Board certification among high school teachers in understudied subject areas and locales to help fill gaps in the research literature. The research team selected two new locales for this analysis, the Commonwealth of Kentucky and the Chicago public schools. Chicago, a racially and ethnically diverse city with a population of more than 2.8 million, has one of the largest urban school districts in the country. Kentucky, by contrast, is a largely rural state with some suburban and urban areas, including the Louisville/Jefferson County metro area, population 750,000. Together, these two locales encompass a full range of public school settings

    Currently offered intercultural training in Germany and Great Britain: an empirical study

    Get PDF
    Der Artikel von Yvonne Knoll Currently Offered Intercultural Training in Germany and Great Britain unternimmt einen empirischen Vergleich der Trainingslandschaft in diesen beiden Ländern und arbeitet Entwicklungspotentiale für interkulturelle Trainingsangebote heraus

    Implications of industry 4.0 on financial performance: an empirical study

    Get PDF
    With this thesis, we explore the relationship between industry 4.0 technologies and financial performance. After presenting the fourth industrial revolution and the analysis of management articles and reviews, we describe the database used. Finally research questions are investigated though t-tests and multiple linear regression model
    • …
    corecore