94,773 research outputs found
Popular Ensemble Methods: An Empirical Study
An ensemble consists of a set of individually trained classifiers (such as
neural networks or decision trees) whose predictions are combined when
classifying novel instances. Previous research has shown that an ensemble is
often more accurate than any of the single classifiers in the ensemble. Bagging
(Breiman, 1996c) and Boosting (Freund and Shapire, 1996; Shapire, 1990) are two
relatively new but popular methods for producing ensembles. In this paper we
evaluate these methods on 23 data sets using both neural networks and decision
trees as our classification algorithm. Our results clearly indicate a number of
conclusions. First, while Bagging is almost always more accurate than a single
classifier, it is sometimes much less accurate than Boosting. On the other
hand, Boosting can create ensembles that are less accurate than a single
classifier -- especially when using neural networks. Analysis indicates that
the performance of the Boosting methods is dependent on the characteristics of
the data set being examined. In fact, further results show that Boosting
ensembles may overfit noisy data sets, thus decreasing its performance.
Finally, consistent with previous studies, our work suggests that most of the
gain in an ensemble's performance comes in the first few classifiers combined;
however, relatively large gains can be seen up to 25 classifiers when Boosting
decision trees
An Empirical Study of AI Generated Text Detection Tools
Since ChatGPT has emerged as a major AIGC model, providing high-quality
responses across a wide range of applications (including software development
and maintenance), it has attracted much interest from many individuals. ChatGPT
has great promise, but there are serious problems that might arise from its
misuse, especially in the realms of education and public safety. Several AIGC
detectors are available, and they have all been tested on genuine text.
However, more study is needed to see how effective they are for multi-domain
ChatGPT material. This study aims to fill this need by creating a multi-domain
dataset for testing the state-of-the-art APIs and tools for detecting
artificially generated information used by universities and other research
institutions. A large dataset consisting of articles, abstracts, stories, news,
and product reviews was created for this study. The second step is to use the
newly created dataset to put six tools through their paces. Six different
artificial intelligence (AI) text identification systems, including "GPTkit,"
"GPTZero," "Originality," "Sapling," "Writer," and "Zylalab," have accuracy
rates between 55.29 and 97.0%. Although all the tools fared well in the
evaluations, originality was particularly effective across the board.Comment: 15 Pages, 4 Figures, 2 Tables, 42 Reference
Recommended from our members
Conceptions Of Deprivation: An Empirical Study
This is a study of the concept of 'deprivation'. It is aimed to discover what the concept means to teachers and others responsible for the education and welfare of children living in a housing estate which is officially recognised as 'deprived'. A social construction approach is used as the idea is examined at what is in essence 'folk' 'level' as far as the contributing professionals are concerned. The recorded impressions held by these professional people provide the data for the investigation and eight teachers working at the local primary school serve as key witnesses.
Before examining the largely tape-recorded evidence collected in the field-work phase of the project attention is given to the way in which the word 'deprivation' is used and an attempt is made to identify underlying ideas held by users of the concept: it is suspected that the label 'deprived child' may be a factor when underachievement occurs in schools serving neighbourhoods of the kind here considered.
The difficulty of usefully surveying the wide literature on deprivation is discussed and attention is drawn to the sterility of studies in this field which attempt to negate the influence of ideology: it is postulated that a full understanding of the concept of deprivation is unlikely to be gained solely from measurement of the generally-used criteria. Nevertheless, indices of deprivation as revealed, for example, in the Census are noticed and comparisons are made between the research area and the country as s whole. Even so, as it is the subjective reality of witnesses that is being sought this research project is in the tradition of sociological phenomenology.
Five groups of hypotheses have been set up against which to measure possible ways in which children corns to be categorised as 'deprived' and in a further group of hypotheses an attempt is made to measure the implications of such categorization before formulating operational advice of particular significance to teachers serving in neighbourhoods seen as deprived
Cyber-crime Science = Crime Science + Information Security
Cyber-crime Science is an emerging area of study aiming to prevent cyber-crime by combining security protection techniques from Information Security with empirical research methods used in Crime Science. Information security research has developed techniques for protecting the confidentiality, integrity, and availability of information assets but is less strong on the empirical study of the effectiveness of these techniques. Crime Science studies the effect of crime prevention techniques empirically in the real world, and proposes improvements to these techniques based on this. Combining both approaches, Cyber-crime Science transfers and further develops Information Security techniques to prevent cyber-crime, and empirically studies the effectiveness of these techniques in the real world. In this paper we review the main contributions of Crime Science as of today, illustrate its application to a typical Information Security problem, namely phishing, explore the interdisciplinary structure of Cyber-crime Science, and present an agenda for research in Cyber-crime Science in the form of a set of suggested research questions
From Large Urban to Small Rural Schools: An Empirical Study of National Board Certification and Teaching Effectiveness Final Report
The National Board for Professional Teaching Standards (NBPTS) is a professional organization that provides national certification to teachers who apply for and meet the Board's standards of performance for "accomplished" educators. This study responds to a request from the NBPTS to analyze National Board certification among high school teachers in understudied subject areas and locales to help fill gaps in the research literature. The research team selected two new locales for this analysis, the Commonwealth of Kentucky and the Chicago public schools. Chicago, a racially and ethnically diverse city with a population of more than 2.8 million, has one of the largest urban school districts in the country. Kentucky, by contrast, is a largely rural state with some suburban and urban areas, including the Louisville/Jefferson County metro area, population 750,000. Together, these two locales encompass a full range of public school settings
Currently offered intercultural training in Germany and Great Britain: an empirical study
Der Artikel von Yvonne Knoll Currently Offered Intercultural Training in Germany and Great Britain unternimmt einen empirischen Vergleich der Trainingslandschaft in diesen beiden Ländern und arbeitet Entwicklungspotentiale für interkulturelle Trainingsangebote heraus
Implications of industry 4.0 on financial performance: an empirical study
With this thesis, we explore the relationship between industry 4.0 technologies and financial performance. After presenting the fourth industrial revolution and the analysis of management articles and reviews, we describe the database used. Finally research questions are investigated though t-tests and multiple linear regression model
- …