Search CORE

42,172 research outputs found

AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments

Author: Deegalla S.
Jahan M. A. C. Akmal
Jiffriya M. A. C.
Ragel R. G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/03/2014
Field of study

Plagiarism is one of the growing issues in academia and is always a concern in Universities and other academic institutions. The situation is becoming even worse with the availability of ample resources on the web. This paper focuses on creating an effective and fast tool for plagiarism detection for text based electronic assignments. Our plagiarism detection tool named AntiPlag is developed using the tri-gram sequence matching technique. Three sets of text based assignments were tested by AntiPlag and the results were compared against an existing commercial plagiarism detection tool. AntiPlag showed better results in terms of false positives compared to the commercial tool due to the pre-processing steps performed in AntiPlag. In addition, to improve the detection latency, AntiPlag applies a data clustering technique making it four times faster than the commercial tool considered. AntiPlag could be used to isolate plagiarized text based assignments from non-plagiarised assignments easily. Therefore, we present AntiPlag, a fast and effective tool for plagiarism detection on text based electronic assignments

arXiv.org e-Print Archive

Crossref

Recommended from our members

Hierarchical classification for multiple, distributed web databases

Author: Yang Hui
Zhang Minjie
Publication venue
Publication date: 01/01/2004
Field of study

The proliferation of online information resources increases the importance of effective and efficient distributed searching. Our research aims to provide an alternative hierarchical categorization and search capability based on a Bayesian network learning algorithm. Our proposed approach, which is grounded on automatic textual analysis of subject content of online web databases, attempts to address the database selection problem by first classifying web databases into a hierarchy of topic categories. The experimental results reported demonstrate that such a classification approach not only effectively reduces the class search space, but also helps to significantly improve the accuracy of classification performance

Open Research Online (The Open University)

White Rose Research Online

Mapping the Bid Behavior of Conference Referees

Author: Bollen Johan
Rodriguez Marko A.
Van de Sompel Herbert
Publication venue: 'Elsevier BV'
Publication date: 24/05/2006
Field of study

The peer-review process, in its present form, has been repeatedly criticized. Of the many critiques ranging from publication delays to referee bias, this paper will focus specifically on the issue of how submitted manuscripts are distributed to qualified referees. Unqualified referees, without the proper knowledge of a manuscript's domain, may reject a perfectly valid study or potentially more damaging, unknowingly accept a faulty or fraudulent result. In this paper, referee competence is analyzed with respect to referee bid data collected from the 2005 Joint Conference on Digital Libraries (JCDL). The analysis of the referee bid behavior provides a validation of the intuition that referees are bidding on conference submissions with regards to the subject domain of the submission. Unfortunately, this relationship is not strong and therefore suggests that there exists other factors beyond subject domain that may be influencing referees to bid for particular submissions

arXiv.org e-Print Archive

CiteSeerX

Topic Modelling of Everyday Sexism Project Entries

Author: Eccles Kathryn
Melville Sophie
Yasseri Taha
Publication venue
Publication date: 05/04/2018
Field of study

The Everyday Sexism Project documents everyday examples of sexism reported by volunteer contributors from all around the world. It collected 100,000 entries in 13+ languages within the first 3 years of its existence. The content of reports in various languages submitted to Everyday Sexism is a valuable source of crowdsourced information with great potential for feminist and gender studies. In this paper, we take a computational approach to analyze the content of reports. We use topic-modelling techniques to extract emerging topics and concepts from the reports, and to map the semantic relations between those topics. The resulting picture closely resembles and adds to that arrived at through qualitative analysis, showing that this form of topic modeling could be useful for sifting through datasets that had not previously been subject to any analysis. More precisely, we come up with a map of topics for two different resolutions of our topic model and discuss the connection between the identified topics. In the low resolution picture, for instance, we found Public space/Street, Online, Work related/Office, Transport, School, Media harassment, and Domestic abuse. Among these, the strongest connection is between Public space/Street harassment and Domestic abuse and sexism in personal relationships.The strength of the relationships between topics illustrates the fluid and ubiquitous nature of sexism, with no single experience being unrelated to another.Comment: preprint, under revie

arXiv.org e-Print Archive

Oxford University Research Archive

Recommended from our members

Investigation of the use of navigation tools in web-based learning: A data mining approach

Author: Chen SY
Liu X
Minetou CG
Publication venue: 'Informa UK Limited'
Publication date: 10/01/2008
Field of study

Web-based learning is widespread in educational settings. The popularity of Web-based learning is in great measure because of its flexibility. Multiple navigation tools provided some of this flexibility. Different navigation tools offer different functions. Therefore, it is important to understand how the navigation tools are used by learners with different backgrounds, knowledge, and skills. This article presents two empirical studies in which data-mining approaches were used to analyze learners' navigation behavior. The results indicate that prior knowledge and subject content are two potential factors influencing the use of navigation tools. In addition, the lack of appropriate use of navigation tools may adversely influence learning performance. The results have been integrated into a model that can help designers develop Web-based learning programs and other Web-based applications that can be tailored to learners' needs

Brunel University Research Archive

Green OFDMA Resource Allocation in Cache-Enabled CRAN

Author: Stephen Reuben George
Zhang Rui
Publication venue
Publication date: 13/12/2016
Field of study

Cloud radio access network (CRAN), in which remote radio heads (RRHs) are deployed to serve users in a target area, and connected to a central processor (CP) via limited-capacity links termed the fronthaul, is a promising candidate for the next-generation wireless communication systems. Due to the content-centric nature of future wireless communications, it is desirable to cache popular contents beforehand at the RRHs, to reduce the burden on the fronthaul and achieve energy saving through cooperative transmission. This motivates our study in this paper on the energy efficient transmission in an orthogonal frequency division multiple access (OFDMA)-based CRAN with multiple RRHs and users, where the RRHs can prefetch popular contents. We consider a joint optimization of the user-SC assignment, RRH selection and transmit power allocation over all the SCs to minimize the total transmit power of the RRHs, subject to the RRHs' individual fronthaul capacity constraints and the users' minimum rate constraints, while taking into account the caching status at the RRHs. Although the problem is non-convex, we propose a Lagrange duality based solution, which can be efficiently computed with good accuracy. We compare the minimum transmit power required by the proposed algorithm with different caching strategies against the case without caching by simulations, which show the significant energy saving with caching.Comment: Presented in IEEE Online Conference on Green Communications (Online GreenComm), Nov. 2016 (Invited Paper

arXiv.org e-Print Archive

Crossref

Reinforcement machine learning for predictive analytics in smart cities

Author: Anagnostopoulos Christos
Kolomvatsos Kostas
Publication venue: 'MDPI AG'
Publication date: 01/06/2017
Field of study

The digitization of our lives cause a shift in the data production as well as in the required data management. Numerous nodes are capable of producing huge volumes of data in our everyday activities. Sensors, personal smart devices as well as the Internet of Things (IoT) paradigm lead to a vast infrastructure that covers all the aspects of activities in modern societies. In the most of the cases, the critical issue for public authorities (usually, local, like municipalities) is the efficient management of data towards the support of novel services. The reason is that analytics provided on top of the collected data could help in the delivery of new applications that will facilitate citizens’ lives. However, the provision of analytics demands intelligent techniques for the underlying data management. The most known technique is the separation of huge volumes of data into a number of parts and their parallel management to limit the required time for the delivery of analytics. Afterwards, analytics requests in the form of queries could be realized and derive the necessary knowledge for supporting intelligent applications. In this paper, we define the concept of a Query Controller ( QC ) that receives queries for analytics and assigns each of them to a processor placed in front of each data partition. We discuss an intelligent process for query assignments that adopts Machine Learning (ML). We adopt two learning schemes, i.e., Reinforcement Learning (RL) and clustering. We report on the comparison of the two schemes and elaborate on their combination. Our aim is to provide an efficient framework to support the decision making of the QC that should swiftly select the appropriate processor for each query. We provide mathematical formulations for the discussed problem and present simulation results. Through a comprehensive experimental evaluation, we reveal the advantages of the proposed models and describe the outcomes results while comparing them with a deterministic framework

Directory of Open Access Journals

Enlighten