42,172 research outputs found

    AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments

    Full text link
    Plagiarism is one of the growing issues in academia and is always a concern in Universities and other academic institutions. The situation is becoming even worse with the availability of ample resources on the web. This paper focuses on creating an effective and fast tool for plagiarism detection for text based electronic assignments. Our plagiarism detection tool named AntiPlag is developed using the tri-gram sequence matching technique. Three sets of text based assignments were tested by AntiPlag and the results were compared against an existing commercial plagiarism detection tool. AntiPlag showed better results in terms of false positives compared to the commercial tool due to the pre-processing steps performed in AntiPlag. In addition, to improve the detection latency, AntiPlag applies a data clustering technique making it four times faster than the commercial tool considered. AntiPlag could be used to isolate plagiarized text based assignments from non-plagiarised assignments easily. Therefore, we present AntiPlag, a fast and effective tool for plagiarism detection on text based electronic assignments

    Mapping the Bid Behavior of Conference Referees

    Full text link
    The peer-review process, in its present form, has been repeatedly criticized. Of the many critiques ranging from publication delays to referee bias, this paper will focus specifically on the issue of how submitted manuscripts are distributed to qualified referees. Unqualified referees, without the proper knowledge of a manuscript's domain, may reject a perfectly valid study or potentially more damaging, unknowingly accept a faulty or fraudulent result. In this paper, referee competence is analyzed with respect to referee bid data collected from the 2005 Joint Conference on Digital Libraries (JCDL). The analysis of the referee bid behavior provides a validation of the intuition that referees are bidding on conference submissions with regards to the subject domain of the submission. Unfortunately, this relationship is not strong and therefore suggests that there exists other factors beyond subject domain that may be influencing referees to bid for particular submissions

    Topic Modelling of Everyday Sexism Project Entries

    Full text link
    The Everyday Sexism Project documents everyday examples of sexism reported by volunteer contributors from all around the world. It collected 100,000 entries in 13+ languages within the first 3 years of its existence. The content of reports in various languages submitted to Everyday Sexism is a valuable source of crowdsourced information with great potential for feminist and gender studies. In this paper, we take a computational approach to analyze the content of reports. We use topic-modelling techniques to extract emerging topics and concepts from the reports, and to map the semantic relations between those topics. The resulting picture closely resembles and adds to that arrived at through qualitative analysis, showing that this form of topic modeling could be useful for sifting through datasets that had not previously been subject to any analysis. More precisely, we come up with a map of topics for two different resolutions of our topic model and discuss the connection between the identified topics. In the low resolution picture, for instance, we found Public space/Street, Online, Work related/Office, Transport, School, Media harassment, and Domestic abuse. Among these, the strongest connection is between Public space/Street harassment and Domestic abuse and sexism in personal relationships.The strength of the relationships between topics illustrates the fluid and ubiquitous nature of sexism, with no single experience being unrelated to another.Comment: preprint, under revie

    Green OFDMA Resource Allocation in Cache-Enabled CRAN

    Full text link
    Cloud radio access network (CRAN), in which remote radio heads (RRHs) are deployed to serve users in a target area, and connected to a central processor (CP) via limited-capacity links termed the fronthaul, is a promising candidate for the next-generation wireless communication systems. Due to the content-centric nature of future wireless communications, it is desirable to cache popular contents beforehand at the RRHs, to reduce the burden on the fronthaul and achieve energy saving through cooperative transmission. This motivates our study in this paper on the energy efficient transmission in an orthogonal frequency division multiple access (OFDMA)-based CRAN with multiple RRHs and users, where the RRHs can prefetch popular contents. We consider a joint optimization of the user-SC assignment, RRH selection and transmit power allocation over all the SCs to minimize the total transmit power of the RRHs, subject to the RRHs' individual fronthaul capacity constraints and the users' minimum rate constraints, while taking into account the caching status at the RRHs. Although the problem is non-convex, we propose a Lagrange duality based solution, which can be efficiently computed with good accuracy. We compare the minimum transmit power required by the proposed algorithm with different caching strategies against the case without caching by simulations, which show the significant energy saving with caching.Comment: Presented in IEEE Online Conference on Green Communications (Online GreenComm), Nov. 2016 (Invited Paper

    Reinforcement machine learning for predictive analytics in smart cities

    Get PDF
    The digitization of our lives cause a shift in the data production as well as in the required data management. Numerous nodes are capable of producing huge volumes of data in our everyday activities. Sensors, personal smart devices as well as the Internet of Things (IoT) paradigm lead to a vast infrastructure that covers all the aspects of activities in modern societies. In the most of the cases, the critical issue for public authorities (usually, local, like municipalities) is the efficient management of data towards the support of novel services. The reason is that analytics provided on top of the collected data could help in the delivery of new applications that will facilitate citizens’ lives. However, the provision of analytics demands intelligent techniques for the underlying data management. The most known technique is the separation of huge volumes of data into a number of parts and their parallel management to limit the required time for the delivery of analytics. Afterwards, analytics requests in the form of queries could be realized and derive the necessary knowledge for supporting intelligent applications. In this paper, we define the concept of a Query Controller ( QC ) that receives queries for analytics and assigns each of them to a processor placed in front of each data partition. We discuss an intelligent process for query assignments that adopts Machine Learning (ML). We adopt two learning schemes, i.e., Reinforcement Learning (RL) and clustering. We report on the comparison of the two schemes and elaborate on their combination. Our aim is to provide an efficient framework to support the decision making of the QC that should swiftly select the appropriate processor for each query. We provide mathematical formulations for the discussed problem and present simulation results. Through a comprehensive experimental evaluation, we reveal the advantages of the proposed models and describe the outcomes results while comparing them with a deterministic framework
    • …
    corecore