29 research outputs found

    An Algorithm to Determine Peer-Reviewers

    Full text link
    The peer-review process is the most widely accepted certification mechanism for officially accepting the written results of researchers within the scientific community. An essential component of peer-review is the identification of competent referees to review a submitted manuscript. This article presents an algorithm to automatically determine the most appropriate reviewers for a manuscript by way of a co-authorship network data structure and a relative-rank particle-swarm algorithm. This approach is novel in that it is not limited to a pre-selected set of referees, is computationally efficient, requires no human-intervention, and, in some instances, can automatically identify conflict of interest situations. A useful application of this algorithm would be to open commentary peer-review systems because it provides a weighting for each referee with respects to their expertise in the domain of a manuscript. The algorithm is validated using referee bid data from the 2005 Joint Conference on Digital Libraries.Comment: Rodriguez, M.A., Bollen, J., "An Algorithm to Determine Peer-Reviewers", Conference on Information and Knowledge Management, in press, ACM, LA-UR-06-2261, October 2008; ISBN:978-1-59593-991-

    Incorporating latent semantic indexing into a neural network model for information retrieval

    Full text link

    Automatic Paper-to-reviewer Assignment, based on the Matching Degree of the Reviewers

    Get PDF
    AbstractThere are a number of issues which are involved with organizing a conference. Among these issues, assigning conference-papers to reviewers is one of the most difficult tasks. Assigning conference-papers to reviewers is automatically the most crucial part. In this paper, we address this issue of paper-to-reviewer assignment, and we propose a method to model the reviewers, based on the matching degree between the reviewers and the papers by combining a preference-based approach and a topic-based approach. We explain the assignment algorithm and show the evaluation results in comparison with the Hungarian algorithm

    Significance Testing Against the Random Model for Scoring Models on Top k Predictions

    Get PDF
    Performance at top k predictions, where instances are ranked by a (learned) scoring model, has been used as an evaluation metric in machine learning for various reasons such as where the entire corpus is unknown (e.g., the web) or where the results are to be used by a person with limited time or resources (e.g., ranking financial news stories where the investor only has time to look at relatively few stories per day). This evaluation metric is primarily used to report whether the performance of a given method is significantly better than other (baseline) methods. It has not, however, been used to show whether the result is significant when compared to the simplest of baselines â the random model. If no models outperform the random model at a given confidence interval, then the results may not be worth reporting. This paper introduces a technique to perform an analysis of the expected performance of the top k predictions from the random model given k and a p-value on an evaluation dataset D. The technique is based on the realization that the distribution of the number of positives seen in the top k predictions follows a hypergeometric distribution, which has welldefined statistical density functions. As this distribution is discrete, we show that using parametric estimations based on a binomial distribution are almost always in complete agreement with the discrete distribution and that, if they differ, an interpolation of the discrete bounds gets very close to the parametric estimations. The technique is demonstrated on results from three prior published works, in which it clearly shows that even though performance is greatly increased (sometimes over 100%) with respect to the expected performance of the random model (at p = 0.5), these results, although qualitatively impressive, are not always as significant (p = 0.1) as might be suggested by the impressive qualitative improvements. The technique is used to show, given k, both how many positive instances are needed to achieve a specific significance threshold is as well as how significant a given top k performance is. The technique when used in a more global setting is able to identify the crossover points, with respect to k, when a method becomes significant for a given p. Lastly, the technique is used to generate a complete confidence curve, which shows a general trend over all k and visually shows where a method is significantly better than the random model over all values of k.Information Systems Working Papers Serie

    Citation recommendation via proximity full-text citation analysis and supervised topical prior

    Get PDF
    Currently the many publications are now available electronically and online, which has had a significant effect, while brought several challenges. With the objective to enhance citation recommendation based on innovative text and graph mining algorithms along with full-text citation analysis, we utilized proximity-based citation contexts extracted from a large number of full-text publications, and then used a publication/citation topic distribution to generate a novel citation graph to calculate the publication topical importance. The importance score can be utilized as a new means to enhance the recommendation performance. Experiment with full-text citation data showed that the novel method could significantly (p < 0.001) enhance citation recommendation performance

    Distributed Denial of Service Attack Detection

    Get PDF
    Distributed Denial of Service (DDoS) attacks on web applications has been a persistent threat. Successful attacks can lead to inaccessible service to legitimate users in time and loss of business reputation. Most research effort on DDoS focused on network layer attacks. Existing approaches on application layer DDoS attack mitigation have limitations such as the lack of detection ability for low rate DDoS and not being able to detect attacks targeting resource files. In this work, we propose DDoS attack detection using concepts from information retrieval and machine learning. We include two popular concepts from information retrieval: Term Frequency (TF)-Inverse Document Frequency (IDF) and Latent Semantic Indexing (LSI). We analyzed web server log data generated in a distributed environment. Our evaluation results indicate that while all the approaches can detect various ranges of attacks, information retrieval approaches can identify attacks ongoing in a given session. All the approaches can detect three well known application level DDoS attacks (trivial, intermediate, advanced). Further, these approaches can enable an administrator identifying new pattern of DDoS attacks

    Conceptualizing and qualifying disruptive business models

    Get PDF
    Purpose – This paper aims to elaborate a set of characteristics that conceptualize and qualify a disruptive business model. Design/methodology/approach – The literature on disruptive business models will be analyzed using the latent semantic analysis (LSA) technique, complemented by content analysis, to obtain a more precise qualification and conceptualization regarding disruptive business models. Findings – The results found described concepts already described in the theory. However, such findings, highlighted by the LSA, bring new perspectives to the analysis of the disruptive business models, little discussed in the literature and which reveal important considerations to be made on this subject. Research limitations/implications – It should be noted, about the technique used, a limitation on the choice of the number of singular values. For this to be a problem in the open literature, the authors tried to work not just with the cost-benefit ratio given the addition of each new dimension in the analysis, as well as a criterion of saturation of the terms presented. Practical implications – The presentation of this set of characteristics can be used as a validation tool to identify if a business is or is not a disruptive business model by managers. Originality/value – The originality of this paper is the achievement of a consolidated set of characteristics that conceptualize and qualify the disruptive business models by conducting an in-depth analysis of the literature on disruptive business models through the LSA technique, considering the difficulty of obtaining precise concepts on this subject in the literature

    Proximity-based document representation for named entity retrieval

    Full text link
    corecore