259 research outputs found

    Wikipedia vandalism detection: combining natural language, metadata, and reputation features

    Get PDF
    Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatio-temporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions.The authors from Universitat Politècnica de València thank also the MICINN research project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (Plan I+D+i). UPenn contributions were supported in part by ONR MURI N00014-07-1-0907. This research was partially supported by award 1R01GM089820-01A1 from the National Institute Of General Medical Sciences, and by ISSDM, a UCSC-LANL educational collaboration.Adler, BT.; Alfaro, LD.; Mola Velasco, SM.; Rosso, P.; West, AG. (2011). Wikipedia vandalism detection: combining natural language, metadata, and reputation features. En Computational Linguistics and Intelligent Text Processing. Springer Verlag (Germany). 6609:277-288. https://doi.org/10.1007/978-3-642-19437-5_23S2772886609Wikimedia Foundation: Wikipedia (2010) [Online; accessed December 29, 2010]Wikimedia Foundation: Wikistats (2010) [Online; accessed December 29, 2010]Potthast, M.: Crowdsourcing a Wikipedia Vandalism Corpus. In: Proc. of the 33rd Intl. ACM SIGIR Conf. (SIGIR 2010). ACM Press, New York (July 2010)Gralla, P.: U.S. senator: It’s time to ban Wikipedia in schools, libraries, http://blogs.computerworld.com/4598/u_s_senator_its_time_to_ban_wikipedia_in_schools_libraries [Online; accessed November 15, 2010]Olanoff, L.: School officials unite in banning Wikipedia. Seattle Times (November 2007)Mola-Velasco, S.M.: Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)Adler, B., de Alfaro, L., Pye, I.: Detecting Wikipedia Vandalism using WikiTrust. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)West, A.G., Kannan, S., Lee, I.: Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. In: EUROSEC 2010: Proceedings of the Third European Workshop on System Security, pp. 22–28 (2010)West, A.G.: STiki: A Vandalism Detection Tool for Wikipedia (2010), http://en.wikipedia.org/wiki/Wikipedia:STikiWikipedia: User: AntiVandalBot – Wikipedia, http://en.wikipedia.org/wiki/User:AntiVandalBot (2010) [Online; accessed November 2, 2010]Wikipedia: User:MartinBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:MartinBot [Online; accessed November 2, 2010]Wikipedia: User:ClueBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:ClueBot [Online; accessed November 2, 2010]Carter, J.: ClueBot and Vandalism on Wikipedia (2008), http://www.acm.uiuc.edu/~carter11/ClueBot.pdf [Online; accessed November 2, 2010]Rodríguez Posada, E.J.: AVBOT: detección y corrección de vandalismos en Wikipedia. NovATIca (203), 51–53 (2010)Potthast, M., Stein, B., Gerling, R.: Automatic Vandalism Detection in Wikipedia. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 663–668. Springer, Heidelberg (2008)Smets, K., Goethals, B., Verdonk, B.: Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 43–48. AAAI Press, Menlo Park (2008)Druck, G., Miklau, G., McCallum, A.: Learning to Predict the Quality of Contributions to Wikipedia. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 7–12. AAAI Press, Menlo Park (2008)Itakura, K.Y., Clarke, C.L.: Using Dynamic Markov Compression to Detect Vandalism in the Wikipedia. In: SIGIR 2009: Proc. of the 32nd Intl. ACM Conference on Research and Development in Information Retrieval, pp. 822–823 (2009)Chin, S.C., Street, W.N., Srinivasan, P., Eichmann, D.: Detecting Wikipedia Vandalism with Active Learning and Statistical Language Models. In: WICOW 2010: Proc. of the 4th Workshop on Information Credibility on the Web (April 2010)Zeng, H., Alhoussaini, M., Ding, L., Fikes, R., McGuinness, D.: Computing Trust from Revision History. In: Intl. Conf. on Privacy, Security and Trust (2006)McGuinness, D., Zeng, H., da Silva, P., Ding, L., Narayanan, D., Bhaowal, M.: Investigation into Trust for Collaborative Information Repositories: A Wikipedia Case Study. In: Proc. of the Workshop on Models of Trust for the Web (2006)Adler, B., de Alfaro, L.: A Content-Driven Reputation System for the Wikipedia. In: WWW 2007: Proceedings of the 16th International World Wide Web Conference. ACM Press, New York (2007)Belani, A.: Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach. Computing Research Repository (CoRR) abs/1001.0700 (2010)Potthast, M., Stein, B., Holfeld, T.: Overview of the 1st International Competition on Wikipedia Vandalism Detection. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: ICML 2006: Proc. of the 23rd Intl. Conf. on Machine Learning (2006

    Monte Carlo Procedure for Protein Design

    Full text link
    A new method for sequence optimization in protein models is presented. The approach, which has inherited its basic philosophy from recent work by Deutsch and Kurosky [Phys. Rev. Lett. 76, 323 (1996)] by maximizing conditional probabilities rather than minimizing energy functions, is based upon a novel and very efficient multisequence Monte Carlo scheme. By construction, the method ensures that the designed sequences represent good folders thermodynamically. A bootstrap procedure for the sequence space search is devised making very large chains feasible. The algorithm is successfully explored on the two-dimensional HP model with chain lengths N=16, 18 and 32.Comment: 7 pages LaTeX, 4 Postscript figures; minor change

    Predicting the Next Best View for 3D Mesh Refinement

    Full text link
    3D reconstruction is a core task in many applications such as robot navigation or sites inspections. Finding the best poses to capture part of the scene is one of the most challenging topic that goes under the name of Next Best View. Recently, many volumetric methods have been proposed; they choose the Next Best View by reasoning over a 3D voxelized space and by finding which pose minimizes the uncertainty decoded into the voxels. Such methods are effective, but they do not scale well since the underlaying representation requires a huge amount of memory. In this paper we propose a novel mesh-based approach which focuses on the worst reconstructed region of the environment mesh. We define a photo-consistent index to evaluate the 3D mesh accuracy, and an energy function over the worst regions of the mesh which takes into account the mutual parallax with respect to the previous cameras, the angle of incidence of the viewing ray to the surface and the visibility of the region. We test our approach over a well known dataset and achieve state-of-the-art results.Comment: 13 pages, 5 figures, to be published in IAS-1

    Retention of improvement in gait stability over 14 weeks due to trip-perturbation training is dependent on perturbation dose

    Get PDF
    © 2018 Elsevier Ltd Perturbation training is an emerging approach to reduce fall risk in the elderly. This study examined potential differences in retention of improvements in reactive gait stability over 14 weeks resulting from unexpected trip-like gait perturbations. Twenty-four healthy middle-aged adults (41–62 years) were assigned randomly to either a single perturbation group (SINGLE, n = 9) or a group subjected to eight trip-like gait perturbations (MULTIPLE, n = 15). While participants walked on a treadmill a custom-built brake-and-release system was used to unexpectedly apply resistance during swing phase to the lower right limb via an ankle strap. The anteroposterior margin of stability (MoS) was calculated as the difference between the anterior boundary of the base of support and the extrapolated centre of mass at foot touchdown for the perturbed step and the first recovery step during the first and second (MULTIPLE group only) perturbation trials for the initial walking session and retention-test walking 14 weeks later. Group MULTIPLE retained the improvements in reactive gait stability to the perturbations (increased MoS at touchdown for perturbed and first recovery steps; p < 0.01). However, in group SINGLE no differences in MoS were detected after 14 weeks compared to the initial walking session. These findings provide evidence for the requirement of a threshold trip-perturbation dose if adaptive changes in the human neuromotor system over several months, aimed at the improvement in fall-resisting skills, are to occur

    Passenger car data – a new source of real-time weather information for nowcasting, forecasting, and road safety

    Get PDF
    PresentaciĂłn realizada en la 3rd European Nowcasting Conference, celebrada en la sede central de AEMET en Madrid del 24 al 26 de abril de 2019
    • …
    corecore