284 research outputs found

    Wikipedia vandalism detection: combining natural language, metadata, and reputation features

    Get PDF
    Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatio-temporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions.The authors from Universitat Politècnica de València thank also the MICINN research project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (Plan I+D+i). UPenn contributions were supported in part by ONR MURI N00014-07-1-0907. This research was partially supported by award 1R01GM089820-01A1 from the National Institute Of General Medical Sciences, and by ISSDM, a UCSC-LANL educational collaboration.Adler, BT.; Alfaro, LD.; Mola Velasco, SM.; Rosso, P.; West, AG. (2011). Wikipedia vandalism detection: combining natural language, metadata, and reputation features. En Computational Linguistics and Intelligent Text Processing. Springer Verlag (Germany). 6609:277-288. https://doi.org/10.1007/978-3-642-19437-5_23S2772886609Wikimedia Foundation: Wikipedia (2010) [Online; accessed December 29, 2010]Wikimedia Foundation: Wikistats (2010) [Online; accessed December 29, 2010]Potthast, M.: Crowdsourcing a Wikipedia Vandalism Corpus. In: Proc. of the 33rd Intl. ACM SIGIR Conf. (SIGIR 2010). ACM Press, New York (July 2010)Gralla, P.: U.S. senator: It’s time to ban Wikipedia in schools, libraries, http://blogs.computerworld.com/4598/u_s_senator_its_time_to_ban_wikipedia_in_schools_libraries [Online; accessed November 15, 2010]Olanoff, L.: School officials unite in banning Wikipedia. Seattle Times (November 2007)Mola-Velasco, S.M.: Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)Adler, B., de Alfaro, L., Pye, I.: Detecting Wikipedia Vandalism using WikiTrust. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)West, A.G., Kannan, S., Lee, I.: Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. In: EUROSEC 2010: Proceedings of the Third European Workshop on System Security, pp. 22–28 (2010)West, A.G.: STiki: A Vandalism Detection Tool for Wikipedia (2010), http://en.wikipedia.org/wiki/Wikipedia:STikiWikipedia: User: AntiVandalBot – Wikipedia, http://en.wikipedia.org/wiki/User:AntiVandalBot (2010) [Online; accessed November 2, 2010]Wikipedia: User:MartinBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:MartinBot [Online; accessed November 2, 2010]Wikipedia: User:ClueBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:ClueBot [Online; accessed November 2, 2010]Carter, J.: ClueBot and Vandalism on Wikipedia (2008), http://www.acm.uiuc.edu/~carter11/ClueBot.pdf [Online; accessed November 2, 2010]Rodríguez Posada, E.J.: AVBOT: detección y corrección de vandalismos en Wikipedia. NovATIca (203), 51–53 (2010)Potthast, M., Stein, B., Gerling, R.: Automatic Vandalism Detection in Wikipedia. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 663–668. Springer, Heidelberg (2008)Smets, K., Goethals, B., Verdonk, B.: Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 43–48. AAAI Press, Menlo Park (2008)Druck, G., Miklau, G., McCallum, A.: Learning to Predict the Quality of Contributions to Wikipedia. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 7–12. AAAI Press, Menlo Park (2008)Itakura, K.Y., Clarke, C.L.: Using Dynamic Markov Compression to Detect Vandalism in the Wikipedia. In: SIGIR 2009: Proc. of the 32nd Intl. ACM Conference on Research and Development in Information Retrieval, pp. 822–823 (2009)Chin, S.C., Street, W.N., Srinivasan, P., Eichmann, D.: Detecting Wikipedia Vandalism with Active Learning and Statistical Language Models. In: WICOW 2010: Proc. of the 4th Workshop on Information Credibility on the Web (April 2010)Zeng, H., Alhoussaini, M., Ding, L., Fikes, R., McGuinness, D.: Computing Trust from Revision History. In: Intl. Conf. on Privacy, Security and Trust (2006)McGuinness, D., Zeng, H., da Silva, P., Ding, L., Narayanan, D., Bhaowal, M.: Investigation into Trust for Collaborative Information Repositories: A Wikipedia Case Study. In: Proc. of the Workshop on Models of Trust for the Web (2006)Adler, B., de Alfaro, L.: A Content-Driven Reputation System for the Wikipedia. In: WWW 2007: Proceedings of the 16th International World Wide Web Conference. ACM Press, New York (2007)Belani, A.: Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach. Computing Research Repository (CoRR) abs/1001.0700 (2010)Potthast, M., Stein, B., Holfeld, T.: Overview of the 1st International Competition on Wikipedia Vandalism Detection. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: ICML 2006: Proc. of the 23rd Intl. Conf. on Machine Learning (2006

    Intersubject Regularity in the Intrinsic Shape of Human V1

    Full text link
    Previous studies have reported considerable intersubject variability in the three-dimensional geometry of the human primary visual cortex (V1). Here we demonstrate that much of this variability is due to extrinsic geometric features of the cortical folds, and that the intrinsic shape of V1 is similar across individuals. V1 was imaged in ten ex vivo human hemispheres using high-resolution (200 μm) structural magnetic resonance imaging at high field strength (7 T). Manual tracings of the stria of Gennari were used to construct a surface representation, which was computationally flattened into the plane with minimal metric distortion. The instrinsic shape of V1 was determined from the boundary of the planar representation of the stria. An ellipse provided a simple parametric shape model that was a good approximation to the boundary of flattened V1. The aspect ration of the best-fitting ellipse was found to be consistent across subject, with a mean of 1.85 and standard deviation of 0.12. Optimal rigid alignment of size-normalized V1 produced greater overlap than that achieved by previous studies using different registration methods. A shape analysis of published macaque data indicated that the intrinsic shape of macaque V1 is also stereotyped, and similar to the human V1 shape. Previoud measurements of the functional boundary of V1 in human and macaque are in close agreement with these results

    Simplified modeling of EM field coupling to complex cable bundles

    Get PDF
    In this contribution, the procedure "Equivalent Cable Bundle Method" is used for the simplification of large cable bundles, and it is extended to the application on differential signal lines. The main focus is on the reduction of twisted-pair cables. Furthermore, the process presented here allows to take into account cables with wires that are situated quite close to each other. The procedure is based on a new approach to calculate the geometry of the simplified cable and uses the fact that the line parameters do not uniquely correspond to a certain geometry. For this reason, an optimization algorithm is applied

    Combining gemcitabine, oxaliplatin and capecitabine (GEMOXEL) for patients with advanced pancreatic carcinoma (APC): a phase I/II trial

    Get PDF
    Background: Gemcitabine remains the mainstay of palliative treatment of advanced pancreatic carcinoma (APC). Adding capecitabine or a platinum derivative each significantly prolonged survival in recent meta-analyses. The purpose of this study was to determine dose, safety and preliminary efficacy of a first-line regimen combining all three classes of active cytotoxic drugs in APC. Patients and methods: Chemotherapy-naive patients with locally advanced or metastatic, histologically proven adenocarcinoma of the pancreas were treated with a 21-day regimen of gemcitabine [1000 mg/m2 day (d) 1, d8], escalating doses of oxaliplatin (80-130 mg/m2 d1) and capecitabine (650-800 mg/m2 b.i.d. d1-d14). The recommended dose (RD), determined in the phase I part of the study by interpatient dose escalation in cohorts of three to six patients, was further studied in a two-stage phase II part with the primary end point of response rate by RECIST criteria. Results: Forty-five patients were treated with a total of 203 treatment cycles. Thrombocytopenia and diarrhea were the toxic effects limiting the dose to an RD of gemcitabine 1000 mg/m2 d1, d8; oxaliplatin 130 mg/m2 d1 and capecitabine 650 mg/m2 b.i.d. d1-14. Central independent radiological review showed partial remissions in 41% [95% confidence interval (CI) 26% to 56%] of patients and disease stabilization in 37% (95% CI 22% to 52%) of patients. Conclusion: This triple combination is feasible and, by far, met the predefined efficacy criteria warranting further investigation

    ARE MUSCLE FORCES RELEVANT IN THE AGE RELATED RISE OF INJURIES IN ADOLESCENT SOCCER PLAYERS?

    Get PDF
    The purpose of this study was the comparison of the kinematics and kinetics in soccer inside passes between three age groups (U12, U16, U23). Using 3D movement analysis and inverse dynamics, hip joint kinematics and adductor muscle forces were calculated. SPM analysis showed significant differences in adduction angle and velocity and in muscle forces of adductor longus and gracilis. Comparison of the muscle forces shows a rapid increase in muscle forces from the youngest children to the adolescents while the difference between the adolescents and adults is only minor. It seems reasonable, that the fast development of muscle forces in adolescents compared to the slower development of the tendons is a factor in the sudden rise in injury incidence at the beginning of puberty. Therefore, adolescent players should be trained with caution to avoid early injuries

    HIP JOINT LOAD AND MUSCLE STRESS IN SOCCER INSIDE PASSlNG

    Get PDF
    Studies investigating the mechanisms of adductor injuries in soccer have concentrated on full effort kicks. Purpose of this study was a kinetic analysis of the inside pass. Using infrared cameras and inverse dynamics, hip joint moments and adductor muscle stress was calculated during the swing phase of the pass. Moments in the transverse plane were nearly as high as in full effort kicks reported previously. Muscle stress in the m. gracilis reached up to 450 kPa. Considering the repetitive nature of inside passes in modern sot; cer, adductor muscles undergo high loads in matches and training. This might contribute to the explanation of the high incidences of adductor injuries. Practitioners should therefore consider the load-recovery-relation even in inside pass training. Specific strength training programs for the adductor and abductor muscle groups should be developed

    Mathematical Modelling of Optical Coherence Tomography

    Full text link
    In this chapter a general mathematical model of Optical Coherence Tomography (OCT) is presented on the basis of the electromagnetic theory. OCT produces high resolution images of the inner structure of biological tissues. Images are obtained by measuring the time delay and the intensity of the backscattered light from the sample considering also the coherence properties of light. The scattering problem is considered for a weakly scattering medium located far enough from the detector. The inverse problem is to reconstruct the susceptibility of the medium given the measurements for different positions of the mirror. Different approaches are addressed depending on the different assumptions made about the optical properties of the sample. This procedure is applied to a full field OCT system and an extension to standard (time and frequency domain) OCT is briefly presented.Comment: 28 pages, 5 figures, book chapte

    Geometric representations for minimalist grammars

    Full text link
    We reformulate minimalist grammars as partial functions on term algebras for strings and trees. Using filler/role bindings and tensor product representations, we construct homomorphisms for these data structures into geometric vector spaces. We prove that the structure-building functions as well as simple processors for minimalist languages can be realized by piecewise linear operators in representation space. We also propose harmony, i.e. the distance of an intermediate processing step from the final well-formed state in representation space, as a measure of processing complexity. Finally, we illustrate our findings by means of two particular arithmetic and fractal representations.Comment: 43 pages, 4 figure
    • …
    corecore