284 research outputs found
Wikipedia vandalism detection: combining natural language, metadata, and reputation features
Wikipedia is an online encyclopedia which anyone can edit.
While most edits are constructive, about 7% are acts of vandalism. Such
behavior is characterized by modifications made in bad faith; introducing
spam and other inappropriate content.
In this work, we present the results of an effort to integrate three of the
leading approaches to Wikipedia vandalism detection: a spatio-temporal
analysis of metadata (STiki), a reputation-based system (WikiTrust),
and natural language processing features. The performance of the resulting
joint system improves the state-of-the-art from all previous methods
and establishes a new baseline for Wikipedia vandalism detection. We
examine in detail the contribution of the three approaches, both for the
task of discovering fresh vandalism, and for the task of locating vandalism
in the complete set of Wikipedia revisions.The authors from Universitat Politècnica de València thank also the MICINN research project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (Plan I+D+i). UPenn contributions were supported in part by ONR MURI N00014-07-1-0907. This research was partially supported by award 1R01GM089820-01A1 from the National Institute Of General Medical Sciences, and by ISSDM, a UCSC-LANL educational collaboration.Adler, BT.; Alfaro, LD.; Mola Velasco, SM.; Rosso, P.; West, AG. (2011). Wikipedia vandalism detection: combining natural language, metadata, and reputation features. En Computational Linguistics and Intelligent Text Processing. Springer Verlag (Germany). 6609:277-288. https://doi.org/10.1007/978-3-642-19437-5_23S2772886609Wikimedia Foundation: Wikipedia (2010) [Online; accessed December 29, 2010]Wikimedia Foundation: Wikistats (2010) [Online; accessed December 29, 2010]Potthast, M.: Crowdsourcing a Wikipedia Vandalism Corpus. In: Proc. of the 33rd Intl. ACM SIGIR Conf. (SIGIR 2010). ACM Press, New York (July 2010)Gralla, P.: U.S. senator: It’s time to ban Wikipedia in schools, libraries, http://blogs.computerworld.com/4598/u_s_senator_its_time_to_ban_wikipedia_in_schools_libraries [Online; accessed November 15, 2010]Olanoff, L.: School officials unite in banning Wikipedia. Seattle Times (November 2007)Mola-Velasco, S.M.: Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)Adler, B., de Alfaro, L., Pye, I.: Detecting Wikipedia Vandalism using WikiTrust. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)West, A.G., Kannan, S., Lee, I.: Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. In: EUROSEC 2010: Proceedings of the Third European Workshop on System Security, pp. 22–28 (2010)West, A.G.: STiki: A Vandalism Detection Tool for Wikipedia (2010), http://en.wikipedia.org/wiki/Wikipedia:STikiWikipedia: User: AntiVandalBot – Wikipedia, http://en.wikipedia.org/wiki/User:AntiVandalBot (2010) [Online; accessed November 2, 2010]Wikipedia: User:MartinBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:MartinBot [Online; accessed November 2, 2010]Wikipedia: User:ClueBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:ClueBot [Online; accessed November 2, 2010]Carter, J.: ClueBot and Vandalism on Wikipedia (2008), http://www.acm.uiuc.edu/~carter11/ClueBot.pdf [Online; accessed November 2, 2010]RodrÃguez Posada, E.J.: AVBOT: detección y corrección de vandalismos en Wikipedia. NovATIca (203), 51–53 (2010)Potthast, M., Stein, B., Gerling, R.: Automatic Vandalism Detection in Wikipedia. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 663–668. Springer, Heidelberg (2008)Smets, K., Goethals, B., Verdonk, B.: Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 43–48. AAAI Press, Menlo Park (2008)Druck, G., Miklau, G., McCallum, A.: Learning to Predict the Quality of Contributions to Wikipedia. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 7–12. AAAI Press, Menlo Park (2008)Itakura, K.Y., Clarke, C.L.: Using Dynamic Markov Compression to Detect Vandalism in the Wikipedia. In: SIGIR 2009: Proc. of the 32nd Intl. ACM Conference on Research and Development in Information Retrieval, pp. 822–823 (2009)Chin, S.C., Street, W.N., Srinivasan, P., Eichmann, D.: Detecting Wikipedia Vandalism with Active Learning and Statistical Language Models. In: WICOW 2010: Proc. of the 4th Workshop on Information Credibility on the Web (April 2010)Zeng, H., Alhoussaini, M., Ding, L., Fikes, R., McGuinness, D.: Computing Trust from Revision History. In: Intl. Conf. on Privacy, Security and Trust (2006)McGuinness, D., Zeng, H., da Silva, P., Ding, L., Narayanan, D., Bhaowal, M.: Investigation into Trust for Collaborative Information Repositories: A Wikipedia Case Study. In: Proc. of the Workshop on Models of Trust for the Web (2006)Adler, B., de Alfaro, L.: A Content-Driven Reputation System for the Wikipedia. In: WWW 2007: Proceedings of the 16th International World Wide Web Conference. ACM Press, New York (2007)Belani, A.: Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach. Computing Research Repository (CoRR) abs/1001.0700 (2010)Potthast, M., Stein, B., Holfeld, T.: Overview of the 1st International Competition on Wikipedia Vandalism Detection. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010)Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: ICML 2006: Proc. of the 23rd Intl. Conf. on Machine Learning (2006
Intersubject Regularity in the Intrinsic Shape of Human V1
Previous studies have reported considerable intersubject variability in the three-dimensional geometry of the human primary visual cortex (V1). Here we demonstrate that much of this variability is due to extrinsic geometric features of the cortical folds, and that the intrinsic shape of V1 is similar across individuals. V1 was imaged in ten ex vivo human hemispheres using high-resolution (200 μm) structural magnetic resonance imaging at high field strength (7 T). Manual tracings of the stria of Gennari were used to construct a surface representation, which was computationally flattened into the plane with minimal metric distortion. The instrinsic shape of V1 was determined from the boundary of the planar representation of the stria. An ellipse provided a simple parametric shape model that was a good approximation to the boundary of flattened V1. The aspect ration of the best-fitting ellipse was found to be consistent across subject, with a mean of 1.85 and standard deviation of 0.12. Optimal rigid alignment of size-normalized V1 produced greater overlap than that achieved by previous studies using different registration methods. A shape analysis of published macaque data indicated that the intrinsic shape of macaque V1 is also stereotyped, and similar to the human V1 shape. Previoud measurements of the functional boundary of V1 in human and macaque are in close agreement with these results
Simplified modeling of EM field coupling to complex cable bundles
In this contribution, the procedure "Equivalent Cable Bundle Method" is
used for the simplification of large cable bundles, and it is extended to the
application on differential signal lines. The main focus is on the reduction
of twisted-pair cables. Furthermore, the process presented here allows to
take into account cables with wires that are situated quite close to each
other. The procedure is based on a new approach to calculate the geometry of
the simplified cable and uses the fact that the line parameters do not
uniquely correspond to a certain geometry. For this reason, an optimization
algorithm is applied
Recommended from our members
Observation error statistics for Doppler radar radial wind superobservations assimilated into the DWD COSMO-KENDA system
Currently in operational numerical weather prediction (NWP) the density of high-resolution observations, such as Doppler radar radial winds (DRWs), is severely reduced in part to avoid violating the assumption of uncorrelated observation errors. To improve the quantity of observations used and the impact that they have on the forecast requires an accurate specification of the observation uncertainties. Observation uncertainties can be estimated using a simple diagnostic that utilises the statistical averages of observation-minus-background and observation-minus-analysis residuals. We are the first to use a modified form of the diagnostic to estimate spatial correlations for observations used in an operational ensemble data assimilation system. The uncertainties for DRW superobservations assimilated into the Deutscher Wetterdienst convection-permitting NWP model are estimated and compared to previous uncertainty estimates for DRWs. The new results show that most diagnosed standard deviations are smaller than those used in the assimilation, hence it may be feasible assimilate DRWs using reduced error standard deviations. However, some of the estimated standard deviations are considerably larger than those used in the assimilation; these large errors highlight areas where the observation processing system may be improved. The error correlation length scales are larger than the observation separation distance and influenced by both the superobbing procedure and observation operator. This is supported by comparing these results to our previous study using Met Office data. Our results suggest that DRW error correlations may be reduced by improving the superobbing procedure and observation operator; however, any remaining correlations should be accounted for in the assimilation
Combining gemcitabine, oxaliplatin and capecitabine (GEMOXEL) for patients with advanced pancreatic carcinoma (APC): a phase I/II trial
Background: Gemcitabine remains the mainstay of palliative treatment of advanced pancreatic carcinoma (APC). Adding capecitabine or a platinum derivative each significantly prolonged survival in recent meta-analyses. The purpose of this study was to determine dose, safety and preliminary efficacy of a first-line regimen combining all three classes of active cytotoxic drugs in APC. Patients and methods: Chemotherapy-naive patients with locally advanced or metastatic, histologically proven adenocarcinoma of the pancreas were treated with a 21-day regimen of gemcitabine [1000 mg/m2 day (d) 1, d8], escalating doses of oxaliplatin (80-130 mg/m2 d1) and capecitabine (650-800 mg/m2 b.i.d. d1-d14). The recommended dose (RD), determined in the phase I part of the study by interpatient dose escalation in cohorts of three to six patients, was further studied in a two-stage phase II part with the primary end point of response rate by RECIST criteria. Results: Forty-five patients were treated with a total of 203 treatment cycles. Thrombocytopenia and diarrhea were the toxic effects limiting the dose to an RD of gemcitabine 1000 mg/m2 d1, d8; oxaliplatin 130 mg/m2 d1 and capecitabine 650 mg/m2 b.i.d. d1-14. Central independent radiological review showed partial remissions in 41% [95% confidence interval (CI) 26% to 56%] of patients and disease stabilization in 37% (95% CI 22% to 52%) of patients. Conclusion: This triple combination is feasible and, by far, met the predefined efficacy criteria warranting further investigation
ARE MUSCLE FORCES RELEVANT IN THE AGE RELATED RISE OF INJURIES IN ADOLESCENT SOCCER PLAYERS?
The purpose of this study was the comparison of the kinematics and kinetics in soccer inside passes between three age groups (U12, U16, U23). Using 3D movement analysis and inverse dynamics, hip joint kinematics and adductor muscle forces were calculated. SPM analysis showed significant differences in adduction angle and velocity and in muscle forces of adductor longus and gracilis. Comparison of the muscle forces shows a rapid increase in muscle forces from the youngest children to the adolescents while the difference between the adolescents and adults is only minor. It seems reasonable, that the fast development of muscle forces in adolescents compared to the slower development of the tendons is a factor in the sudden rise in injury incidence at the beginning of puberty. Therefore, adolescent players should be trained with caution to avoid early injuries
HIP JOINT LOAD AND MUSCLE STRESS IN SOCCER INSIDE PASSlNG
Studies investigating the mechanisms of adductor injuries in soccer have concentrated on full effort kicks. Purpose of this study was a kinetic analysis of the inside pass. Using infrared cameras and inverse dynamics, hip joint moments and adductor muscle stress was calculated during the swing phase of the pass. Moments in the transverse plane were nearly as high as in full effort kicks reported previously. Muscle stress in the m. gracilis reached up to 450 kPa. Considering the repetitive nature of inside passes in modern sot; cer, adductor muscles undergo high loads in matches and training. This might contribute to the explanation of the high incidences of adductor injuries. Practitioners should therefore consider the load-recovery-relation even in inside pass training. Specific strength training programs for the adductor and abductor muscle groups should be developed
Mathematical Modelling of Optical Coherence Tomography
In this chapter a general mathematical model of Optical Coherence Tomography
(OCT) is presented on the basis of the electromagnetic theory. OCT produces
high resolution images of the inner structure of biological tissues. Images are
obtained by measuring the time delay and the intensity of the backscattered
light from the sample considering also the coherence properties of light. The
scattering problem is considered for a weakly scattering medium located far
enough from the detector. The inverse problem is to reconstruct the
susceptibility of the medium given the measurements for different positions of
the mirror. Different approaches are addressed depending on the different
assumptions made about the optical properties of the sample. This procedure is
applied to a full field OCT system and an extension to standard (time and
frequency domain) OCT is briefly presented.Comment: 28 pages, 5 figures, book chapte
Geometric representations for minimalist grammars
We reformulate minimalist grammars as partial functions on term algebras for
strings and trees. Using filler/role bindings and tensor product
representations, we construct homomorphisms for these data structures into
geometric vector spaces. We prove that the structure-building functions as well
as simple processors for minimalist languages can be realized by piecewise
linear operators in representation space. We also propose harmony, i.e. the
distance of an intermediate processing step from the final well-formed state in
representation space, as a measure of processing complexity. Finally, we
illustrate our findings by means of two particular arithmetic and fractal
representations.Comment: 43 pages, 4 figure
- …