17 research outputs found

    Gender impacts on rural cycling decisions : a case study of Bugesera district, Rwanda

    Get PDF
    Papers presented virtually at the 41st International Southern African Transport Conference on 10-13 July 2034In recent years, cycling mobility has attracted increasing interest from researchers. However, most available data on bicycling has focused on transport planning and policy development to address urban-related issues related. Only some studies have sought to understand rural cyclists' daily mobility decisions. The Global Positioning System (GPS) is an innovative tool that addresses spatial differences, even from a gender perspective. The study, therefore, assesses the impact of men's and women's cycling decisions in rural communities of low-income countries. The study targeted bike taxi customers and owners. Fifty participants of different gender, social background and occupation were recruited and handed a GPS device to collect their travel tracks in Nyamata and Mayange, Bugesera, Rwanda. Tracks collected contained road network data, distance (Origin-Destination), Elevation and Speed. Additional information was obtained through a survey, Focus Group Discussions and mapping of participants' daily activities. Limited gender inequality between male and female cyclists confirms that travelling speed has no impact on cycling decisions, reinforcing the notion that cultural norms and the lack of bicycle education, among many others, are the main barriers to more female cycling in Sub-Saharan Africa. Creating policies that encourage bicycle education at the school level, and teaching the value of cycling use for health and the environment, will help destigmatize cycling and remove cultural norms and restrictions

    Survey of Educational Modelling Languages (EMLs)

    Get PDF
    CEN/ISSS WS/LT Learning Technologies WorkshopThe reports compares several approaches to educational modelling. The work has been performed under the umbrella of the CEN/ISSS, the European workshop for learning technologies. (http://dspace.ou.nl/bitstream/1820/227/2/eml-report-cen-isss.pdf

    Overview of PAN 2018. Author identification, author profiling, and author obfuscation

    Full text link
    [EN] PAN 2018 explores several authorship analysis tasks enabling a systematic comparison of competitive approaches and advancing research in digital text forensics.More specifically, this edition of PAN introduces a shared task in cross-domain authorship attribution, where texts of known and unknown authorship belong to distinct domains, and another task in style change detection that distinguishes between single author and multi-author texts. In addition, a shared task in multimodal author profiling examines, for the first time, a combination of information from both texts and images posted by social media users to estimate their gender. Finally, the author obfuscation task studies how a text by a certain author can be paraphrased so that existing author identification tools are confused and cannot recognize the similarity with other texts of the same author. New corpora have been built to support these shared tasks. A relatively large number of software submissions (41 in total) was received and evaluated. Best paradigms are highlighted while baselines indicate the pros and cons of submitted approaches.The work at the Universitat Polit`ecnica de Val`encia was funded by the MINECO research project SomEMBED (TIN2015-71147-C2-1-P)Stamatatos, E.; Rangel-Pardo, FM.; Tschuggnall, M.; Stein, B.; Kestemont, M.; Rosso, P.; Potthast, M. (2018). Overview of PAN 2018. Author identification, author profiling, and author obfuscation. Lecture Notes in Computer Science. 11018:267-285. https://doi.org/10.1007/978-3-319-98932-7_25S26728511018Argamon, S., Juola, P.: Overview of the international authorship identification competition at PAN-2011. In: Petras, V., Forner, P., Clough, P. (eds.) Notebook Papers of CLEF 2011 Labs and Workshops, 19–22 September 2011, Amsterdam, Netherlands, September 2011. http://www.clef-initiative.eu/publication/working-notesBird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media, Sebastopol (2009)Bogdanova, D., Lazaridou, A.: Cross-language authorship attribution. In: Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, pp. 2015–2020 (2014)Choi, F.Y.: Advances in domain independent linear text segmentation. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference (NAACL), pp. 26–33. Association for Computational Linguistics, Seattle, April 2000Custódio, J.E., Paraboni, I.: EACH-USP ensemble cross-domain authorship attribution. In: Working Notes Papers of the CLEF 2018 Evaluation Labs, September 2018, to be announcedDaneshvar, S.: Gender identification in Twitter using n-grams and LSA. In: Working Notes Papers of the CLEF 2018 Evaluation Labs, September 2018, to be announcedDaniel Karaś, M.S., Sobecki, P.: OPI-JSA at CLEF 2017: author clustering and style breach detection. In: Working Notes Papers of the CLEF 2017 Evaluation Labs. CEUR Workshop Proceedings. CLEF and CEUR-WS.org, September 2017Giannella, C.: An improved algorithm for unsupervised decomposition of a multi-author document. The MITRE Corporation. Technical Papers, February 2014Glover, A., Hirst, G.: Detecting stylistic inconsistencies in collaborative writing. In: Sharples, M., van der Geest, T. (eds.) The New Writing Environment, pp. 147–168. Springer, London (1996). https://doi.org/10.1007/978-1-4471-1482-6_12Hagen, M., Potthast, M., Stein, B.: Overview of the author obfuscation task at PAN 2017: safety evaluation revisited. In: Cappellato, L., Ferro, N., Goeuriot, L., Mandl, T. (eds.) Working Notes Papers of the CLEF 2017 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS.org, September 2017Hagen, M., Potthast, M., Stein, B.: Overview of the author obfuscation task at PAN 2018. In: Working Notes Papers of the CLEF 2018 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS.org (2018)Hellekson, K., Busse, K. (eds.): The Fan Fiction Studies Reader. University of Iowa Press, Iowa City (2014)Juola, P.: An overview of the traditional authorship attribution subtask. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF 2012 Evaluation Labs and Workshop - Working Notes Papers, 17–20 September 2012, Rome, Italy, September 2012. http://www.clef-initiative.eu/publication/working-notesJuola, P.: The rowling case: a proposed standard analytic protocol for authorship questions. Digital Sch. Humanit. 30(suppl–1), i100–i113 (2015)Kestemont, M., Luyckx, K., Daelemans, W., Crombez, T.: Cross-genre authorship verification using unmasking. Engl. Stud. 93(3), 340–356 (2012)Kestemont, M., et al.: Overview of the author identification task at PAN-2018: cross-domain authorship attribution and style change detection. In: Working Notes Papers of the CLEF 2018 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS.org (2018)Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring differentiability: unmasking pseudonymous authors. J. Mach. Learn. Res. 8, 1261–1276 (2007)Overdorf, R., Greenstadt, R.: Blogs, Twitter feeds, and reddit comments: cross-domain authorship attribution. Proc. Priv. Enhanc. Technol. 2016(3), 155–171 (2016)Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)Potthast, M., Eiselt, A., Barrón-Cedeño, A., Stein, B., Rosso, P.: Overview of the 3rd international competition on plagiarism detection. In: Notebook Papers of the 5th Evaluation Lab on Uncovering Plagiarism, Authorship and Social Software Misuse (PAN), Amsterdam, The Netherlands, September 2011Potthast, M., Hagen, M., Stein, B.: Author obfuscation: attacking the state of the art in authorship verification. In: Working Notes Papers of the CLEF 2016 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS.org, September 2016. http://ceur-ws.org/Vol-1609/Potthast, M., Hagen, M., Völske, M., Stein, B.: Crowdsourcing interaction logs to understand text reuse from the web. In: Fung, P., Poesio, M. (eds.) Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), pp. 1212–1221. Association for Computational Linguistics, August 2013. http://www.aclweb.org/anthology/P13-1119Rangel, F., Celli, F., Rosso, P., Potthast, M., Stein, B., Daelemans, W.: Overview of the 3rd author profiling task at PAN 2015. In: Cappellato, L., Ferro, N., Jones, G., San Juan, E. (eds.) CLEF 2015 Evaluation Labs and Workshop - Working Notes Papers, Toulouse, France, pp. 8–11. CEUR-WS.org, September 2015Rangel, F., et al.: Overview of the 2nd author profiling task at PAN 2014. In: Cappellato, L., Ferro, N., Halvey, M., Kraaij, W. (eds.) CLEF 2014 Evaluation Labs and Workshop - Working Notes Papers, Sheffield, UK, pp. 15–18. CEUR-WS.org, September 2014Rangel, F., Rosso, P., G’omez, M.M., Potthast, M., Stein, B.: Overview of the 6th author profiling task at pan 2018: multimodal gender identification in Twitter. In: CLEF 2018 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org (2017)Rangel, F., Rosso, P., Koppel, M., Stamatatos, E., Inches, G.: Overview of the author profiling task at PAN 2013. In: Forner, P., Navigli, R., Tufis, D. (eds.) CLEF 2013 Evaluation Labs and Workshop - Working Notes Papers, 23–26 September 2013, Valencia, Spain, September 2013Rangel, F., Rosso, P., Potthast, M., Stein, B.: Overview of the 5th author profiling task at PAN 2017: gender and language variety identification in Twitter. In: Cappellato, L., Ferro, N., Goeuriot, L., Mandl, T. (eds.) Working Notes Papers of the CLEF 2017 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS.org, September 2017Rangel, F., Rosso, P., Verhoeven, B., Daelemans, W., Potthast, M., Stein, B.: Overview of the 4th author profiling task at PAN 2016: cross-genre evaluations. In: Balog, K., Cappellato, L., Ferro, N., Macdonald, C. (eds.) CLEF 2016 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org, September 2016Safin, K., Kuznetsova, R.: Style breach detection with neural sentence embeddings. In: Working Notes Papers of the CLEF 2017 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS.org, September 2017Sapkota, U., Bethard, S., Montes, M., Solorio, T.: Not all character n-grams are created equal: a study in authorship attribution. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–102 (2015)Sapkota, U., Solorio, T., Montes, M., Bethard, S., Rosso, P.: Cross-topic authorship attribution: will out-of-topic data help? In: Proceedings of the 25th International Conference on Computational Linguistics. Technical Papers, pp. 1228–1237 (2014)Stamatatos, E.: Intrinsic plagiarism detection using character nnn-gram Profiles. In: Stein, B., Rosso, P., Stamatatos, E., Koppel, M., Agirre, E. (eds.) SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 2009), pp. 38–46. Universidad Politécnica de Valencia and CEUR-WS.org, September 2009. http://ceur-ws.org/Vol-502Stamatatos, E.: On the robustness of authorship attribution based on character n-gram features. J. Law Policy 21, 421–439 (2013)Stamatatos, E.: Authorship attribution using text distortion. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Long Papers, vol. 1, pp. 1138–1149. Association for Computational Linguistics (2017)Stamatatos, E., et al.: Overview of the author identification task at PAN 2015. In: Cappellato, L., Ferro, N., Jones, G., San Juan, E. (eds.) CLEF 2015 Evaluation Labs and Workshop - Working Notes Papers, 8–11 September 2015, Toulouse, France. CEUR-WS.org, September 2015Stamatatos, E., et al.: Clustering by authorship within and across documents. In: Working Notes Papers of the CLEF 2016 Evaluation Labs. CEUR Workshop Proceedings, CLEF and CEUR-WS.org, September 2016. http://ceur-ws.org/Vol-1609/Takahashi, T., Tahara, T., Nagatani, K., Miura, Y., Taniguchi, T., Ohkuma, T.: Text and image synergy with feature cross technique for gender identification. In: Working Notes Papers of the CLEF 2018 Evaluation Labs, September 2018, to be announcedTellez, E.S., Miranda-Jiménez, S., Moctezuma, D., Graff, M., Salgado, V., Ortiz-Bejar, J.: Gender identification through multi-modal tweet analysis using microtc and bag of visual words. In: Working Notes Papers of the CLEF 2018 Evaluation Labs, September 2018, to be announcedTschuggnall, M., Specht, G.: Automatic decomposition of multi-author documents using grammar analysis. In: Proceedings of the 26th GI-Workshop on Grundlagen von Datenbanken. CEUR-WS, Bozen, October 2014Tschuggnall, M., et al.: Overview of the author identification task at PAN-2017: style breach detection and author clustering. In: Cappellato, L., Ferro, N., Goeuriot, L., Mandl, T. (eds.) Working Notes Papers of the CLEF 2017 Evaluation Labs. CEUR Workshop Proceedings, vol. 1866. CLEF and CEUR-WS.org, September 2017. http://ceur-ws.org/Vol-1866

    Feature-based Time Series Analytics

    Get PDF
    Time series analytics is a fundamental prerequisite for decision-making as well as automation and occurs in several applications such as energy load control, weather research, and consumer behavior analysis. It encompasses time series engineering, i.e., the representation of time series exhibiting important characteristics, and data mining, i.e., the application of the representation to a specific task. Due to the exhaustive data gathering, which results from the ``Industry 4.0'' vision and its shift towards automation and digitalization, time series analytics is undergoing a revolution. Big datasets with very long time series are gathered, which is challenging for engineering techniques. Traditionally, one focus has been on raw-data-based or shape-based engineering. They assess the time series' similarity in shape, which is only suitable for short time series. Another focus has been on model-based engineering. It assesses the time series' similarity in structure, which is suitable for long time series but requires larger models or a time-consuming modeling. Feature-based engineering tackles these challenges by efficiently representing time series and comparing their similarity in structure. However, current feature-based techniques are unsatisfactory as they are designed for specific data-mining tasks. In this work, we introduce a novel feature-based engineering technique. It efficiently provides a short representation of time series, focusing on their structural similarity. Based on a design rationale, we derive important time series characteristics such as the long-term and cyclically repeated characteristics as well as distribution and correlation characteristics. Moreover, we define a feature-based distance measure for their comparison. Both the representation technique and the distance measure provide desirable properties regarding storage and runtime. Subsequently, we introduce techniques based on our feature-based engineering and apply them to important data-mining tasks such as time series generation, time series matching, time series classification, and time series clustering. First, our feature-based generation technique outperforms state-of-the-art techniques regarding the accuracy of evolved datasets. Second, with our features, a matching method retrieves a match for a time series query much faster than with current representations. Third, our features provide discriminative characteristics to classify datasets as accurately as state-of-the-art techniques, but orders of magnitude faster. Finally, our features recommend an appropriate clustering of time series which is crucial for subsequent data-mining tasks. All these techniques are assessed on datasets from the energy, weather, and economic domains, and thus, demonstrate the applicability to real-world use cases. The findings demonstrate the versatility of our feature-based engineering and suggest several courses of action in order to design and improve analytical systems for the paradigm shift of Industry 4.0

    Social search in collaborative tagging networks : the role of ties

    Get PDF
    [no abstract

    Prediction of user behaviour on the web

    Get PDF
    The Web has become an ubiquitous environment for human interaction, communication, and data sharing. As a result, large amounts of data are produced. This data can be utilised by building predictive models of user behaviour in order to support business decisions. However, the fast pace of modern businesses is creating the pressure on industry to provide faster and better decisions. This thesis addresses this challenge by proposing a novel methodology for an effcient prediction of user behaviour. The problems concerned are: (i) modelling user behaviour on the Web, (ii) choosing and extracting features from data generated by user behaviour, and (iii) choosing a Machine Learning (ML) set-up for an effcient prediction. First, a novel Time-Varying Attributed Graph (TVAG) is introduced and then a TVAG-based model for modelling user behaviour on the Web is proposed. TVAGs capture temporal properties of user behaviour by their time varying component of features of the graph nodes and edges. Second, the proposed model allows to extract features for further ML predictions. However, extracting the features and building the model may be unacceptably hard and long process. Thus, a guideline for an effcient feature extraction from the TVAG-based model is proposed. Third, a method for choosing a ML set-up to build an accurate and fast predictive model is proposed and evaluated. Finally, a deep learning architecture for predicting user behaviour on the Web is proposed and evaluated. To sum up, the main contribution to knowledge of this work is in developing the methodology for fast and effcient predictions of user behaviour on the Web. The methodology is evaluated on datasets from a few Web platforms, namely Stack Exchange, Twitter, and Facebook

    Helmholtz Portfolio Theme Large-Scale Data Management and Analysis (LSDMA)

    Get PDF
    The Helmholtz Association funded the "Large-Scale Data Management and Analysis" portfolio theme from 2012-2016. Four Helmholtz centres, six universities and another research institution in Germany joined to enable data-intensive science by optimising data life cycles in selected scientific communities. In our Data Life cycle Labs, data experts performed joint R&D together with scientific communities. The Data Services Integration Team focused on generic solutions applied by several communities
    corecore