5,124 research outputs found

    Cross-domain authorship attribution combining instance-based and profile-based features notebook for PAN at CLEF 2019

    Get PDF
    Being able to identify the author of an unknown text is crucial. Although it is a well-studied field, it is still an open problem, since a standard approach has yet to be found. In this notebook, we propose our model for the Authorship Attribution task of PAN 2019, that focuses on cross-domain setting covering 4 different languages: French, Italian, English, and Spanish. We use n-grams of characters, words, stemmed words, and distorted text. Our model has an SVM for each feature and an ensemble architecture. Our final results outperform the baseline given by PAN in almost every problem. With this model, we reach the second place in the task with an F1-score of 68%

    Data Science and Knowledge Discovery

    Get PDF
    Data Science (DS) is gaining significant importance in the decision process due to a mix of various areas, including Computer Science, Machine Learning, Math and Statistics, domain/business knowledge, software development, and traditional research. In the business field, DS's application allows using scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data to support the decision process. After collecting the data, it is crucial to discover the knowledge. In this step, Knowledge Discovery (KD) tasks are used to create knowledge from structured and unstructured sources (e.g., text, data, and images). The output needs to be in a readable and interpretable format. It must represent knowledge in a manner that facilitates inferencing. KD is applied in several areas, such as education, health, accounting, energy, and public administration. This book includes fourteen excellent articles which discuss this trending topic and present innovative solutions to show the importance of Data Science and Knowledge Discovery to researchers, managers, industry, society, and other communities. The chapters address several topics like Data mining, Deep Learning, Data Visualization and Analytics, Semantic data, Geospatial and Spatio-Temporal Data, Data Augmentation and Text Mining

    Are Social Networks Watermarking Us or Are We (Unawarely) Watermarking Ourself?

    Get PDF
    In the last decade, Social Networks (SNs) have deeply changed many aspects of society, and one of the most widespread behaviours is the sharing of pictures. However, malicious users often exploit shared pictures to create fake profiles, leading to the growth of cybercrime. Thus, keeping in mind this scenario, authorship attribution and verification through image watermarking techniques are becoming more and more important. In this paper, we firstly investigate how thirteen of the most popular SNs treat uploaded pictures in order to identify a possible implementation of image watermarking techniques by respective SNs. Second, we test the robustness of several image watermarking algorithms on these thirteen SNs. Finally, we verify whether a method based on the Photo-Response Non-Uniformity (PRNU) technique, which is usually used in digital forensic or image forgery detection activities, can be successfully used as a watermarking approach for authorship attribution and verification of pictures on SNs. The proposed method is sufficiently robust, in spite of the fact that pictures are often downgraded during the process of uploading to the SNs. Moreover, in comparison to conventional watermarking methods the proposed method can successfully pass through different SNs, solving related problems such as profile linking and fake profile detection. The results of our analysis on a real dataset of 8400 pictures show that the proposed method is more effective than other watermarking techniques and can help to address serious questions about privacy and security on SNs. Moreover, the proposed method paves the way for the definition of multi-factor online authentication mechanisms based on robust digital features

    Meeting in the Middle: Towards Successful Multidisciplinary Bioimage Analysis Collaboration

    Get PDF
    With an increase in subject knowledge expertise required to solve specific biological questions, experts from different fields need to collaborate to address increasingly complex issues. To successfully collaborate, everyone involved in the collaboration must take steps to "meet in the middle". We thus present a guide on truly cross-disciplinary work using bioimage analysis as a showcase, where it is required that the expertise of biologists, microscopists, data analysts, clinicians, engineers, and physicists meet. We discuss considerations and best practices from the perspective of both users and technology developers, while offering suggestions for working together productively and how this can be supported by institutes and funders. Although this guide uses bioimage analysis as an example, the guiding principles of these perspectives are widely applicable to other cross-disciplinary work

    A systematic survey of online data mining technology intended for law enforcement

    Get PDF
    As an increasing amount of crime takes on a digital aspect, law enforcement bodies must tackle an online environment generating huge volumes of data. With manual inspections becoming increasingly infeasible, law enforcement bodies are optimising online investigations through data-mining technologies. Such technologies must be well designed and rigorously grounded, yet no survey of the online data-mining literature exists which examines their techniques, applications and rigour. This article remedies this gap through a systematic mapping study describing online data-mining literature which visibly targets law enforcement applications, using evidence-based practices in survey making to produce a replicable analysis which can be methodologically examined for deficiencies

    Three-dimensional multiphase flow computational fluid dynamics models for proton exchange membrane fuel cell: a theoretical development

    Get PDF
    A review of published three-dimensional, computational fluid dynamics models for proton exchange membrane fuel cells that accounts for multiphase flow is presented. The models can be categorized as models for transport phenomena, geometry or operating condition effects, and thermal effects. The influences of heat and water management on the fuel cell performance have been repeatedly addressed, and these still remain two central issues in proton exchange membrane fuel cell technology. The strengths and weaknesses of the models, the modelling assumptions, and the model validation are discussed. The salient numerical features of the models are examined, and an overview of the most commonly used computational fluid dynamic codes for the numerical modelling of proton exchange membrane fuel cells is given. Comprehensive three-dimensional multiphase flow computational fluid dynamic models accounting for the major transport phenomena inside a complete cell have been developed. However, it has been noted that more research is required to develop models that include among other things, the detailed composition and structure of the catalyst layers, the effects of water droplets movement in the gas flow channels, the consideration of phase change in both the anode and the cathode sides of the fuel cell, and dissolved water transport
    • …
    corecore