48 research outputs found

    Shall I post this now? Optimized, delay-based privacy protection in social networks

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10115-016-1010-4Despite the several advantages commonly attributed to social networks such as easiness and immediacy to communicate with acquaintances and friends, significant privacy threats provoked by unexperienced or even irresponsible users recklessly publishing sensitive material are also noticeable. Yet, a different, but equally significant privacy risk might arise from social networks profiling the online activity of their users based on the timestamp of the interactions between the former and the latter. In order to thwart this last type of commonly neglected attacks, this paper proposes an optimized deferral mechanism for messages in online social networks. Such solution suggests intelligently delaying certain messages posted by end users in social networks in a way that the observed online activity profile generated by the attacker does not reveal any time-based sensitive information, while preserving the usability of the system. Experimental results as well as a proposed architecture implementing this approach demonstrate the suitability and feasibility of our mechanism.Peer ReviewedPostprint (author's final draft

    Identifying Experts in Question \& Answer Portals: A Case Study on Data Science Competencies in Reddit

    Full text link
    The irreplaceable key to the triumph of Question & Answer (Q&A) platforms is their users providing high-quality answers to the challenging questions posted across various topics of interest. Recently, the expert finding problem attracted much attention in information retrieval research. In this work, we inspect the feasibility of supervised learning model to identify data science experts in Reddit. Our method is based on the manual coding results where two data science experts labelled expert, non-expert and out-of-scope comments. We present a semi-supervised approach using the activity behaviour of every user, including Natural Language Processing (NLP), crowdsourced and user feature sets. We conclude that the NLP and user feature sets contribute the most to the better identification of these three classes It means that this method can generalise well within the domain. Moreover, we present different types of users, which can be helpful to detect various types of users in the future

    A Survey on Data-Driven Evaluation of Competencies and Capabilities Across Multimedia Environments

    Get PDF
    The rapid evolution of technology directly impacts the skills and jobs needed in the next decade. Users can, intentionally or unintentionally, develop different skills by creating, interacting with, and consuming the content from online environments and portals where informal learning can emerge. These environments generate large amounts of data; therefore, big data can have a significant impact on education. Moreover, the educational landscape has been shifting from a focus on contents to a focus on competencies and capabilities that will prepare our society for an unknown future during the 21st century. Therefore, the main goal of this literature survey is to examine diverse technology-mediated environments that can generate rich data sets through the users’ interaction and where data can be used to explicitly or implicitly perform a data-driven evaluation of different competencies and capabilities. We thoroughly and comprehensively surveyed the state of the art to identify and analyse digital environments, the data they are producing and the capabilities they can measure and/or develop. Our survey revealed four key multimedia environments that include sites for content sharing & consumption, video games, online learning and social networks that fulfilled our goal. Moreover, different methods were used to measure a large array of diverse capabilities such as expertise, language proficiency and soft skills. Our results prove the potential of the data from diverse digital environments to support the development of lifelong and lifewide 21st-century capabilities for the future society

    Fundamentos de Programación: Catálogo de ejercicios y soluciones

    Get PDF
    ©2022. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0

    Identifying Professional Photographers Through Image Quality and Aesthetics in Flickr

    Full text link
    In our generation, there is an undoubted rise in the use of social media and specifically photo and video sharing platforms. These sites have proved their ability to yield rich data sets through the users' interaction which can be used to perform a data-driven evaluation of capabilities. Nevertheless, this study reveals the lack of suitable data sets in photo and video sharing platforms and evaluation processes across them. In this way, our first contribution is the creation of one of the largest labelled data sets in Flickr with the multimodal data which has been open sourced as part of this contribution. Predicated on these data, we explored machine learning models and concluded that it is feasible to properly predict whether a user is a professional photographer or not based on self-reported occupation labels and several feature representations out of the user, photo and crowdsourced sets. We also examined the relationship between the aesthetics and technical quality of a picture and the social activity of that picture. Finally, we depicted which characteristics differentiate professional photographers from non-professionals. As far as we know, the results presented in this work represent an important novelty for the users' expertise identification which researchers from various domains can use for different applications

    COSMOS: Centinela colaborativa, perfecta y adaptable para la Internet de las cosas

    Get PDF
    The Internet of Things (IoT) became established during the last decade as an emerging technology with considerable potentialities and applicability. Its paradigm of everything connected together penetrated the real world, with smart devices located in several daily appliances. Such intelligent objects are able to communicate autonomously through already existing network infrastructures, thus generating a more concrete integration between real world and computer-based systems. On the downside, the great benefit carried by the IoT paradigm in our life brings simultaneously severe security issues, since the information exchanged among the objects frequently remains unprotected from malicious attackers. The paper at hand proposes COSMOS (Collaborative, Seamless and Adaptive Sentinel for the Internet of Things), a novel sentinel to protect smart environments from cyber threats. Our sentinel shields the IoT devices using multiple defensive rings, resulting in a more accurate and robust protection. Additionally, we discuss the current deployment of the sentinel on a commodity device (i.e., Raspberry Pi). Exhaustive experiments are conducted on the sentinel, demonstrating that it performs meticulously even in heavily stressing conditions. Each defensive layer is tested, reaching a remarkable performance, thus proving the applicability of COSMOS in a distributed and dynamic scenario such as IoT. With the aim of easing the enjoyment of the proposed sentinel, we further developed a friendly and ease-to-use COSMOS App, so that end-users can manage sentinel(s) directly using their own devices (e.g., smartphone)

    A Big Data Architecture for Early Identification and Categorization of Dark Web Sites

    Full text link
    The dark web has become notorious for its association with illicit activities and there is a growing need for systems to automate the monitoring of this space. This paper proposes an end-to-end scalable architecture for the early identification of new Tor sites and the daily analysis of their content. The solution is built using an Open Source Big Data stack for data serving with Kubernetes, Kafka, Kubeflow, and MinIO, continuously discovering onion addresses in different sources (threat intelligence, code repositories, web-Tor gateways, and Tor repositories), downloading the HTML from Tor and deduplicating the content using MinHash LSH, and categorizing with the BERTopic modeling (SBERT embedding, UMAP dimensionality reduction, HDBSCAN document clustering and c-TF-IDF topic keywords). In 93 days, the system identified 80,049 onion services and characterized 90% of them, addressing the challenge of Tor volatility. A disproportionate amount of repeated content is found, with only 6.1% unique sites. From the HTML files of the dark sites, 31 different low-topics are extracted, manually labeled, and grouped into 11 high-level topics. The five most popular included sexual and violent content, repositories, search engines, carding, cryptocurrencies, and marketplaces. During the experiments, we identified 14 sites with 13,946 clones that shared a suspiciously similar mirroring rate per day, suggesting an extensive common phishing network. Among the related works, this study is the most representative characterization of onion services based on topics to date

    Mobility in collaborative alert systems building trust through reputation

    Get PDF
    Part 3: - WCNS 2011 Workshop; International audience; Collaborative Intrusion Detection Networks (CIDN) are usually composed by a set of nodes working together to detect distributed intrusions that cannot be easily recognized with traditional intrusion detection architectures. In this approach every node could potentially collaborate to provide its vision of the system and report the alarms being detected at the network, service and/or application levels. This approach includes considering mobile nodes that will be entering and leaving the network in an ad hoc manner. However, for this alert information to be useful in the context of CIDN networks, certain trust and reputation mechanisms determining the credibility of a particular mobile node, and the alerts it provides, are needed. This is the main objective of this paper, where an inter-domain trust and reputation model, together with an architecture for inter-domain collaboration, are presented with the main aim of improving the detection accuracy in CIDN systems while users move from one security domain to another. Document type: Part of book or chapter of boo

    SCORPION Cyber Range: Fully Customizable Cyberexercises, Gamification and Learning Analytics to Train Cybersecurity Competencies

    Full text link
    It is undeniable that we are witnessing an unprecedented digital revolution. However, recent years have been characterized by the explosion of cyberattacks, making cybercrime one of the most profitable businesses on the planet. That is why training in cybersecurity is increasingly essential to protect the assets of cyberspace. One of the most vital tools to train cybersecurity competencies is the Cyber Range, a virtualized environment that simulates realistic networks. The paper at hand introduces SCORPION, a fully functional and virtualized Cyber Range, which manages the authoring and automated deployment of scenarios. In addition, SCORPION includes several elements to improve student motivation, such as a gamification system with medals, points, or rankings, among other elements. Such a gamification system includes an adaptive learning module that is able to adapt the cyberexercise based on the users' performance. Moreover, SCORPION leverages learning analytics that collects and processes telemetric and biometric user data, including heart rate through a smartwatch, which is available through a dashboard for instructors. Finally, we developed a case study where SCORPION obtained 82.10% in usability and 4.57 out of 5 in usefulness from the viewpoint of a student and an instructor. The positive evaluation results are promising, indicating that SCORPION can become an effective, motivating, and advanced cybersecurity training tool to help fill current gaps in this context.Comment: 31 page
    corecore