17 research outputs found

    Bot and gender detection of twitter accounts using distortion and LSA notebook for PAN at CLEF 2019

    Get PDF
    In this work, we present our approach for the Author Profiling task of PAN 2019. The task is divided into two sub-problems, bot, and gender detection, for two different languages: English and Spanish. For each instance of the problem and each language, we address the problem differently. We use an ensemble architecture to solve the Bot Detection for accounts that write in English and a single SVM for those who write in Spanish. For the Gender detection we use a single SVM architecture for both the languages, but we pre-process the tweets in a different way. Our final models achieve accuracy over the 90% in the bot detection task, while for the gender detection, of 84.17% and 77.61% respectively for the English and Spanish languages

    Cross-domain authorship attribution combining instance-based and profile-based features notebook for PAN at CLEF 2019

    Get PDF
    Being able to identify the author of an unknown text is crucial. Although it is a well-studied field, it is still an open problem, since a standard approach has yet to be found. In this notebook, we propose our model for the Authorship Attribution task of PAN 2019, that focuses on cross-domain setting covering 4 different languages: French, Italian, English, and Spanish. We use n-grams of characters, words, stemmed words, and distorted text. Our model has an SVM for each feature and an ensemble architecture. Our final results outperform the baseline given by PAN in almost every problem. With this model, we reach the second place in the task with an F1-score of 68%

    A Light in the Dark Web: Linking Dark Web aliases to real internet identities

    No full text
    Most users have several Internet names. On Facebook or LinkedIn, for example, people usually appear with the real one. On other standard websites, like forums, people often use aliases to protect their real identities with respect to the other users, with no real privacy against the web site and the authorities. Aliases in the Dark Web are different: users expect strong identity protection. In this paper, we show that using both “open” aliases (aliases used in the standard Web) and Dark Web aliases can be dangerous per se. Indeed, we develop tools to link Dark Web to open aliases. For the first time, we perform a massive scale experiment on real scenarios. First between two Dark Web forums, then between the Dark Web forums and the standard forums. Due to a large number of possible pairs, we first reduce the search space cutting down the number of potential matches to a small set of candidates, and then on the selection of the correct alias among these candidates. We show that our methodology has excellent precision, from 87% to 94%, and recall around 80%

    The parallel lives of autonomous systems: ASN allocations vs. BGP

    No full text
    Autonomous Systems (ASes) exist in two dimensions on the Internet: the administrative and the operational one. Regional Internet Registries (RIRs) rule the former, while BGP the latter. In this work, we reconstruct the lives of the ASes on both dimensions, performing a joint analysis that covers 17 years of data. For the administrative dimension, we leverage delegation files published by RIRs to report the daily status of Internet resources they allocate. For the operational dimension, we characterize the temporal activity of ASNs in the Internet control plane using BGP data collected by the RouteViews and RIPE RIS projects. We present a methodology to extract insights about AS life cycles, including dealing with pitfalls affecting authoritative public datasets. We then perform a joint analysis to establish the relationship (or lack of) between these two dimensions for all allocated ASNs and all ASNs visible in BGP. We characterize the usual behaviors, specific differences between RIRs and historical resources, as well as measure the discrepancies between the two "parallel"lives. We find discrepancies and misalignment that reveal useful insights, and we highlight through examples the potential of this new lens to help pinpoint malicious BGP activity and various types of misconfigurations. This study illuminates a largely unexplored aspect of the Internet global routing system and provides methods and data to support broader studies that relate to security, policy, and network management

    Map-following skills in left and right brain-damaged patients with and without hemineglect

    No full text
    Map-following tasks require a "semantic interpretation" of the map, which could be affected by left brain damage, and "superimposition of the map upon the space," which could be compromised by right lesions and particularly by the presence of hemineglect. Participants followed a pathway depicted on a map of a real environment. The pathway included four left and four right turns. A legend explained the meaning of each symbol that appeared on the map. Our results showed no deficits in left brain-damaged patients, but poor performance in right brain-damaged patients affected by hemineglect. This deficit can be ascribed to their impaired egocentric frame of reference, but we cannot exclude a prevalent role of the right hemisphere in their use of the allocentric information on the map despite the presence of hemineglect. Indeed, three right brain-damaged patients without hemineglect showed a specific deficit in performing the task. We discuss the results in light of the possible impairment of the parietomedial temporal pathway, which supports spatial navigation and could be responsible for the patients' deficit.Map-following tasks require a "semantic interpretation" of the map, which could be affected by left brain damage, and "superimposition of the map upon the space," which could be compromised by right lesions and particularly by the presence of hemineglect. Participants followed a pathway depicted on a map of a real environment. The pathway included four left and four right turns. A legend explained the meaning of each symbol that appeared on the map. Our results showed no deficits in left brain-damaged patients, but poor performance in right brain-damaged patients affected by hemineglect. This deficit can be ascribed to their impaired egocentric frame of reference, but we cannot exclude a prevalent role of the right hemisphere in their use of the allocentric information on the map despite the presence of hemineglect. Indeed, three right brain-damaged patients without hemineglect showed a specific deficit in performing the task. We discuss the results in light of the possible impairmen
    corecore