24 research outputs found

    Klusterointipohjainen liikenteensuodatus puolustuksena hajautettuja palvelunestohyökkäyksiä vastaan

    Get PDF
    Distributed Denial of Service (DDoS) attacks are considered one of the major security threats in the current Internet. Although many solutions have been suggested for the DDoS defense, real progress in fighting those attacks is still missing. In this work, we analyze and experiment with cluster-based filtering for DDoS defense. In cluster-based filtering, unsupervised learning is used to create a nor- mal profile of the network traffic. Then the filter for DDoS attacks is based on this normal profile. We focus on the scenario in which the cluster-based filter is deployed at the target network and serves for proactive or reactive defense. A game-theoretic model is created for the scenario, making it possible to model the defender and attacker strategies as mathematical optimization tasks. The ob- tained optimal strategies are then experimentally evaluated. In the testbed setup, the hierarchical heavy hitters (HHH) algorithm is applied to traffic clustering and the Differentiated Services (DiffServ) quality-of-service (QoS) architecture is used for deploying the cluster-based filter on a Linux router. The theoretical results suggest that the cluster-based filtering is an effective method for DDoS defense, unless the attacker is able to send traffic which per- fectly imitates the normal traffic distribution. The experimental outcome con- firms the theoretical results and shows the high effectiveness of cluster-based filtering in proactive and reactive DDoS defense.Hajautetut palvelunestohyökkäykset ovat yksi nyky-Internetin suurimmista tietoturvahaasteista. Vaikkakin näitä hyökkäyksiä vastaan on kehitetty lukuisia puolustusmekanismeja, mikään näistä ei tarjoa täydellistä suojaa. Tämä työ tutkii klusterointiin perustuvaa liikenteensuodatusta ja sen käyttöä puolustuksena palvelunestohyökkäyksiä vastaan. Klusterointipohjaisessa suodatuksessa suodatin oppii itsenäisesti normaalit liikennejakaumat. Tämän jälkeen näitä liikennejakaumia voidaan käyttää suodattamaan palvelunestohyökkäyksestä johtuvaa ylimääräistä liikennettä. Diplomityö tutkii skenaariota, jossa käytetään sekä proaktiivista, että reaktiivista klusterointipohjaista puolustusmenetelmää. Lisäksi skenaariosta formuloidaan peliteoreettinen malli, jonka avulla erilaisten hyökkäys- sekä puolustusmenetelmien analyyttinen tutkiminen on mahdollista. Analyyttisesti saatuja tuloksia evaluoidaan kokeellisesti Linux-reitittimessä hyödyntäen Hierarchical Heavy Hitter –klusterointialgoritmia sekä DiffServ-arkkitehtuuria. Diplomityön teoreettiset tulokset osoittavat, että klusterointiin perustuva suodatus on tehokas puolustus palvelunestohyökkäyksiä vastaan ellei hyökkääjä kykene tekemään imitoimaan tavallista liikennejakaumaa palvelunestohyökkäystä tehdessään. Kokeelliset tulokset vahvistavat teoreettiset tulokset ja osoittavat klusterointipohjaisen suodatuksen tehokkuuden palvelunestohyökkäyksiä vastaan

    The role of the Big Geographic Sort in Online News Circulation among U.S. Reddit Users

    Get PDF
    Past research has attributed the circulation of online news to two main factors—individual characteristics (e.g., a person’s information literacy) and social media efects (e.g., algorithmmediated information difusion)—and has overlooked a third one: the critical mass created by the ofine self-segregation of Americans into like-minded geographical regions such as states (a phenomenon called ‘The Big Sort’). We hypothesized that this latter factor matters for the online spreading of news not least because online interactions, despite having the potential of being global, end up being localized: interaction probability is known to rapidly decay with distance. Upon analysis of more than 8M Reddit comments containing news links spanning four years, from January 2016 to December 2019, we found that Reddit did not work as an ‘hype machine’ for news (as opposed to what previous work reported for other platforms, circulation was not mainly caused by platform-facilitated network efects). Rather, news circulation in Reddit worked as a supply-and-demand system: news items scaled linearly with the number of users in each state (with a scaling exponent β ≈ 1, and a goodness of ft R2 ≈ 0.95). Furthermore, deviations from such a universal pattern were best explained by state-level personality and cultural factors (R2 ≈ {0.12, 0.39}), rather than socioeconomic conditions (R2 ≈ {0.15, 0.29}) or political characteristics (R2 ≈ {0.06, 0.21}). Higher-than-expected circulation of any type of news was found in states characterised by residents who tend to be less diligent in terms of their personality (low in conscientiousness) and by loose cultures understating the importance of adherence to norms (low in cultural tightness). Interestingly, the combination of those factors with low levels of education was then associated with the circulation of a particular type of news, that is, misinformation. These results suggest that online interactions are geographically bounded and, as such, news circulation cannot be studied purely as an Internet phenomenon but should be grounded into a user’s ofine cultural environment, which has become increasingly segregated over the decades, and is admittedly hard to change

    Malware distributions and graph structure of the Web

    Full text link
    Knowledge about the graph structure of the Web is important for understanding this complex socio-technical system and for devising proper policies supporting its future development. Knowledge about the differences between clean and malicious parts of the Web is important for understanding potential treats to its users and for devising protection mechanisms. In this study, we conduct data science methods on a large crawl of surface and deep Web pages with the aim to increase such knowledge. To accomplish this, we answer the following questions. Which theoretical distributions explain important local characteristics and network properties of websites? How are these characteristics and properties different between clean and malicious (malware-affected) websites? What is the prediction power of local characteristics and network properties to classify malware websites? To the best of our knowledge, this is the first large-scale study describing the differences in global properties between malicious and clean parts of the Web. In other words, our work is building on and bridging the gap between \textit{Web science} that tackles large-scale graph representations and \textit{Web cyber security} that is concerned with malicious activities on the Web. The results presented herein can also help antivirus vendors in devising approaches to improve their detection algorithms

    On the adoption of e-moped sharing systems

    Get PDF
    AbstractRecent years have witnessed the emerging of novel shared mobility solutions that provide diffused on-demand access to transportation. The widespread adoption of these solutions, particularly electric mopeds (e-mopeds), is expected to bring important benefits such as the reduction of noise and atmospheric pollution, and road congestion, with extensive repercussions on liveability and quality of life in urban areas. Currently, almost no effort has been devoted to exploring the adoption patterns of e-moped sharing services, therefore, optimal management and allocation of vehicles appears to be a problem for service managers. In this study, we tried to demonstrate the validity of the hypothesis that the adoption of electric mopeds depends on the built environment and demographic aspects of each neighbourhood. In detail, we singled out three features concerning the area characteristics (distance from centre, walkability, concentration of places) and one about the population (education index). The results obtained on a real world case study show the strong impact these factors have in determining the adoption of e-moped sharing services. Finally, an analysis was conducted on the possible role that the electric moped sharing can play in social equalization by studying the interactions between rich and poor neighbourhoods. The results of the analyses conducted indicate that communities within a city tend to aggregate by wealth and isolate themselves from one another (social isolation): very few interactions, in terms of trajectories, have been observed between the richest and poorest areas of the city under study

    The Healthy States of America: Creating a Health Taxonomy with Social Media

    Get PDF
    Since the uptake of social media, researchers have mined online discussions to track the outbreak and evolution of specific diseases or chronic conditions such as influenza or depression. To broaden the set of diseases under study, we developed a Deep Learning tool for Natural Language Processing that extracts mentions of virtually any medical condition or disease from unstructured social media text. With that tool at hand, we processed Reddit and Twitter posts, analyzed the clusters of the two resulting co-occurrence networks of conditions, and discovered that they correspond to well-defined categories of medical conditions. This resulted in the creation of the first comprehensive taxonomy of medical conditions automatically derived from online discussions. We validated the structure of our taxonomy against the official International Statistical Classification of Diseases and Related Health Problems (ICD-11), finding matches of our clusters with 20 official categories, out of 22. Based on the mentions of our taxonomy's sub-categories on Reddit posts geo-referenced in the U.S., we were then able to compute disease-specific health scores. As opposed to counts of disease mentions or counts with no knowledge of our taxonomy's structure, we found that our disease-specific health scores are causally linked with the officially reported prevalences of 18 conditions

    Dream Content Discovery from Reddit with an Unsupervised Mixed-Method Approach

    Full text link
    Dreaming is a fundamental but not fully understood part of human experience that can shed light on our thought patterns. Traditional dream analysis practices, while popular and aided by over 130 unique scales and rating systems, have limitations. Mostly based on retrospective surveys or lab studies, they struggle to be applied on a large scale or to show the importance and connections between different dream themes. To overcome these issues, we developed a new, data-driven mixed-method approach for identifying topics in free-form dream reports through natural language processing. We tested this method on 44,213 dream reports from Reddit's r/Dreams subreddit, where we found 217 topics, grouped into 22 larger themes: the most extensive collection of dream topics to date. We validated our topics by comparing it to the widely-used Hall and van de Castle scale. Going beyond traditional scales, our method can find unique patterns in different dream types (like nightmares or recurring dreams), understand topic importance and connections, and observe changes in collective dream experiences over time and around major events, like the COVID-19 pandemic and the recent Russo-Ukrainian war. We envision that the applications of our method will provide valuable insights into the intricate nature of dreaming.Comment: 20 pages, 6 figures, 4 tables, 4 pages of supplementary informatio

    Population structure of the invasive Atlantic blue crab, Callinectes sapidus on the Eastern Adriatic coast (Croatia, Montenegro)

    Get PDF
    The Atlantic blue crab, Callinectes sapidus, is a highly invasive species that poses a significant threat to Mediterranean ecosystems. In the last two decades, it has become established in several marine and estuarine areas of the eastern Adriatic Sea, resulting in a decline in commercial catches and damage to fishing gear. This article reviews the current status of blue crab invasion in Montenegro and Croatia and analyses its abundance and population structure. Overall 619 crabs were sampled (male:female ratio, 1:1.91). Both carapace width and weight differed significantly between males and females, with males having a wider carapace and more weight. There was a significant difference in carapace width and weight among sites. For the total population, the mean male and female carapace width was 130.3 ± 30.8 and 108.8 ± 41.4 mm, respectively. In addition, the mean male and female weight was 187.2 ± 85.6 and 132.5 ± 39.1 g, respectively. Coefficient b between the weight and carapace width of blue crabs was significant at all locations, although it varied between males and females. This work will also document the impacts of the blue crab invasion on local ecosystems and provide comprehensive overview of populations structures and shed light on this important aspect of blue crab ecology

    Mitigating, DDoS attacks with cluster-based traffic filtering

    No full text
    https://www.ester.ee/record=b5400302*es

    An Empirical Study in Evaluating the E-learning Dimension of Blended Model

    No full text
    The purpose of this study is to offer a methodological framework for assessing e-learning capacities in blended educational scheme, based on users’ experiences, with an ultimate goal of getting a clearer view of what should be improved within existing systems of this kind. Accordingly, a survey among sixty students at the post graduate level and ten experienced instructors was conducted at “Mediterranean” University in Montenegro. An assessment model based on a particular combination of binary and Saaty’s matrix approaches was used for adding the quantitative dimension to the considered issue, with an intention to generate more specific directions for redesigning and improving key features of contemporary web-based e-learning systems (WBEL) in the blended environment by making them more valuable and user-friendly ones

    How Circadian Rhythms Extracted from Social Media Relate to Physical Activity and Sleep

    No full text
    Circadian rhythm has been linked to both physical and mental health at an individual level in prior research. Such a link at population level has been long hypothesized but has never been tested, largely because of lack of data. To partly fix this literature gap, we need: a dataset on population-level circadian rhythms, a dataset on population-level health conditions, and strong associations between these two partly independent sets. Recent work has shown that affect on social media data relates to population-level circadian rhythms. Building upon that work, we extracted five circadian rhythm metrics from 6M Reddit posts across 18 major cities (for which the number of residents is highly correlated with the number of users), and paired them with three ground-truth health metrics (daily number of steps, sleep quantity, and sleep quality) extracted from 233K wearable users in these cities. We found that rhythms of online activity approximated sleeping patterns rather than, what the literature previously hypothesized, alertness levels. Despite that, we found that these rhythms, when computed in two specific times of the day (i.e., late at night and early morning), were still predictive of the three ground-truth health metrics: in general, healthier cities had morning spikes on social media, night dips, and expressions of positive affect. These results suggest that circadian rhythms on social media, if taken at two specific times of the day and operationalized with literature-driven metrics, can approximate the temporal evolution of people's shared underlying biological rhythm as it relates to physical activity (R2=0.492), sleep quantity (R2=0.765), and sleep quality (R2=0.624)
    corecore