54,215 research outputs found

    Classifying Web Exploits with Topic Modeling

    Full text link
    This short empirical paper investigates how well topic modeling and database meta-data characteristics can classify web and other proof-of-concept (PoC) exploits for publicly disclosed software vulnerabilities. By using a dataset comprised of over 36 thousand PoC exploits, near a 0.9 accuracy rate is obtained in the empirical experiment. Text mining and topic modeling are a significant boost factor behind this classification performance. In addition to these empirical results, the paper contributes to the research tradition of enhancing software vulnerability information with text mining, providing also a few scholarly observations about the potential for semi-automatic classification of exploits in the existing tracking infrastructures.Comment: Proceedings of the 2017 28th International Workshop on Database and Expert Systems Applications (DEXA). http://ieeexplore.ieee.org/abstract/document/8049693

    Predicting Cyber Events by Leveraging Hacker Sentiment

    Full text link
    Recent high-profile cyber attacks exemplify why organizations need better cyber defenses. Cyber threats are hard to accurately predict because attackers usually try to mask their traces. However, they often discuss exploits and techniques on hacking forums. The community behavior of the hackers may provide insights into groups' collective malicious activity. We propose a novel approach to predict cyber events using sentiment analysis. We test our approach using cyber attack data from 2 major business organizations. We consider 3 types of events: malicious software installation, malicious destination visits, and malicious emails that surpassed the target organizations' defenses. We construct predictive signals by applying sentiment analysis on hacker forum posts to better understand hacker behavior. We analyze over 400K posts generated between January 2016 and January 2018 on over 100 hacking forums both on surface and Dark Web. We find that some forums have significantly more predictive power than others. Sentiment-based models that leverage specific forums can outperform state-of-the-art deep learning and time-series models on forecasting cyber attacks weeks ahead of the events

    Understanding Communication Patterns in MOOCs: Combining Data Mining and qualitative methods

    Full text link
    Massive Open Online Courses (MOOCs) offer unprecedented opportunities to learn at scale. Within a few years, the phenomenon of crowd-based learning has gained enormous popularity with millions of learners across the globe participating in courses ranging from Popular Music to Astrophysics. They have captured the imaginations of many, attracting significant media attention - with The New York Times naming 2012 "The Year of the MOOC." For those engaged in learning analytics and educational data mining, MOOCs have provided an exciting opportunity to develop innovative methodologies that harness big data in education.Comment: Preprint of a chapter to appear in "Data Mining and Learning Analytics: Applications in Educational Research

    Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media

    Full text link
    Social media is often viewed as a sensor into various societal events such as disease outbreaks, protests, and elections. We describe the use of social media as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our approach detects a broad range of cyber-attacks (e.g., distributed denial of service (DDOS) attacks, data breaches, and account hijacking) in an unsupervised manner using just a limited fixed set of seed event triggers. A new query expansion strategy based on convolutional kernels and dependency parses helps model reporting structure and aids in identifying key event characteristics. Through a large-scale analysis over Twitter, we demonstrate that our approach consistently identifies and encodes events, outperforming existing methods.Comment: 13 single column pages, 5 figures, submitted to KDD 201
    • …
    corecore