3,439 research outputs found
The Looming Threat of Fake and LLM-generated LinkedIn Profiles: Challenges and Opportunities for Detection and Prevention
In this paper, we present a novel method for detecting fake and Large
Language Model (LLM)-generated profiles in the LinkedIn Online Social Network
immediately upon registration and before establishing connections. Early fake
profile identification is crucial to maintaining the platform's integrity since
it prevents imposters from acquiring the private and sensitive information of
legitimate users and from gaining an opportunity to increase their credibility
for future phishing and scamming activities. This work uses textual information
provided in LinkedIn profiles and introduces the Section and Subsection Tag
Embedding (SSTE) method to enhance the discriminative characteristics of these
data for distinguishing between legitimate profiles and those created by
imposters manually or by using an LLM. Additionally, the dearth of a large
publicly available LinkedIn dataset motivated us to collect 3600 LinkedIn
profiles for our research. We will release our dataset publicly for research
purposes. This is, to the best of our knowledge, the first large publicly
available LinkedIn dataset for fake LinkedIn account detection. Within our
paradigm, we assess static and contextualized word embeddings, including GloVe,
Flair, BERT, and RoBERTa. We show that the suggested method can distinguish
between legitimate and fake profiles with an accuracy of about 95% across all
word embeddings. In addition, we show that SSTE has a promising accuracy for
identifying LLM-generated profiles, despite the fact that no LLM-generated
profiles were employed during the training phase, and can achieve an accuracy
of approximately 90% when only 20 LLM-generated profiles are added to the
training set. It is a significant finding since the proliferation of several
LLMs in the near future makes it extremely challenging to design a single
system that can identify profiles created with various LLMs.Comment: 33rd ACM Conference on Hypertext and Social Media (HT '23
An Empirical Study on Android for Saving Non-shared Data on Public Storage
With millions of apps that can be downloaded from official or third-party
market, Android has become one of the most popular mobile platforms today.
These apps help people in all kinds of ways and thus have access to lots of
user's data that in general fall into three categories: sensitive data, data to
be shared with other apps, and non-sensitive data not to be shared with others.
For the first and second type of data, Android has provided very good storage
models: an app's private sensitive data are saved to its private folder that
can only be access by the app itself, and the data to be shared are saved to
public storage (either the external SD card or the emulated SD card area on
internal FLASH memory). But for the last type, i.e., an app's non-sensitive and
non-shared data, there is a big problem in Android's current storage model
which essentially encourages an app to save its non-sensitive data to shared
public storage that can be accessed by other apps. At first glance, it seems no
problem to do so, as those data are non-sensitive after all, but it implicitly
assumes that app developers could correctly identify all sensitive data and
prevent all possible information leakage from private-but-non-sensitive data.
In this paper, we will demonstrate that this is an invalid assumption with a
thorough survey on information leaks of those apps that had followed Android's
recommended storage model for non-sensitive data. Our studies showed that
highly sensitive information from billions of users can be easily hacked by
exploiting the mentioned problematic storage model. Although our empirical
studies are based on a limited set of apps, the identified problems are never
isolated or accidental bugs of those apps being investigated. On the contrary,
the problem is rooted from the vulnerable storage model recommended by Android.
To mitigate the threat, we also propose a defense framework
Using Social Media to Promote STEM Education: Matching College Students with Role Models
STEM (Science, Technology, Engineering, and Mathematics) fields have become
increasingly central to U.S. economic competitiveness and growth. The shortage
in the STEM workforce has brought promoting STEM education upfront. The rapid
growth of social media usage provides a unique opportunity to predict users'
real-life identities and interests from online texts and photos. In this paper,
we propose an innovative approach by leveraging social media to promote STEM
education: matching Twitter college student users with diverse LinkedIn STEM
professionals using a ranking algorithm based on the similarities of their
demographics and interests. We share the belief that increasing STEM presence
in the form of introducing career role models who share similar interests and
demographics will inspire students to develop interests in STEM related fields
and emulate their models. Our evaluation on 2,000 real college students
demonstrated the accuracy of our ranking algorithm. We also design a novel
implementation that recommends matched role models to the students.Comment: 16 pages, 8 figures, accepted by ECML/PKDD 2016, Industrial Trac
Analyzing Social and Stylometric Features to Identify Spear phishing Emails
Spear phishing is a complex targeted attack in which, an attacker harvests
information about the victim prior to the attack. This information is then used
to create sophisticated, genuine-looking attack vectors, drawing the victim to
compromise confidential information. What makes spear phishing different, and
more powerful than normal phishing, is this contextual information about the
victim. Online social media services can be one such source for gathering vital
information about an individual. In this paper, we characterize and examine a
true positive dataset of spear phishing, spam, and normal phishing emails from
Symantec's enterprise email scanning service. We then present a model to detect
spear phishing emails sent to employees of 14 international organizations, by
using social features extracted from LinkedIn. Our dataset consists of 4,742
targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack
emails sent to 5,912 non victims; and publicly available information from their
LinkedIn profiles. We applied various machine learning algorithms to this
labeled data, and achieved an overall maximum accuracy of 97.76% in identifying
spear phishing emails. We used a combination of social features from LinkedIn
profiles, and stylometric features extracted from email subjects, bodies, and
attachments. However, we achieved a slightly better accuracy of 98.28% without
the social features. Our analysis revealed that social features extracted from
LinkedIn do not help in identifying spear phishing emails. To the best of our
knowledge, this is one of the first attempts to make use of a combination of
stylometric features extracted from emails, and social features extracted from
an online social network to detect targeted spear phishing emails.Comment: Detection of spear phishing using social media feature
An empirical analysis of SNS users and their privacy and security awareness of risks associated with sharing SNS profiles (online identities)
Social networking sites (SNS) like MySpace, Facebook and
LinkedIn now have hundreds of millions of users. In this paper a quantitative approach was used to analyse primary data collected about SNS users. Our findings show that SNS users are dominated by younger adults, higher education levels and higher income levels. SNSs are more likely to be used for maintaining existing friendships as opposed to establishing new friendships and for building business networks. SNS users either have poor levels of privacy and security awareness or high levels of complacency in relation to SNS profile sharing and sharing their identity online
- …