25,422 research outputs found
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
Solutions to Detect and Analyze Online Radicalization : A Survey
Online Radicalization (also called Cyber-Terrorism or Extremism or
Cyber-Racism or Cyber- Hate) is widespread and has become a major and growing
concern to the society, governments and law enforcement agencies around the
world. Research shows that various platforms on the Internet (low barrier to
publish content, allows anonymity, provides exposure to millions of users and a
potential of a very quick and widespread diffusion of message) such as YouTube
(a popular video sharing website), Twitter (an online micro-blogging service),
Facebook (a popular social networking website), online discussion forums and
blogosphere are being misused for malicious intent. Such platforms are being
used to form hate groups, racist communities, spread extremist agenda, incite
anger or violence, promote radicalization, recruit members and create virtual
organi- zations and communities. Automatic detection of online radicalization
is a technically challenging problem because of the vast amount of the data,
unstructured and noisy user-generated content, dynamically changing content and
adversary behavior. There are several solutions proposed in the literature
aiming to combat and counter cyber-hate and cyber-extremism. In this survey, we
review solutions to detect and analyze online radicalization. We review 40
papers published at 12 venues from June 2003 to November 2011. We present a
novel classification scheme to classify these papers. We analyze these
techniques, perform trend analysis, discuss limitations of existing techniques
and find out research gaps
- …