2 research outputs found
Heuristic Feature Selection for Clickbait Detection
We study feature selection as a means to optimize the baseline clickbait
detector employed at the Clickbait Challenge 2017. The challenge's task is to
score the "clickbaitiness" of a given Twitter tweet on a scale from 0 (no
clickbait) to 1 (strong clickbait). Unlike most other approaches submitted to
the challenge, the baseline approach is based on manual feature engineering and
does not compete out of the box with many of the deep learning-based
approaches. We show that scaling up feature selection efforts to heuristically
identify better-performing feature subsets catapults the performance of the
baseline classifier to second rank overall, beating 12 other competing
approaches and improving over the baseline performance by 20%. This
demonstrates that traditional classification approaches can still keep up with
deep learning on this task.Comment: Clickbait Challenge 201
The Clickbait Challenge 2017: Towards a Regression Model for Clickbait Strength
Clickbait has grown to become a nuisance to social media users and social
media operators alike. Malicious content publishers misuse social media to
manipulate as many users as possible to visit their websites using clickbait
messages. Machine learning technology may help to handle this problem, giving
rise to automatic clickbait detection. To accelerate progress in this
direction, we organized the Clickbait Challenge 2017, a shared task inviting
the submission of clickbait detectors for a comparative evaluation. A total of
13 detectors have been submitted, achieving significant improvements over the
previous state of the art in terms of detection performance. Also, many of the
submitted approaches have been published open source, rendering them
reproducible, and a good starting point for newcomers. While the 2017 challenge
has passed, we maintain the evaluation system and answer to new registrations
in support of the ongoing research on better clickbait detectors