116 research outputs found
A Novel Gesture-based CAPTCHA Design for Smart Devices
CAPTCHAs have been widely used in Web applications to prevent service abuse. With the evolution of computing environment from desktop computing to ubiquitous computing, more and more users are accessing Web applications on smart devices where touch based interactions are dominant. However, the majority of CAPTCHAs are designed for use on computers and laptops which do not reflect the shift of interaction style very well. In this paper, we propose a novel CAPTCHA design to utilise the convenience of touch interface while retaining the needed security. This is achieved through using a hybrid challenge to take advantages of human’s cognitive abilities. A prototype is also developed and found to be more user friendly than conventional CAPTCHAs in the preliminary user acceptance test
Designing Mobile Friendly CAPTCHAs: An Exploratory Study.
CAPTCHAs (Completely Automated Public Turing Test to Tell Computers and Humans Apart) are one of the most widely used authentication mechanisms that help to prevent online service abuse. With the advent of mobile computing, mobile devices such as smartphones and tablets have become the primary way people access the Internet. As a result, increasing attention has been paid to designing CAPTCHAs that are mobile friendly. Although such CAPTCHAs generally show their advantages over traditional ones, it is still unclear what the best practices are for designing a CAPTCHA scheme that is easy to use on mobile devices. In this paper, we present an exploratory study that focuses on developing a more holistic view of usability issues with interactive CAPTCHAs to inform design guidance. This is done through investigating the usability performance of seven mobile friendly CAPTCHA schemes representing five different CAPTCHA types
BeCAPTCHA: Behavioral bot detection using touchscreen and mobile sensors benchmarked on HuMIdb
In this paper we study the suitability of a new generation of CAPTCHA methods based on smartphone interactions. The heterogeneous flow of data generated during the interaction with the smartphones can be used to model human behavior when interacting with the technology and improve bot detection algorithms. For this, we propose BeCAPTCHA, a CAPTCHA method based on the analysis of the touchscreen information obtained during a single drag and drop task in combination with the accelerometer data. The goal of BeCAPTCHA is to determine whether the drag and drop task was realized by a human or a bot. We evaluate the method by generating fake samples synthesized with Generative Adversarial Neural Networks and handcrafted methods. Our results suggest the potential of mobile sensors to characterize the human behavior and develop a new generation of CAPTCHAs. The experiments are evaluated with HuMIdb1 (Human Mobile Interaction database), a novel multimodal mobile database that comprises 14 mobile sensors acquired from 600 users. HuMIdb is freely available to the research communityThis work has been supported by projects: PRIMA, Spain (H2020-MSCA-ITN-2019-860315), TRESPASS-ETN, Spain (H2020-MSCA-ITN-2019-860813), BIBECA RTI2018-101248-B-I00 (MINECO/FEDER), and BioGuard, Spain (Ayudas Fundación BBVA a Equipos de Investigación Científica 2017). Spanish Patent Application P20203006
Toward Robust Video Event Detection and Retrieval Under Adversarial Constraints
The continuous stream of videos that are uploaded and shared on the Internet has been leveraged by computer vision researchers for a myriad of detection and retrieval tasks, including gesture detection, copy detection, face authentication, etc. However, the existing state-of-the-art event detection and retrieval techniques fail to deal with several real-world challenges (e.g., low resolution, low brightness and noise) under adversary constraints. This dissertation focuses on these challenges in realistic scenarios and demonstrates practical methods to address the problem of robustness and efficiency within video event detection and retrieval systems in five application settings (namely, CAPTCHA decoding, face liveness detection, reconstructing typed input on mobile devices, video confirmation attack, and content-based copy detection). Specifically, for CAPTCHA decoding, I propose an automated approach which can decode moving-image object recognition (MIOR) CAPTCHAs faster than humans. I showed that not only are there inherent weaknesses in current MIOR CAPTCHA designs, but that several obvious countermeasures (e.g., extending the length of the codeword) are not viable. More importantly, my work highlights the fact that the choice of underlying hard problem selected by the designers of a leading commercial solution falls into a solvable subclass of computer vision problems. For face liveness detection, I introduce a novel approach to bypass modern face authentication systems. More specifically, by leveraging a handful of pictures of the target user taken from social media, I show how to create realistic, textured, 3D facial models that undermine the security of widely used face authentication solutions. My framework makes use of virtual reality (VR) systems, incorporating along the way the ability to perform animations (e.g., raising an eyebrow or smiling) of the facial model, in order to trick liveness detectors into believing that the 3D model is a real human face. I demonstrate that such VR-based spoofing attacks constitute a fundamentally new class of attacks that point to a serious weaknesses in camera-based authentication systems. For reconstructing typed input on mobile devices, I proposed a method that successfully transcribes the text typed on a keyboard by exploiting video of the user typing, even from significant distances and from repeated reflections. This feat allows us to reconstruct typed input from the image of a mobile phone’s screen on a user’s eyeball as reflected through a nearby mirror, extending the privacy threat to include situations where the adversary is located around a corner from the user. To assess the viability of a video confirmation attack, I explored a technique that exploits the emanations of changes in light to reveal the programs being watched. I leverage the key insight that the observable emanations of a display (e.g., a TV or monitor) during presentation of the viewing content induces a distinctive flicker pattern that can be exploited by an adversary. My proposed approach works successfully in a number of practical scenarios, including (but not limited to) observations of light effusions through the windows, on the back wall, or off the victim’s face. My empirical results show that I can successfully confirm hypotheses while capturing short recordings (typically less than 4 minutes long) of the changes in brightness from the victim’s display from a distance of 70 meters. Lastly, for content-based copy detection, I take advantage of a new temporal feature to index a reference library in a manner that is robust to the popular spatial and temporal transformations in pirated videos. My technique narrows the detection gap in the important area of temporal transformations applied by would-be pirates. My large-scale evaluation on real-world data shows that I can successfully detect infringing content from movies and sports clips with 90.0% precision at a 71.1% recall rate, and can achieve that accuracy at an average time expense of merely 5.3 seconds, outperforming the state of the art by an order of magnitude.Doctor of Philosoph
Mathematical Expression Recognition based on Probabilistic Grammars
[EN] Mathematical notation is well-known and used all over the
world. Humankind has evolved from simple methods representing
countings to current well-defined math notation able to account for
complex problems. Furthermore, mathematical expressions constitute a
universal language in scientific fields, and many information
resources containing mathematics have been created during the last
decades. However, in order to efficiently access all that information,
scientific documents have to be digitized or produced directly in
electronic formats.
Although most people is able to understand and produce mathematical
information, introducing math expressions into electronic devices
requires learning specific notations or using editors. Automatic
recognition of mathematical expressions aims at filling this gap
between the knowledge of a person and the input accepted by
computers. This way, printed documents containing math expressions
could be automatically digitized, and handwriting could be used for
direct input of math notation into electronic devices.
This thesis is devoted to develop an approach for mathematical
expression recognition. In this document we propose an approach for
recognizing any type of mathematical expression (printed or
handwritten) based on probabilistic grammars. In order to do so, we
develop the formal statistical framework such that derives several
probability distributions. Along the document, we deal with the
definition and estimation of all these probabilistic sources of
information. Finally, we define the parsing algorithm that globally
computes the most probable mathematical expression for a given input
according to the statistical framework.
An important point in this study is to provide objective performance
evaluation and report results using public data and standard
metrics. We inspected the problems of automatic evaluation in this
field and looked for the best solutions. We also report several
experiments using public databases and we participated in several
international competitions. Furthermore, we have released most of the
software developed in this thesis as open source.
We also explore some of the applications of mathematical expression
recognition. In addition to the direct applications of transcription
and digitization, we report two important proposals. First, we
developed mucaptcha, a method to tell humans and computers apart by
means of math handwriting input, which represents a novel application
of math expression recognition. Second, we tackled the problem of
layout analysis of structured documents using the statistical
framework developed in this thesis, because both are two-dimensional
problems that can be modeled with probabilistic grammars.
The approach developed in this thesis for mathematical expression
recognition has obtained good results at different levels. It has
produced several scientific publications in international conferences
and journals, and has been awarded in international competitions.[ES] La notación matemática es bien conocida y se utiliza en todo el
mundo. La humanidad ha evolucionado desde simples métodos para
representar cuentas hasta la notación formal actual capaz de modelar
problemas complejos. Además, las expresiones matemáticas constituyen
un idioma universal en el mundo científico, y se han creado muchos
recursos que contienen matemáticas durante las últimas décadas. Sin
embargo, para acceder de forma eficiente a toda esa información, los
documentos científicos han de ser digitalizados o producidos
directamente en formatos electrónicos.
Aunque la mayoría de personas es capaz de entender y producir
información matemática, introducir expresiones matemáticas en
dispositivos electrónicos requiere aprender notaciones especiales o
usar editores. El reconocimiento automático de expresiones matemáticas
tiene como objetivo llenar ese espacio existente entre el conocimiento
de una persona y la entrada que aceptan los ordenadores. De este modo,
documentos impresos que contienen fórmulas podrían digitalizarse
automáticamente, y la escritura se podría utilizar para introducir
directamente notación matemática en dispositivos electrónicos.
Esta tesis está centrada en desarrollar un método para reconocer
expresiones matemáticas. En este documento proponemos un método para
reconocer cualquier tipo de fórmula (impresa o manuscrita) basado en
gramáticas probabilísticas. Para ello, desarrollamos el marco
estadístico formal que deriva varias distribuciones de probabilidad. A
lo largo del documento, abordamos la definición y estimación de todas
estas fuentes de información probabilística. Finalmente, definimos el
algoritmo que, dada cierta entrada, calcula globalmente la expresión
matemática más probable de acuerdo al marco estadístico.
Un aspecto importante de este trabajo es proporcionar una evaluación
objetiva de los resultados y presentarlos usando datos públicos y
medidas estándar. Por ello, estudiamos los problemas de la evaluación
automática en este campo y buscamos las mejores soluciones. Asimismo,
presentamos diversos experimentos usando bases de datos públicas y
hemos participado en varias competiciones internacionales. Además,
hemos publicado como código abierto la mayoría del software
desarrollado en esta tesis.
También hemos explorado algunas de las aplicaciones del reconocimiento
de expresiones matemáticas. Además de las aplicaciones directas de
transcripción y digitalización, presentamos dos propuestas
importantes. En primer lugar, desarrollamos mucaptcha, un método para
discriminar entre humanos y ordenadores mediante la escritura de
expresiones matemáticas, el cual representa una novedosa aplicación
del reconocimiento de fórmulas. En segundo lugar, abordamos el
problema de detectar y segmentar la estructura de documentos
utilizando el marco estadístico formal desarrollado en esta tesis,
dado que ambos son problemas bidimensionales que pueden modelarse con
gramáticas probabilísticas.
El método desarrollado en esta tesis para reconocer expresiones
matemáticas ha obtenido buenos resultados a diferentes niveles. Este
trabajo ha producido varias publicaciones en conferencias
internacionales y revistas, y ha sido premiado en competiciones
internacionales.[CA] La notació matemàtica és ben coneguda i s'utilitza a tot el món. La
humanitat ha evolucionat des de simples mètodes per representar
comptes fins a la notació formal actual capaç de modelar
problemes complexos. A més, les expressions matemàtiques
constitueixen un idioma universal al món científic, i s'han creat
molts recursos que contenen matemàtiques durant les últimes
dècades. No obstant això, per accedir de forma eficient a tota
aquesta informació, els documents científics han de ser
digitalitzats o produïts directament en formats electrònics.
Encara que la majoria de persones és capaç d'entendre i produir
informació matemàtica, introduir expressions matemàtiques en
dispositius electrònics requereix aprendre notacions especials o usar
editors. El reconeixement automàtic d'expressions matemàtiques
té per objectiu omplir aquest espai existent entre el coneixement
d'una persona i l'entrada que accepten els ordinadors. D'aquesta
manera, documents impresos que contenen fórmules podrien
digitalitzar-se automàticament, i l'escriptura es podria utilitzar per
introduir directament notació matemàtica en dispositius electrònics.
Aquesta tesi està centrada en desenvolupar un mètode per reconèixer
expressions matemàtiques. En aquest document proposem un mètode per
reconèixer qualsevol tipus de fórmula (impresa o manuscrita) basat en
gramàtiques probabilístiques. Amb aquesta finalitat, desenvolupem el
marc estadístic formal que deriva diverses distribucions de
probabilitat. Al llarg del document, abordem la definició i estimació
de totes aquestes fonts d'informació probabilística. Finalment,
definim l'algorisme que, donada certa entrada, calcula globalment
l'expressió matemàtica més probable d'acord al marc estadístic.
Un aspecte important d'aquest treball és proporcionar una avaluació
objectiva dels resultats i presentar-los usant dades públiques i
mesures estàndard. Per això, estudiem els problemes de l'avaluació
automàtica en aquest camp i busquem les millors solucions. Així
mateix, presentem diversos experiments usant bases de dades públiques
i hem participat en diverses competicions internacionals. A més, hem
publicat com a codi obert la majoria del software desenvolupat en
aquesta tesi.
També hem explorat algunes de les aplicacions del reconeixement
d'expressions matemàtiques. A més de les aplicacions directes de
transcripció i digitalització, presentem dues propostes
importants. En primer lloc, desenvolupem mucaptcha, un mètode per
discriminar entre humans i ordinadors mitjançant l'escriptura
d'expressions matemàtiques, el qual representa una nova aplicació del
reconeixement de fórmules. En segon lloc, abordem el problema de
detectar i segmentar l'estructura de documents utilitzant el marc
estadístic formal desenvolupat en aquesta tesi, donat que ambdós són
problemes bidimensionals que poden modelar-se amb gramàtiques
probabilístiques.
El mètode desenvolupat en aquesta tesi per reconèixer expressions
matemàtiques ha obtingut bons resultats a diferents nivells. Aquest
treball ha produït diverses publicacions en conferències
internacionals i revistes, i ha sigut premiat en competicions
internacionals.Álvaro Muñoz, F. (2015). Mathematical Expression Recognition based on Probabilistic Grammars [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/51665TESI
How WEIRD is Usable Privacy and Security Research? (Extended Version)
In human factor fields such as human-computer interaction (HCI) and
psychology, researchers have been concerned that participants mostly come from
WEIRD (Western, Educated, Industrialized, Rich, and Democratic) countries. This
WEIRD skew may hinder understanding of diverse populations and their cultural
differences. The usable privacy and security (UPS) field has inherited many
research methodologies from research on human factor fields. We conducted a
literature review to understand the extent to which participant samples in UPS
papers were from WEIRD countries and the characteristics of the methodologies
and research topics in each user study recruiting Western or non-Western
participants. We found that the skew toward WEIRD countries in UPS is greater
than that in HCI. Geographic and linguistic barriers in the study methods and
recruitment methods may cause researchers to conduct user studies locally. In
addition, many papers did not report participant demographics, which could
hinder the replication of the reported studies, leading to low reproducibility.
To improve geographic diversity, we provide the suggestions including
facilitate replication studies, address geographic and linguistic issues of
study/recruitment methods, and facilitate research on the topics for non-WEIRD
populations.Comment: This paper is the extended version of the paper presented at USENIX
SECURITY 202
Cryptographic Protocols for Privacy Enhancing Technologies: From Privacy Preserving Human Attestation to Internet Voting
Desire of privacy is oftentimes associated with the intention to hide certain
aspects of our thoughts or actions due to some illicit activity. This is a
narrow understanding of privacy, and a marginal fragment of the motivations
for undertaking an action with a desired level of privacy. The right for not
being subject to arbitrary interference of our privacy is part of the universal
declaration of human rights (Article 12) and, above that, a requisite for
our freedom. Developing as a person freely, which results in the development
of society, requires actions to be done without a watchful eye. While
the awareness of privacy in the context of modern technologies is not widely
spread, it is clearly understood, as can be seen in the context of elections,
that in order to make a free choice one needs to maintain its privacy. So
why demand privacy when electing our government, but not when selecting
our daily interests, books we read, sites we browse, or persons we encounter?
It is popular belief that the data that we expose of ourselves would not be
exploited if one is a law-abiding citizen. No further from the truth, as this
data is used daily for commercial purposes: users’ data has value. To make
matters worse, data has also been used for political purposes without the
user’s consent or knowledge. However, the benefits that data can bring to
individuals seem endless and a solution of not using this data at all seems
extremist. Legislative efforts have tried, in the past years, to provide mechanisms
for users to decide what is done with their data and define a framework
where companies can use user data, but always under the consent of the latter.
However, these attempts take time to take track, and have unfortunately
not been very successful since their introduction.
In this thesis we explore the possibility of constructing cryptographic protocols
to provide a technical, rather than legislative, solution to the privacy
problem. In particular we focus on two aspects of society: browsing and
internet voting. These two events shape our lives in one way or another, and
require high levels of privacy to provide a safe environment for humans to
act upon them freely. However, these two problems have opposite solutions.
On the one hand, elections are a well established event in society that has
been around for millennia, and privacy and accountability are well rooted
requirements for such events. This might be the reason why its digitalisation
is something which is falling behind with respect to other acts of our society
(banking, shopping, reading, etc). On the other hand, browsing is a recently
introduced action, but that has quickly taken track given the amount of possibilities
that it opens with such ease. We now have access to whatever we
can imagine (except for voting) at the distance of a click. However, the data
that we generate while browsing is extremely sensitive, and most of it is disclosed to third parties under the claims of making the user experience better
(targeted recommendations, ads or bot-detection).
Chapter 1 motivates why resolving such a problem is necessary for the
progress of digital society. It then introduces the problem that this thesis
aims to resolve, together with the methodology. In Chapter 2 we introduce
some technical concepts used throughout the thesis. Similarly, we expose the
state-of-the-art and its limitations.
In Chapter 3 we focus on a mechanism to provide private browsing. In
particular, we focus on how we can provide a safer, and more private way, for
human attestation. Determining whether a user is a human or a bot is important
for the survival of an online world. However, the existing mechanisms
are either invasive or pose a burden to the user. We present a solution that
is based on a machine learning model to distinguish between humans and
bots that uses natural events of normal browsing (such as touch the screen
of a phone) to make its prediction. To ensure that no private data leaves
the user’s device, we evaluate such a model in the device rather than sending
the data over the wire. To provide insurance that the expected model has
been evaluated, the user’s device generates a cryptographic proof. However
this opens an important question. Can we achieve a high level of accuracy
without resulting in a noneffective battery consumption? We provide a positive
answer to this question in this work, and show that a privacy-preserving
solution can be achieved while maintaining the accuracy high and the user’s
performance overhead low.
In Chapter 4 we focus on the problem of internet voting. Internet voting
means voting remotely, and therefore in an uncontrolled environment.
This means that anyone can be voting under the supervision of a coercer,
which makes the main goal of the protocols presented to be that of coercionresistance.
We need to build a protocol that allows a voter to escape the
act of coercion. We present two proposals with the main goal of providing
a usable, and scalable coercion resistant protocol. They both have different
trade-offs. On the one hand we provide a coercion resistance mechanism
that results in linear filtering, but that provides a slightly weaker notion of
coercion-resistance. Secondly, we present a mechanism with a slightly higher
complexity (poly-logarithmic) but that instead provides a stronger notion of
coercion resistance. Both solutions are based on a same idea: allowing the
voter to cast several votes (such that only the last one is counted) in a way
that cannot be determined by a coercer.
Finally, in Chapter 5, we conclude the thesis, and expose how our results
push one step further the state-of-the-art. We concisely expose our contributions,
and describe clearly what are the next steps to follow. The results
presented in this work argue against the two main claims against privacy preserving solutions: either that privacy is not practical or that higher levels
of privacy result in lower levels of security.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Agustín Martín Muñoz.- Secretario: José María de Fuentes García-Romero de Tejada.- Vocal: Alberto Peinado Domíngue
- …