133 research outputs found
The weak password problem: chaos, criticality, and encrypted p-CAPTCHAs
Vulnerabilities related to weak passwords are a pressing global economic and
security issue. We report a novel, simple, and effective approach to address
the weak password problem. Building upon chaotic dynamics, criticality at phase
transitions, CAPTCHA recognition, and computational round-off errors we design
an algorithm that strengthens security of passwords. The core idea of our
method is to split a long and secure password into two components. The first
component is memorized by the user. The second component is transformed into a
CAPTCHA image and then protected using evolution of a two-dimensional dynamical
system close to a phase transition, in such a way that standard brute-force
attacks become ineffective. We expect our approach to have wide applications
for authentication and encryption technologies.Comment: 5 pages, 6 figer
Making defeating CAPTCHAs harder for bots
For a number of years, many websites have used CAPTCHAs to filter out
interactions by bots. However, attackers have found ways to circumvent CAPTCHAs
by programming bots to solve or bypass them, or even relay them for humans to
solve. In order to reduce the chances of success of such attacks, CAPTCHAs can
be strengthened by the addition of certain safeguards. In this paper, we
discuss seven existing safeguards as well as five novel safeguards designed to
make circumventing CAPTCHAs harder. These safeguards are not mutually exclusive
and can add multiple layers of protection to a CAPTCHA. We further provide a
high-level comparison of their effectiveness in addressing the threat posed by
CAPTCHA-defeating techniques. In order to focus on safeguards that are usable,
we restrict our attention to those which have minimal adverse effect on the
user experience
CAPTCHA Types and Breaking Techniques: Design Issues, Challenges, and Future Research Directions
The proliferation of the Internet and mobile devices has resulted in
malicious bots access to genuine resources and data. Bots may instigate
phishing, unauthorized access, denial-of-service, and spoofing attacks to
mention a few. Authentication and testing mechanisms to verify the end-users
and prohibit malicious programs from infiltrating the services and data are
strong defense systems against malicious bots. Completely Automated Public
Turing test to tell Computers and Humans Apart (CAPTCHA) is an authentication
process to confirm that the user is a human hence, access is granted. This
paper provides an in-depth survey on CAPTCHAs and focuses on two main things:
(1) a detailed discussion on various CAPTCHA types along with their advantages,
disadvantages, and design recommendations, and (2) an in-depth analysis of
different CAPTCHA breaking techniques. The survey is based on over two hundred
studies on the subject matter conducted since 2003 to date. The analysis
reinforces the need to design more attack-resistant CAPTCHAs while keeping
their usability intact. The paper also highlights the design challenges and
open issues related to CAPTCHAs. Furthermore, it also provides useful
recommendations for breaking CAPTCHAs
Foundations, Properties, and Security Applications of Puzzles: A Survey
Cryptographic algorithms have been used not only to create robust ciphertexts
but also to generate cryptograms that, contrary to the classic goal of
cryptography, are meant to be broken. These cryptograms, generally called
puzzles, require the use of a certain amount of resources to be solved, hence
introducing a cost that is often regarded as a time delay---though it could
involve other metrics as well, such as bandwidth. These powerful features have
made puzzles the core of many security protocols, acquiring increasing
importance in the IT security landscape. The concept of a puzzle has
subsequently been extended to other types of schemes that do not use
cryptographic functions, such as CAPTCHAs, which are used to discriminate
humans from machines. Overall, puzzles have experienced a renewed interest with
the advent of Bitcoin, which uses a CPU-intensive puzzle as proof of work. In
this paper, we provide a comprehensive study of the most important puzzle
construction schemes available in the literature, categorizing them according
to several attributes, such as resource type, verification type, and
applications. We have redefined the term puzzle by collecting and integrating
the scattered notions used in different works, to cover all the existing
applications. Moreover, we provide an overview of the possible applications,
identifying key requirements and different design approaches. Finally, we
highlight the features and limitations of each approach, providing a useful
guide for the future development of new puzzle schemes.Comment: This article has been accepted for publication in ACM Computing
Survey
Avatar captcha : telling computers and humans apart via face classification and mouse dynamics.
Bots are malicious, automated computer programs that execute malicious scripts and predefined functions on an affected computer. They pose cybersecurity threats and are one of the most sophisticated and common types of cybercrime tools today. They spread viruses, generate spam, steal personal sensitive information, rig online polls and commit other types of online crime and fraud. They sneak into unprotected systems through the Internet by seeking vulnerable entry points. They access the system’s resources like a human user does. Now the question arises how do we counter this? How do we prevent bots and on the other hand allow human users to access the system resources? One solution is by designing a CAPTCHA (Completely Automated Public Turing Tests to tell Computers and Humans Apart), a program that can generate and grade tests that most humans can pass but computers cannot. It is used as a tool to distinguish humans from malicious bots. They are a class of Human Interactive Proofs (HIPs) meant to be easily solvable by humans and economically infeasible for computers. Text CAPTCHAs are very popular and commonly used. For each challenge, they generate a sequence of alphabets by distorting standard fonts, requesting users to identify them and type them out. However, they are vulnerable to character segmentation attacks by bots, English language dependent and are increasingly becoming too complex for people to solve. A solution to this is to design Image CAPTCHAs that use images instead of text and require users to identify certain images to solve the challenges. They are user-friendly and convenient for human users and a much more challenging problem for bots to solve. In today’s Internet world the role of user profiling or user identification has gained a lot of significance. Identity thefts, etc. can be prevented by providing authorized access to resources. To achieve timely response to a security breach frequent user verification is needed. However, this process must be passive, transparent and non-obtrusive. In order for such a system to be practical it must be accurate, efficient and difficult to forge. Behavioral biometric systems are usually less prominent however, they provide numerous and significant advantages over traditional biometric systems. Collection of behavior data is non-obtrusive and cost-effective as it requires no special hardware. While these systems are not unique enough to provide reliable human identification, they have shown to be highly accurate in identity verification. In accomplishing everyday tasks, human beings use different styles, strategies, apply unique skills and knowledge, etc. These define the behavioral traits of the user. Behavioral biometrics attempts to quantify these traits to profile users and establish their identity. Human computer interaction (HCI)-based biometrics comprise of interaction strategies and styles between a human and a computer. These unique user traits are quantified to build profiles for identification. A specific category of HCI-based biometrics is based on recording human interactions with mouse as the input device and is known as Mouse Dynamics. By monitoring the mouse usage activities produced by a user during interaction with the GUI, a unique profile can be created for that user that can help identify him/her. Mouse-based verification approaches do not record sensitive user credentials like usernames and passwords. Thus, they avoid privacy issues. An image CAPTCHA is proposed that incorporates Mouse Dynamics to help fortify it. It displays random images obtained from Yahoo’s Flickr. To solve the challenge the user must identify and select a certain class of images. Two theme-based challenges have been designed. They are Avatar CAPTCHA and Zoo CAPTCHA. The former displays human and avatar faces whereas the latter displays different animal species. In addition to the dynamically selected images, while attempting to solve the CAPTCHA, the way each user interacts with the mouse i.e. mouse clicks, mouse movements, mouse cursor screen co-ordinates, etc. are recorded nonobtrusively at regular time intervals. These recorded mouse movements constitute the Mouse Dynamics Signature (MDS) of the user. This MDS provides an additional secure technique to segregate humans from bots. The security of the CAPTCHA is tested by an adversary executing a mouse bot attempting to solve the CAPTCHA challenges
Mathematical Expression Recognition based on Probabilistic Grammars
[EN] Mathematical notation is well-known and used all over the
world. Humankind has evolved from simple methods representing
countings to current well-defined math notation able to account for
complex problems. Furthermore, mathematical expressions constitute a
universal language in scientific fields, and many information
resources containing mathematics have been created during the last
decades. However, in order to efficiently access all that information,
scientific documents have to be digitized or produced directly in
electronic formats.
Although most people is able to understand and produce mathematical
information, introducing math expressions into electronic devices
requires learning specific notations or using editors. Automatic
recognition of mathematical expressions aims at filling this gap
between the knowledge of a person and the input accepted by
computers. This way, printed documents containing math expressions
could be automatically digitized, and handwriting could be used for
direct input of math notation into electronic devices.
This thesis is devoted to develop an approach for mathematical
expression recognition. In this document we propose an approach for
recognizing any type of mathematical expression (printed or
handwritten) based on probabilistic grammars. In order to do so, we
develop the formal statistical framework such that derives several
probability distributions. Along the document, we deal with the
definition and estimation of all these probabilistic sources of
information. Finally, we define the parsing algorithm that globally
computes the most probable mathematical expression for a given input
according to the statistical framework.
An important point in this study is to provide objective performance
evaluation and report results using public data and standard
metrics. We inspected the problems of automatic evaluation in this
field and looked for the best solutions. We also report several
experiments using public databases and we participated in several
international competitions. Furthermore, we have released most of the
software developed in this thesis as open source.
We also explore some of the applications of mathematical expression
recognition. In addition to the direct applications of transcription
and digitization, we report two important proposals. First, we
developed mucaptcha, a method to tell humans and computers apart by
means of math handwriting input, which represents a novel application
of math expression recognition. Second, we tackled the problem of
layout analysis of structured documents using the statistical
framework developed in this thesis, because both are two-dimensional
problems that can be modeled with probabilistic grammars.
The approach developed in this thesis for mathematical expression
recognition has obtained good results at different levels. It has
produced several scientific publications in international conferences
and journals, and has been awarded in international competitions.[ES] La notación matemática es bien conocida y se utiliza en todo el
mundo. La humanidad ha evolucionado desde simples métodos para
representar cuentas hasta la notación formal actual capaz de modelar
problemas complejos. Además, las expresiones matemáticas constituyen
un idioma universal en el mundo científico, y se han creado muchos
recursos que contienen matemáticas durante las últimas décadas. Sin
embargo, para acceder de forma eficiente a toda esa información, los
documentos científicos han de ser digitalizados o producidos
directamente en formatos electrónicos.
Aunque la mayoría de personas es capaz de entender y producir
información matemática, introducir expresiones matemáticas en
dispositivos electrónicos requiere aprender notaciones especiales o
usar editores. El reconocimiento automático de expresiones matemáticas
tiene como objetivo llenar ese espacio existente entre el conocimiento
de una persona y la entrada que aceptan los ordenadores. De este modo,
documentos impresos que contienen fórmulas podrían digitalizarse
automáticamente, y la escritura se podría utilizar para introducir
directamente notación matemática en dispositivos electrónicos.
Esta tesis está centrada en desarrollar un método para reconocer
expresiones matemáticas. En este documento proponemos un método para
reconocer cualquier tipo de fórmula (impresa o manuscrita) basado en
gramáticas probabilísticas. Para ello, desarrollamos el marco
estadístico formal que deriva varias distribuciones de probabilidad. A
lo largo del documento, abordamos la definición y estimación de todas
estas fuentes de información probabilística. Finalmente, definimos el
algoritmo que, dada cierta entrada, calcula globalmente la expresión
matemática más probable de acuerdo al marco estadístico.
Un aspecto importante de este trabajo es proporcionar una evaluación
objetiva de los resultados y presentarlos usando datos públicos y
medidas estándar. Por ello, estudiamos los problemas de la evaluación
automática en este campo y buscamos las mejores soluciones. Asimismo,
presentamos diversos experimentos usando bases de datos públicas y
hemos participado en varias competiciones internacionales. Además,
hemos publicado como código abierto la mayoría del software
desarrollado en esta tesis.
También hemos explorado algunas de las aplicaciones del reconocimiento
de expresiones matemáticas. Además de las aplicaciones directas de
transcripción y digitalización, presentamos dos propuestas
importantes. En primer lugar, desarrollamos mucaptcha, un método para
discriminar entre humanos y ordenadores mediante la escritura de
expresiones matemáticas, el cual representa una novedosa aplicación
del reconocimiento de fórmulas. En segundo lugar, abordamos el
problema de detectar y segmentar la estructura de documentos
utilizando el marco estadístico formal desarrollado en esta tesis,
dado que ambos son problemas bidimensionales que pueden modelarse con
gramáticas probabilísticas.
El método desarrollado en esta tesis para reconocer expresiones
matemáticas ha obtenido buenos resultados a diferentes niveles. Este
trabajo ha producido varias publicaciones en conferencias
internacionales y revistas, y ha sido premiado en competiciones
internacionales.[CA] La notació matemàtica és ben coneguda i s'utilitza a tot el món. La
humanitat ha evolucionat des de simples mètodes per representar
comptes fins a la notació formal actual capaç de modelar
problemes complexos. A més, les expressions matemàtiques
constitueixen un idioma universal al món científic, i s'han creat
molts recursos que contenen matemàtiques durant les últimes
dècades. No obstant això, per accedir de forma eficient a tota
aquesta informació, els documents científics han de ser
digitalitzats o produïts directament en formats electrònics.
Encara que la majoria de persones és capaç d'entendre i produir
informació matemàtica, introduir expressions matemàtiques en
dispositius electrònics requereix aprendre notacions especials o usar
editors. El reconeixement automàtic d'expressions matemàtiques
té per objectiu omplir aquest espai existent entre el coneixement
d'una persona i l'entrada que accepten els ordinadors. D'aquesta
manera, documents impresos que contenen fórmules podrien
digitalitzar-se automàticament, i l'escriptura es podria utilitzar per
introduir directament notació matemàtica en dispositius electrònics.
Aquesta tesi està centrada en desenvolupar un mètode per reconèixer
expressions matemàtiques. En aquest document proposem un mètode per
reconèixer qualsevol tipus de fórmula (impresa o manuscrita) basat en
gramàtiques probabilístiques. Amb aquesta finalitat, desenvolupem el
marc estadístic formal que deriva diverses distribucions de
probabilitat. Al llarg del document, abordem la definició i estimació
de totes aquestes fonts d'informació probabilística. Finalment,
definim l'algorisme que, donada certa entrada, calcula globalment
l'expressió matemàtica més probable d'acord al marc estadístic.
Un aspecte important d'aquest treball és proporcionar una avaluació
objectiva dels resultats i presentar-los usant dades públiques i
mesures estàndard. Per això, estudiem els problemes de l'avaluació
automàtica en aquest camp i busquem les millors solucions. Així
mateix, presentem diversos experiments usant bases de dades públiques
i hem participat en diverses competicions internacionals. A més, hem
publicat com a codi obert la majoria del software desenvolupat en
aquesta tesi.
També hem explorat algunes de les aplicacions del reconeixement
d'expressions matemàtiques. A més de les aplicacions directes de
transcripció i digitalització, presentem dues propostes
importants. En primer lloc, desenvolupem mucaptcha, un mètode per
discriminar entre humans i ordinadors mitjançant l'escriptura
d'expressions matemàtiques, el qual representa una nova aplicació del
reconeixement de fórmules. En segon lloc, abordem el problema de
detectar i segmentar l'estructura de documents utilitzant el marc
estadístic formal desenvolupat en aquesta tesi, donat que ambdós són
problemes bidimensionals que poden modelar-se amb gramàtiques
probabilístiques.
El mètode desenvolupat en aquesta tesi per reconèixer expressions
matemàtiques ha obtingut bons resultats a diferents nivells. Aquest
treball ha produït diverses publicacions en conferències
internacionals i revistes, i ha sigut premiat en competicions
internacionals.Álvaro Muñoz, F. (2015). Mathematical Expression Recognition based on Probabilistic Grammars [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/51665TESI
- …