20 research outputs found
Multi-Factor Authentication: A Survey
Today, digitalization decisively penetrates all the sides of the modern society. One of the key enablers to maintain this process secure is authentication. It covers many different areas of a hyper-connected world, including online payments, communications, access right management, etc. This work sheds light on the evolution of authentication systems towards Multi-Factor Authentication (MFA) starting from Single-Factor Authentication (SFA) and through Two-Factor Authentication (2FA). Particularly, MFA is expected to be utilized for human-to-everything interactions by enabling fast, user-friendly, and reliable authentication when accessing a service. This paper surveys the already available and emerging sensors (factor providers) that allow for authenticating a user with the system directly or by involving the cloud. The corresponding challenges from the user as well as the service provider perspective are also reviewed. The MFA system based on reversed Lagrange polynomial within Shamir’s Secret Sharing (SSS) scheme is further proposed to enable more flexible authentication. This solution covers the cases of authenticating the user even if some of the factors are mismatched or absent. Our framework allows for qualifying the missing factors by authenticating the user without disclosing sensitive biometric data to the verification entity. Finally, a vision of the future trends in MFA is discussed.Peer reviewe
Development of Technologies for the Detection of (Cyber)Bullying Actions: The BullyBuster Project
Bullying and cyberbullying are harmful social phenomena that involve the intentional, repeated use of power to intimidate or harm others. The ramifications of these actions are felt not just at the individual level but also pervasively throughout society, necessitating immediate attention and practical solutions. The BullyBuster project pioneers a multi-disciplinary approach, integrating artificial intelligence (AI) techniques with psychological models to comprehensively understand and combat these issues. In particular, employing AI in the project allows the automatic identification of potentially harmful content by analyzing linguistic patterns and behaviors in various data sources, including photos and videos. This timely detection enables alerts to relevant authorities or moderators, allowing for rapid interventions and potential harm mitigation. This paper, a culmination of previous research and advancements, details the potential for significantly enhancing cyberbullying detection and prevention by focusing on the system’s design and the novel application of AI classifiers within an integrated framework. Our primary aim is to evaluate the feasibility and applicability of such a framework in a real-world application context. The proposed approach is shown to tackle the pervasive issue of cyberbullying effectively
Análise de distribuições de distâncias entre palavras genómicas
The investigation of DNA has been one of the most developed areas of
research in this and in the last century. However, there is a long way to go
to fully understand the DNA code. With the increasing of DNA sequenced
data, mathematical methods play an important role in addressing the need
for e cient quantitative techniques for the detection of regions of interest
and overall characteristics in these sequences.
A feature of interest in the study of genomic words is their spatial distribution
along a DNA sequence, which can be characterized by the distances between
words. Counting such distances provides discrete distributions that may
be analyzed from a statistical point of view. In this work we explore the
distances between genomic words as a mathematical descriptor of DNA
sequences. The main goal is to design, develop and apply statistical methods
specially designed for their distributions, in order to capture information
about the primary and secondary structure of DNA.
The characterization of empirical inter-word distance distributions involves
the problem of the exponential increasing of the number of distributions
as the word length increases, leading to the need of data reduction.
Moreover, if the data can be validly clustered, the class labels may provide
a meaningful description of similarities and di erences between sets of
distributions. Therefore, we explore the inter-word distance distributions
potential to obtain a word clustering, able to highlight similar patterns
of word distributions as well as summarized characteristics of each set of
distributions.
With the aim of performing comparative studies between genomic sequences
and de ning species signatures, we deduce exact distributions of inter-word
distances under random scenarios. Based on these theoretical distributions,
we de ne genomic signatures of species able to discriminate between species
and to capture their evolutionary relation. We presume that the study of
distributions similarities and the clustering procedure allow identifying words
whose distance distribution strongly di ers from a reference distribution or
from the global behaviour of the majority of the words. One of the key topics
of our research focuses on the establishment of procedures that capture
distance distributions with atypical behaviours, herein referred to as atypical
distributions.
In the genomic context, words with an atypical distance distribution may
be related with some biological function (motifs). We expect that our
results may be used to provide some sort of classi cation of sequences,
identifying evolutionary patterns and allowing for the prediction of functional
properties, thereby contributing to the advancement of knowledge about
DNA sequences.A investigação do ADN é uma das áreas mais desenvolvidas neste e no
último século. O crescente aumento do número de genomas sequenciados
tem exigido técnicas quantitativas mais e cientes para a identi cação de
caracterÃsticas gerais e especà cas das sequências genómicas, os métodos
matemáticos desempenham um papel importante na resposta a essa
necessidade.
Uma caracterÃstica com particular interesse no estudo de palavras genómicas
é a sua distribuição espacial ao longo de sequências de ADN, podendo
esta ser caracterizada pelas distâncias entre palavras. A contagem dessas
distâncias fornece distribuições discretas passÃveis de análise estatÃstica.
Neste trabalho, exploramos as distâncias entre palavras como um descritor
matemático das sequências de ADN, tendo como objetivo delinear e
desenvolver procedimentos estatÃsticos especialmente concebidos para o
estudo das suas distribuições.
A caracterização das distribuições de distâncias empÃricas entre palavras
genómicas envolve o problema do crescimento exponencial do número
de distribuições com o aumento do comprimento da palavra, gerando a
necessidade de redução dos dados. Além disso, se os dados puderem
ser validamente agrupados em classes então os representantes de classe
fornecem informação relevante sobre semelhanças e diferenças entre cada
grupo de distribuições. Assim, exploramos o potencial das distribuições de
distâncias na obtenção de um agrupamento de palavras, que agrupe padrões
de distâncias semelhantes e que coloque em evidência as caracterÃsticas de
cada grupo. Com vista ao estudo comparativo de sequências genómicas e
à de nição de assinaturas de espécies, focamo-nos no desenvolvimento de
modelos teóricos que descrevam distribuições de distâncias entre palavras em
cenários aleatórios. Esses modelos são utilizados na de nição de assinaturas
genómicas, capazes de discriminar entre espécies e de recuperar relações
evolutivas entre estas. Presumimos que o estudo de semelhanças e a
análise de agrupamento das distribuições permite identi car palavras cuja
distribuição se afasta fortemente de uma distribuição de referência ou do
comportamento global das maioria das palavras. Um dos principais tópicos
de investigação foca-se na deteção de distribuições com comportamentos
anormais, aqui referidas como distribuições atÃpicas.
No contexto genómico, palavras com distribuições de distâncias atÃpicas
poderão estar relacionadas com alguma função biológica (motivos).
Esperamos que os resultados obtidos possam ser utilizados para fornecer
algum tipo de classi cação de sequências, identi cando padrões evolutivos e
permitindo a previsão das propriedades funcionais, representando assim um
passo adicional na criação de conhecimento sobre sequências de ADN.Programa Doutoral em Matemátic
Machine learning approaches to video activity recognition: from computer vision to signal processing
244 p.La investigación presentada se centra en técnicas de clasificación para dos tareas diferentes, aunque relacionadas, de tal forma que la segunda puede ser considerada parte de la primera: el reconocimiento de acciones humanas en vÃdeos y el reconocimiento de lengua de signos.En la primera parte, la hipótesis de partida es que la transformación de las señales de un vÃdeo mediante el algoritmo de Patrones Espaciales Comunes (CSP por sus siglas en inglés, comúnmente utilizado en sistemas de ElectroencefalografÃa) puede dar lugar a nuevas caracterÃsticas que serán útiles para la posterior clasificación de los vÃdeos mediante clasificadores supervisados. Se han realizado diferentes experimentos en varias bases de datos, incluyendo una creada durante esta investigación desde el punto de vista de un robot humanoide, con la intención de implementar el sistema de reconocimiento desarrollado para mejorar la interacción humano-robot.En la segunda parte, las técnicas desarrolladas anteriormente se han aplicado al reconocimiento de lengua de signos, pero además de ello se propone un método basado en la descomposición de los signos para realizar el reconocimiento de los mismos, añadiendo la posibilidad de una mejor explicabilidad. El objetivo final es desarrollar un tutor de lengua de signos capaz de guiar a los usuarios en el proceso de aprendizaje, dándoles a conocer los errores que cometen y el motivo de dichos errores
Discriminative dimensionality reduction: variations, applications, interpretations
Schulz A. Discriminative dimensionality reduction: variations, applications, interpretations. Bielefeld: Universität Bielefeld; 2017.The amount of digital data increases rapidly as a result of advances in information and sensor technology. Because the data sets grow with respect to their size, complexity and dimensionality, they are no longer easily accessible to a human user. The framework of dimensionality reduction addresses this problem by aiming to visualize complex data sets in two dimensions while preserving the relevant structure. While these methods can provide significant insights, the problem formulation of structure preservation is ill-posed in general and can lead to undesired effects.
In this thesis, the concept of discriminative dimensionality reduction is investigated as a particular promising way to indicate relevant structure by specifying auxiliary data.
The goal is to overcome challenges in data inspection and to investigate in how far discriminative dimensionality reduction methods can yield an improvement. The main scientific contributions are the following:
(I) The most popular techniques for discriminative dimensionality reduction
are based on the Fisher metric. However, they are restricted in their applicability as concerns complex settings: They can only be employed for fixed data sets, i.e. new data cannot be included in an existing embedding. Only data provided in vectorial representation can be processed. And they are designed for discrete-valued auxiliary data and cannot be applied to real-valued ones. We propose solutions to overcome these challenges.
(II) Besides the problem that complex data are not accessible to humans, the same holds for trained machine learning models which often constitute black box models. In order to provide an intuitive interface to such models, we propose a general framework which allows to visualize high-dimensional functions, such as regression or classification functions, in two dimensions.
(III) Although nonlinear dimensionality reduction techniques illustrate the structure of the data very well, they suffer from the fact that there is no explicit relationship between the original features and the obtained projection. We propose a methodology to create a connection, thus allowing to understand the
importance of the features.
(IV) Although linear mappings constitute a very popular tool, a direct interpretation of their weights as feature relevance can be misleading. We propose a methodology which enables a valid interpretation by providing relevance bounds for each feature.
(V) The problem of transfer learning without given correspondence information between the source and target space and without labels is particularly challenging. Here, we utilize the structure preserving property of dimensionality reduction methods to transfer knowledge in a latent space given by dimensionality reduction
Comparison of Text Mining Models for Food and Dietary Constituent Named-Entity Recognition
publishedVersionPeer reviewe
PolÃticas de Copyright de Publicações CientÃficas em Repositórios Institucionais: O Caso do INESC TEC
A progressiva transformação das práticas cientÃficas, impulsionada pelo desenvolvimento das novas Tecnologias de Informação e Comunicação (TIC), têm possibilitado aumentar o acesso à informação, caminhando gradualmente para uma abertura do ciclo de pesquisa. Isto permitirá resolver a longo prazo uma adversidade que se tem colocado aos investigadores, que passa pela existência de barreiras que limitam as condições de acesso, sejam estas geográficas ou financeiras. Apesar da produção cientÃfica ser dominada, maioritariamente, por grandes editoras comerciais, estando sujeita à s regras por estas impostas, o Movimento do Acesso Aberto cuja primeira declaração pública, a Declaração de Budapeste (BOAI), é de 2002, vem propor alterações significativas que beneficiam os autores e os leitores. Este Movimento vem a ganhar importância em Portugal desde 2003, com a constituição do primeiro repositório institucional a nÃvel nacional. Os repositórios institucionais surgiram como uma ferramenta de divulgação da produção cientÃfica de uma instituição, com o intuito de permitir abrir aos resultados da investigação, quer antes da publicação e do próprio processo de arbitragem (preprint), quer depois (postprint), e, consequentemente, aumentar a visibilidade do trabalho desenvolvido por um investigador e a respetiva instituição. O estudo apresentado, que passou por uma análise das polÃticas de copyright das publicações cientÃficas mais relevantes do INESC TEC, permitiu não só perceber que as editoras adotam cada vez mais polÃticas que possibilitam o auto-arquivo das publicações em repositórios institucionais, como também que existe todo um trabalho de sensibilização a percorrer, não só para os investigadores, como para a instituição e toda a sociedade. A produção de um conjunto de recomendações, que passam pela implementação de uma polÃtica institucional que incentive o auto-arquivo das publicações desenvolvidas no âmbito institucional no repositório, serve como mote para uma maior valorização da produção cientÃfica do INESC TEC.The progressive transformation of scientific practices, driven by the development of new Information and Communication Technologies (ICT), which made it possible to increase access to information, gradually moving towards an opening of the research cycle. This opening makes it possible to resolve, in the long term, the adversity that has been placed on researchers, which involves the existence of barriers that limit access conditions, whether geographical or financial. Although large commercial publishers predominantly dominate scientific production and subject it to the rules imposed by them, the Open Access movement whose first public declaration, the Budapest Declaration (BOAI), was in 2002, proposes significant changes that benefit the authors and the readers. This Movement has gained importance in Portugal since 2003, with the constitution of the first institutional repository at the national level. Institutional repositories have emerged as a tool for disseminating the scientific production of an institution to open the results of the research, both before publication and the preprint process and postprint, increase the visibility of work done by an investigator and his or her institution. The present study, which underwent an analysis of the copyright policies of INESC TEC most relevant scientific publications, allowed not only to realize that publishers are increasingly adopting policies that make it possible to self-archive publications in institutional repositories, all the work of raising awareness, not only for researchers but also for the institution and the whole society. The production of a set of recommendations, which go through the implementation of an institutional policy that encourages the self-archiving of the publications developed in the institutional scope in the repository, serves as a motto for a greater appreciation of the scientific production of INESC TEC
Mathematics in Software Reliability and Quality Assurance
This monograph concerns the mathematical aspects of software reliability and quality assurance and consists of 11 technical papers in this emerging area. Included are the latest research results related to formal methods and design, automatic software testing, software verification and validation, coalgebra theory, automata theory, hybrid system and software reliability modeling and assessment