60 research outputs found
Making Users Indistinguishable: Attribute-wise Unlearning in Recommender Systems
With the growing privacy concerns in recommender systems, recommendation
unlearning, i.e., forgetting the impact of specific learned targets, is getting
increasing attention. Existing studies predominantly use training data, i.e.,
model inputs, as the unlearning target. However, we find that attackers can
extract private information, i.e., gender, race, and age, from a trained model
even if it has not been explicitly encountered during training. We name this
unseen information as attribute and treat it as the unlearning target. To
protect the sensitive attribute of users, Attribute Unlearning (AU) aims to
degrade attacking performance and make target attributes indistinguishable. In
this paper, we focus on a strict but practical setting of AU, namely
Post-Training Attribute Unlearning (PoT-AU), where unlearning can only be
performed after the training of the recommendation model is completed. To
address the PoT-AU problem in recommender systems, we design a two-component
loss function that consists of i) distinguishability loss: making attribute
labels indistinguishable from attackers, and ii) regularization loss:
preventing drastic changes in the model that result in a negative impact on
recommendation performance. Specifically, we investigate two types of
distinguishability measurements, i.e., user-to-user and
distribution-to-distribution. We use the stochastic gradient descent algorithm
to optimize our proposed loss. Extensive experiments on three real-world
datasets demonstrate the effectiveness of our proposed methods
Enhanced image reconstruction of electrical impedance tomography using simultaneous algebraic reconstruction technique and K-means clustering
Electrical impedance tomography (EIT), as a non-ionizing tomography method, has been widely used in various fields of application, such as engineering and medical fields. This study applies an iterative process to reconstruct EIT images using the simultaneous algebraic reconstruction technique (SART) algorithm combined with K-means clustering. The reconstruction started with defining the finite element method (FEM) model and filtering the measurement data with a Butterworth low-pass filter. The next step is solving the inverse problem in the EIT case with the SART algorithm. The results of the SART algorithm approach were classified using the K-means clustering and thresholding. The reconstruction results were evaluated with the peak signal noise ratio (PSNR), structural similarity indices (SSIM), and normalized root mean square error (NRMSE). They were compared with the one-step gauss-newton (GN) and total variation regularization based on iteratively reweighted least-squares (TV-IRLS) methods. The evaluation shows that the average PSNR and SSIM of the proposed reconstruction method are the highest of the other methods, each being 24.24 and 0.94; meanwhile, the average NRMSE value is the lowest, which is 0.04. The performance evaluation also shows that the proposed method is faster than the other methods
Credit Card Fraud Detection with Subspace Learning-based One-Class Classification
In an increasingly digitalized commerce landscape, the proliferation of
credit card fraud and the evolution of sophisticated fraudulent techniques have
led to substantial financial losses. Automating credit card fraud detection is
a viable way to accelerate detection, reducing response times and minimizing
potential financial losses. However, addressing this challenge is complicated
by the highly imbalanced nature of the datasets, where genuine transactions
vastly outnumber fraudulent ones. Furthermore, the high number of dimensions
within the feature set gives rise to the ``curse of dimensionality". In this
paper, we investigate subspace learning-based approaches centered on One-Class
Classification (OCC) algorithms, which excel in handling imbalanced data
distributions and possess the capability to anticipate and counter the
transactions carried out by yet-to-be-invented fraud techniques. The study
highlights the potential of subspace learning-based OCC algorithms by
investigating the limitations of current fraud detection strategies and the
specific challenges of credit card fraud detection. These algorithms integrate
subspace learning into the data description; hence, the models transform the
data into a lower-dimensional subspace optimized for OCC. Through rigorous
experimentation and analysis, the study validated that the proposed approach
helps tackle the curse of dimensionality and the imbalanced nature of credit
card data for automatic fraud detection to mitigate financial losses caused by
fraudulent activities.Comment: 6 pages, 1 figure, 2 tables. Accepted at IEEE Symposium Series on
Computational Intelligence 202
Particle swarm optimization for linear support vector machines based classifier selection
Particle swarm optimization is a metaheuristic technique widely applied to solve various optimization problems as well as parameter selection problems for various classification techniques. This paper presents an approach for linear support vector machines classifier optimization combining its selection from a family of similar classifiers with parameter optimization. Experimental results indicate that proposed heuristics can help obtain competitive or even better results compared to similar techniques and approaches and can be used as a solver for various classification tasks
Machine learning methods for sign language recognition: a critical review and analysis.
Sign language is an essential tool to bridge the communication gap between normal and hearing-impaired people. However, the diversity of over 7000 present-day sign languages with variability in motion position, hand shape, and position of body parts making automatic sign language recognition (ASLR) a complex system. In order to overcome such complexity, researchers are investigating better ways of developing ASLR systems to seek intelligent solutions and have demonstrated remarkable success. This paper aims to analyse the research published on intelligent systems in sign language recognition over the past two decades. A total of 649 publications related to decision support and intelligent systems on sign language recognition (SLR) are extracted from the Scopus database and analysed. The extracted publications are analysed using bibliometric VOSViewer software to (1) obtain the publications temporal and regional distributions, (2) create the cooperation networks between affiliations and authors and identify productive institutions in this context. Moreover, reviews of techniques for vision-based sign language recognition are presented. Various features extraction and classification techniques used in SLR to achieve good results are discussed. The literature review presented in this paper shows the importance of incorporating intelligent solutions into the sign language recognition systems and reveals that perfect intelligent systems for sign language recognition are still an open problem. Overall, it is expected that this study will facilitate knowledge accumulation and creation of intelligent-based SLR and provide readers, researchers, and practitioners a roadmap to guide future direction
Exploring attributes, sequences, and time in Recommender Systems: From classical to Point-of-Interest recommendation
Tesis Doctoral inédita leÃda en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingenieria Informática. Fecha de lectura: 08-07-2021Since the emergence of the Internet and the spread of digital communications
throughout the world, the amount of data stored on the Web has been
growing exponentially. In this new digital era, a large number of companies
have emerged with the purpose of ltering the information available on the
web and provide users with interesting items. The algorithms and models
used to recommend these items are called Recommender Systems. These
systems are applied to a large number of domains, from music, books, or
movies to dating or Point-of-Interest (POI), which is an increasingly popular
domain where users receive recommendations of di erent places when
they arrive to a city.
In this thesis, we focus on exploiting the use of contextual information, especially
temporal and sequential data, and apply it in novel ways in both
traditional and Point-of-Interest recommendation. We believe that this type
of information can be used not only for creating new recommendation models
but also for developing new metrics for analyzing the quality of these
recommendations. In one of our rst contributions we propose di erent
metrics, some of them derived from previously existing frameworks, using
this contextual information. Besides, we also propose an intuitive algorithm
that is able to provide recommendations to a target user by exploiting the
last common interactions with other similar users of the system.
At the same time, we conduct a comprehensive review of the algorithms
that have been proposed in the area of POI recommendation between 2011
and 2019, identifying the common characteristics and methodologies used.
Once this classi cation of the algorithms proposed to date is completed, we
design a mechanism to recommend complete routes (not only independent
POIs) to users, making use of reranking techniques. In addition, due to the
great di culty of making recommendations in the POI domain, we propose
the use of data aggregation techniques to use information from di erent
cities to generate POI recommendations in a given target city.
In the experimental work we present our approaches on di erent datasets
belonging to both classical and POI recommendation. The results obtained
in these experiments con rm the usefulness of our recommendation proposals,
in terms of ranking accuracy and other dimensions like novelty, diversity,
and coverage, and the appropriateness of our metrics for analyzing temporal
information and biases in the recommendations producedDesde la aparici on de Internet y la difusi on de las redes de comunicaciones
en todo el mundo, la cantidad de datos almacenados en la red ha crecido
exponencialmente. En esta nueva era digital, han surgido un gran n umero
de empresas con el objetivo de ltrar la informaci on disponible en la red
y ofrecer a los usuarios art culos interesantes. Los algoritmos y modelos
utilizados para recomendar estos art culos reciben el nombre de Sistemas de
Recomendaci on. Estos sistemas se aplican a un gran n umero de dominios,
desde m usica, libros o pel culas hasta las citas o los Puntos de Inter es (POIs,
en ingl es), un dominio cada vez m as popular en el que los usuarios reciben
recomendaciones de diferentes lugares cuando llegan a una ciudad.
En esta tesis, nos centramos en explotar el uso de la informaci on contextual,
especialmente los datos temporales y secuenciales, y aplicarla de forma novedosa
tanto en la recomendaci on cl asica como en la recomendaci on de POIs.
Creemos que este tipo de informaci on puede utilizarse no s olo para crear
nuevos modelos de recomendaci on, sino tambi en para desarrollar nuevas
m etricas para analizar la calidad de estas recomendaciones. En una de
nuestras primeras contribuciones proponemos diferentes m etricas, algunas
derivadas de formulaciones previamente existentes, utilizando esta informaci
on contextual. Adem as, proponemos un algoritmo intuitivo que es
capaz de proporcionar recomendaciones a un usuario objetivo explotando
las ultimas interacciones comunes con otros usuarios similares del sistema.
Al mismo tiempo, realizamos una revisi on exhaustiva de los algoritmos que
se han propuesto en el a mbito de la recomendaci o n de POIs entre 2011 y
2019, identi cando las caracter sticas comunes y las metodolog as utilizadas.
Una vez realizada esta clasi caci on de los algoritmos propuestos hasta la
fecha, dise~namos un mecanismo para recomendar rutas completas (no s olo
POIs independientes) a los usuarios, haciendo uso de t ecnicas de reranking.
Adem as, debido a la gran di cultad de realizar recomendaciones en el
ambito de los POIs, proponemos el uso de t ecnicas de agregaci on de datos
para utilizar la informaci on de diferentes ciudades y generar recomendaciones
de POIs en una determinada ciudad objetivo.
En el trabajo experimental presentamos nuestros m etodos en diferentes
conjuntos de datos tanto de recomendaci on cl asica como de POIs. Los
resultados obtenidos en estos experimentos con rman la utilidad de nuestras
propuestas de recomendaci on en t erminos de precisi on de ranking y de
otras dimensiones como la novedad, la diversidad y la cobertura, y c omo de
apropiadas son nuestras m etricas para analizar la informaci on temporal y
los sesgos en las recomendaciones producida
Statistical Approaches for Binary and Categorical Data Modeling
Nowadays a massive amount of data is generated as the development of technology and services has accelerated. Therefore, the demand for data clustering in order to gain knowledge has increased in many sectors such as medical sciences, risk assessment and product sales. Moreover, binary data has been widely used in various applications including market basket data and text documents analysis. While applying classic widely used k-means method is inappropriate to cluster binary data, we propose an improvement of K-medoids algorithm using binary similarity measures instead of Euclidean distance which is generally deployed in clustering algorithms. In addition to K-medoids clustering method, agglomerative hierarchical clustering methods based on Gaussian probability models have recently shown to be efficient in different applications. However, the emerging of pattern recognition applications where the features are binary or integer-valued demand extending research efforts to such data types. We propose a hierarchical clustering framework for clustering categorical data based on Multinomial and Bernoulli mixture models. We have compared two widely used density-based distances, namely; Bhattacharyya and Kullback-Leibler. The merits of our proposed clustering frameworks have been shown through extensive experiments on clustering text, binary images categorization and images categorization.
The development of generative/discriminative approaches for classifying different kinds of data has attracted scholars’ attention. Considering the strengths and weaknesses of both approaches, several hybrid learning approaches which combined the desirable properties of both have been developed. Our contribution is to combine Support Vector Machines (SVMs) and Bernoulli mixture model in order to classify binary data. We propose using Bernoulli mixture model for generating probabilistic kernels for SVM based on information divergence. These kernels make intelligent use of unlabeled binary data to achieve good data discrimination. We evaluate the proposed hybrid learning approach by classifying binary and texture images
Rejection-oriented learning without complete class information
Machine Learning is commonly used to support decision-making in numerous, diverse contexts. Its usefulness in this regard is unquestionable: there are complex systems built on the top of machine learning techniques whose descriptive and predictive capabilities go far beyond those of human beings. However, these systems still have limitations, whose analysis enable to estimate their applicability and confidence in various cases. This is interesting considering that abstention from the provision of a response is preferable to make a mistake in doing so. In the context of classification-like tasks, the indication of such inconclusive output is called rejection. The research which culminated in this thesis led to the conception, implementation and evaluation of rejection-oriented learning systems for two distinct tasks: open set recognition and data stream clustering. These system were derived from WiSARD artificial neural network, which had rejection modelling incorporated into its functioning. This text details and discuss such realizations. It also presents experimental results which allow assess the scientific and practical importance of the proposed state-of-the-art methodology.Aprendizado de Máquina é comumente usado para apoiar a tomada de decisão em numerosos e diversos contextos. Sua utilidade neste sentido é inquestionável: existem sistemas complexos baseados em técnicas de aprendizado de máquina cujas capacidades descritivas e preditivas vão muito além das dos seres humanos. Contudo, esses sistemas ainda possuem limitações, cuja análise permite estimar sua aplicabilidade e confiança em vários casos. Isto é interessante considerando que a abstenção da provisão de uma resposta é preferÃvel a cometer um equÃvoco ao realizar tal ação. No contexto de classificação e tarefas similares, a indicação desse resultado inconclusivo é chamada de rejeição. A pesquisa que culminou nesta tese proporcionou a concepção, implementação e avaliação de sistemas de aprendizado orientados `a rejeição para duas tarefas distintas: reconhecimento em cenário abertos e agrupamento de dados em fluxo contÃnuo. Estes sistemas foram derivados da rede neural artificial WiSARD, que teve a modelagem de rejeição incorporada a seu funcionamento. Este texto detalha e discute tais realizações. Ele também apresenta resultados experimentais que permitem avaliar a importância cientÃfica e prática da metodologia de ponta proposta
- …