1,161 research outputs found
Cooperative Training in Multiple Classifier Systems
Multiple classifier system has shown to be an effective technique for classification.
The success of multiple classifiers does not entirely depend on the base classifiers
and/or the aggregation technique. Other parameters, such as training data, feature
attributes, and correlation among the base classifiers may also contribute to the
success of multiple classifiers. In addition, interaction of these parameters with each other may have an impact on multiple classifiers performance. In the present study, we intended to examine some of these interactions and investigate further the effects of these interactions on the performance of classifier ensembles.
The proposed research introduces a different direction in the field of multiple
classifiers systems. We attempt to understand and compare ensemble methods from
the cooperation perspective. In this thesis, we narrowed down our focus on cooperation at training level. We first developed measures to estimate the degree and type of cooperation among training data partitions. These evaluation measures enabled us to evaluate the diversity and correlation among a set of disjoint and overlapped partitions. With the aid of properly selected measures and training information, we proposed two new data partitioning approaches: Cluster, De-cluster, and Selection (CDS) and Cooperative Cluster, De-cluster, and Selection (CO-CDS). In the end, a
comprehensive comparative study was conducted where we compared our proposed
training approaches with several other approaches in terms of robustness of their
usage, resultant classification accuracy and classification stability.
Experimental assessment of CDS and CO-CDS training approaches validates
their robustness as compared to other training approaches. In addition, this study
suggests that: 1) cooperation is generally beneficial and 2) classifier ensembles that
cooperate through sharing information have higher generalization ability compared
to the ones that do not share training information
Multimedia
The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant Self-Organising Maps
With the advent of large scale surveys the manual analysis and classification
of individual radio source morphologies is rendered impossible as existing
approaches do not scale. The analysis of complex morphological features in the
spatial domain is a particularly important task. Here we discuss the challenges
of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project
and introduce a proper transfer mechanism via quantile random forest
regression. By using parallelized rotation and flipping invariant Kohonen-maps,
image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio
continuum and WISE infrared all sky surveys are first projected down to a
two-dimensional embedding in an unsupervised way. This embedding can be seen as
a discretised space of shapes with the coordinates reflecting morphological
features as expressed by the automatically derived prototypes. We find that
these prototypes have reconstructed physically meaningful processes across two
channel images at radio and infrared wavelengths in an unsupervised manner. In
the second step, images are compared with those prototypes to create a
heat-map, which is the morphological fingerprint of each object and the basis
for transferring the user generated labels. These heat-maps have reduced the
feature space by a factor of 248 and are able to be used as the basis for
subsequent ML methods. Using an ensemble of decision trees we achieve upwards
of 85.7% and 80.7% accuracy when predicting the number of components and peaks
in an image, respectively, using these heat-maps. We also question the
currently used discrete classification schema and introduce a continuous scale
that better reflects the uncertainty in transition between two classes, caused
by sensitivity and resolution limits
QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios
The concerns about individuals security have justified the increasing number of surveillance
cameras deployed both in private and public spaces. However, contrary to popular belief,
these devices are in most cases used solely for recording, instead of feeding intelligent analysis
processes capable of extracting information about the observed individuals. Thus, even though
video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant
details about the subjects that took part in a crime depends on the manual inspection
of recordings. As such, the current goal of the research community is the development of
automated surveillance systems capable of monitoring and identifying subjects in surveillance
scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric
recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at
designing a visual surveillance system capable of acquiring biometric data at a distance (e.g.,
face, iris or gait) without requiring human intervention in the process, as well as devising biometric
recognition methods robust to the degradation factors resulting from the unconstrained
acquisition process.
Regarding the first goal, the analysis of the data acquired by typical surveillance systems
shows that large acquisition distances significantly decrease the resolution of biometric samples,
and thus their discriminability is not sufficient for recognition purposes. In the literature,
diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring
high-resolution imagery at a distance, particularly when using a master-slave configuration. In
the master-slave configuration, the video acquired by a typical surveillance camera is analyzed
for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged
at high-resolution by the PTZ camera. Several methods have already shown that this configuration
can be used for acquiring biometric data at a distance. Nevertheless, these methods
failed at providing effective solutions to the typical challenges of this strategy, restraining its
use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development
of a biometric data acquisition system based on the cooperation of a PTZ camera
with a typical surveillance camera. The first proposal is a camera calibration method capable
of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ
camera. The second proposal is a camera scheduling method for determining - in real-time -
the sequence of acquisitions that maximizes the number of different targets obtained, while
minimizing the cumulative transition time. In order to achieve the first goal of this thesis,
both methods were combined with state-of-the-art approaches of the human monitoring field
to develop a fully automated surveillance capable of acquiring biometric data at a distance and
without human cooperation, designated as QUIS-CAMPI system.
The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis
of the performance of the state-of-the-art biometric recognition approaches shows that these
approaches attain almost ideal recognition rates in unconstrained data. However, this performance
is incongruous with the recognition rates observed in surveillance scenarios. Taking into
account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising
biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a
distance ranging from 5 to 40 meters and without human intervention in the acquisition process.
This set allows to objectively assess the performance of state-of-the-art biometric recognition
methods in data that truly encompass the covariates of surveillance scenarios. As such, this set
was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained
by the nine methods specially designed for this competition. In addition, the data acquired by
the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the
development of methods robust to the covariates of surveillance scenarios. The first proposal
regards a method for detecting corrupted features in biometric signatures inferred by a redundancy
analysis algorithm. The second proposal is a caricature-based face recognition approach
capable of enhancing the recognition performance by automatically generating a caricature
from a 2D photo. The experimental evaluation of these methods shows that both approaches
contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivíduos tem justificado o crescimento
do número de câmaras de vídeo-vigilância instaladas tanto em espaços privados como públicos.
Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos
casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente
capaz de inferir em tempo real informações sobre os indivíduos observados. Assim, apesar de a
vídeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda
confinado à disponibilização de vídeos que têm que ser manualmente inspecionados para extrair
informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal
desafio da comunidade científica é o desenvolvimento de sistemas automatizados capazes de
monitorizar e identificar indivíduos em ambientes de vídeo-vigilância.
Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento
biométrico aos ambientes de vídeo-vigilância. De forma mais especifica, pretende-se
1) conceber um sistema de vídeo-vigilância que consiga adquirir dados biométricos a longas distâncias
(e.g., imagens da cara, íris, ou vídeos do tipo de passo) sem requerer a cooperação dos
indivíduos no processo; e 2) desenvolver métodos de reconhecimento biométrico robustos aos
fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas.
No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas típicos
de vídeo-vigilância mostra que, devido à distância de captura, os traços biométricos amostrados
não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis.
Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir
imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave.
Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse
(e.g. carros, pessoas) a partir do vídeo adquirido por uma câmara de vídeo-vigilância
e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos
métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos
à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados
com esta estratégia, impedindo assim o seu uso em ambientes de vídeo-vigilância. Deste modo,
esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de
vídeo-vigilância usando uma câmara PTZ assistida por uma câmara típica de vídeo-vigilância. O
primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara
master para o ângulo da câmara PTZ (slave) sem o auxílio de outros dispositivos óticos. O
segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela
câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações
que maximiza o número de diferentes sujeitos observados e simultaneamente minimiza o
tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os
dois métodos propostos foram combinados com os avanços alcançados na área da monitorização
de humanos para assim desenvolver o primeiro sistema de vídeo-vigilância completamente automatizado
e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação
dos indivíduos no processo, designado por sistema QUIS-CAMPI.
O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada
com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento
biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento
quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento
maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho não é corroborado pelos resultados observados em ambientes de vídeo-vigilância, o que sugere que os conjuntos
de dados atuais não contêm verdadeiramente os fatores de degradação típicos dos ambientes de
vídeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biométricos atuais,
esta tese introduz um novo conjunto de dados biométricos (imagens da face e vídeos do tipo de
passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação
dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho
dos métodos do estado-da-arte no reconhecimento de indivíduos em imagens/vídeos
capturados num ambiente real de vídeo-vigilância. Como tal, este conjunto foi utilizado para
promover a primeira competição de reconhecimento biométrico em ambientes não controlados.
Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9
métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos
pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para
aumentar a robustez aos fatores de degradação observados em ambientes de vídeo-vigilância. O
primeiro é um método para detetar características corruptas em assinaturas biométricas através
da análise da redundância entre subconjuntos de características. O segundo é um método de
reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma única
foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir
as taxas de erro em dados adquiridos de forma não controlada
An empirical study on credit evaluation of SMEs based on detailed loan data
Small and micro-sized Enterprises (SMEs) are an important part of Chinese economic system.The establishment of credit evaluating model of SMEs can effectively help financial intermediaries to reveal credit risk of enterprises and reduce the cost of enterprises information acquisition. Besides it can also serve as a guide to investors which also helps companies with good credit.
This thesis conducts an empirical study based on loan data from a Chinese bank of loans granted to SMEs. The study aims to develop a data-driven model that can accurately predict if a given loan has an acceptable risk from the bank’s perspective, or not. Furthermore, we test different methods to deal with the problem of unbalanced class and uncredible sample. Lastly, the importance of variables is analyzed. Remaining Unpaid Principal, Floating Interest Rate, Time Until Maturity Date, Real Interest Rate, Amount of Loan all have significant effects on the final result of the prediction.The main contribution of this study is to build a credit evaluation model of small and micro enterprises, which not only helps commercial banks accurately identify the credit risk of small and micro enterprises, but also helps to overcome creditdifficulties of small and micro enterprises.As pequenas e microempresas constituem uma parte importante do sistema económico chinês. A definição de um modelo de avaliação de crédito para estas empresas pode ajudar os intermediários financeiros a revelarem o risco de crédito das empresas e a reduzirem o custo
de aquisição de informação das empresas. Além disso, pode igualmente servir como guia para os investidores, auxiliando também empresas com bom crédito.
Na presente tese apresenta-se um estudo empírico baseado em dados de um banco chinês relativos a empréstimos concedidos a pequenas e microempresas. O estudo visa desenvolver um modelo empírico que possa prever com precisão se um determinado empréstimo tem um risco aceitável do ponto de vista do banco, ou não. Além disso, são efetuados testes com diferentes métodos que permitem lidar com os problemas de classes de dados não balanceadas e de amostras que não refletem o problema real a modelar. Finalmente, é analisada a importância relativa das variáveis. O montante da dívida por pagar, a taxa de juro variável, o prazo até a data de vencimento, a taxa de juro real, o montante do empréstimo, todas têm efeitos significativos no resultado final da previsão. O principal contributo deste estudo é, assim, a construção de um modelo de avaliação de crédito que permite apoiar os bancos comerciais a identificarem com precisão o risco de crédito das pequenas e micro empresas e ajudar também estas empresas a superarem as suas dificuldades de crédito
- …