23 research outputs found
PVSNet: Palm Vein Authentication Siamese Network Trained using Triplet Loss and Adaptive Hard Mining by Learning Enforced Domain Specific Features
Designing an end-to-end deep learning network to match the biometric features
with limited training samples is an extremely challenging task. To address this
problem, we propose a new way to design an end-to-end deep CNN framework i.e.,
PVSNet that works in two major steps: first, an encoder-decoder network is used
to learn generative domain-specific features followed by a Siamese network in
which convolutional layers are pre-trained in an unsupervised fashion as an
autoencoder. The proposed model is trained via triplet loss function that is
adjusted for learning feature embeddings in a way that minimizes the distance
between embedding-pairs from the same subject and maximizes the distance with
those from different subjects, with a margin. In particular, a triplet Siamese
matching network using an adaptive margin based hard negative mining has been
suggested. The hyper-parameters associated with the training strategy, like the
adaptive margin, have been tuned to make the learning more effective on
biometric datasets. In extensive experimentation, the proposed network
outperforms most of the existing deep learning solutions on three type of
typical vein datasets which clearly demonstrates the effectiveness of our
proposed method.Comment: Accepted in 5th IEEE International Conference on Identity, Security
and Behavior Analysis (ISBA), 2019, Hyderabad, Indi
Identification and Security Implications of Biometrics
The usage of biometrics has become more frequent over the past couple of decades, notably due to technological advancements. Evolving technology in the field of biometrics has also led to increased accuracy of associated software, which have provided the opportunity to use a multitude of different human characteristics for identification and/or verification purposes. The current study assessed the usage of biometrics in casinos, hospitals, and law enforcement agencies using a survey methodology. Results indicated that privacy concerns related to the use of biometrics may not be as prevalent as indicated in the literature. Additionally, results indicated that the utilization of biometrics has led to increased accuracy in identification and verification processes, led to enhanced security, and would be highly recommended to other institutions. Information obtained from the literature notes the racial bias in facial recognition technologies due to algorithmic development based solely upon features of Caucasian individuals. Efforts need to be made to create facial recognition algorithms that are more racially and ethnically diverse
Pathway to Future Symbiotic Creativity
This report presents a comprehensive view of our vision on the development
path of the human-machine symbiotic art creation. We propose a classification
of the creative system with a hierarchy of 5 classes, showing the pathway of
creativity evolving from a mimic-human artist (Turing Artists) to a Machine
artist in its own right. We begin with an overview of the limitations of the
Turing Artists then focus on the top two-level systems, Machine Artists,
emphasizing machine-human communication in art creation. In art creation, it is
necessary for machines to understand humans' mental states, including desires,
appreciation, and emotions, humans also need to understand machines' creative
capabilities and limitations. The rapid development of immersive environment
and further evolution into the new concept of metaverse enable symbiotic art
creation through unprecedented flexibility of bi-directional communication
between artists and art manifestation environments. By examining the latest
sensor and XR technologies, we illustrate the novel way for art data collection
to constitute the base of a new form of human-machine bidirectional
communication and understanding in art creation. Based on such communication
and understanding mechanisms, we propose a novel framework for building future
Machine artists, which comes with the philosophy that a human-compatible AI
system should be based on the "human-in-the-loop" principle rather than the
traditional "end-to-end" dogma. By proposing a new form of inverse
reinforcement learning model, we outline the platform design of machine
artists, demonstrate its functions and showcase some examples of technologies
we have developed. We also provide a systematic exposition of the ecosystem for
AI-based symbiotic art form and community with an economic model built on NFT
technology. Ethical issues for the development of machine artists are also
discussed
A PROBABILISTIC APPROACH TO THE CONSTRUCTION OF A MULTIMODAL AFFECT SPACE
Understanding affective signals from others is crucial for both human-human and human-agent interaction. The automatic analysis of emotion is by and large addressed as a pattern recognition problem which grounds in early psychological theories of emotion. Suitable features are first extracted and then used as input to classification (discrete emotion recognition) or regression (continuous affect detection). In this thesis, differently from many computational models in the literature, we draw on a simulationist approach to the analysis of facially displayed emotions - e.g., in the course of a face-to-face interaction between an expresser and an observer. At the heart of such perspective lies the enactment of the perceived emotion in the observer. We propose a probabilistic framework based on a deep latent representation of a continuous affect space, which can be exploited for both the estimation and the enactment of affective states in a multimodal space. Namely, we consider the observed facial expression together with physiological activations driven by internal autonomic activity. The rationale behind the approach lies in the large body of evidence from affective neuroscience showing that when we observe emotional facial expressions, we react with congruent facial mimicry. Further, in more complex situations, affect understanding is likely to rely on a comprehensive representation grounding the reconstruction of the state of the body associated with the displayed emotion. We show that our approach can address such problems in a unified and principled perspective, thus avoiding ad hoc heuristics while minimising learning efforts. Moreover, our model improves the inferred belief through the adoption of an inner loop of measurements and predictions within the central affect state-space, that realise the dynamics of the affect enactment. Results so far achieved have been obtained by adopting two publicly available multimodal corpora
Recommended from our members
Learning Birds-Eye View Representations for Autonomous Driving
Over the past few years, progress towards the ambitious goal of widespread fully-autonomous vehicles on our roads has accelerated dramatically. This progress has been spurred largely by the success of highly accurate LiDAR sensors, as well the use of detailed high-resolution maps, which together allow a vehicle to navigate its surroundings effectively. Often, however, one or both of these resources may be unavailable, whether due to cost, sensor failure, or the need to operate in an unmapped environment. The aim of this thesis is therefore to demonstrate that it is possible to build detailed three-dimensional representations of traffic scenes using only 2D monocular camera images as input. Such an approach faces many challenges: most notably that 2D images do not provide explicit 3D structure. We overcome this limitation by applying a combination of deep learning and geometry to transform image-based features into an orthographic birds-eye view representation of the scene, allowing algorithms to reason in a metric, 3D space. This approach is applied to solving two challenging perception tasks central to autonomous driving.
The first part of this thesis addresses the problem of monocular 3D object detection, which involves determining the size and location of all objects in the scene. Our solution was based on a novel convolutional network architecture that processed features in both the image and birds-eye view perspective. Results on the KITTI dataset showed that this network outperformed existing works at the time, and although more recent works have improved on these results, we conducted extensive analysis to find that our solution performed well in many difficult edge-case scenarios such as objects close to or distant from the camera.
In the second part of the thesis, we consider the related problem of semantic map prediction. This consists of estimating a birds-eye view map of the world visible from a given camera, encoding both static elements of the scene such as pavement and road layout, as well as dynamic objects such as vehicles and pedestrians. This was accomplished using a second network that built on the experience from the previous work and achieved convincing performance on two real-world driving datasets. By formulating the maps as an occupancy grid map (a widely used representation from robotics), we were able to demonstrate how predictions could be accumulated across multiple frames, and that doing so further improved the robustness of maps produced by our system.Toyota Motors Europ
Electroencephalographic Responses to Frictional Stimuli: Measurement Setup and Processing Pipeline
Tactility is a key sense in the human interaction with the environment. The understanding of
tactile perception has become an exciting area in industrial, medical and scienti c research with an
emphasis on the development of new haptic technologies. Surprisingly, the quanti cation of tactile
perception has, compared to other senses, only recently become a eld of scienti c investigation.
The overall goal of this emerging scienti c discipline is an understanding of the causal chain
from the contact of the skin with materials to the brain dynamics representing recognition of
and emotional reaction to the materials. Each link in this chain depends on individual and
environmental factors ranging from the in uence of humidity on contact formation to the role of
attention for the perception of touch.
This thesis reports on the research of neural correlates to the frictional stimulation of the human
ngertip. Event-related electroencephalographic potentials (ERPs) upon the change in ngertip
friction are measured and studied, when pins of a programmable Braille-display were brought into
skin contact. In order to contribute to the understanding of the causal chain mentioned above,
this work combines two research areas which are usually not connected to each other, namely
tribology and neuroscience. The goal of the study is to evaluate contributions of friction to the
process of haptic perception. Key contributions of this thesis are:
1) Development of a setup to simultaneously record physical forces and ERPs upon tactile
stimulation.
2) Implementation of a dedicated signal processing pipeline for the statistical analysis of ERP
-amplitudes, -latencies and -instantaneous phases.
3) Interpretation of skin friction data and extraction of neural correlates with respect to varying
friction intensities.
The tactile stimulation of the ngertip upon raising and lowering of di erent lines of Braille-pins
(one, three and ve) caused pronounced N50 and P100 components in the event-related ERPsequences,
which is in line with the current literature. Friction between the ngertip and the
Braille-system exhibited a characteristic temporal development which is attributed to viscoelastic
skin relaxation. Although the force stimuli varied by a factor of two between the di erent Braillepatterns,
no signi cant di erences were observed between the amplitudes and latencies of ERPs
after standard across-trial averaging. Thus, for the rst time a phase measure for estimating singletrial
interactions of somatosensory potentials is proposed. Results show that instantaneous phase
coherency is evoked by friction, and that higher friction induces stronger and more time-localized
phase coherencyDie Taktilität ist ein zentraler Sinn in der Interaktion mit unserer Umwelt. Das Bestreben,
fundierte Erkenntnisse hinsichtlich der taktilenWahrnehmung zu gewinnen erhält groÿen Zuspruch
in der industriellen, medizinischen und wissenschaftlichen Forschung, meist mit einem Fokus auf
der Entwicklung von haptischen Technologien. Erstaunlicherweise ist jedoch die wissenschaftliche
Quanti zierung der taktilen Wahrnehmung, verglichen mit anderen Sinnesmodalitäten, erst seit
kurzem ein sich entwickelnder Forschungsbereich. Fokus dieser Disziplin ist es, die kognitive und
emotionale Reaktion nach physischem Kontakt mit Materialien zu beschreiben, und die kausale
Wirkungskette von der BerĂĽhrung bis zur Reaktion zu verstehen. Dabei unterliegen die einzelnen
Faktoren dieser Kette sowohl individuellen als auch externen Ein ĂĽssen, welche von der Luftfeuchtigkeit
während des Kontaktes bis hin zur Rolle der Aufmerksamkeit für die Wahrnehmung
reichen.
Die vorliegende Arbeit beschäftigt sich mit der Untersuchung von neuronalen Korrelaten nach
Reibungsstimulation des menschlichen Fingers. Dazu wurden Reibungsänderungen, welche durch
den Kontakt der menschlichen Fingerspitze mit schaltbaren Stiften eines Braille-Display erzeugt
wurden, untersucht und die entsprechenden neuronalen Korrelate aufgezeichnet. Um zu dem Verst
ändnis der oben erwähnten Wirkungskette beizutragen, werden Ansätze aus zwei für gewöhnlich
nicht zusammenhängenden Forschungsbereichen, nämlich der Tribologie und der Neurowissenschaft,
kombiniert. Folgende Beiträge sind Hauptbestandteile dieser Arbeit:
1) Realisierung einer Messumgebung zur simultanen Ableitung von Kräften und ereigniskorrelierten
Potentialen nach taktiler Stimulation der Fingerspitze.
2) Aufbau einer speziellen Signalverarbeitungskette zur statistischen Analyse von stimulationsabh
ängigen EEG -Amplituden, -Latenzen und -instantanen Phasen.
3) Interpretation der erhobenen Reibungsdaten und Extraktion neuronaler Korrelate hinsichtlich
variierender Stimulationsintensitäten.
Unsere Resultate zeigen, dass die taktile Stimulation der Fingerspitze nach Anheben und Senken
von Braille-Stiften zu signi kanten N50 und P100 Komponenten in den ereigniskorrelierten Potentialen
fĂĽhrt, im Einklang mit der aktuellen Literatur. Die Reibung zwischen der Fingerspitze
und dem Braille-System zeigte einen charakteristischen Signalverlauf, welcher auf viskoelastische
Hautrelaxation zurĂĽckzufĂĽhren ist. Trotz der um einen Faktor zwei verschiedenen Intensit
ätsunterschiede zwischen den Stimulationsmustern zeigten sich keine signi kanten Unterschiede
zwischen den einfach gemittelten Amplituden der evozierten Potentialen. Erstmalig wurde ein
Phasen-MaĂż zur Identi zierung von Unterschieden zwischen somatosensorischen "single-trial" Interaktionen
angewandt. Diese Phasenanalyse zeigte, im Gegensatz zur Amplituden- und Latenzanalyse,
deutlichere und signi kantere Unterschiede zwischen den Stimulationsparadigmen. Es
wird gefolgert, dass Kohärenz zwischen den Momentanphasen durch Reibungsereignisse herbeigef
ührt wird und dass durch stärkere Reibung diese Kohärenz, im zeitlichen Verlauf, stärker und
lokalisierter wird
Machine Learning for Biomedical Application
Biomedicine is a multidisciplinary branch of medical science that consists of many scientific disciplines, e.g., biology, biotechnology, bioinformatics, and genetics; moreover, it covers various medical specialties. In recent years, this field of science has developed rapidly. This means that a large amount of data has been generated, due to (among other reasons) the processing, analysis, and recognition of a wide range of biomedical signals and images obtained through increasingly advanced medical imaging devices. The analysis of these data requires the use of advanced IT methods, which include those related to the use of artificial intelligence, and in particular machine learning. It is a summary of the Special Issue “Machine Learning for Biomedical Application”, briefly outlining selected applications of machine learning in the processing, analysis, and recognition of biomedical data, mostly regarding biosignals and medical images