9,146 research outputs found
On Monte Carlo methods for the Dirichlet process mixture model, and the selection of its precision parameter prior
Two issues commonly faced by users of Dirichlet process mixture models are: 1) how to appropriately select a hyperprior for its precision parameter alpha, and 2) the typically slow mixing of the MCMC chain produced by conditional Gibbs samplers based on its stick-breaking representation, as opposed to marginal collapsed Gibbs samplers based on the Polya urn, which have smaller integrated autocorrelation times.
In this thesis, we analyse the most common approaches to hyperprior selection for alpha, we identify their limitations, and we propose a new methodology to overcome them.
To address slow mixing, we revisit three label-switching Metropolis moves from the literature (Hastie et al., 2015; Papaspiliopoulos and Roberts, 2008), improve them, and introduce a fourth move. Secondly, we revisit two i.i.d. sequential importance samplers which operate in the collapsed space (Liu, 1996; S. N. MacEachern et al., 1999), and we develop a new sequential importance sampler for the stick-breaking parameters of Dirichlet process mixtures, which operates in the stick-breaking space and which has minimal integrated autocorrelation time. Thirdly, we introduce the i.i.d. transcoding algorithm which, conditional to a partition of the data, can infer back which specific stick in the stick-breaking construction each observation originated from. We use it as a building block to develop the transcoding sampler, which removes the need for label-switching Metropolis moves in the conditional stick-breaking sampler, as it uses the better performing marginal sampler (or any other sampler) to drive the MCMC chain, and augments its exchangeable partition posterior with conditional i.i.d. stick-breaking parameter inferences after the fact, thereby inheriting its shorter autocorrelation times
Decoding spatial location of attended audio-visual stimulus with EEG and fNIRS
When analyzing complex scenes, humans often focus their attention on an object at a particular spatial location in the presence of background noises and irrelevant visual objects. The ability to decode the attended spatial location would facilitate brain computer interfaces (BCI) for complex scene analysis. Here, we tested two different neuroimaging technologies and investigated their capability to decode audio-visual spatial attention in the presence of competing stimuli from multiple locations. For functional near-infrared spectroscopy (fNIRS), we targeted dorsal frontoparietal network including frontal eye field (FEF) and intra-parietal sulcus (IPS) as well as superior temporal gyrus/planum temporal (STG/PT). They all were shown in previous functional magnetic resonance imaging (fMRI) studies to be activated by auditory, visual, or audio-visual spatial tasks. We found that fNIRS provides robust decoding of attended spatial locations for most participants and correlates with behavioral performance. Moreover, we found that FEF makes a large contribution to decoding performance. Surprisingly, the performance was significantly above chance level 1s after cue onset, which is well before the peak of the fNIRS response.
For electroencephalography (EEG), while there are several successful EEG-based algorithms, to date, all of them focused exclusively on auditory modality where eye-related artifacts are minimized or controlled. Successful integration into a more ecological typical usage requires careful consideration for eye-related artifacts which are inevitable. We showed that fast and reliable decoding can be done with or without ocular-removal algorithm. Our results show that EEG and fNIRS are promising platforms for compact, wearable technologies that could be applied to decode attended spatial location and reveal contributions of specific brain regions during complex scene analysis
Learning disentangled speech representations
A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody.
The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions.
In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks.
This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically
Optimising acoustic cavitation for industrial application
The ultrasonic horn is one of the most commonly used acoustic devices in laboratories and industry. For its efficient application to cavitation mediated process, the cavitation generated at its tip as a function of its tip-vibration amplitudes still needed to be studied in detail. High-speed imaging and acoustic detection are used to investigate the cavitation generated at the tip of an ultrasonic horn, operating at a fundamental frequency, f0, of 20 kHz. Tip-vibration amplitudes are sampled at fine increments across the range of input powers available. The primary bubble cluster under the tip is found to undergo subharmonic periodic collapse, with concurrent shock wave emission, at frequencies of f0/m, with m increasing through integer values with increasing tip-vibration amplitude. The contribution of periodic shock waves to the noise spectra of the acoustic emissions is confirmed. Transitional input powers for which the value of m is indistinct, and shock wave emission irregular and inconsistent, are identified through Vrms of the acoustic detector output. For cavitation applications mediated by bubble collapse, sonications at transitional powers may lead to inefficient processing. The ultrasonic horn is also deployed to investigate the role of shock waves in the fragmentation of intermetallic crystals, nominally for ultrasonic treatment of Aluminium melt, and in a novel two-horn configuration for potential cavitation enhancement effects. An experiment investigating nitrogen fixation via cavitation generated by focused ultrasound exposures is also described. Vrms from the acoustic detector is again used to quantify the acoustic emissions for comparison to the sonochemical nitrite yield and for optimisation of sonication protocols at constant input energy. The findings revealed that the acoustic cavitation could be enhanced at constant input energy through optimisation of the pulse duration and pulse interval. Anomalous results may be due to inadequate assessment for the nitrate generated. The studies presented in this thesis have illustrated means of improving the cavitation efficiency of the used acoustic devices, which may be important to some selected industrial processes
Machine learning for managing structured and semi-structured data
As the digitalization of private, commercial, and public sectors advances rapidly, an increasing amount of data is becoming available. In order to gain insights or knowledge from these enormous amounts of raw data, a deep analysis is essential. The immense volume requires highly automated processes with minimal manual interaction. In recent years, machine learning methods have taken on a central role in this task. In addition to the individual data points, their interrelationships often play a decisive role, e.g. whether two patients are related to each other or whether they are treated by the same physician. Hence, relational learning is an important branch of research, which studies how to harness this explicitly available structural information between different data points. Recently, graph neural networks have gained importance. These can be considered an extension of convolutional neural networks from regular grids to general (irregular) graphs.
Knowledge graphs play an essential role in representing facts about entities in a machine-readable way. While great efforts are made to store as many facts as possible in these graphs, they often remain incomplete, i.e., true facts are missing. Manual verification and expansion of the graphs is becoming increasingly difficult due to the large volume of data and must therefore be assisted or substituted by automated procedures which predict missing facts. The field of knowledge graph completion can be roughly divided into two categories: Link Prediction and Entity Alignment. In Link Prediction, machine learning models are trained to predict unknown facts between entities based on the known facts. Entity Alignment aims at identifying shared entities between graphs in order to link several such knowledge graphs based on some provided seed alignment pairs.
In this thesis, we present important advances in the field of knowledge graph completion. For Entity Alignment, we show how to reduce the number of required seed alignments while maintaining performance by novel active learning techniques. We also discuss the power of textual features and show that graph-neural-network-based methods have difficulties with noisy alignment data. For Link Prediction, we demonstrate how to improve the prediction for unknown entities at training time by exploiting additional metadata on individual statements, often available in modern graphs. Supported with results from a large-scale experimental study, we present an analysis of the effect of individual components of machine learning models, e.g., the interaction function or loss criterion, on the task of link prediction. We also introduce a software library that simplifies the implementation and study of such components and makes them accessible to a wide research community, ranging from relational learning researchers to applied fields, such as life sciences. Finally, we propose a novel metric for evaluating ranking results, as used for both completion tasks. It allows for easier interpretation and comparison, especially in cases with different numbers of ranking candidates, as encountered in the de-facto standard evaluation protocols for both tasks.Mit der rasant fortschreitenden Digitalisierung des privaten, kommerziellen und öffentlichen Sektors werden immer größere Datenmengen verfügbar. Um aus diesen enormen Mengen an Rohdaten Erkenntnisse oder Wissen zu gewinnen, ist eine tiefgehende Analyse unerlässlich. Das immense Volumen erfordert hochautomatisierte Prozesse mit minimaler manueller Interaktion. In den letzten Jahren haben Methoden des maschinellen Lernens eine zentrale Rolle bei dieser Aufgabe eingenommen. Neben den einzelnen Datenpunkten spielen oft auch deren Zusammenhänge eine entscheidende Rolle, z.B. ob zwei Patienten miteinander verwandt sind oder ob sie vom selben Arzt behandelt werden. Daher ist das relationale Lernen ein wichtiger Forschungszweig, der untersucht, wie diese explizit verfügbaren strukturellen Informationen zwischen verschiedenen Datenpunkten nutzbar gemacht werden können. In letzter Zeit haben Graph Neural Networks an Bedeutung gewonnen. Diese können als eine Erweiterung von CNNs von regelmäßigen Gittern auf allgemeine (unregelmäßige) Graphen betrachtet werden.
Wissensgraphen spielen eine wesentliche Rolle bei der Darstellung von Fakten über Entitäten in maschinenlesbaren Form. Obwohl große Anstrengungen unternommen werden, so viele Fakten wie möglich in diesen Graphen zu speichern, bleiben sie oft unvollständig, d. h. es fehlen Fakten. Die manuelle Überprüfung und Erweiterung der Graphen wird aufgrund der großen Datenmengen immer schwieriger und muss daher durch automatisierte Verfahren unterstützt oder ersetzt werden, die fehlende Fakten vorhersagen. Das Gebiet der Wissensgraphenvervollständigung lässt sich grob in zwei Kategorien einteilen: Link Prediction und Entity Alignment. Bei der Link Prediction werden maschinelle Lernmodelle trainiert, um unbekannte Fakten zwischen Entitäten auf der Grundlage der bekannten Fakten vorherzusagen. Entity Alignment zielt darauf ab, gemeinsame Entitäten zwischen Graphen zu identifizieren, um mehrere solcher Wissensgraphen auf der Grundlage einiger vorgegebener Paare zu verknüpfen.
In dieser Arbeit stellen wir wichtige Fortschritte auf dem Gebiet der Vervollständigung von Wissensgraphen vor. Für das Entity Alignment zeigen wir, wie die Anzahl der benötigten Paare reduziert werden kann, während die Leistung durch neuartige aktive Lerntechniken erhalten bleibt. Wir erörtern auch die Leistungsfähigkeit von Textmerkmalen und zeigen, dass auf Graph-Neural-Networks basierende Methoden Schwierigkeiten mit verrauschten Paar-Daten haben. Für die Link Prediction demonstrieren wir, wie die Vorhersage für unbekannte Entitäten zur Trainingszeit verbessert werden kann, indem zusätzliche Metadaten zu einzelnen Aussagen genutzt werden, die oft in modernen Graphen verfügbar sind. Gestützt auf Ergebnisse einer groß angelegten experimentellen Studie präsentieren wir eine Analyse der Auswirkungen einzelner Komponenten von Modellen des maschinellen Lernens, z. B. der Interaktionsfunktion oder des Verlustkriteriums, auf die Aufgabe der Link Prediction. Außerdem stellen wir eine Softwarebibliothek vor, die die Implementierung und Untersuchung solcher Komponenten vereinfacht und sie einer breiten Forschungsgemeinschaft zugänglich macht, die von Forschern im Bereich des relationalen Lernens bis hin zu angewandten Bereichen wie den Biowissenschaften reicht. Schließlich schlagen wir eine neuartige Metrik für die Bewertung von Ranking-Ergebnissen vor, wie sie für beide Aufgaben verwendet wird. Sie ermöglicht eine einfachere Interpretation und einen leichteren Vergleich, insbesondere in Fällen mit einer unterschiedlichen Anzahl von Kandidaten, wie sie in den de-facto Standardbewertungsprotokollen für beide Aufgaben vorkommen
Recommended from our members
Brain signal recognition using deep learning
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityBrain Computer Interface (BCI) has the potential to offer a new generation of applications independent of
muscular activity and controlled by the human brain. Brain imaging technologies are used to transfer the
cognitive tasks into control commands for a BCI system. The electroencephalography (EEG) technology
serves as the best available non-invasive solution for extracting signals from the brain. On the other hand,
speech is the primary means of communication, but for patients suffering from locked-in syndrome, there
is no easy way to communicate. Therefore, an ideal communication system for locked-in patients is a
thought-to-speech BCI system.
This research aims to investigate methods for the recognition of imagined speech from EEG signals
using deep learning techniques. In order to design an optimal imagined speech recognition BCI, variety
of issues have been solved. These include 1) proposing new feature extraction and classification
framework for recognition of imagined speech from EEG signals, 2) grammatical class recognition of
imagined words from EEG signals, 3) discriminating different cognitive tasks associated with speech in
the brain such as overt speech, covert speech, and visual imagery. In this work machine learning, deep
learning methods were used to analyze EEG signals.
For recognition of imagined speech from EEG signals, a new EEG database was collected while the
participants mentally spoke (imagined speech) the presented words. Along with imagined speech, EEG
data was recorded for visual imagery (imagining a scene or an image) and overt speech (verbal speech).
Spectro-temporal and spatio-temporal domain features were investigated for the classification of imagined
words from EEG signals. Further, a deep learning framework using the convolutional network
and attention mechanism was implemented for learning features in the spatial, temporal, and spectral
domains. The method achieved a recognition rate of 76.6% for three binary word pairs. These experiments
show that deep learning algorithms are ideal for imagined speech recognition from EEG signals
due to their ability to interpret features from non-linear and non-stationary signals. Grammatical classes
of imagined words from EEG signals were also recognized using a multi-channel convolution network
framework. This method was extended to a multi-level recognition system for multi-class classification
of imagined words which achieved an accuracy of 52.9% for 10 words, which is much better in
comparison to previous work.
In order to investigate the difference between imagined speech with verbal speech and visual imagery
from EEG signals, we used multivariate pattern analysis (MVPA). MVPA provided the time segments
when the neural oscillation for the different cognitive tasks was linearly separable. Further, frequencies
that result in most discrimination between the different cognitive tasks were also explored. A framework
was proposed to discriminate two cognitive tasks based on the spatio-temporal patterns in EEG signals.
The proposed method used the K-means clustering algorithm to find the best electrode combination and
convolutional-attention network for feature extraction and classification. The proposed method achieved
a high recognition rate of 82.9% and 77.7%.
The results in this research suggest that a communication based BCI system can be designed using
deep learning methods. Further, this work add knowledge to the existing work in the field of communication
based BCI system
Ion interactions in ionic liquids
Ionic liquids are being intensively investigated as more sustainable chemicals for many applications
due to their advantageous physicochemical properties. The main reason for this interest
is the synthetic flexibility associated with them, as there is a never-ending number of different
possible combinations of anions and cations, which can lead to compounds with very distinct
properties. However, the research interest has progressed fast into developing ionic liquids for
specific applications, while we are still lacking fundamental knowledge on the interactions taking
place inside of them. As a result of this, many models fail to provide accurate predictions
about the properties or reactivity of ionic liquids. This thesis presents fundamental research
on the structure-property relationships of ionic liquids, emphasising on aspects like the effects
of specific functional groups or symmetry on their physical properties. Furthermore, Electron
Paramagnetic Resonance spectroscopy is introduced as a versatile tool for characterising the
chemical environment inside of an ionic liquid. Using nitroxide spin probes as ‘spies’, we are
trying to identify the relationship between the bulk properties (e.g. viscosity) and the interactions
the radicals feel in their immediate environment. The results indicate that depending on
the examined ionic liquid and the used spin probe there is a plethora of microscale interactions
which are in no way deduced from the physicochemical studies of bulk properties.Open Acces
"Disconnecting Something From Anything": Fetishized Objects, Alienated Subjects, and Literary Modernism
This dissertation explores modernist attitudes toward the commodity and the process of commodification under late capitalism. Some modernists, notably those commonly referred to as the "men of 1914," lament a reversal of the presumed proper relationship between subject and object, in which people become passive as a result of the mechanical routines of the workplace, and objects gain perverse independence from their human creators. My dissertation suggests that there is a feminist alternative to this familiar, hegemonic modernist critique in the work of Gertrude Stein, Djuna Barnes, and Virginia Woolf. For Stein, Barnes, and Woolf, the problem with commodification is not passive subjects and animated objects, but, to the contrary, domineering subjects and a fungible object world. Stein, Barnes, and Woolf seek not to reclaim humanitys world-creating powers, but to re-enchant the world of things and discover modes of ethical passivity that enable a more receptive, hospitable relationship to alterity.
In articulating this alternative critique, I distinguish my position from two strains of modernist scholarship, one that acknowledges only one critique of commodification—that of the "men of 1914"—and a wave of scholarship that considers itself as, in the words of Kathryn Simpson, "exploding the myth [...] of modernist writers' and artists' absolute disinterest, detachment and contempt for popular and consumer culture" (1). While I align myself with the latter contingent, I differentiate my position through a consideration of the ways in which certain modernists reformulate a critique of the commodity in less absolutist and naïve terms. I argue that Stein, Barnes, and Woolf advance immanent critiques that do not presume to stand outside the commodity industry but draw power from certain tensions within commodification. Specifically, their critique is animated by a paradox: by exaggerating the alienation and fetishism characteristic of commodification, they hope to combat the commodity's reifying logic
The new age of fear: an analysis of crisis framing by right-wing populist parties in Greece and France
From the 2009 Eurozone economic downturn, to the 2015 mass movement of forcibly displaced migrants and the current COVID-19 pandemic, crises have seemingly become a ‘new normal’ feature of European politics. During this decade, rolling crises generated a wave of public discontent that damaged the legitimacy of national governments and the European Union and heralded a renaissance of populism. The central message of populist parties, which helped them rise in popularity or enter parliament for the first time, is simple but very effective: democratic representation has been undermined by national and global elites. This has provoked a wealth of studies seeking to explain the rise or breakthrough of populist fringe parties, without adequate consideration of how crises transform, not only the demand side, but also the supply of populist arguments, which has received scarce attention.
This thesis seeks to address this imbalance by synthesising insights from the crisis framing literature, which facilitates an understanding and operationalisation of populism as a style of discourse. To assess how far-right parties employ this discourse, and the implications of this for their electoral prospects, a comparative case-study design is employed, exploring the discourse of parties, the National Rally (NR) in France and Golden Dawn (GD) in Greece. Their ideologically similar profile but differential electoral performance, allows for a more nuanced analysis of their respective framing strategies.
The thesis examines the discourse of the two parties MPs on month by month basis over a four year period, 2012-2015 for GD and 2012-2013 and 2016-2017 for NR, via the use of the NVivo software. Their respective discourses are quantified and broken down into four key areas associated with Foreign Policy, the Economy, the Political System and Society, analysing the content, frequency and salience of key crisis frames. Discourse analysis of excerpts adds a qualitative element to the analysis that showcases the substantial differences between the two case studies. The analysis demonstrates that references to ‘the people’ and anti-elitism were the centrepieces of each case study’s discourse with strong nativist and nationalist elements.
The two parties were extremely similar in the diagnostic stage of their framing and the way which they attribute blame for the crises. However, their discursive strategies diverge regarding their proposed solutions to the crises. Golden Dawn remained a single issue party in terms of discourse, since it never presented a comprehensive plan for ending the crises. As a result, Golden Dawn’s discourse remained one-dimensional throughout its brief period of success, being centred solely on attributing blame and attacking its political opponents and the European Union. On the other hand, National Rally’s framing was more elaborate and ambitious both in terms of the variety of issues raised and, especially, the proposed solutions if advocated. This, it is argued, contributed to the evolution of RN into a mainstream competitor that is no longer dependent on a niche part of the electoral market, while the inability of GD to develop equally successful crisis frames offers a unique understanding as to why the party failed electorally and was unable to enter Parliament in the 2019 elections. The overall analysis produces a rich framework that maps out the key elements of populist crisis discourse by far-right parties, which has implications for electoral politics and for our understanding of populism, more broadly
- …