211 research outputs found

    An Analytical Performance Evaluation on Multiview Clustering Approaches

    Get PDF
    The concept of machine learning encompasses a wide variety of different approaches, one of which is called clustering. The data points are grouped together in this approach to the problem. Using a clustering method, it is feasible, given a collection of data points, to classify each data point as belonging to a specific group. This can be done if the algorithm is given the collection of data points. In theory, data points that constitute the same group ought to have attributes and characteristics that are equivalent to one another, however data points that belong to other groups ought to have properties and characteristics that are very different from one another. The generation of multiview data is made possible by recent developments in information collecting technologies. The data were collected from Ă  variety of sources and were analysed using a variety of perspectives. The data in question are what are known as multiview data. On a single view, the conventional clustering algorithms are applied. In spite of this, real-world data are complicated and can be clustered in a variety of different ways, depending on how the data are interpreted. In practise, the real-world data are messy. In recent years, Multiview Clustering, often known as MVC, has garnered an increasing amount of attention due to its goal of utilising complimentary and consensus information derived from different points of view. On the other hand, the vast majority of the systems that are currently available only enable the single-clustering scenario, whereby only makes utilization of a single cluster to split the data. This is the case since there is only one cluster accessible. In light of this, it is absolutely necessary to carry out investigation on the multiview data format. The study work is centred on multiview clustering and how well it performs compared to these other strategies

    Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

    Get PDF
    This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

    Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

    Full text link
    Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research

    Learning from limited labeled data - Zero-Shot and Few-Shot Learning

    Get PDF
    Human beings have the remarkable ability to recognize novel visual concepts after observing only few or zero examples of them. Deep learning, however, often requires a large amount of labeled data to achieve a good performance. Labeled instances are expensive, difficult and even infeasible to obtain because the distribution of training instances among labels naturally exhibits a long tail. Therefore, it is of great interest to investigate how to learn efficiently from limited labeled data. This thesis concerns an important subfield of learning from limited labeled data, namely, low-shot learning. The setting assumes the availability of many labeled examples from known classes and the goal is to learn novel classes from only a few~(few-shot learning) or zero~(zero-shot learning) training examples of them. To this end, we have developed a series of multi-modal learning approaches to facilitate the knowledge transfer from known classes to novel classes for a wide range of visual recognition tasks including image classification, semantic image segmentation and video action recognition. More specifically, this thesis mainly makes the following contributions. First, as there is no agreed upon zero-shot image classification benchmark, we define a new benchmark by unifying both the evaluation protocols and data splits of publicly available datasets. Second, in order to tackle the labeled data scarcity, we propose feature generation frameworks that synthesize data in the visual feature space for novel classes. Third, we extend zero-shot learning and few-shot learning to the semantic segmentation task and propose a challenging benchmark for it. We show that incorporating semantic information into a semantic segmentation network is effective in segmenting novel classes. Finally, we develop better video representation for the few-shot video classification task and leverage weakly-labeled videos by an efficient retrieval method.Menschen haben die bemerkenswerte FĂ€higkeit, neuartige visuelle Konzepte zu erkennen, nachdem sie nur wenige oder gar keine Beispiele davon beobachtet haben. Tiefes Lernen erfordert jedoch oft eine große Menge an beschrifteten Daten, um eine gute Leistung zu erzielen. Etikettierte Instanzen sind teuer, schwierig und sogar undurchfĂŒhrbar, weil die Verteilung der Trainingsinstanzen auf die Etiketten naturgemĂ€ĂŸ einen langen Schwanz aufweist. Daher ist es von großem Interesse zu untersuchen, wie man effizient aus begrenzten gelabelten Daten lernen kann. Diese These betrifft einen wichtigen Teilbereich des Lernens aus begrenzt gelabelten Daten, nĂ€mlich das Low-Shot-Lernen. Das Setting setzt die VerfĂŒgbarkeit vieler gelabelter Beispiele aus bekannten Klassen voraus, und das Ziel ist es, neuartige Klassen aus nur wenigen (few-shot learning) oder null (zero-shot learning) Trainingsbeispielen davon zu lernen. Zu diesem Zweck haben wir eine Reihe von multimodalen LernansĂ€tzen entwickelt, um den Wissenstransfer von bekannten Klassen zu neuartigen Klassen fĂŒr ein breites Spektrum von visuellen Erkennungsaufgaben zu erleichtern, darunter Bildklassifizierung, semantische Bildsegmentierung und Videoaktionserkennung. Genauer gesagt, leistet diese Arbeit hauptsĂ€chlich die folgenden BeitrĂ€ge. Da es keinen vereinbarten Benchmark fĂŒr die Zero-Shot- Bildklassifikation gibt, definieren wir zunĂ€chst einen neuen Benchmark, indem wir sowohl die Evaluierungsprotokolle als auch die Datensplits öffentlich zugĂ€nglicher DatensĂ€tze vereinheitlichen. Zweitens schlagen wir zur BewĂ€ltigung der etikettierten Datenknappheit einen Rahmen fĂŒr die Generierung von Merkmalen vor, der Daten im visuellen Merkmalsraum fĂŒr neuartige Klassen synthetisiert. Drittens dehnen wir das Zero-Shot-Lernen und das few-Shot-Lernen auf die semantische Segmentierungsaufgabe aus und schlagen dafĂŒr einen anspruchsvollen Benchmark vor. Wir zeigen, dass die Einbindung semantischer Informationen in ein semantisches Segmentierungsnetz bei der Segmentierung neuartiger Klassen effektiv ist. Schließlich entwickeln wir eine bessere Videodarstellung fĂŒr die Klassifizierungsaufgabe ”few-shot video” und nutzen schwach markierte Videos durch eine effiziente Abrufmethode.Max Planck Institute Informatic

    Leveraging literals for knowledge graph embeddings

    Get PDF
    Wissensgraphen (Knowledge Graphs, KGs) reprĂ€sentieren strukturierte Fakten, die sich aus EntitĂ€ten und den zwischen diesen bestehenden Relationen zusammensetzen. Um die Effizienz von KG-Anwendungen zu maximieren, ist es von Vorteil, KGs in einen niedrigdimensionalen Vektorraum zu transformieren. KGs folgen dem Paradigma einer offenen Welt (Open World Assumption, OWA), d. h. fehlende Information wird als potenziell möglich angesehen, wodurch ihre Verwendung in realen Anwendungsszenarien oft eingeschrĂ€nkt wird. Link-Vorhersage (Link Prediction, LP) zur VervollstĂ€ndigung von KGs kommt daher eine hohe Bedeutung zu. LP kann in zwei unterschiedlichen Modi durchgefĂŒhrt werden, transduktiv und induktiv, wobei die erste Möglichkeit voraussetzt, dass alle EntitĂ€ten der Testdaten in den Trainingsdaten vorhanden sind, wĂ€hrend die zweite Möglichkeit auch zuvor nicht bekannte EntitĂ€ten in den Testdaten zulĂ€sst. Die vorliegende Arbeit untersucht die Verwendung von Literalen in der transduktiven und induktiven LP, da KGs zahlreiche numerische und textuelle Literale enthalten, die eine wesentliche Semantik aufweisen. Zur Evaluierung dieser LP Methoden werden spezielle Benchmark-DatensĂ€tze eingefĂŒhrt. Insbesondere wird eine neuartige KG Embedding (KGE) Methode, RAILD, vorgeschlagen, die Textliterale zusammen mit kontextuellen Graphinformationen fĂŒr die LP nutzt. Das Ziel von RAILD ist es, die bestehende ForschungslĂŒcke beim Lernen von Embeddings fĂŒr beim Training ungesehene Relationen zu schließen. DafĂŒr wird eine Architektur vorgeschlagen, die Sprachmodelle (Language Models, LMs) mit Netzwerkembeddings kombiniert. Hierzu erfolgt ein Feintuning von leistungsstarken vortrainierten LMs wie BERT zum Zweck der LP, wobei textuelle Beschreibungen von EntitĂ€ten und Relationen genutzt werden. DarĂŒber hinaus wird ein neuer Algorithmus, WeiDNeR, eingefĂŒhrt, um ein Relationsnetzwerk zu generieren, das zum Erlernen graphbasierter Embeddings von Relationen unter Verwendung eines Netzwerkembeddingsmodells dient. Die VektorreprĂ€sentationen dieser Relationen werden fĂŒr die LP kombiniert. Zudem wird ein weiteres neuartiges Embeddingmodell, LitKGE, vorgestellt, das numerische Literale fĂŒr die transduktive LP verwendet. Es zielt darauf ab, numerische Merkmale fĂŒr EntitĂ€ten durch Graphtraversierung zu erzeugen. HierfĂŒr wird ein weiterer Algorithmus, WeiDNeR_Extended, eingefĂŒhrt, der ein Netzwerk aus Objekt- und Datentypproperties erzeugt. Aus den aus diesem Netzwerk extrahierten Propertypfaden werden dann numerische Merkmale von EntitĂ€ten generiert. Des Weiteren wird der Einsatz eines mehrsprachigen LM zur Kodierung von EntitĂ€tenbeschreibungen in verschiedenen natĂŒrlichen Sprachen zum Zweck der LP untersucht. FĂŒr die Evaluierung der KGE-Modelle wurden die Benchmark-DatensĂ€tze LiterallyWikidata und Wikidata68K erstellt. Die vielversprechenden Ergebnisse, die mit den vorgestellten Modellen erzielt wurden, eröffnen interessante Fragestellungen fĂŒr die zukĂŒnftige Forschung auf dem Gebiet der KGEs und ihrer Folgeanwendungen

    A transfer learning approach to drug resistance classification in mixed HIV dataset

    Get PDF
    Funding: This research is funded by the Tertiary Education Trust Fund (TETFund), Nigeria.As we advance towards individualized therapy, the ‘one-size-fits-all’ regimen is gradually paving the way for adaptive techniques that address the complexities of failed treatments. Treatment failure is associated with factors such as poor drug adherence, adverse side effect/reaction, co-infection, lack of follow-up, drug-drug interaction and more. This paper implements a transfer learning approach that classifies patients' response to failed treatments due to adverse drug reactions. The research is motivated by the need for early detection of patients' response to treatments and the generation of domain-specific datasets to balance under-represented classification data, typical of low-income countries located in Sub-Saharan Africa. A soft computing model was pre-trained to cluster CD4+ counts and viral loads of treatment change episodes (TCEs) processed from two disparate sources: the Stanford HIV drug resistant database (https://hivdb.stanford.edu), or control dataset, and locally sourced patients' records from selected health centers in Akwa Ibom State, Nigeria, or mixed dataset. Both datasets were experimented on a traditional 2-layer neural network (NN) and a 5-layer deep neural network (DNN), with odd dropout neurons distribution resulting in the following configurations: NN (Parienti et al., 2004) [32], NN (Deniz et al., 2018) [53] and DNN [9 7 5 3 1]. To discern knowledge of failed treatment, DNN1 [9 7 5 3 1] and DNN2 [9 7 5 3 1] were introduced to model both datasets and only TCEs of patients at risk of drug resistance, respectively. Classification results revealed fewer misclassifications, with the DNN architecture yielding best performance measures. However, the transfer learning approach with DNN2 [9 7 3 1] configuration produced superior classification results when compared to other variants/configurations, with classification accuracy of 99.40%, and RMSE values of 0.0056, 0.0510, and 0.0362, for test, train, and overall datasets, respectively. The proposed system therefore indicates good generalization and is vital as decision-making support to clinicians/physicians for predicting patients at risk of adverse drug reactions. Although imbalanced features classification is typical of disease problems and diminishes dependence on classification accuracy, the proposed system still compared favorably with the literature and can be hybridized to improve its precision and recall rates.Publisher PDFPeer reviewe
    • 

    corecore