4,796 research outputs found

    Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project

    Get PDF
    In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events ” (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks. Index Terms: Networked multimedia events; audio processing: speech recognition; speaker diarization and linking; multimedia indexing and searching; hyper-events. 1

    A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression

    Full text link
    Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction or network inference problems. During the last decade kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify existing kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency and spectral filtering properties. Our theoretical results provide valuable insights in assessing the advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427

    Missing Value Imputation With Unsupervised Backpropagation

    Full text link
    Many data mining and data analysis techniques operate on dense matrices or complete tables of data. Real-world data sets, however, often contain unknown values. Even many classification algorithms that are designed to operate with missing values still exhibit deteriorated accuracy. One approach to handling missing values is to fill in (impute) the missing values. In this paper, we present a technique for unsupervised learning called Unsupervised Backpropagation (UBP), which trains a multi-layer perceptron to fit to the manifold sampled by a set of observed point-vectors. We evaluate UBP with the task of imputing missing values in datasets, and show that UBP is able to predict missing values with significantly lower sum-squared error than other collaborative filtering and imputation techniques. We also demonstrate with 24 datasets and 9 supervised learning algorithms that classification accuracy is usually higher when randomly-withheld values are imputed using UBP, rather than with other methods

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Estimating Optimal Weights in Hybrid Recommender Systems

    Get PDF

    An Ensemble Model-Based Recommendation Approach for Consumer Decision-Making System

    Get PDF
    A recommendation system can suggest items aligned with diverse user interests by leveraging multiple sources of information. While many recommendation systems heavily rely on the collaborative filtering (CF) approach—where user preference data is combined with others to predict additional items of potential interest—this study introduces a novel weighted recommendation system to enhance consumer decision-making using CF. The methodology includes the development of equations to calculate the weights for both the product and review, as well as to determine the similarity between consumer reviews. To ensemble the model, Random Forest (RF), Support Vector Machine (SVM), and Logistic Regression (LR) are employed in the methodology. The study considers Ensemble Classifiers (RF+SVM+LR) to implement the results, aiming for improved outcomes compared to prior research. The proposed model is trained and tested using an open-source dataset on Kaggle's website. Numerical analysis of the proposed model reveals superior performance, outperforming conventional methods in terms of accuracy (0.821), precision (0.802), recall (0.821), F-measure (0.833), error rate (0.100), and more

    Predictive Accuracy of Recommender Algorithms

    Get PDF
    Recommender systems present a customized list of items based upon user or item characteristics with the objective of reducing a large number of possible choices to a smaller ranked set most likely to appeal to the user. A variety of algorithms for recommender systems have been developed and refined including applications of deep learning neural networks. Recent research reports point to a need to perform carefully controlled experiments to gain insights about the relative accuracy of different recommender algorithms, because studies evaluating different methods have not used a common set of benchmark data sets, baseline models, and evaluation metrics. The dissertation used publicly available sources of ratings data with a suite of three conventional recommender algorithms and two deep learning (DL) algorithms in controlled experiments to assess their comparative accuracy. Results for the non-DL algorithms conformed well to published results and benchmarks. The two DL algorithms did not perform as well and illuminated known challenges implementing DL recommender algorithms as reported in the literature. Model overfitting is discussed as a potential explanation for the weaker performance of the DL algorithms and several regularization strategies are reviewed as possible approaches to improve predictive error. Findings justify the need for further research in the use of deep learning models for recommender systems

    Pole-to-Pole Connections : Similarities between Arctic and Antarctic Microbiomes and Their Vulnerability to Environmental Change

    Get PDF
    Acknowledgments JK acknowledges the Carl Zeiss foundation for PhD funding, the Marie-Curie COFUND-BEIPD PostDoc fellowship for PostDoc funding, FNRS travel funding and the logistical and financial support by UNIS. JK and FK acknowledge the Natural Environment Research Council (NERC) Antarctic Funding Initiative AFI-CGS-70 (collaborative gearing scheme) and logistic support from the British Antarctic Survey (BAS) for field work in Antarctica. JK and CZ acknowledge the Excellence Initiative at the University of TĂŒbingen funded by the German Federal Ministry of Education and Research and the German Research Foundation (DFG). FH, AV, and PB received funding from MetaHIT (HEALTH-F4-2007-201052), Microbios (ERC-AdG-502 669830) and the European Molecular Biology Laboratory (EMBL). We thank members of the Bork group at EMBL for helpful discussions. We acknowledge the EMBL Genomics Core Facility for sequencing support and Y. P. Yuan and the EMBL Information Technology Core Facility for support with high-performance computing and EMBL for financial support. PC is supported by NERC core funding to the BAS “Biodiversity, Evolution and Adaptation” Team. MB was funded by Helge Ax:son Johnsons Stiftelse and PUT1317. DRD acknowledges the DFG funded project DI698/18-1 Dietrich and the Marie Curie International Research Staff Exchange Scheme Fellowship (PIRSES-GA-2011-295223). Operations in the Canadian High Arctic were supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), ArcticNet and the Polar Continental Shelf Program (PCSP). We are also grateful to the TOTAL Foundation (Paris) and the UK NERC (WP 4.3 of Oceans 2025 core funding to FCK at the Scottish Association for Marine Science) for funding the expedition to Baffin Island and within this context Olivier Dargent and Dr. Pieter van West for sample collection, and the Spanish Ministry of Science and Technology through project LIMNOPOLAR (POL200606635 and CGL2005-06549-C02-01/ANT to AQ as well as CGL2005-06549-C02-02/ANT to AC, the last of these co-financed by European FEDER funds). We are grateful for funding from the MASTS pooling initiative (The Marine Alliance for Science and Technology for Scotland), funded by the Scottish Funding Council (HR09011) and contributing institutions. Supplementary Material The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo.2017.00137/full#supplementary-materialPeer reviewedPublisher PD

    Large Scale Visual Recommendations From Street Fashion Images

    Full text link
    We describe a completely automated large scale visual recommendation system for fashion. Our focus is to efficiently harness the availability of large quantities of online fashion images and their rich meta-data. Specifically, we propose four data driven models in the form of Complementary Nearest Neighbor Consensus, Gaussian Mixture Models, Texture Agnostic Retrieval and Markov Chain LDA for solving this problem. We analyze relative merits and pitfalls of these algorithms through extensive experimentation on a large-scale data set and baseline them against existing ideas from color science. We also illustrate key fashion insights learned through these experiments and show how they can be employed to design better recommendation systems. Finally, we also outline a large-scale annotated data set of fashion images (Fashion-136K) that can be exploited for future vision research
    • 

    corecore