4,796 research outputs found
Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project
In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected âhyper-events â (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks. Index Terms: Networked multimedia events; audio processing: speech recognition; speaker diarization and linking; multimedia indexing and searching; hyper-events. 1
A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression
Many machine learning problems can be formulated as predicting labels for a
pair of objects. Problems of that kind are often referred to as pairwise
learning, dyadic prediction or network inference problems. During the last
decade kernel methods have played a dominant role in pairwise learning. They
still obtain a state-of-the-art predictive performance, but a theoretical
analysis of their behavior has been underexplored in the machine learning
literature.
In this work we review and unify existing kernel-based algorithms that are
commonly used in different pairwise learning settings, ranging from matrix
filtering to zero-shot learning. To this end, we focus on closed-form efficient
instantiations of Kronecker kernel ridge regression. We show that independent
task kernel ridge regression, two-step kernel ridge regression and a linear
matrix filter arise naturally as a special case of Kronecker kernel ridge
regression, implying that all these methods implicitly minimize a squared loss.
In addition, we analyze universality, consistency and spectral filtering
properties. Our theoretical results provide valuable insights in assessing the
advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427
Missing Value Imputation With Unsupervised Backpropagation
Many data mining and data analysis techniques operate on dense matrices or
complete tables of data. Real-world data sets, however, often contain unknown
values. Even many classification algorithms that are designed to operate with
missing values still exhibit deteriorated accuracy. One approach to handling
missing values is to fill in (impute) the missing values. In this paper, we
present a technique for unsupervised learning called Unsupervised
Backpropagation (UBP), which trains a multi-layer perceptron to fit to the
manifold sampled by a set of observed point-vectors. We evaluate UBP with the
task of imputing missing values in datasets, and show that UBP is able to
predict missing values with significantly lower sum-squared error than other
collaborative filtering and imputation techniques. We also demonstrate with 24
datasets and 9 supervised learning algorithms that classification accuracy is
usually higher when randomly-withheld values are imputed using UBP, rather than
with other methods
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
An Ensemble Model-Based Recommendation Approach for Consumer Decision-Making System
A recommendation system can suggest items aligned with diverse user interests by leveraging multiple sources of information. While many recommendation systems heavily rely on the collaborative filtering (CF) approachâwhere user preference data is combined with others to predict additional items of potential interestâthis study introduces a novel weighted recommendation system to enhance consumer decision-making using CF. The methodology includes the development of equations to calculate the weights for both the product and review, as well as to determine the similarity between consumer reviews. To ensemble the model, Random Forest (RF), Support Vector Machine (SVM), and Logistic Regression (LR) are employed in the methodology. The study considers Ensemble Classifiers (RF+SVM+LR) to implement the results, aiming for improved outcomes compared to prior research. The proposed model is trained and tested using an open-source dataset on Kaggle's website. Numerical analysis of the proposed model reveals superior performance, outperforming conventional methods in terms of accuracy (0.821), precision (0.802), recall (0.821), F-measure (0.833), error rate (0.100), and more
Predictive Accuracy of Recommender Algorithms
Recommender systems present a customized list of items based upon user or item characteristics with the objective of reducing a large number of possible choices to a smaller ranked set most likely to appeal to the user. A variety of algorithms for recommender systems have been developed and refined including applications of deep learning neural networks. Recent research reports point to a need to perform carefully controlled experiments to gain insights about the relative accuracy of different recommender algorithms, because studies evaluating different methods have not used a common set of benchmark data sets, baseline models, and evaluation metrics. The dissertation used publicly available sources of ratings data with a suite of three conventional recommender algorithms and two deep learning (DL) algorithms in controlled experiments to assess their comparative accuracy. Results for the non-DL algorithms conformed well to published results and benchmarks. The two DL algorithms did not perform as well and illuminated known challenges implementing DL recommender algorithms as reported in the literature. Model overfitting is discussed as a potential explanation for the weaker performance of the DL algorithms and several regularization strategies are reviewed as possible approaches to improve predictive error. Findings justify the need for further research in the use of deep learning models for recommender systems
Pole-to-Pole Connections : Similarities between Arctic and Antarctic Microbiomes and Their Vulnerability to Environmental Change
Acknowledgments JK acknowledges the Carl Zeiss foundation for PhD funding, the Marie-Curie COFUND-BEIPD PostDoc fellowship for PostDoc funding, FNRS travel funding and the logistical and financial support by UNIS. JK and FK acknowledge the Natural Environment Research Council (NERC) Antarctic Funding Initiative AFI-CGS-70 (collaborative gearing scheme) and logistic support from the British Antarctic Survey (BAS) for field work in Antarctica. JK and CZ acknowledge the Excellence Initiative at the University of TĂŒbingen funded by the German Federal Ministry of Education and Research and the German Research Foundation (DFG). FH, AV, and PB received funding from MetaHIT (HEALTH-F4-2007-201052), Microbios (ERC-AdG-502 669830) and the European Molecular Biology Laboratory (EMBL). We thank members of the Bork group at EMBL for helpful discussions. We acknowledge the EMBL Genomics Core Facility for sequencing support and Y. P. Yuan and the EMBL Information Technology Core Facility for support with high-performance computing and EMBL for financial support. PC is supported by NERC core funding to the BAS âBiodiversity, Evolution and Adaptationâ Team. MB was funded by Helge Ax:son Johnsons Stiftelse and PUT1317. DRD acknowledges the DFG funded project DI698/18-1 Dietrich and the Marie Curie International Research Staff Exchange Scheme Fellowship (PIRSES-GA-2011-295223). Operations in the Canadian High Arctic were supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), ArcticNet and the Polar Continental Shelf Program (PCSP). We are also grateful to the TOTAL Foundation (Paris) and the UK NERC (WP 4.3 of Oceans 2025 core funding to FCK at the Scottish Association for Marine Science) for funding the expedition to Baffin Island and within this context Olivier Dargent and Dr. Pieter van West for sample collection, and the Spanish Ministry of Science and Technology through project LIMNOPOLAR (POL200606635 and CGL2005-06549-C02-01/ANT to AQ as well as CGL2005-06549-C02-02/ANT to AC, the last of these co-financed by European FEDER funds). We are grateful for funding from the MASTS pooling initiative (The Marine Alliance for Science and Technology for Scotland), funded by the Scottish Funding Council (HR09011) and contributing institutions. Supplementary Material The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo.2017.00137/full#supplementary-materialPeer reviewedPublisher PD
Large Scale Visual Recommendations From Street Fashion Images
We describe a completely automated large scale visual recommendation system
for fashion. Our focus is to efficiently harness the availability of large
quantities of online fashion images and their rich meta-data. Specifically, we
propose four data driven models in the form of Complementary Nearest Neighbor
Consensus, Gaussian Mixture Models, Texture Agnostic Retrieval and Markov Chain
LDA for solving this problem. We analyze relative merits and pitfalls of these
algorithms through extensive experimentation on a large-scale data set and
baseline them against existing ideas from color science. We also illustrate key
fashion insights learned through these experiments and show how they can be
employed to design better recommendation systems. Finally, we also outline a
large-scale annotated data set of fashion images (Fashion-136K) that can be
exploited for future vision research
- âŠ