10 research outputs found

    Probabilistic Models over Ordered Partitions with Application in Learning to Rank

    Get PDF
    This paper addresses the general problem of modelling and learning rank data with ties. We propose a probabilistic generative model, that models the process as permutations over partitions. This results in super-exponential combinatorial state space with unknown numbers of partitions and unknown ordering among them. We approach the problem from the discrete choice theory, where subsets are chosen in a stagewise manner, reducing the state space per each stage significantly. Further, we show that with suitable parameterisation, we can still learn the models in linear time. We evaluate the proposed models on the problem of learning to rank with the data from the recently held Yahoo! challenge, and demonstrate that the models are competitive against well-known rivals.Comment: 19 pages, 2 figure

    Automatic Quality Estimation for ASR System Combination

    Get PDF
    Recognizer Output Voting Error Reduction (ROVER) has been widely used for system combination in automatic speech recognition (ASR). In order to select the most appropriate words to insert at each position in the output transcriptions, some ROVER extensions rely on critical information such as confidence scores and other ASR decoder features. This information, which is not always available, highly depends on the decoding process and sometimes tends to over estimate the real quality of the recognized words. In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses. We first introduce an effective set of features to compensate for the absence of ASR decoder information. Then, we apply QE techniques to perform accurate hypothesis ranking at segment-level before starting the fusion process. The evaluation is carried out on two different tasks, in which we respectively combine hypotheses coming from independent ASR systems and multi-microphone recordings. In both tasks, it is assumed that the ASR decoder information is not available. The proposed approach significantly outperforms standard ROVER and it is competitive with two strong oracles that e xploit prior knowledge about the real quality of the hypotheses to be combined. Compared to standard ROVER, the abs olute WER improvements in the two evaluation scenarios range from 0.5% to 7.3%

    A cross-benchmark comparison of 87 learning to rank methods

    Get PDF
    Learning to rank is an increasingly important scientific field that comprises the use of machine learning for the ranking task. New learning to rank methods are generally evaluated on benchmark test collections. However, comparison of learning to rank methods based on evaluation results is hindered by the absence of a standard set of evaluation benchmark collections. In this paper we propose a way to compare learning to rank methods based on a sparse set of evaluation results on a set of benchmark datasets. Our comparison methodology consists of two components: (1) Normalized Winning Number, which gives insight in the ranking accuracy of the learning to rank method, and (2) Ideal Winning Number, which gives insight in the degree of certainty concerning its ranking accuracy. Evaluation results of 87 learning to rank methods on 20 well-known benchmark datasets are collected through a structured literature search. ListNet, SmoothRank, FenchelRank, FSMRank, LRUF and LARF are Pareto optimal learning to rank methods in the Normalized Winning Number and Ideal Winning Number dimensions, listed in increasing order of Normalized Winning Number and decreasing order of Ideal Winning Number

    Online Mental Fatigue Monitoring via Indirect Brain Dynamics Evaluation

    Full text link
    Driver mental fatigue leads to thousands of traffic accidents. The increasing quality and availability of low-cost electroencephalogram (EEG) systems offer possibilities for practical fatigue monitoring. However, non-data-driven methods, designed for practical, complex situations, usually rely on handcrafted data statistics of EEG signals. To reduce human involvement, we introduce a data-driven methodology for online mental fatigue detection: self-weight ordinal regression (SWORE). Reaction time (RT), referring to the length of time people take to react to an emergency, is widely considered an objective behavioral measure for mental fatigue state. Since regression methods are sensitive to extreme RTs, we propose an indirect RT estimation based on preferences to explore the relationship between EEG and RT, which generalizes to any scenario when an objective fatigue indicator is available. In particular, SWORE evaluates the noisy EEG signals from multiple channels in terms of two states: shaking state and steady state. Modeling the shaking state can discriminate the reliable channels from the uninformative ones, while modeling the steady state can suppress the task-nonrelevant fluctuation within each channel. In addition, an online generalized Bayesian moment matching (online GBMM) algorithm is proposed to online-calibrate SWORE efficiently per participant. Experimental results with 40 participants show that SWORE can maximally achieve consistent with RT, demonstrating the feasibility and adaptability of our proposed framework in practical mental fatigue estimation

    Concept of Interactive Machine Learning in Urban Design Problems

    Get PDF
    This work presents a concept of interactive machine learning in a human design process. An urban design problem is viewed as  a multiple-criteria optimization problem. The outlined feature  of an urban design problem is the dependence of a design  goal on a context of the problem. We model the design goal  as a randomized fitness measure that depends on the context.  In terms of multiple-criteria decision analysis (MCDA), the  defined measure corresponds to a subjective expected utility  of a user.  In the first stage of the proposed approach we let the algorithm  explore a design space using clustering techniques. The second  stage is an interactive design loop; the user makes a proposal,  then the program optimizes it, gets the user’s feedback and  returns back the control over the application interface

    Supervised Preference Models: Data and Storage, Methods, and Tools for Application

    Get PDF
    In this thesis, we present a variety of models commonly known as pairwise comparisons, discrete choice and learning to rank under one paradigm that we call preference models. We discuss these approaches together with the intention to show that these belong to the same family and show a unified notation to express these. We focus on supervised machine learning approaches to predict preferences, present existing approaches and identify gaps in the literature. We discuss reduction and aggregation, a key technique used in this field and identify that there are no existing guidelines for how to create probabilistic aggregations, which is a topic we begin exploring. We also identify that there are no machine learning interfaces in Python that can account well for hosting a variety of types of preference models and giving a seamless user experience when it comes to using commonly recurring concepts in preference models, specifically, reduction, aggregation and compositions of sequential decision making. Therefore, we present our idea of what such software should look like in Python and show the current state of the development of this package which we call skpref

    Proceedings of the 8th Workshop on Semantic Ambient Media Experiences (SAME 2016): Smart Cities for Better Living with HCI and UX - SEACHI Extended Papers

    Get PDF
    Digital and interactive technologies are becoming increasingly embedded in everyday lives of people around the world. Application of technologies such as real-time, context-aware, and interactive technologies; augmented and immersive realities; social media; and location-based services has been particularly evident in urban environments where technological and sociocultural infrastructures enable easier deployment and adoption as compared to non-urban areas. There has been growing consumer demand for new forms of experiences and services enabled through these emerging technologies. We call this ambient media, as the media is embedded in the natural human living environment. The 8th Semantic Ambient Media Workshop Experience (SAME) Proceedings where based on a collaboration between the SEACHI Workshop Smart Cities for Better Living with HCI and UX, which has been organized by UX Indonesia and was held in conjunction with Computers and Human-Computer Interaction (CHI) 2016 in San Jose, CA USA. The extended versions of the workshop papers are freely available through http://www.ambientmediaassociation.org/Journal under open access by the International Ambient Media Association (iAMEA). iAMEA is hosting the international open access journal entitled “International Journal on Information Systems and Management in Creative eMediaâ€, and the international open access series “International Series on Information Systems and Management in Creative eMedia†(see http://www.ambientmediaassociation.org). The International Ambient Media Association (AMEA) organizes the Semantic Ambient Media (SAME) workshop series, which took place in 2008 in conjunction with ACM Multimedia 2008 in Vancouver, Canada; in 2009 in conjunction with AmI 2009 in Salzburg, Austria; in 2010 in conjunction with AmI 2010 in Malaga, Spain; in 2011 in conjunction with Communities and Technologies 2011 in Brisbane, Australia; in 2012 in conjunction with Pervasive 2012 in Newcastle, UK; and in 2013 in conjunction with C&T 2013 in Munich, Germany; and in 2014 in conjunction with NordCHI 2014 in Helsinki, Finland. The workshop organizers present you a fascinating crossover of latest cutting edge views on the topic of ambient media, and hope you will be enjoying the reading. We also would like to thank all the contributors, as only with their enthusiasm the workshop can become a success. At least we would like to thank the lovely organizing team of CHI 2016, the SEACHI 2016 organisers, and our programme committee members

    Extending low-rank matrix factorizations for emerging applications

    Get PDF
    Low-rank matrix factorizations have become increasingly popular to project high dimensional data into latent spaces with small dimensions in order to obtain better understandings of the data and thus more accurate predictions. In particular, they have been widely applied to important applications such as collaborative filtering and social network analysis. In this thesis, I investigate the applications and extensions of the ideas of the low-rank matrix factorization to solve several practically important problems arise from collaborative filtering and social network analysis. A key challenge in recommendation system research is how to effectively profile new users, a problem generally known as \emph{cold-start} recommendation. In the first part of this work, we extend the low-rank matrix factorization by allowing the latent factors to have more complex structures --- decision trees to solve the problem of cold-start recommendations. In particular, we present \emph{functional matrix factorization} (fMF), a novel cold-start recommendation method that solves the problem of adaptive interview construction based on low-rank matrix factorizations. The second part of this work considers the efficiency problem of making recommendations in the context of large user and item spaces. Specifically, we address the problem through learning binary codes for collaborative filtering, which can be viewed as restricting the latent factors in low-rank matrix factorizations to be binary vectors that represent the binary codes for both users and items. In the third part of this work, we investigate the applications of low-rank matrix factorizations in the context of social network analysis. Specifically, we propose a convex optimization approach to discover the hidden network of social influence with low-rank and sparse structure by modeling the recurrent events at different individuals as multi-dimensional Hawkes processes, emphasizing the mutual-excitation nature of the dynamics of event occurrences. The proposed framework combines the estimation of mutually exciting process and the low-rank matrix factorization in a principled manner. In the fourth part of this work, we estimate the triggering kernels for the Hawkes process. In particular, we focus on estimating the triggering kernels from an infinite dimensional functional space with the Euler Lagrange equation, which can be viewed as applying the idea of low-rank factorizations in the functional space.Ph.D
    corecore