49 research outputs found

    Deep Neural Networks for Visual Reasoning, Program Induction, and Text-to-Image Synthesis.

    Full text link
    Deep neural networks excel at pattern recognition, especially in the setting of large scale supervised learning. A combination of better hardware, more data, and algorithmic improvements have yielded breakthroughs in image classification, speech recognition and other perception problems. The research frontier has shifted towards the weak side of neural networks: reasoning, planning, and (like all machine learning algorithms) creativity. How can we advance along this frontier using the same generic techniques so effective in pattern recognition; i.e. gradient descent with backpropagation? In this thesis I develop neural architectures with new capabilities in visual reasoning, program induction and text-to-image synthesis. I propose two models that disentangle the latent visual factors of variation that give rise to images, and enable analogical reasoning in the latent space. I show how to augment a recurrent network with a memory of programs that enables the learning of compositional structure for more data-efficient and generalizable program induction. Finally, I develop a generative neural network that translates descriptions of birds, flowers and other categories into compelling natural images.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135763/1/reedscot_1.pd

    Neural recommender models for sparse and skewed behavioral data

    Get PDF
    Modern online platforms offer recommendations and personalized search and services to a large and diverse user base while still aiming to acquaint users with the broader community on the platform. Prior work backed by large volumes of user data has shown that user retention is reliant on catering to their specific eccentric tastes, in addition to providing them popular services or content on the platform. Long-tailed distributions are a fundamental characteristic of human activity, owing to the bursty nature of human attention. As a result, we often observe skew in data facets that involve human interaction. While there are superficial similarities to Zipf's law in textual data and other domains, the challenges with user data extend further. Individual words may have skewed frequencies in the corpus, but the long-tail words by themselves do not significantly impact downstream text-mining tasks. On the contrary, while sparse users (a majority on most online platforms) contribute little to the training data, they are equally crucial at inference time. Perhaps more so, since they are likely to churn. In this thesis, we study platforms and applications that elicit user participation in rich social settings incorporating user-generated content, user-user interaction, and other modalities of user participation and data generation. For instance, users on the Yelp review platform participate in a follower-followee network and also create and interact with review text (two modalities of user data). Similarly, community question-answer (CQA) platforms incorporate user interaction and collaboratively authored content over diverse domains and discussion threads. Since user participation is multimodal, we develop generalizable abstractions beyond any single data modality. Specifically, we aim to address the distributional mismatch that occurs with user data independent of dataset specifics; While a minority of the users generates most training samples, it is insufficient only to learn the preferences of this subset of users. As a result, the data's overall skew and individual users' sparsity are closely interlinked: sparse users with uncommon preferences are under-represented. Thus, we propose to treat these problems jointly with a skew-aware grouping mechanism that iteratively sharpens the identification of preference groups within the user population. As a result, we improve user characterization; content recommendation and activity prediction (+6-22% AUC, +6-43% AUC, +12-25% RMSE over state-of-the-art baselines), primarily for users with sparse activity. The size of the item or content inventories compounds the skew problem. Recommendation models can achieve very high aggregate performance while recommending only a tiny proportion of the inventory (as little as 5%) to users. We propose a data-driven solution guided by the aggregate co-occurrence information across items in the dataset. We specifically note that different co-occurrences are not equally significant; For example, some co-occurring items are easily substituted while others are not. We develop a self-supervised learning framework where the aggregate co-occurrences guide the recommendation problem while providing room to learn these variations among the item associations. As a result, we improve coverage to ~100% (up from 5%) of the inventory and increase long-tail item recall up to 25%. We also note that the skew and sparsity problems repeat across data modalities. For instance, social interactions and review content both exhibit aggregate skew, although individual users who actively generate reviews may not participate socially and vice-versa. It is necessary to differentially weight and merge different data sources for each user towards inference tasks in such cases. We show that the problem is inherently adversarial since the user participation modalities compete to describe a user accurately. We develop a framework to unify these representations while algorithmically tackling mode collapse, a well-known pitfall with adversarial models. A more challenging but important instantiation of sparsity is the few-shot setting or cross-domain setting. We may only have a single or a few interactions for users or items in the sparse domains or partitions. We show that contextualizing user-item interactions helps us infer behavioral invariants in the dense domain, allowing us to correlate sparse participants to their active counterparts (resulting in 3x faster training, ~19% recall gains in multi-domain settings). Finally, we consider the multi-task setting, where the platform incorporates multiple distinct recommendations and prediction tasks for each user. A single-user representation is insufficient for users who exhibit different preferences along each dimension. At the same time, it is counter-productive to handle correlated prediction or inference tasks in isolation. We develop a multi-faceted representation approach grounded on residual learning with heterogeneous knowledge graph representations, which provides us an expressive data representation for specialized domains and applications with multimodal user data. We achieve knowledge sharing by unifying task-independent and task-specific representations of each entity with a unified knowledge graph framework. In each chapter, we also discuss and demonstrate how the proposed frameworks directly incorporate a wide range of gradient-optimizable recommendation and behavior models, maximizing their applicability and pertinence to user-centered inference tasks and platforms

    Representation Learning Methods for Sequential Information in Marketing and Customer Level Transactions

    Get PDF
    The rapid growth of data generated by businesses has surpassed human capabilities to produce actionable insights. Modern marketing applications depend on vast amounts of customer labelled data and supervised machine learning algorithms to predict customer behaviour and their potential next actions. However, this process requires significant effort in data pre-processing and the involvement of domain experts, which can be costly and time-consuming. This work reviews representation learning techniques as an alternative approach to feature engineering, aiming to eliminate the need for hand-crafted features and accelerate the process of extracting insights from data. Techniques such as Bayesian neural networks, general embeddings, and encoding-decoding architectures are explored to compress information obtained directly from raw input data into a dense probabilistic space. This thesis introduces the necessary technical aspects of neural networks and representation learning, from traditional methods like principal component analysis (PCA) and embeddings, to latent variable and generative methods that use deep neural networks, such as variational auto-encoders and Bayesian neural networks. It also explores the theoretical background of survival analysis and recommender systems, which serve as the foundation for the applications presented in this work to predict when individuals are likely to stop their relationship with businesses in a non-contractual settings or which items individuals are the most likely to interact with in their next purchase. Experiments conducted on real-world retail and benchmark datasets demonstrate comparable results in terms of predictive performance and superior computational efficiency when compared to existing methods

    Recent Developments in Recommender Systems: A Survey

    Full text link
    In this technical survey, we comprehensively summarize the latest advancements in the field of recommender systems. The objective of this study is to provide an overview of the current state-of-the-art in the field and highlight the latest trends in the development of recommender systems. The study starts with a comprehensive summary of the main taxonomy of recommender systems, including personalized and group recommender systems, and then delves into the category of knowledge-based recommender systems. In addition, the survey analyzes the robustness, data bias, and fairness issues in recommender systems, summarizing the evaluation metrics used to assess the performance of these systems. Finally, the study provides insights into the latest trends in the development of recommender systems and highlights the new directions for future research in the field

    Knowledge and Reasoning for Image Understanding

    Get PDF
    abstract: Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning. Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Memory-based preferential choice in large option spaces

    Get PDF
    Whether adding songs to a playlist or groceries to a shopping basket, everyday decisions often require us to choose between an innumerable set of options. Laboratory studies of preferential choice have made considerable progress in describing how people navigate fixed sets of options. Yet, questions remain about how well this generalises to more complex, everyday choices. In this thesis, I ask how people navigate large option spaces, focusing particularly on how long-term memory supports decisions. In the first project, I explore how large option spaces are structured in the mind. A topic model trained on the purchasing patterns of consumers uncovered an intuitive set of themes that centred primarily around goals (e.g., tomatoes go well in a salad), suggesting that representations are geared to support action. In the second project, I explore how such representations are queried during memory-based decisions, where options must be retrieved from memory. Using a large dataset of over 100,000 online grocery shops, results revealed that consumers query multiple systems of associative memory when determining what choose next. Attending to certain knowledge sources, as estimated by a cognitive model, predicted important retrieval errors, such as the propensity to forget or add unwanted products. In the final project, I ask how preferences could be learned and represented in large option spaces, where most options are untried. A cognitive model of sequential decision making is proposed, which learns preferences over choice attributes, allowing for the generalisation of preferences to unseen options, by virtue of their similarity to previous choices. This model explains reduced exploration patterns behaviour observed in the supermarket and preferential choices in more controlled laboratory settings. Overall, this suggests that consumers depend on associative systems in long-term memory when navigating large spaces of options, enabling inferences about the conceptual properties and subjective value of novel options

    Utilizing AI/ML methods for measuring data quality

    Get PDF
    Kvalitní data jsou zásadní pro důvěryhodná rozhodnutí na datech založená. Značná část současných přístupů k měření kvality dat je spojena s náročnou, odbornou a časově náročnou prací, která vyžaduje manuální přístup k dosažení odpovídajících výsledků. Tyto přístupy jsou navíc náchylné k chybám a nevyužívají plně potenciál umělé inteligence (AI). Možným řešením je prozkoumat inovativní nové metody založené na strojovém učení (ML), které využívají potenciál AI k překonání těchto problémů. Významná část práce se zabývá teorií kvality dat, která poskytuje komplexní vhled do této oblasti. V existující literatuře byly objeveny čtyři moderní metody založené na ML a byla navržena jedna nová metoda založená na autoenkodéru (AE). Byly provedeny experimenty s AE a dolováním asociačních pravidel za pomoci metod zpracování přirozeného jazyka. Navrhované metody založené na AE prokázaly schopnost detekce potenciálních problémů s kvalitou dat na datasetech z reálného světa. Dolování asociačních pravidel dokázalo extrahovat byznys pravidla pro stanovený problém, ale vyžadovalo značné úsilí s předzpracováním dat. Alternativní metody nezaložené na AI byly také podrobeny analýze, ale vyžadovaly odborné znalosti daného problému a domény.High-quality data is crucial for trusted data-based decisions. A considerable part of current data quality measuring approaches is associated with expensive, expert and time-consuming work that includes manual effort to achieve adequate results. Furthermore, these approaches are prone to error and do not take full advantage of the AI potential. A possible solution is to explore ML-based state-of-the-art methods that are using the potential of AI to overcome these issues. A significant part of the thesis deals with data quality theory which provides a comprehensive insight into the field of data quality. Four ML-based state-of-the-art methods were discovered in the existing literature, and one novel method based on Autoencoders (AE) was proposed. Experiments with AE and Association Rule Mining using NLP were conducted. Proposed methods based on AE proved to detect potential data quality defects in real-world datasets. Association Rule Mining approach was able to extract business rules for a given business question, but the required significant preprocessing effort. Alternative non-AI methods were also analyzed but required reliance on expert and domain knowledge
    corecore