301 research outputs found

    Neural Networks for Information Retrieval

    Get PDF
    Machine learning plays a role in many aspects of modern IR systems, and deep learning is applied in all of them. The fast pace of modern-day research has given rise to many different approaches for many different IR problems. The amount of information available can be overwhelming both for junior students and for experienced researchers looking for new research topics and directions. Additionally, it is interesting to see what key insights into IR problems the new technologies are able to give us. The aim of this full-day tutorial is to give a clear overview of current tried-and-trusted neural methods in IR and how they benefit IR research. It covers key architectures, as well as the most promising future directions.Comment: Overview of full-day tutorial at SIGIR 201

    Generating Diverse and Meaningful Captions

    Get PDF
    Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty. We make our source code publicly available online.Comment: Accepted for presentation at The 27th International Conference on Artificial Neural Networks (ICANN 2018

    Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

    Full text link
    The visual classification performance of vision-language models such as CLIP has been shown to benefit from additional semantic knowledge from large language models (LLMs) such as GPT-3. In particular, averaging over LLM-generated class descriptors, e.g. "waffle, which has a round shape", can notably improve generalization performance. In this work, we critically study this behavior and propose WaffleCLIP, a framework for zero-shot visual classification which simply replaces LLM-generated descriptors with random character and word descriptors. Without querying external models, we achieve comparable performance gains on a large number of visual classification tasks. This allows WaffleCLIP to both serve as a low-cost alternative, as well as a sanity check for any future LLM-based vision-language model extensions. We conduct an extensive experimental study on the impact and shortcomings of additional semantics introduced with LLM-generated descriptors, and showcase how - if available - semantic context is better leveraged by querying LLMs for high-level concepts, which we show can be done to jointly resolve potential class name ambiguities. Code is available here: https://github.com/ExplainableML/WaffleCLIP.Comment: Accepted to ICCV 2023. Main paper with 9 page

    Dynamic Key-Value Memory Networks for Knowledge Tracing

    Full text link
    Knowledge Tracing (KT) is a task of tracing evolving knowledge state of students with respect to one or more concepts as they engage in a sequence of learning activities. One important purpose of KT is to personalize the practice sequence to help students learn knowledge concepts efficiently. However, existing methods such as Bayesian Knowledge Tracing and Deep Knowledge Tracing either model knowledge state for each predefined concept separately or fail to pinpoint exactly which concepts a student is good at or unfamiliar with. To solve these problems, this work introduces a new model called Dynamic Key-Value Memory Networks (DKVMN) that can exploit the relationships between underlying concepts and directly output a student's mastery level of each concept. Unlike standard memory-augmented neural networks that facilitate a single memory matrix or two static memory matrices, our model has one static matrix called key, which stores the knowledge concepts and the other dynamic matrix called value, which stores and updates the mastery levels of corresponding concepts. Experiments show that our model consistently outperforms the state-of-the-art model in a range of KT datasets. Moreover, the DKVMN model can automatically discover underlying concepts of exercises typically performed by human annotations and depict the changing knowledge state of a student.Comment: To appear in 26th International Conference on World Wide Web (WWW), 201

    Worst-case bounds on the quality of max-product fixed-points

    Get PDF
    We study worst-case bounds on the quality of any fixed point assignment of the max-product algorithm for Markov Random Fields (MRF). We start proving a bound independent of the MRF structure and parameters. Afterwards, we show how this bound can be improved for MRFs with particular structures such as bipartite graphs or grids. Our results provide interesting insight into the behavior of max-product. For example, we prove that max-product provides very good results (at least 90% of the optimal) on MRFs with large variable-disjoint cycles (MRFs in which all cycles are variable-disjoint, namely that they do not share any edge and in which each cycle contains at least 20 variables)

    Web Search of Fashion Items with Multimodal Querying

    Full text link
    In this paper, we introduce a novel multimodal fashion search paradigm where e-commerce data is searched with a multimodal query composed of both an image and text. In this setting, the query image shows a fashion product that the user likes and the query text allows to change certain product attributes to fit the product to the user’s desire. Multimodal search gives users the means to clearly express what they are looking for. This is in contrast to current e-commerce search mechanisms, which are cumbersome and often fail to grasp the customer’s needs. Multimodal search requires intermodal representations of visual and textual fashion attributes which can be mixed and matched to form the user’s desired product, and which have a mechanism to indicate when a visual and textual fashion attribute represent the same concept. With a neural network, we induce a common, multimodal space for visual and textual fashion attributes where their inner product measures their semantic similarity. We build a multimodal retrieval model which operates on the obtained intermodal representations and which ranks images based on their relevance to a multimodal query. We demonstrate that our model is able to retrieve images that both exhibit the necessary query image attributes and satisfy the query texts. Moreover, we show that our model substantially outperforms two state-of-the-art retrieval models adapted to multimodal fashion search.status: accepte

    Association of Human Leukocyte Antigens Class II Variants with Susceptibility to Hidradenitis Suppurativa in a Caucasian Spanish Population

    Get PDF
    Hidradenitis suppurativa (HS) is a chronic inflammatory cutaneous disease of the hair follicle typically presenting recurrent, painful, and inflamed lesions on the inverse areas of the body. Although its pathogenesis remains unknown, the immune system appears to play a potential role. To date, two previous studies have not found any association between the Human Leukocyte Antigen system (HLA) and HS. In this study we analyzed the HLA-A, -B, -C; and DRB1, -DQA1, and ?DQB1 allele distribution in 106 HS patients and 262 healthy controls from a Caucasian population in Cantabria (northern Spain). HLA-A*29 and B*50 were significantly more common in HS patients and A*30 and B*37 in controls, but these associations disappeared after statistical correction. DRB1*07, DQA1*02, and DQB1*02 were significantly more common in controls (p 0.026, p 0.0012, and p 0.0005, respectively) and the HLA allele DQB1*03:01 was significantly more common in HS patients (p 0.00007) after the Bonferroni correction. The DRB1*07~DQA1*02~DQB1*02 haplotype was significantly more common in controls (p < 0.0005). This is the first study showing an association between HLA-class II and HS. Our results suggest that HLA-II alleles (DRB1*07, DQA1*02, DQB1*02, and DQB1*03:01) and the DRB1*07~DQA1*02~DQB1*02 haplotype could influence resistance or susceptibility to HS

    Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error

    Get PDF
    We investigate the impact of choosing regressors and molecular representations for the construction of fast machine learning (ML) models of 13 electronic ground-state properties of organic molecules. The performance of each regressor/representation/property combination is assessed using learning curves which report out-of-sample errors as a function of training set size with up to ∼118k distinct molecules. Molecular structures and properties at the hybrid density functional theory (DFT) level of theory come from the QM9 database [Ramakrishnan et al. Sci. Data 2014, 1, 140022] and include enthalpies and free energies of atomization, HOMO/LUMO energies and gap, dipole moment, polarizability, zero point vibrational energy, heat capacity, and the highest fundamental vibrational frequency. Various molecular representations have been studied (Coulomb matrix, bag of bonds, BAML and ECFP4, molecular graphs (MG)), as well as newly developed distribution based variants including histograms of distances (HD), angles (HDA/MARAD), and dihedrals (HDAD). Regressors include linear models (Bayesian ridge regression (BR) and linear regression with elastic net regularization (EN)), random forest (RF), kernel ridge regression (KRR), and two types of neural networks, graph convolutions (GC) and gated graph networks (GG). Out-of sample errors are strongly dependent on the choice of representation and regressor and molecular property. Electronic properties are typically best accounted for by MG and GC, while energetic properties are better described by HDAD and KRR. The specific combinations with the lowest out-of-sample errors in the ∼118k training set size limit are (free) energies and enthalpies of atomization (HDAD/KRR), HOMO/LUMO eigenvalue and gap (MG/GC), dipole moment (MG/GC), static polarizability (MG/GG), zero point vibrational energy (HDAD/KRR), heat capacity at room temperature (HDAD/KRR), and highest fundamental vibrational frequency (BAML/RF). We present numerical evidence that ML model predictions deviate from DFT (B3LYP) less than DFT (B3LYP) deviates from experiment for all properties. Furthermore, out-of-sample prediction errors with respect to hybrid DFT reference are on par with, or close to, chemical accuracy. The results suggest that ML models could be more accurate than hybrid DFT if explicitly electron correlated quantum (or experimental) data were available
    corecore