48 research outputs found

    The Emerging Trends of Multi-Label Learning

    Full text link
    Exabytes of data are generated daily by humans, leading to the growing need for new efforts in dealing with the grand challenges for multi-label learning brought by big data. For example, extreme multi-label classification is an active and rapidly growing research area that deals with classification tasks with an extremely large number of classes or labels; utilizing massive data with limited supervision to build a multi-label classification model becomes valuable for practical applications, etc. Besides these, there are tremendous efforts on how to harvest the strong learning capability of deep learning to better capture the label dependencies in multi-label learning, which is the key for deep learning to address real-world classification tasks. However, it is noted that there has been a lack of systemic studies that focus explicitly on analyzing the emerging trends and new challenges of multi-label learning in the era of big data. It is imperative to call for a comprehensive survey to fulfill this mission and delineate future research directions and new applications.Comment: Accepted to TPAMI 202

    LONGITUDINAL ANALYSIS FOR ORDINAL DATA THROUGH MULTILEVEL AND ITEM RESPONSE MODELING: APPLICATIONS TO CHILD OBSERVATION RECORD (COR)

    Get PDF
    The large proportion of children from low SES backgrounds and the increasing achievement gap between disadvantaged children and their more advantaged peers are beckoning national attention to investment in early childhood education (Reid, Kagan, Hilton, & Potter, 2015). As evidenced in many research studies (e.g., Heckman & Masterov, 2007), early childhood experiences are critical to the development of children. High-quality early learning is necessary for children, especially disadvantaged children, to narrow the school readiness gap and to build a foundation for more advanced academic, social, and cognitive skills. Importantly, high quality assessment tools (i.e., reliability and validity evidence supporting the use and interpretation of scores from a given instrument) are a necessary component of early childhood education. The Preschool Child Observation Record (Preschool OnlineCOR, High/Scope, 2003) is an observational assessment tool created by the High/Scope Educational Research Foundation, a nonprofit, research, and training organization focused on early childhood development. First, a comprehensive evaluation of the psychometric qualities of the Preschool OnlineCOR was conducted; including dimensionality, reliability, item properties, and measurement invariance. Second, a short form of the Preschool OnlineCOR (COR-ST) was developed. Third, this study examined the growth trajectories of 3- and 4-year-old children on the general factor measured by the COR-ST through traditional multilevel growth modeling and more advanced multilevel growth modeling approaches, including multilevel item response theory modeling (MLIRT; Kamata, 2001; Liu & Hedeker, 2006; Pastor & Beretvas, 2006; Sulis & Toland, 2016), and multiple indicator multilevel modeling (MIML; Muthén and Muthén, 1998-2015). The traditional multilevel growth model consists of two steps: 1) one score is calculated to represent the general factor for each child at each time point based on their raw item scores through either classical test theory (CTT) or IRT (item response theory) approaches, and 2) the growth trajectory is modeled based on the overall COR-ST score at each time point. In comparison, the more advanced approaches, MLIRT and MIML growth models, model directly using item-level information (item scores), which simultaneously estimate item-level parameters, change over time, and effects of child predictors

    Slot Filling

    Get PDF
    Slot filling (SF) is the task of automatically extracting facts about particular entities from unstructured text, and populating a knowledge base (KB) with these facts. These structured KBs enable applications such as structured web queries and question answering. SF is typically framed as a query-oriented setting of the related task of relation extraction. Throughout this thesis, we reflect on how SF is a task with many distinct problems. We demonstrate that recall is a major limiter on SF system performance. We contribute an analysis of typical SF recall loss, and find a substantial amount of loss occurs early in the SF pipeline. We confirm that accurate NER and coreference resolution are required for high-recall SF. We measure upper bounds using a naïve graph-based semi-supervised bootstrapping technique, and find that only 39% of results are reachable using a typical feature space. We expect that this graph-based technique will be directly useful for extraction, and this leads us to frame SF as a label propagation task. We focus on a detailed graph representation of the task which reflects the behaviour and assumptions we want to model based on our analysis, including modifying the label propagation process to model multiple types of label interaction. Analysing the graph, we find that a large number of errors occur in very close proximity to training data, and identify that this is of major concern for propagation. While there are some conflicts caused by a lack of sufficient disambiguating context—we explore adding additional contextual features to address this—many of these conflicts are caused by subtle annotation problems. We find that lack of a standard for how explicit expressions of relations must be in text makes consistent annotation difficult. Using a strict definition of explicitness results in 20% of correct annotations being removed from a standard dataset. We contribute several annotation-driven analyses of this problem, exploring the definition of slots and the effect of the lack of a concrete definition of explicitness: annotation schema do not detail how explicit expressions of relations need to be, and there is large scope for disagreement between annotators. Additionally, applications may require relatively strict or relaxed evidence for extractions, but this is not considered in annotation tasks. We demonstrate that annotators frequently disagree on instances, dependent on differences in annotator world knowledge and thresholds on making probabilistic inference. SF is fundamental to enabling many knowledge-based applications, and this work motivates modelling and evaluating SF to better target these tasks

    Learning Instance Weights in Multi-Instance Learning

    Get PDF
    Multi-instance (MI) learning is a variant of supervised machine learning, where each learning example contains a bag of instances instead of just a single feature vector. MI learning has applications in areas such as drug activity prediction, fruit disease management and image classification. This thesis investigates the case where each instance has a weight value determining the level of influence that it has on its bag's class label. This is a more general assumption than most existing approaches use, and thus is more widely applicable. The challenge is to accurately estimate these weights in order to make predictions at the bag level. An existing approach known as MILES is retroactively identified as an algorithm that uses instance weights for MI learning, and is evaluated using a variety of base learners on benchmark problems. New algorithms for learning instance weights for MI learning are also proposed and rigorously evaluated on both artificial and real-world datasets. The new algorithms are shown to achieve better root mean squared error rates than existing approaches on artificial data generated according to the algorithms' underlying assumptions. Experimental results also demonstrate that the new algorithms are competitive with existing approaches on real-world problems

    Web Relation Extraction with Distant Supervision

    Get PDF
    Being able to find relevant information about prominent entities quickly is the main reason to use a search engine. However, with large quantities of information on the World Wide Web, real time search over billions of Web pages can waste resources and the end user’s time. One of the solutions to this is to store the answer to frequently asked general knowledge queries, such as the albums released by a musical artist, in a more accessible format, a knowledge base. Knowledge bases can be created and maintained automatically by using information extraction methods, particularly methods to extract relations between proper names (named entities). A group of approaches for this that has become popular in recent years are distantly supervised approaches as they allow to train relation extractors without text-bound annotation, using instead known relations from a knowledge base to heuristically align them with a large textual corpus from an appropriate domain. This thesis focuses on researching distant supervision for the Web domain. A new setting for creating training and testing data for distant supervision from the Web with entity-specific search queries is introduced and the resulting corpus is published. Methods to recognise noisy training examples as well as methods to combine extractions based on statistics derived from the background knowledge base are researched. Using co-reference resolution methods to extract relations from sentences which do not contain a direct mention of the subject of the relation is also investigated. One bottleneck for distant supervision for Web data is identified to be named entity recognition and classification (NERC), since relation extraction methods rely on it for identifying relation arguments. Typically, existing pre-trained tools are used, which fail in diverse genres with non-standard language, such as the Web genre. The thesis explores what can cause NERC methods to fail in diverse genres and quantifies different reasons for NERC failure. Finally, a novel method for NERC for relation extraction is proposed based on the idea of jointly training the named entity classifier and the relation extractor with imitation learning to reduce the reliance on external NERC tools. This thesis improves the state of the art in distant supervision for knowledge base population, and sheds light on and proposes solutions for issues arising for information extraction for not traditionally studied domains

    AN APPROACH TO MACHINE DEVELOPMENT OF MUSICAL ONTOGENY

    Get PDF
    This Thesis pursues three main objectives: (i) to use computational modelling to explore how music is perceived, cognitively processed and created by human beings; (ii) to explore interactive musical systems as a method to model and achieve the transmission of musical influence in artificial worlds and between humans and machines; and (iii) to experiment with artificial and alternative developmental musical routes in order to observe the evolution of musical styles. In order to achieve these objectives, this Thesis introduces a new paradigm for the design of computer interactive musical systems called the Ontomemetical Model of Music Evolution - OMME, which includes the fields of musical ontogenesis and memetlcs. OMME-based systems are designed to artificially explore the evolution of music centred on human perceptive and cognitive faculties. The potential of the OMME is illustrated with two interactive musical systems, the Rhythmic Meme Generator (RGeme) and the Interactive Musical Environments (iMe). which have been tested in a series of laboratory experiments and live performances. The introduction to the OMME is preceded by an extensive and critical overview of the state of the art computer models that explore musical creativity and interactivity, in addition to a systematic exposition of the major issues involved in the design and implementation of these systems. This Thesis also proposes innovative solutions for (i) the representation of musical streams based on perceptive features, (ii) music segmentation, (iii) a memory-based music model, (iv) the measure of distance between musical styles, and (v) an impi*ovisation-based creative model

    Neural information extraction from natural language text

    Get PDF
    Natural language processing (NLP) deals with building computational techniques that allow computers to automatically analyze and meaningfully represent human language. With an exponential growth of data in this digital era, the advent of NLP-based systems has enabled us to easily access relevant information via a wide range of applications, such as web search engines, voice assistants, etc. To achieve it, a long-standing research for decades has been focusing on techniques at the intersection of NLP and machine learning. In recent years, deep learning techniques have exploited the expressive power of Artificial Neural Networks (ANNs) and achieved state-of-the-art performance in a wide range of NLP tasks. Being one of the vital properties, Deep Neural Networks (DNNs) can automatically extract complex features from the input data and thus, provide an alternative to the manual process of handcrafted feature engineering. Besides ANNs, Probabilistic Graphical Models (PGMs), a coupling of graph theory and probabilistic methods have the ability to describe causal structure between random variables of the system and capture a principled notion of uncertainty. Given the characteristics of DNNs and PGMs, they are advantageously combined to build powerful neural models in order to understand the underlying complexity of data. Traditional machine learning based NLP systems employed shallow computational methods (e.g., SVM or logistic regression) and relied on handcrafting features which is time-consuming, complex and often incomplete. However, deep learning and neural network based methods have recently shown superior results on various NLP tasks, such as machine translation, text classification, namedentity recognition, relation extraction, textual similarity, etc. These neural models can automatically extract an effective feature representation from training data. This dissertation focuses on two NLP tasks: relation extraction and topic modeling. The former aims at identifying semantic relationships between entities or nominals within a sentence or document. Successfully extracting the semantic relationships greatly contributes in building structured knowledge bases, useful in downstream NLP application areas of web search, question-answering, recommendation engines, etc. On other hand, the task of topic modeling aims at understanding the thematic structures underlying in a collection of documents. Topic modeling is a popular text-mining tool to automatically analyze a large collection of documents and understand topical semantics without actually reading them. In doing so, it generates word clusters (i.e., topics) and document representations useful in document understanding and information retrieval, respectively. Essentially, the tasks of relation extraction and topic modeling are built upon the quality of representations learned from text. In this dissertation, we have developed task-specific neural models for learning representations, coupled with relation extraction and topic modeling tasks in the realms of supervised and unsupervised machine learning paradigms, respectively. More specifically, we make the following contributions in developing neural models for NLP tasks: 1. Neural Relation Extraction: Firstly, we have proposed a novel recurrent neural network based architecture for table-filling in order to jointly perform entity and relation extraction within sentences. Then, we have further extended our scope of extracting relationships between entities across sentence boundaries, and presented a novel dependency-based neural network architecture. The two contributions lie in the supervised paradigm of machine learning. Moreover, we have contributed in building a robust relation extractor constrained by the lack of labeled data, where we have proposed a novel weakly-supervised bootstrapping technique. Given the contributions, we have further explored interpretability of the recurrent neural networks to explain their predictions for the relation extraction task. 2. Neural Topic Modeling: Besides the supervised neural architectures, we have also developed unsupervised neural models to learn meaningful document representations within topic modeling frameworks. Firstly, we have proposed a novel dynamic topic model that captures topics over time. Next, we have contributed in building static topic models without considering temporal dependencies, where we have presented neural topic modeling architectures that also exploit external knowledge, i.e., word embeddings to address data sparsity. Moreover, we have developed neural topic models that incorporate knowledge transfers using both the word embeddings and latent topics from many sources. Finally, we have shown improving neural topic modeling by introducing language structures (e.g., word ordering, local syntactic and semantic information, etc.) that deals with bag-of-words issues in traditional topic models. The class of proposed neural NLP models in this section are based on techniques at the intersection of PGMs, deep learning and ANNs. Here, the task of neural relation extraction employs neural networks to learn representations typically at the sentence level, without access to the broader document context. However, topic models have access to statistical information across documents. Therefore, we advantageously combine the two complementary learning paradigms in a neural composite model, consisting of a neural topic and a neural language model that enables us to jointly learn thematic structures in a document collection via the topic model, and word relations within a sentence via the language model. Overall, our research contributions in this dissertation extend NLP-based systems for relation extraction and topic modeling tasks with state-of-the-art performances
    corecore