971 research outputs found

    A hierarchical taxonomy for classifying hardness of inference tasks

    Get PDF
    International audienceExhibiting inferential capabilities is one of the major goals of many modern Natural Language Processing systems. However, if attempts have been made to define what textual inferences are, few seek to classify inference phenomena by difficulty. In this paper we propose a hierarchical taxonomy for inferences, relatively to their hardness, and with corpus annotation and system design and evaluation in mind. Indeed, a fine-grained assessment of the difficulty of a task allows us to design more appropriate systems and to evaluate them only on what they are designed to handle. Each of seven classes is described and provided with examples from different tasks like question answering, textual entailment and coreference resolution. We then test the classes of our hierarchy on the specific task of question answering. Our annotation process of the testing data at the QA4MRE 2013 evaluation campaign reveals that it is possible to quantify the contrasts in types of difficulty on datasets of the same task

    Automatic document classification of biological literature

    Get PDF
    Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, a text-mining system for biological literature, which marks up full text according to a shallow ontology that includes terms of biological interest. This project investigates document classification in the context of biological literature, making use of the Textpresso markup of a corpus of Caenorhabditis elegans literature. Results: We present a two-step text categorization algorithm to classify a corpus of C. elegans papers. Our classification method first uses a support vector machine-trained classifier, followed by a novel, phrase-based clustering algorithm. This clustering step autonomously creates cluster labels that are descriptive and understandable by humans. This clustering engine performed better on a standard test-set (Reuters 21578) compared to previously published results (F-value of 0.55 vs. 0.49), while producing cluster descriptions that appear more useful. A web interface allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept. Conclusions: We have demonstrated a simple method to classify biological documents that embodies an improvement over current methods. While the classification results are currently optimized for Caenorhabditis elegans papers by human-created rules, the classification engine can be adapted to different types of documents. We have demonstrated this by presenting a web interface that allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept

    An approach to knowledge assessment in an Intelligent Tutoring System

    Get PDF
    In this paper, we present an approach to student's evaluation in a well-defined domain based on a semantic network. A similarity matrix based on the semantic memory structure of humans is used to build a semantic distance model in order to describe an assessment technique to evaluate the student's state of knowledge. Our aim is to facilitate a deeper conceptual understanding of domain principles. We are developing a new student model including an assessment module with DistSem model.Workshop de Tecnología Informática Aplicada en Educación (WTIAE)Red de Universidades con Carreras en Informática (RedUNCI

    Reasoning about river basins: WaWO+ revisited

    Get PDF
    © . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/This paper characterizes part of an interdisciplinary research effort on Artificial Intelligence (AI) techniques and tools applied to Environmental Decision-Support Systems (EDSS). WaWO+ the ontology we present here, provides a set of concepts that are queried, advertised and used to support reasoning about and the management of urban water resources in complex scenarios as a River Basin. The goal of this research is to increase efficiency in Data and Knowledge interoperability and data integration among heterogeneous environmental data sources (e.g., software agents) using an explicit, machine understandable ontology to facilitate urban water resources management within a River Basin.Peer ReviewedPostprint (author's final draft

    Automated ontology framework for service robots

    Get PDF
    This paper presents an automated ontology framework for service robots. The framework is designed to automatically create an ontology and an instance of concept in dynamic environment. Ontology learning from text is applied to build a concept hierarchy using WordNet which provides a rich semantic processing for physical objects. The Automated Ontology is composed of four modules: Concept Creation, Property Creation, Relationship Creation and Instance of Concept Creation. The automated ontology algorithm was implemented in order to create the concept hierarchy in the Robot Ontology. The Semantic Knowledge Acquisition represents knowledge of physical objects in dynamic environments. In simulation experiments, the list of object names and property names was identified. The result shows the concept hierarchy which represents explicit terms and the semantic knowledge of physical objects for performing everyday manipulation tasks

    Part grouping for efficient process planning

    Get PDF
    A framework to provide automated part grouping has been investigated in order to overcome the limitations found in existing part grouping techniques. The work is targeted at: exploration of criteria for feature-based part grouping to make the process planning activity efficient; determination of the optimal number of part families in the part grouping process; development of an experimental hybrid process planning system (HYCAPP); investigation of the effects of improved part grouping on manufacturing cell design. The research work has explored the creation of a feature-based component data model and manufacturing system capability data model, and checked the limitations inherent in existing part grouping techniques i.e. part grouping: around methods; based on part geometry; based on machining processes; and based on machines. [Continues.

    Meta-Generalization for Multiparty Privacy Learning to Identify Anomaly Multimedia Traffic in Graynet

    Full text link
    Identifying anomaly multimedia traffic in cyberspace is a big challenge in distributed service systems, multiple generation networks and future internet of everything. This letter explores meta-generalization for a multiparty privacy learning model in graynet to improve the performance of anomaly multimedia traffic identification. The multiparty privacy learning model in graynet is a globally shared model that is partitioned, distributed and trained by exchanging multiparty parameters updates with preserving private data. The meta-generalization refers to discovering the inherent attributes of a learning model to reduce its generalization error. In experiments, three meta-generalization principles are tested as follows. The generalization error of the multiparty privacy learning model in graynet is reduced by changing the dimension of byte-level imbedding. Following that, the error is reduced by adapting the depth for extracting packet-level features. Finally, the error is reduced by adjusting the size of support set for preprocessing traffic-level data. Experimental results demonstrate that the proposal outperforms the state-of-the-art learning models for identifying anomaly multimedia traffic.Comment: Correct some typo

    A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions

    Full text link
    Graphs represent interconnected structures prevalent in a myriad of real-world scenarios. Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data, underpinning various tasks including node classification and link prediction. However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce, thereby leading to biased learning outcomes. This necessitates the emerging field of imbalanced learning on graphs, which aims to correct these data distribution skews for more accurate and representative learning outcomes. In this survey, we embark on a comprehensive review of the literature on imbalanced learning on graphs. We begin by providing a definitive understanding of the concept and related terminologies, establishing a strong foundational understanding for readers. Following this, we propose two comprehensive taxonomies: (1) the problem taxonomy, which describes the forms of imbalance we consider, the associated tasks, and potential solutions; (2) the technique taxonomy, which details key strategies for addressing these imbalances, and aids readers in their method selection process. Finally, we suggest prospective future directions for both problems and techniques within the sphere of imbalanced learning on graphs, fostering further innovation in this critical area.Comment: The collection of awesome literature on imbalanced learning on graphs: https://github.com/Xtra-Computing/Awesome-Literature-ILoG

    An approach to knowledge assessment in an Intelligent Tutoring System

    Get PDF
    In this paper, we present an approach to student's evaluation in a well-defined domain based on a semantic network. A similarity matrix based on the semantic memory structure of humans is used to build a semantic distance model in order to describe an assessment technique to evaluate the student's state of knowledge. Our aim is to facilitate a deeper conceptual understanding of domain principles. We are developing a new student model including an assessment module with DistSem model.Workshop de Tecnología Informática Aplicada en Educación (WTIAE)Red de Universidades con Carreras en Informática (RedUNCI
    corecore