94 research outputs found

    AI explainability:a bridge between machine vision and natural language processing

    No full text
    Abstract This paper attempts to present an appraisal review of explainable Artificial Intelligence research, with a focus on building a bridge between image processing community and natural language processing (NLP) community. The paper highlights the implicit link between the two disciplines as exemplified from the emergence of automatic image annotation systems, visual question-answer systems. Text-To-Image generation and multimedia analytics. Next, the paper identified a set of natural language processing fields where the visual-based explainability can boost the local NLP task. This includes, sentiment analysis, automatic text summarization, system argumentation, topical analysis, among others, which are highly expected to fuel prominent future research in the field

    Knowledge-based sentence semantic similarity:algebraical properties

    No full text
    Abstract Determining the extent to which two text snippets are semantically equivalent is a well-researched topic in the areas of natural language processing, information retrieval and text summarization. The sentence-to-sentence similarity scoring is extensively used in both generic and query-based summarization of documents as a significance or a similarity indicator. Nevertheless, most of these applications utilize the concept of semantic similarity measure only as a tool, without paying importance to the inherent properties of such tools that ultimately restrict the scope and technical soundness of the underlined applications. This paper aims to contribute to fill in this gap. It investigates three popular WordNet hierarchical semantic similarity measures, namely path-length, Wu and Palmer and Leacock and Chodorow, from both algebraical and intuitive properties, highlighting their inherent limitations and theoretical constraints. We have especially examined properties related to range and scope of the semantic similarity score, incremental monotonicity evolution, monotonicity with respect to hyponymy/hypernymy relationship as well as a set of interactive properties. Extension from word semantic similarity to sentence similarity has also been investigated using a pairwise canonical extension. Properties of the underlined sentence-to-sentence similarity are examined and scrutinized. Next, to overcome inherent limitations of WordNet semantic similarity in terms of accounting for various Part-of-Speech word categories, a WordNet “All word-To-Noun conversion” that makes use of Categorial Variation Database (CatVar) is put forward and evaluated using a publicly available dataset with a comparison with some state-of-the-art methods. The finding demonstrates the feasibility of the proposal and opens up new opportunities in information retrieval and natural language processing tasks

    Gene selection for cancer diagnosis via iterative graph clustering-based approach

    No full text
    Abstract The development of microarray devices has led to the accumulation of DNA microarray datasets. Through this technological advance, physicians are able to examine various aspects of gene expression for cancer diagnosis. As data accumulation rapidly increases, the task of machine learning faces considerable challenges for high-dimensional DNA microarray data classification. Gene selection is a popular and powerful approach to deal with these high-dimensional cancer data. In this paper, a novel graph clustering-based gene selection approach is developed. The developed approach has two main objectives, consisting of relevance maximization and redundancy minimization of the selected genes. In this method, in each iteration, one subgraph is extracted, and then among the existing genes in this cluster, appropriate genes are selected using filter-based measure. The reported results on five cancer datasets indicate that the developed gene selection approach can improve the accuracy of cancer diagnosis

    Self-supervised face presentation attack detection with dynamic grayscale snippets

    No full text
    Abstract Face presentation attack detection (PAD) plays an important role in defending face recognition systems against presentation attacks. The success of PAD largely relies on supervised learning that requires a huge number of labeled data, which is especially challenging for videos and often requires expert knowledge. To avoid the costly collection of labeled data, this paper presents a novel method for self-supervised video representation learning via motion prediction. To achieve this, we exploit the temporal consistency based on three RGB frames which are acquired at three different times in the video sequence. The obtained frames are then transformed into grayscale images where each image is specified to three different channels such as R(red), G(green), and B(blue) to form a dynamic grayscale snippet (DGS). Motivated by this, the labels are automatically generated to increase the temporal diversity based on DGS by using the different temporal lengths of the videos, which prove to be very helpful for the downstream task. Benefiting from the self-supervised nature of our method, we report the results that outperform existing methods on four public benchmarks, namely, Replay-Attack, MSU-MFSD, CASIA-FASD, and OULU-NPU. Explainability analysis has been carried out through LIME and Grad-CAM techniques to visualize the most important features used in the DGS

    A novel attributed community detection by integration of feature weighting and node centrality

    No full text
    Abstract Community detection is one of the primary problems in social network analysis and this problem has more challenges in attributed social networks. The purpose of community detection in attributed social networks is to discover communities with not only homogeneous node properties but also adherent structures. Although community detection has been extensively studied, attributed community detection of large social networks with a large number of attributes remains a vital challenge. To address this challenge, in this paper a novel attributed community detection method is developed by integration of feature weighting with node centrality techniques. The developed method includes two main phases: (1) Weight Matrix Calculation, (2) Label Propagation Algorithm-based Attributed Community Detection. The aim of the first phase is to calculate the weight between two linked nodes using structural and attribute similarities, while, in the second phase, an improved label propagation algorithm-based community detection method in the attributed social network is proposed. The purpose of the second phase is to detect different communities by employing the calculated weight matrix and node popularity. After implementing the proposed method, its performance is compared with several other state of the art methods using some benchmarked real-world datasets. The results indicate that the developed method outperforms several other state-of-the-art methods and ascertain the effectiveness of the developed method for attributed community detection

    A novel explainable COVID-19 diagnosis method by integration of feature selection with random forest

    No full text
    Abstract Several Artificial Intelligence-based models have been developed for COVID-19 disease diagnosis. In spite of the promise of artificial intelligence, there are very few models which bridge the gap between traditional human-centered diagnosis and the potential future of machine-centered disease diagnosis. Under the concept of human-computer interaction design, this study proposes a new explainable artificial intelligence method that exploits graph analysis for feature visualization and optimization for the purpose of COVID-19 diagnosis from blood test samples. In this developed model, an explainable decision forest classifier is employed to COVID-19 classification based on routinely available patient blood test data. The approach enables the clinician to use the decision tree and feature visualization to guide the explainability and interpretability of the prediction model. By utilizing this novel feature selection phase, the proposed diagnosis model will not only improve diagnosis accuracy but decrease the execution time as well

    Exploration of quantitative factors affecting the popularity of users in an online community

    No full text
    Abstract Online communities for discussing different topics have been around since the dawn of the Internet. Whether an individual user is popular or not in the community is highly subjective and difficult to measure with a quantitative approach. In this paper we explore factors and usage patterns on a Finnish football related online community FutisForum2 for receiving votes in a yearly voting for Forum User of the Year. Several quantitative factors are identified in order to calculate correlations between each factor and the yearly voting results. These factors include yearly message network centrality figures, number of message, user quote amounts, number of characters in messages, etc. Although message amounts clearly correlate with the voting results, the strongest correlation was noticed when comparing eigenvector centrality to the voting results. The main outcome of the project was also generation of a database of 11 million messages for further research

    On web based sentence similarity for paraphrasing detection

    No full text
    Abstract Semantic similarity measures play vital roles in information retrieval, natural language processing and paraphrasing detection. With the growing plagiarisms cases in both commercial and research community, designing efficient tools and approaches for paraphrasing detection becomes crucial. This paper contrasts web-based approach related to analysis of snippets of the search engine with WordNet based measure. Several refinements of the web-based approach will be investigated and compared. Evaluations of the approaches with respect to Microsoft paraphrasing dataset will be performed and discussed
    • …
    corecore