164,781 research outputs found

    Task-specific Word Identification from Short Texts Using a Convolutional Neural Network

    Full text link
    Task-specific word identification aims to choose the task-related words that best describe a short text. Existing approaches require well-defined seed words or lexical dictionaries (e.g., WordNet), which are often unavailable for many applications such as social discrimination detection and fake review detection. However, we often have a set of labeled short texts where each short text has a task-related class label, e.g., discriminatory or non-discriminatory, specified by users or learned by classification algorithms. In this paper, we focus on identifying task-specific words and phrases from short texts by exploiting their class labels rather than using seed words or lexical dictionaries. We consider the task-specific word and phrase identification as feature learning. We train a convolutional neural network over a set of labeled texts and use score vectors to localize the task-specific words and phrases. Experimental results on sentiment word identification show that our approach significantly outperforms existing methods. We further conduct two case studies to show the effectiveness of our approach. One case study on a crawled tweets dataset demonstrates that our approach can successfully capture the discrimination-related words/phrases. The other case study on fake review detection shows that our approach can identify the fake-review words/phrases.Comment: accepted by Intelligent Data Analysis, an International Journa

    A Recurrent Deep Neural Network Model to measure Sentence Complexity for the Italian Language

    Get PDF
    Text simplification (TS) is a natural language processing task devoted to the modification of a text in such a way that the grammar and structure of the phrases is greatly simplified, preserving the underlying meaning and information contents. In this paper we give a contribution to the TS field presenting a deep neural network model able to detect the complexity of italian sentences. In particular, the system gives a score to an input text that identifies the confidence level during the decision making process and that could be interpreted as a measure of the sentence complexity. Experiments have been carried out on one public corpus of Italian texts created specifically for the task of TS. We have also provided a comparison of our model with a state of the art method used for the same purpos

    Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem

    Full text link
    This paper builds upon the fundamental work of Niwa et al. [34], which provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free standardized microenvironment. The hardness of the problem comes from the superposition between the driving forces of intra- and inter-molecule interactions and it is mirrored by the evidences of shift from folding to aggregation phenotypes by single-point mutations [10]. Here we apply several state-of-the-art classification methods coming from the field of structural pattern recognition, with the aim to compare different representations of the same proteins gathered from the Niwa et al. data base; such representations include sequences and labeled (contact) graphs enriched with chemico-physical attributes. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating "easily foldable" from "hardly foldable" molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution used in the various discrimination systems.Comment: 17 pages, 3 figures, 46 reference

    Assessing the Quality of Democracy: A Practical Guide

    Get PDF

    A Neural Model for Self Organizing Feature Detectors and Classifiers in a Network Hierarchy

    Full text link
    Many models of early cortical processing have shown how local learning rules can produce efficient, sparse-distributed codes in which nodes have responses that are statistically independent and low probability. However, it is not known how to develop a useful hierarchical representation, containing sparse-distributed codes at each level of the hierarchy, that incorporates predictive feedback from the environment. We take a step in that direction by proposing a biologically plausible neural network model that develops receptive fields, and learns to make class predictions, with or without the help of environmental feedback. The model is a new type of predictive adaptive resonance theory network called Receptive Field ARTMAP, or RAM. RAM self organizes internal category nodes that are tuned to activity distributions in topographic input maps. Each receptive field is composed of multiple weight fields that are adapted via local, on-line learning, to form smooth receptive ftelds that reflect; the statistics of the activity distributions in the input maps. When RAM generates incorrect predictions, its vigilance is raised, amplifying subtractive inhibition and sharpening receptive fields until the error is corrected. Evaluation on several classification benchmarks shows that RAM outperforms a related (but neurally implausible) model called Gaussian ARTMAP, as well as several standard neural network and statistical classifters. A topographic version of RAM is proposed, which is capable of self organizing hierarchical representations. Topographic RAM is a model for receptive field development at any level of the cortical hierarchy, and provides explanations for a variety of perceptual learning data.Defense Advanced Research Projects Agency and Office of Naval Research (N00014-95-1-0409

    Deep Learning for the Radiographic Detection of periodontal Bone Loss

    Get PDF
    We applied deep convolutional neural networks (CNNs) to detect periodontal bone loss (PBL) on panoramic dental radiographs. We synthesized a set of 2001 image segments from panoramic radiographs. Our reference test was the measured % of PBL. A deep feed-forward CNN was trained and validated via 10-times repeated group shuffling. Model architectures and hyperparameters were tuned using grid search. The final model was a seven-layer deep neural network, parameterized by a total number of 4,299,651 weights. For comparison, six dentists assessed the image segments for PBL. Averaged over 10 validation folds the mean (SD) classification accuracy of the CNN was 0.81 (0.02). Mean (SD) sensitivity and specificity were 0.81 (0.04), 0.81 (0.05), respectively. The mean (SD) accuracy of the dentists was 0.76 (0.06), but the CNN was not statistically significant superior compared to the examiners (p = 0.067/t-test). Mean sensitivity and specificity of the dentists was 0.92 (0.02) and 0.63 (0.14), respectively. A CNN trained on a limited amount of radiographic image segments showed at least similar discrimination ability as dentists for assessing PBL on panoramic radiographs. Dentists’ diagnostic efforts when using radiographs may be reduced by applying machine-learning based technologies
    • …
    corecore