20 research outputs found

    MEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY

    Get PDF
    ABSTRACT Semantic similarity measures between words play an important role in community minin

    Monte Carlo Video Text Segmentation

    Get PDF
    This paper presents a probabilistic algorithm for segmenting and recognizing text embedded in video sequences based on adaptive thresholding using a Bayes filtering method. The algorithm approximates the posterior distribution of segmentation thresholds of video text by a set of weighted samples. The set of samples is initialized by applying a classical segmentation algorithm on the first video frame and further refined by random sampling under a temporal Bayesian framework. This framework allows us to evaluate an text image segmentor on the basis of recognition result instead of visual segmentation result, which is directly relevant to our character recognition task. Results on a database of 6944 images demonstrate the validity of the algorithm

    Semantic Sort: A Supervised Approach to Personalized Semantic Relatedness

    Full text link
    We propose and study a novel supervised approach to learning statistical semantic relatedness models from subjectively annotated training examples. The proposed semantic model consists of parameterized co-occurrence statistics associated with textual units of a large background knowledge corpus. We present an efficient algorithm for learning such semantic models from a training sample of relatedness preferences. Our method is corpus independent and can essentially rely on any sufficiently large (unstructured) collection of coherent texts. Moreover, the approach facilitates the fitting of semantic models for specific users or groups of users. We present the results of extensive range of experiments from small to large scale, indicating that the proposed method is effective and competitive with the state-of-the-art.Comment: 37 pages, 8 figures A short version of this paper was already published at ECML/PKDD 201

    A service concept recommendation system for enhancing the dependability of semantic service matchmakers in the service ecosystem environment

    Get PDF
    A Service Ecosystem is a biological view of the business and software environment, which is comprised of a Service Use Ecosystem and a Service Supply Ecosystem. Service matchmakers play an important role in ensuring the connectivity between the two ecosystems. Current matchmakers attempt to employ ontologies to disambiguate service consumers’ service queries by semantically classifying service entities and providing a series of human computer interactions to service consumers. However, the lack of relevant service domain knowledge and the wrong service queries could prevent the semantic service matchmakers from seeking the service concepts that can be used to correctly represent service requests. To resolve this issue, in this paper, we propose the framework of a service concept recommendation system, which is built upon a semantic similarity model.This system can be employed to seek the concepts used to correctly represent service consumers’ requests, when a semantic service matchmaker finds that the service concepts that are eventually retrieved cannot match the service requests. Whilst many similar semantic similarity models have been developed to date, most of them focus on distance-based measures for the semantic network environment and ignore content-based measures for the ontology environment. For the ontology environment in which concepts are defined with sufficient datatype properties, object properties, and restrictions etc., the content of concepts should be regarded as an important factor in concept similarity measures. Hence, we present a novel semantic similarity model for the service ontology environment. The technical details and evaluation details of the framework are discussed in this paper

    Investigating the document structure as a source of evidence for multimedia fragment retrieval

    Get PDF
    International audienceMultimedia objects can be retrieved using their context that can be for instance the text surrounding them in documents. This text may be either near or far from the searched objects. Our goal in this paper is to study the impact, in term of effectiveness, of text position relatively to searched objects. The multimedia objects we consider are described in structured documents such as XML ones. The document structure is therefore exploited to provide this text position in documents. Although structural information has been shown to be an effective source of evidence in textual information retrieval, only a few works investigated its interest in multimedia retrieval. More precisely, the task we are interested in this paper is to retrieve multimedia fragments (i.e. XML elements having at least one multimedia object). Our general approach is built on two steps: we first retrieve XML elements containing multimedia objects, and we then explore the surrounding information to retrieve relevant multimedia fragments. In both cases, we study the impact of the surrounding information using the documents structure.Our work is carried out on images, but it can be extended to any other media, since the physical content of multimedia objects is not used. We conducted several experiments in the context of the Multimedia track of the INEX evaluation campaign. Results showed that structural evidences are of high interest to tune the importance of textual context for multimedia retrieval. Moreover, the proposed approach outperforms state of the art approaches

    A Localization/Verification Scheme for Finding Text in Images and Video Frames Based on Contrast Independent Features and Machine Learning Methods

    Get PDF
    Automatic character detection in video sequences is a complex task, due to the variety of sizes and colors as well as to the complexity of the background. In this paper we address this problem by proposing a localization/verification scheme. Candidate text regions are first localized by using a fast algorithm with a very low rejection rate, which enables the character size normalization. Contrast independent features are then proposed for training machine learning tools in order to verify the text regions. Two kinds of machine learning tools, multilayer perceptrons and support vector machines, are compared based on four different features in the verification task. This scheme provides fast text detection in images and videos with a low computation cost, comparing with traditional methods
    corecore