20,577 research outputs found

    Just an Update on PMING Distance for Web-based Semantic Similarity in Artificial Intelligence and Data Mining

    Full text link
    One of the main problems that emerges in the classic approach to semantics is the difficulty in acquisition and maintenance of ontologies and semantic annotations. On the other hand, the Internet explosion and the massive diffusion of mobile smart devices lead to the creation of a worldwide system, which information is daily checked and fueled by the contribution of millions of users who interacts in a collaborative way. Search engines, continually exploring the Web, are a natural source of information on which to base a modern approach to semantic annotation. A promising idea is that it is possible to generalize the semantic similarity, under the assumption that semantically similar terms behave similarly, and define collaborative proximity measures based on the indexing information returned by search engines. The PMING Distance is a proximity measure used in data mining and information retrieval, which collaborative information express the degree of relationship between two terms, using only the number of documents returned as result for a query on a search engine. In this work, the PMINIG Distance is updated, providing a novel formal algebraic definition, which corrects previous works. The novel point of view underlines the features of the PMING to be a locally normalized linear combination of the Pointwise Mutual Information and Normalized Google Distance. The analyzed measure dynamically reflects the collaborative change made on the web resources

    Utilising semantic technologies for intelligent indexing and retrieval of digital images

    Get PDF
    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they in principle rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this paper we present a semantically-enabled image annotation and retrieval engine that is designed to satisfy the requirements of the commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as the exploitation of lexical databases for explicit semantic-based query expansion

    Soft behaviour modelling of user communities

    Get PDF
    A soft modelling approach for describing behaviour in on-line user communities is introduced in this work. Behaviour models of individual users in dynamic virtual environments have been described in the literature in terms of timed transition automata; they have various drawbacks. Soft multi/agent behaviour automata are defined and proposed to describe multiple user behaviours and to recognise larger classes of user group histories, such as group histories which contain unexpected behaviours. The notion of deviation from the user community model allows defining a soft parsing process which assesses and evaluates the dynamic behaviour of a group of users interacting in virtual environments, such as e-learning and e-business platforms. The soft automaton model can describe virtually infinite sequences of actions due to multiple users and subject to temporal constraints. Soft measures assess a form of distance of observed behaviours by evaluating the amount of temporal deviation, additional or omitted actions contained in an observed history as well as actions performed by unexpected users. The proposed model allows the soft recognition of user group histories also when the observed actions only partially meet the given behaviour model constraints. This approach is more realistic for real-time user community support systems, concerning standard boolean model recognition, when more than one user model is potentially available, and the extent of deviation from community behaviour models can be used as a guide to generate the system support by anticipation, projection and other known techniques. Experiments based on logs from an e-learning platform and plan compilation of the soft multi-agent behaviour automaton show the expressiveness of the proposed model

    Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

    Full text link
    Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770
    • …
    corecore