16,025 research outputs found

    A text categorisation tool for open source communities based on semantic analysis

    Get PDF
    Open source software (OSS) projects are supported by communities interacting through software repositories and mailing lists. Thousands of contributors participate in the development of the projects although they rarely meet each other. The result is a huge archived repository with thousands of questions, answers and contributions usually difficult to explore. We propose a tool based on semantic analysis for both performing an automatic knowledge discovery and a categorisation of the content of mailing lists repositories. Semantic analysis is a practical method for extracting and inferring relations of words in passages of discourse, producing measures of relations among words or passages that are well correlated with semantic similarity. The objective of this article is two-fold: (1) to develop a text categorisation tool based on indexing terms and semantic annotation, and (2) to apply the developed tool to extract the main dimensions related to knowledge sharing activities in virtual communities. Debian Linux ports to embedded processors are used as a case study to accomplish the proposed double objectiv

    Social influence analysis in microblogging platforms - a topic-sensitive based approach

    Get PDF
    The use of Social Media, particularly microblogging platforms such as Twitter, has proven to be an effective channel for promoting ideas to online audiences. In a world where information can bias public opinion it is essential to analyse the propagation and influence of information in large-scale networks. Recent research studying social media data to rank users by topical relevance have largely focused on the “retweet", “following" and “mention" relations. In this paper we propose the use of semantic profiles for deriving influential users based on the retweet subgraph of the Twitter graph. We introduce a variation of the PageRank algorithm for analysing users’ topical and entity influence based on the topical/entity relevance of a retweet relation. Experimental results show that our approach outperforms related algorithms including HITS, InDegree and Topic-Sensitive PageRank. We also introduce VisInfluence, a visualisation platform for presenting top influential users based on a topical query need

    Past, present and future of information and knowledge sharing in the construction industry: Towards semantic service-based e-construction

    Get PDF
    The paper reviews product data technology initiatives in the construction sector and provides a synthesis of related ICT industry needs. A comparison between (a) the data centric characteristics of Product Data Technology (PDT) and (b) ontology with a focus on semantics, is given, highlighting the pros and cons of each approach. The paper advocates the migration from data-centric application integration to ontology-based business process support, and proposes inter-enterprise collaboration architectures and frameworks based on semantic services, underpinned by ontology-based knowledge structures. The paper discusses the main reasons behind the low industry take up of product data technology, and proposes a preliminary roadmap for the wide industry diffusion of the proposed approach. In this respect, the paper stresses the value of adopting alliance-based modes of operation

    The hunt for submarines in classical art: mappings between scientific invention and artistic interpretation

    Get PDF
    This is a report to the AHRC's ICT in Arts and Humanities Research Programme. This report stems from a project which aimed to produce a series of mappings between advanced imaging information and communications technologies (ICT) and needs within visual arts research. A secondary aim was to demonstrate the feasibility of a structured approach to establishing such mappings. The project was carried out over 2006, from January to December, by the visual arts centre of the Arts and Humanities Data Service (AHDS Visual Arts).1 It was funded by the Arts and Humanities Research Council (AHRC) as one of the Strategy Projects run under the aegis of its ICT in Arts and Humanities Research programme. The programme, which runs from October 2003 until September 2008, aims ‘to develop, promote and monitor the AHRC’s ICT strategy, and to build capacity nation-wide in the use of ICT for arts and humanities research’.2 As part of this, the Strategy Projects were intended to contribute to the programme in two ways: knowledge-gathering projects would inform the programme’s Fundamental Strategic Review of ICT, conducted for the AHRC in the second half of 2006, focusing ‘on critical strategic issues such as e-science and peer-review of digital resources’. Resource-development projects would ‘build tools and resources of broad relevance across the range of the AHRC’s academic subject disciplines’.3 This project fell into the knowledge-gathering strand. The project ran under the leadership of Dr Mike Pringle, Director, AHDS Visual Arts, and the day-to-day management of Polly Christie, Projects Manager, AHDS Visual Arts. The research was carried out by Dr Rupert Shepherd

    Multi modal multi-semantic image retrieval

    Get PDF
    PhDThe rapid growth in the volume of visual information, e.g. image, and video can overwhelm users’ ability to find and access the specific visual information of interest to them. In recent years, ontology knowledge-based (KB) image information retrieval techniques have been adopted into in order to attempt to extract knowledge from these images, enhancing the retrieval performance. A KB framework is presented to promote semi-automatic annotation and semantic image retrieval using multimodal cues (visual features and text captions). In addition, a hierarchical structure for the KB allows metadata to be shared that supports multi-semantics (polysemy) for concepts. The framework builds up an effective knowledge base pertaining to a domain specific image collection, e.g. sports, and is able to disambiguate and assign high level semantics to ‘unannotated’ images. Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. Local features are more useful than global features, e.g. colour, shape or texture, as they are invariant to image scale, orientation and camera angle. An innovative approach is proposed for the representation, annotation and retrieval of visual content using a hybrid technique based upon the use of an unstructured visual word and upon a (structured) hierarchical ontology KB model. The structural model facilitates the disambiguation of unstructured visual words and a more effective classification of visual content, compared to a vector space model, through exploiting local conceptual structures and their relationships. The key contributions of this framework in using local features for image representation include: first, a method to generate visual words using the semantic local adaptive clustering (SLAC) algorithm which takes term weight and spatial locations of keypoints into account. Consequently, the semantic information is preserved. Second a technique is used to detect the domain specific ‘non-informative visual words’ which are ineffective at representing the content of visual data and degrade its categorisation ability. Third, a method to combine an ontology model with xi a visual word model to resolve synonym (visual heterogeneity) and polysemy problems, is proposed. The experimental results show that this approach can discover semantically meaningful visual content descriptions and recognise specific events, e.g., sports events, depicted in images efficiently. Since discovering the semantics of an image is an extremely challenging problem, one promising approach to enhance visual content interpretation is to use any associated textual information that accompanies an image, as a cue to predict the meaning of an image, by transforming this textual information into a structured annotation for an image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct types of information representation and modality, there are some strong, invariant, implicit, connections between images and any accompanying text information. Semantic analysis of image captions can be used by image retrieval systems to retrieve selected images more precisely. To do this, a Natural Language Processing (NLP) is exploited firstly in order to extract concepts from image captions. Next, an ontology-based knowledge model is deployed in order to resolve natural language ambiguities. To deal with the accompanying text information, two methods to extract knowledge from textual information have been proposed. First, metadata can be extracted automatically from text captions and restructured with respect to a semantic model. Second, the use of LSI in relation to a domain-specific ontology-based knowledge model enables the combined framework to tolerate ambiguities and variations (incompleteness) of metadata. The use of the ontology-based knowledge model allows the system to find indirectly relevant concepts in image captions and thus leverage these to represent the semantics of images at a higher level. Experimental results show that the proposed framework significantly enhances image retrieval and leads to narrowing of the semantic gap between lower level machinederived and higher level human-understandable conceptualisation

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Content analysis of open innovation communities using latent semantic indexing

    Get PDF
    Open innovation (OI) represents an emergent paradigm by which customers and users are involved as part of the innovation processes of organisations. One of its most popular implementation schemes is OI communities, which have been popularised by the use of social software. Through these communities, users are free to post, share, comment and evaluate other users’ ideas, and they can interact with other users as well as with the innovation department and experts of the organisation. One of the challenges of OI communities is distinguishing the most innovative ideas, as they receive hundreds or even thousands of ideas. This paper proposes a novel approach for this task consisting of analysing the content of shared ideas. Through this analysis, several conclusions about the decision processes of the organisation can be inferred. The obtained results can help OI managers to improve ideas evaluation processesMinisterio de Economía y Competitividad ECO2013-43856-RJunta de Andalucia. Consejería de Economía, Innovación, Ciencia y Empleo P12-SEJ-32

    Mapping an ancient historian in a digital age: the Herodotus Encoded Space-Text-Image Archive (HESTIA)

    Get PDF
    HESTIA (the Herodotus Encoded Space-Text-Imaging Archive) employs the latest digital technology to develop an innovative methodology to the study of spatial data in Herodotus' Histories. Using a digital text of Herodotus, freely available from the Perseus on-line library, to capture all the place-names mentioned in the narrative, we construct a database to house that information and represent it in a series of mapping applications, such as GIS, GoogleEarth and GoogleMap Timeline. As a collaboration of academics from the disciplines of Classics, Geography, and Archaeological Computing, HESTIA has the twin aim of investigating the ways geography is represented in the Histories and of bringing Herodotus' world into people's homes

    TEI and LMF crosswalks

    Get PDF
    The present paper explores various arguments in favour of making the Text Encoding Initia-tive (TEI) guidelines an appropriate serialisation for ISO standard 24613:2008 (LMF, Lexi-cal Mark-up Framework) . It also identifies the issues that would have to be resolved in order to reach an appropriate implementation of these ideas, in particular in terms of infor-mational coverage. We show how the customisation facilities offered by the TEI guidelines can provide an adequate background, not only to cover missing components within the current Dictionary chapter of the TEI guidelines, but also to allow specific lexical projects to deal with local constraints. We expect this proposal to be a basis for a future ISO project in the context of the on going revision of LMF
    corecore