9 research outputs found
Context-based multimedia semantics modelling and representation
The evolution of the World Wide Web, increase in processing power, and more network bandwidth have contributed to the proliferation of digital multimedia data. Since multimedia data has become a critical resource in many organisations, there is an increasing need to gain efficient access to data, in order to share, extract knowledge, and ultimately use the knowledge to inform business decisions. Existing methods for multimedia semantic understanding are limited to the computable low-level features; which raises the question of how to identify and represent the high-level semantic knowledge in multimedia resources.In order to bridge the semantic gap between multimedia low-level features and high-level human perception, this thesis seeks to identify the possible contextual dimensions in multimedia resources to help in semantic understanding and organisation. This thesis investigates the use of contextual knowledge to organise and represent the semantics of multimedia data aimed at efficient and effective multimedia content-based semantic retrieval.A mixed methods research approach incorporating both Design Science Research and Formal Methods for investigation and evaluation was adopted. A critical review of current approaches for multimedia semantic retrieval was undertaken and various shortcomings identified. The objectives for a solution were defined which led to the design, development, and formalisation of a context-based model for multimedia semantic understanding and organisation. The model relies on the identification of different contextual dimensions in multimedia resources to aggregate meaning and facilitate semantic representation, knowledge sharing and reuse. A prototype system for multimedia annotation, CONMAN was built to demonstrate aspects of the model and validate the research hypothesis, H₁.Towards providing richer and clearer semantic representation of multimedia content, the original contributions of this thesis to Information Science include: (a) a novel framework and formalised model for organising and representing the semantics of heterogeneous visual data; and (b) a novel S-Space model that is aimed at visual information semantic organisation and discovery, and forms the foundations for automatic video semantic understanding
Moving towards the semantic web: enabling new technologies through the semantic annotation of social contents.
La Web Social ha causat un creixement exponencial dels continguts disponibles deixant enormes quantitats de recursos textuals electrònics que sovint aclaparen els usuaris. Aquest volum d’informació és d’interès per a la comunitat de mineria de dades. Els algorismes de mineria de dades exploten característiques de les entitats per tal de categoritzar-les, agrupar-les o classificar-les segons la seva semblança. Les dades per si mateixes no aporten cap mena de significat: han de ser interpretades per esdevenir informació. Els mètodes tradicionals de mineria de dades no tenen com a objectiu “entendre” el contingut d’un recurs, sinó que extreuen valors numèrics els quals esdevenen models en aplicar-hi càlculs estadístics, que només cobren sentit sota l’anàlisi manual d’un expert. Els darrers anys, motivat per la Web Semàntica, molts investigadors han proposat mètodes semàntics de classificació de dades capaços d’explotar recursos textuals a nivell conceptual. Malgrat això, normalment aquests mètodes depenen de recursos anotats prèviament per poder interpretar semànticament el contingut d’un document. L’ús d’aquests mètodes està estretament relacionat amb l’associació de dades i el seu significat.
Aquest treball es centra en el desenvolupament d’una metodologia genèrica capaç de detectar els trets més rellevants d’un recurs textual descobrint la seva associació semàntica, es a dir, enllaçant-los amb conceptes modelats a una ontologia, i detectant els principals temes de discussió. Els mètodes proposats són no supervisats per evitar el coll d’ampolla generat per l’anotació manual, independents del domini (aplicables a qualsevol àrea de coneixement) i flexibles (capaços d’analitzar recursos heterogenis: documents textuals o documents semi-estructurats com els articles de la Viquipèdia o les publicacions de Twitter). El treball ha estat avaluat en els àmbits turístic i mèdic.
Per tant, aquesta dissertació és un primer pas cap a l'anotació semàntica automàtica de documents necessària per possibilitar el camí cap a la visió de la Web Semàntica.La Web Social ha provocado un crecimiento exponencial de los contenidos disponibles, dejando enormes cantidades de recursos electrónicos que a menudo abruman a los usuarios. Tal volumen de información es de interés para la comunidad de minería de datos. Los algoritmos de minería de datos explotan características de las entidades para categorizarlas, agruparlas o clasificarlas según su semejanza. Los datos por sí mismos no aportan ningún significado: deben ser interpretados para convertirse en información. Los métodos tradicionales no tienen como objetivo "entender" el contenido de un recurso, sino que extraen valores numéricos que se convierten en modelos tras aplicar cálculos estadísticos, los cuales cobran sentido bajo el análisis manual de un experto. Actualmente, motivados por la Web Semántica, muchos investigadores han propuesto métodos semánticos de clasificación de datos capaces de explotar recursos textuales a nivel conceptual. Sin embargo, generalmente estos métodos dependen de recursos anotados previamente para poder interpretar semánticamente el contenido de un documento. El uso de estos métodos está estrechamente relacionado con la asociación de datos y su significado.
Este trabajo se centra en el desarrollo de una metodología genérica capaz de detectar los rasgos más relevantes de un recurso textual descubriendo su asociación semántica, es decir, enlazándolos con conceptos modelados en una ontología, y detectando los principales temas de discusión. Los métodos propuestos son no supervisados para evitar el cuello de botella generado por la anotación manual, independientes del dominio (aplicables a cualquier área de conocimiento) y flexibles (capaces de analizar recursos heterogéneos: documentos textuales o documentos semi-estructurados, como artículos de la Wikipedia o publicaciones de Twitter). El trabajo ha sido evaluado en los ámbitos turístico y médico.
Esta disertación es un primer paso hacia la anotación semántica automática de documentos necesaria para posibilitar el camino hacia la visión de la Web Semántica.Social Web technologies have caused an exponential growth of the documents available through the Web, making enormous amounts of textual electronic resources available. Users may be overwhelmed by such amount of contents and, therefore, the automatic analysis and exploitation of all this information is of interest to the data mining community. Data mining algorithms exploit features of the entities in order to characterise, group or classify them according to their resemblance. Data by itself does not carry any meaning; it needs to be interpreted to convey information. Classical data analysis methods did not aim to “understand” the content and the data were treated as meaningless numbers and statistics were calculated on them to build models that were interpreted manually by human domain experts. Nowadays, motivated by the Semantic Web, many researchers have proposed semantic-grounded data classification and clustering methods that are able to exploit textual data at a conceptual level. However, they usually rely on pre-annotated inputs to be able to semantically interpret textual data such as the content of Web pages. The usability of all these methods is related to the linkage between data and its meaning.
This work focuses on the development of a general methodology able to detect the most relevant features of a particular textual resource finding out their semantics (associating them to concepts modelled in ontologies) and detecting its main topics. The proposed methods are unsupervised (avoiding the manual annotation bottleneck), domain-independent (applicable to any area of knowledge) and flexible (being able to deal with heterogeneous resources: raw text documents, semi-structured user-generated documents such Wikipedia articles or short and noisy tweets). The methods have been evaluated in different fields (Tourism, Oncology).
This work is a first step towards the automatic semantic annotation of documents, needed to pave the way towards the Semantic Web vision
Recommended from our members
Enhancing student learning journeys with semantically annotated content
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonThere is an increasing interest in developing existing Special Educational Needs (SEN) teaching methods due to recent concerns regarding the number of SEN pupils in schools. Communication is difficult for students when they have little or no clear speech. Consequently, a range of communication systems are used as an alternative to speech, including symbols, pictures or gestures. Importantly, helping students to better communicate also improves their education, friendships and independence. However, it is acknowledged that creating these educational resources is time consuming and expensive, and the learning results are not recognised as being as effective as required. Semantic Web technology has had an impact in the educational field and offers the required linkages for more engagement with Web content. There is, however, a considerable gap in Semantic Web research between the contributions in the mainstream educational field and research undertaken into special educational needs (SEN) students. This thesis presents an augmented World Wide Web (WWW) vision utilising annotation to more effectively support diverse special educational needs students. Students are supported in part by a SEN Teaching Platform (SENTP), one artefact from this design science research. Poetry is used as a website teaching material because of its significant impact on special needs students as it is a difficult topic to understand. The first stage of the research is to select the appropriate tools for testing annotation techniques in a real SEN environment. Later, a design of the proposed SEN teaching platform is built based on a Semantic Web annotation tool (Amaya) coordinated with a web application. Design is evaluated by conducting a pilot study in schools caring for special needs students (SEN). Evaluations were carried out at two schools, interviewing nine participants (Teachers, Teaching Assistant) in the UK. SENTP is tested for using Semantic Web technology to benefit the education of SEN students by utilizing Semantic Web annotation tools. This research further improves the SENTP with additional support for cognitive load using specific annotation formats within the Amaya annotation tool. Field testing is carried out at six UK schools with twenty-two participants being interviewed. Cognitive load principles are shown to improve both learning and class behaviour, also supporting teachers in the production of educational content. The pilot study and field testing results reveal that the proposed approach is effective. Following this, designed artefacts are synthesised within a wider design blueprint that articulates how this new world of annotated digital media is designed, deployed and consumed. Finally, SENTP ontology is created using OWL language and Protégé 5. The main goal of this ontology is to produce a wider design SENTP ontology that can be adapted to wider teaching purposes