67 research outputs found

    Creation and extension of ontologies for describing communications in the context of organizations

    Get PDF
    Thesis submitted to Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa, in partial fulfillment of the requirements for the degree of Master in Computer ScienceThe use of ontologies is nowadays a sufficiently mature and solid field of work to be considered an efficient alternative in knowledge representation. With the crescent growth of the Semantic Web, it is expectable that this alternative tends to emerge even more in the near future. In the context of a collaboration established between FCT-UNL and the R&D department of a national software company, a new solution entitled ECC – Enterprise Communications Center was developed. This application provides a solution to manage the communications that enter, leave or are made within an organization, and includes intelligent classification of communications and conceptual search techniques in a communications repository. As specificity may be the key to obtain acceptable results with these processes, the use of ontologies becomes crucial to represent the existing knowledge about the specific domain of an organization. This work allowed us to guarantee a core set of ontologies that have the power of expressing the general context of the communications made in an organization, and of a methodology based upon a series of concrete steps that provides an effective capability of extending the ontologies to any business domain. By applying these steps, the minimization of the conceptualization and setup effort in new organizations and business domains is guaranteed. The adequacy of the core set of ontologies chosen and of the methodology specified is demonstrated in this thesis by its effective application to a real case-study, which allowed us to work with the different types of sources considered in the methodology and the activities that support its construction and evolution

    Applications of Natural Language Processing in Biodiversity Science

    Get PDF
    Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science

    Community-driven & Work-integrated Creation, Use and Evolution of Ontological Knowledge Structures

    Get PDF

    Sentiment Classification of Online Customer Reviews and Blogs Using Sentence-level Lexical Based Semantic Orientation Method

    Get PDF
    ABSTRACT Sentiment analysis is the process of extracting knowledge from the peoples‟ opinions, appraisals and emotions toward entities, events and their attributes. These opinions greatly impact on customers to ease their choices regarding online shopping, choosing events, products and entities. With the rapid growth of online resources, a vast amount of new data in the form of customer reviews and opinions are being generated progressively. Hence, sentiment analysis methods are desirable for developing efficient and effective analyses and classification of customer reviews, blogs and comments. The main inspiration for this thesis is to develop high performance domain independent sentiment classification method. This study focuses on sentiment analysis at the sentence level using lexical based method for different type data such as reviews and blogs. The proposed method is based on general lexicons i.e. WordNet, SentiWordNet and user defined lexical dictionaries for sentiment orientation. The relations and glosses of these dictionaries provide solution to the domain portability problem. The experiments are performed on various data sets such as customer reviews and blogs comments. The results show that the proposed method with sentence contextual information is effective for sentiment classification. The proposed method performs better than word and text level corpus based machine learning methods for semantic orientation. The results highlight that the proposed method achieves an average accuracy of 86% at sentence-level and 97% at feedback level for customer reviews. Similarly, it achieves an average accuracy of 83% at sentence level and 86% at feedback level for blog comment

    Towards Designing and Generating User Interfaces by Using Expert Knowledge

    Full text link
    [ES] La investigación reportada en la presente tesis doctoral se lleva a cabo a través de la metodología de la ciencia del diseño que se centra en la creación y evaluación de artefactos. En esta tesis, el principal artefacto es el novedoso enfoque para diseñar y generar interfaces de usuario utilizando el conocimiento experto. Con el fin de permitir el uso del conocimiento experto, el enfoque propuesto se basa en la reutilización de patrones de diseño que incorporan el conocimiento experto del diseño de la interfaz y proporcionan soluciones reutilizables a diversos problemas de diseño. El objetivo principal de dicho enfoque es abordar el uso de patrones de diseño a fin de garantizar que los conocimientos especializados se integren en el diseño y la generación de interfaces de usuario para aplicaciones móviles y web. Las contribuciones específicas de esta tesis se resumen a continuación: Una primera contribución consiste en el marco AUIDP que se define para apoyar el diseño y la generación de interfaces adaptativas para aplicaciones web y móviles utilizando patrones de diseño HCI. El marco propuesto abarca tanto la etapa de diseño como la de ejecución de dichas interfaces. En el momento del diseño, los modelos de patrones de diseño junto con la interfaz de usuario y el perfil de usuario se definen siguiendo una metodología de desarrollo específica. En tiempo de ejecución, los modelos creados se utilizan para permitir la selección de patrones de diseño de HCI y para permitir la generación de interfaces de usuario a partir de las soluciones de diseño proporcionadas por los patrones de diseño relevantes. La segunda contribución es un método de especificación para establecer un modelo de ontología que convierte la representación tradicional basada en texto en la representación formal del patrón de diseño de HCI. Este método adopta la metodología Neon para lograr la transición de las representaciones informales a las formales. El modelo de ontología creado se llama MIDEP, que es una ontología modular que captura el conocimiento sobre los patrones de diseño, así como la interfaz de usuario y el perfil del usuario. La tercera contribución es el IDEPAR, que es el primer sistema dentro del marco global del AUIDP. Este sistema tiene como objetivo recomendar automáticamente los patrones de diseño más relevantes para un problema de diseño dado. Se basa en un enfoque híbrido que utiliza una combinación mixta de técnicas de recomendación basadas en texto y ontología para producir recomendaciones de patrones de diseño que proporcionan soluciones de diseño apropiadas. La cuarta contribución es un sistema generador de interfaz llamado ICGDEP, que se propone para generar automáticamente el código fuente de la interfaz de usuario para aplicaciones web y móviles. El ICGDEP es el segundo sistema dentro del marco global de AUIDP y se basa en el uso de patrones de diseño de HCI que son recomendados por el sistema IDEPAR. Su objetivo principal es generar automáticamente el código fuente de la interfaz de usuario a partir de las soluciones de diseño proporcionadas por los patrones de diseño. Para lograr esto, el sistema ICGDEP utiliza un método que permite la generación de código fuente de interfaz de usuario para la aplicación de destino. Las contribuciones aportadas en la presente tesis han sido validadas a través de diferentes perspectivas. En primer lugar, la evaluación de la ontología MIDEP desarrollada se realiza utilizando preguntas de competencia, enfoques de evaluación basados en la tecnología y basados en aplicaciones. En segundo lugar, la evaluación del sistema IDEPAR se establece mediante un patrón producido por expertos y un estudio de evaluación centrado en el usuario. Luego, el sistema ICGDEP es evaluado en términos de ser utilizado efectivamente por los desarrolladores, considerando el factor de productividad. Por último, la evaluación del marco mundial de AUIDP se lleva a cabo mediante estudios de casos y estudios de usabilidad.[CA] La investigació reportada en aquesta tesi doctoral es duu a terme a través de la metodologia de la ciència del disseny que se centra en la creació i avaluació d'artefactes. En aquesta tesi, el principal artefacte és el nou enfocament per dissenyar i generar interfícies d'usuari utilitzant el coneixement expert. Per tal de permetre l'ús del coneixement expert, l'enfocament proposat es basa en la reutilització de patrons de disseny que incorporen el coneixement expert del disseny de la interfície i proporcionen solucions reutilitzables a diversos problemes de disseny. L'objectiu principal d'aquest enfocament és abordar l'ús de patrons de disseny per tal de garantir que els coneixements especialitzats s'integrin en el disseny i la generació d'interfícies d'usuari per a aplicacions mòbils i web. Les contribucions específiques d'aquesta tesi es resumeixen a continuació: Una primera contribució consisteix en el marc AUIDP que es defineix per donar suport al disseny i generació d'interfícies adaptatives per a aplicacions web i mòbils utilitzant patrons de disseny HCI. El marc proposat inclou tant l'etapa de disseny com la d'execució de les interfícies esmentades. En el moment del disseny, els models de patrons de disseny juntament amb la interfície d'usuari i el perfil d'usuari es defineixen seguint una metodologia de desenvolupament específica. En temps d'execució, els models creats s'utilitzen per permetre la selecció de patrons de disseny de HCI i per permetre la generació de interfícies d'usuari a partir de les solucions de disseny proporcionades pels patrons de disseny rellevants. La segona contribució és un mètode d'especificació per establir un model d'ontologia que converteix la representació tradicional basada en text en la representació formal del patró de disseny de HCI. Aquest mètode adopta la metodologia Neon per aconseguir la transició de les representacions informals a les formals. El model d'ontologia creat s'anomena MIDEP, una ontologia modular que captura el coneixement sobre els patrons de disseny, així com la interfície d'usuari i el perfil de l'usuari. La tercera contribució és l'IDEPAR, que és el primer sistema dins del marc global de l'AUIDP. Aquest sistema té com a objectiu recomanar automàticament els patrons de disseny més rellevants per a un problema de disseny donat. Es basa en un enfocament híbrid que utilitza una combinació mixta de tècniques de recomanació basades en text i ontologia per produir recomanacions de patrons de disseny que proporcionen solucions de disseny apropiades. La quarta contribució és un sistema generador d'interfície anomenat ICGDEP, que es proposa per generar automàticament el codi font de la interfície d'usuari per a aplicacions web i mòbils. L'ICGDEP és el segon sistema dins del marc global d'AUIDP i es basa en l'ús de patrons de disseny de HCI que són recomanats pel sistema IDEPAR. El seu objectiu principal és generar automàticament el codi font de la interfície d'usuari a partir de les solucions de disseny proporcionades pels patrons de disseny. Per aconseguir-ho, el sistema ICGDEP utilitza un mètode que permet generar codi font d'interfície d'usuari per a l'aplicació de destinació. Les contribucions aportades a la present tesi han estat validades a través de diferents perspectives. En primer lloc, l'avaluació de l'ontologia MIDEP desenvolupada es fa utilitzant preguntes de competència, enfocaments d'avaluació basats en la tecnologia i basats en aplicacions. En segon lloc, l'avaluació del sistema IDEPAR s'estableix mitjançant un patró produït per experts i un estudi d'avaluació centrat en l'usuari. Després, el sistema ICGDEP és avaluat en termes de ser utilitzat efectivament pels desenvolupadors, considerant el factor de productivitat. Finalment, l'avaluació del marc mundial d'AUIDP es fa mitjançant estudis de casos i estudis d'usabilitat.[EN] The research reported in the present PhD dissertation is conducted through the design science methodology that focuses on creating and evaluating artifacts. In the current thesis, the main artifact is the novel approach to design and generate user interfaces using expert knowledge. In order to enable the use of expert knowledge, the present approach is devoted to reuse design patterns that incorporate expert knowledge of interface design and provide reusable solutions to various design problems. The main goal of the proposed approach is to address the use of design patterns in order to ensure that expert knowledge is integrated into the design and generation of user interfaces for mobile and Web applications. The specific contributions of this thesis are summarized below: This first contribution is the AUIDP framework that is defined to support the design and generation of adaptive interfaces for Web and mobile applications using HCI design patterns. The proposed framework spans over design-time and run-time. At design-time, models of design patterns along with user interface and user profile are defined following a specific development methodology. At run-time, the created models are used to allow the selection of HCI design patterns and to enable the generation of user interfaces from the design solutions provided by the relevant design patterns. The second contribution is a specification method to establish an ontology model that turns traditional text-based representation into formal HCI design pattern representation. This method adopts the Neon methodology to achieve the transition from informal to formal representations. The created ontology model is named MIDEP, which is a modular ontology that captures knowledge about design patterns as well as the user interface and user's profile. The third contribution is the IDEPAR, which is the first system within the global AUIDP framework. This system aims to automatically recommend the most relevant design patterns for a given design problem. It is based on a hybrid approach that relies on a mixed combination of text-based and ontology-based recommendation techniques to produce design pattern recommendations that provide appropriate design solutions. The fourth contribution is an interface generator system called ICGDEP, which is proposed to automatically generate the user interface source code for Web and mobile applications. The proposed ICGDEP is the second system within the global AUIDP framework and relies on the use of HCI design patterns that are recommended by the IDEPAR system. It mainly aims at automatically generating the user interface source code from the design solutions provided by design patterns. To achieve this, the ICGDEP system is based on a generation method that allows the generation of user interface source code for the target application. The contributions provided in the present thesis have been validated through different perspectives. First, the evaluation of the developed MIDEP ontology is performed using competency questions, technology-based, and application-based evaluation approaches. Second, the evaluation of the IDEPAR system is established through an expert-based gold standard and a user-centric evaluation study. Then, the ICGDEP system is evaluated in terms of being effectively used by developers, considering the productivity factor. Finally, the evaluation of the global AUIDP framework is conducted through case studies and usability studies.Braham, A. (2022). Towards Designing and Generating User Interfaces by Using Expert Knowledge [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/19092

    A semantic-driven framework for IT support of clinical laboratory standards

    Get PDF
    The clinical laboratory plays a critical role in the delivery of care within the healthcare system by providing services that support accurate and timely diagnosis of diseases. The clinical laboratory relies on standard operating procedures (SOP) to provide information and guidance on the laboratory procedures. To ensure an excellent standard of clinical laboratory services, SOPs need to be of high quality, and practitioners need to have easy access to information contained within the SOPs. However, we argue in this thesis that there is a lack of standardization within clinical laboratory SOPs, and machines and human practitioners have difficulties accessing or using the content of SOPs. This thesis proposes a solution to challenges regarding the representation and use of SOPs in clinical laboratories (see Chapter 1). The research work in this thesis is based on the most up-to-date technological, theoretical, and empirical approaches (see Chapter 2). Additionally, external researchers have already utilized the outcome of this research for various purposes (see Chapter 5). In this thesis, we present the SmartSOP framework, a semantic-driven framework, that supports the representation of clinical laboratory procedure concepts in a standardised format for use within software applications. The SmartSOP framework consists of three main components, the Ontology for Clinical Laboratory SOP (OCL-SOP), the translation engine that converts free text SOPs to a standardised format, and a mobile application to provide lab practitioners with easy access to SOPs (see Chapters 3 and 4). We used the design science approach for the execution of this research work

    Sentiment Classification of Online Customer Reviews and Blogs Using Sentence-level Lexical Based Semantic Orientation Method

    Get PDF
    ABSTRACT Sentiment analysis is the process of extracting knowledge from the peoples‟ opinions, appraisals and emotions toward entities, events and their attributes. These opinions greatly impact on customers to ease their choices regarding online shopping, choosing events, products and entities. With the rapid growth of online resources, a vast amount of new data in the form of customer reviews and opinions are being generated progressively. Hence, sentiment analysis methods are desirable for developing efficient and effective analyses and classification of customer reviews, blogs and comments. The main inspiration for this thesis is to develop high performance domain independent sentiment classification method. This study focuses on sentiment analysis at the sentence level using lexical based method for different type data such as reviews and blogs. The proposed method is based on general lexicons i.e. WordNet, SentiWordNet and user defined lexical dictionaries for sentiment orientation. The relations and glosses of these dictionaries provide solution to the domain portability problem. The experiments are performed on various data sets such as customer reviews and blogs comments. The results show that the proposed method with sentence contextual information is effective for sentiment classification. The proposed method performs better than word and text level corpus based machine learning methods for semantic orientation. The results highlight that the proposed method achieves an average accuracy of 86% at sentence-level and 97% at feedback level for customer reviews. Similarly, it achieves an average accuracy of 83% at sentence level and 86% at feedback level for blog comment

    SENTIMENT CLASSIFICATION OF ONLINE CUSTOMER REVIEWS AND BLOGS USING SENTENCE-LEVEL LEXICAL BASED SEMANTIC ORIENTATION METHOD

    Get PDF
    Sentiment analysis is the process of extracting knowledge from the peoples’ opinions, appraisals and emotions toward entities, events and their attributes. These opinions greatly impact on customers to ease their choices regarding online shopping, choosing events, products and entities. With the rapid growth of online resources, a vast amount of new data in the form of customer reviews and opinions are being generated progressively. Hence, sentiment analysis methods are desirable for developing efficient and effective analyses and classification of customer reviews, blogs and comments. The main inspiration for this thesis is to develop high performance domain independent sentiment classification method. This study focuses on sentiment analysis at the sentence level using lexical based method for different type data such as reviews and blogs. The proposed method is based on general lexicons i.e. WordNet, SentiWordNet and user defined lexical dictionaries for sentiment orientation. The relations and glosses of these dictionaries provide solution to the domain portability problem. The experiments are performed on various datasets such as customer reviews and blogs comments. The results show that the proposed method with sentence contextual information is effective for sentiment classification. The proposed method performs better than word and text level corpus based machine learning methods for semantic orientation. The results highlight that the proposed method achieves an average accuracy of 86% at sentence-level and 97% at feedback level for customer reviews. Similarly, it achieves an average accuracy of 83% at sentence level and 86% at feedback level for blog comments

    Knowledge Organization and Terminology: application to Cork

    Get PDF
    This PhD thesis aims to prove the relevance of texts within the conceptual strand of terminological work. Our methodology serves to demonstrate how linguists can infer knowledge information from texts and subsequently systematise it, either through semiformal or formal representations. We mainly focus on the terminological analysis of specialised corpora resorting to semi-automatic tools for text analysis to systematise lexical-semantic relationships observed in specialised discourse context and subsequent modelling of the underlying conceptual system. The ultimate goal of this methodology is to propose a typology that can help lexicographers to write definitions. Based on the double dimension of Terminology, we hypothesise that text and logic modelling do not go hand in hand since the latter does not directly relate to the former. We highlight that knowledge and language are crucial for knowledge systematisation, albeit keeping in mind that they pertain to different levels of analysis, for they are not isomorphic. To meet our goals, we resorted to specialised texts produced within the industry of cork. These texts provide us with a test bed made of knowledge-rich data which enable us to demonstrate our deductive mechanisms employing the Aristotelian formula: X=Y+DC through the linguistic and conceptual analysis of the semi-automatically extracted textual data. To explore the corpus, we resorted to text mining strategies where regular expressions play a central role. The final goal of this study is to create a terminological resource for the cork industry, where two types of resources interlink, namely the CorkCorpus and the OntoCork. TermCork is a project that stems from the organisation of knowledge in the specialised field of cork. For that purpose, a terminological knowledge database is being developed to feed an e-dictionary. This e-dictionary is designed as a multilingual and multimodal product, where several resources, namely linguistic and conceptual ones are paired. OntoCork is a micro domain-ontology where the concepts are enriched with natural language definitions and complemented with images, either annotated with metainformation or enriched with hyperlinks to additional information, such as a lexicographic resource. This type of e-dictionary embodies what we consider a useful terminological tool in the current digital information society: accounting for its main features, along with an electronic format that can be integrated into the Semantic Web due to its interoperability data format. This aspect emphasises its contribution to reduce ambiguity as much as possible and to increase effective communication between experts of the domain, future experts, and language professionals.Cette thèse vise à prouver la pertinence des textes dans le volet conceptuel du travail terminologique. Notre méthodologie sert à démontrer comment les linguistes peuvent déduire des informations de connaissance à partir de textes et les systématiser par la suite, soit à travers des représentations semi-formelles ou formelles. Nous nous concentrons principalement sur l'analyse terminologique de corpus spécialisé faisant appel à des outils semi-automatiques d'analyse de texte pour systématiser les relations lexico-sémantiques observées dans un contexte de discours spécialisé et la modélisation ultérieure du système conceptuel sous-jacent. L’objectif de cette méthodologie est de proposer une typologie qui peut aider les lexicographes à rédiger des définitions. Sur la base de la double dimension de la terminologie, nous émettons l'hypothèse que la modélisation textuelle et logique ne va pas de pair puisque cette dernière n'est pas directement liée à la première. Nous soulignons que la connaissance et le langage sont essentiels pour la systématisation des connaissances, tout en gardant à l'esprit qu'ils appartiennent à différents niveaux d'analyse, car ils ne sont pas isomorphes. Pour atteindre nos objectifs, nous avons eu recours à des textes spécialisés produits dans l'industrie du liège. Ces textes nous fournissent un banc d'essai constitué de données riches en connaissances qui nous permettent de démontrer nos mécanismes déductifs utilisant la formule aristotélicienne : X = Y + DC à travers l'analyse linguistique et conceptuelle des données textuelles extraites semi-automatiquement. Pour l'exploitation du corpus, nous avons recours à des stratégies de text mining où les expressions régulières jouent un rôle central. Le but de cette étude est de créer une ressource terminologique pour l'industrie du liège, où deux types de ressources sont liés, à savoir le CorkCorpus et l'OntoCork. TermCork est un projet qui découle de l'organisation des connaissances dans le domaine spécialisé du liège. À cette fin, une base de données de connaissances terminologiques est en cours de développement pour alimenter un dictionnaire électronique. Cet edictionnaire est conçu comme un produit multilingue et multimodal, où plusieurs ressources, à savoir linguistiques et conceptuelles, sont jumelées. OntoCork est une micro-ontologie de domaine où les concepts sont enrichis de définitions de langage naturel et complétés par des images, annotées avec des méta-informations ou enrichies d'hyperliens vers des informations supplémentaires. Ce type de dictionnaire électronique désigne ce que nous considérons comme un outil terminologique utile dans la société de l'information numérique actuelle : la prise en compte de ses principales caractéristiques, ainsi qu'un format électronique qui peut être intégré dans le Web sémantique en raison de son format de données d'interopérabilité. Cet aspect met l'accent sur sa contribution à réduire autant que possible l'ambiguïté et à accroître l'efficacité de la communication entre les experts du domaine, les futurs experts et les professionnels de la langue
    corecore