6,154 research outputs found

    Review essay: new writings on love, sex and kisses

    Get PDF

    Knowledge-Driven Implicit Information Extraction

    Get PDF
    Natural language is a powerful tool developed by humans over hundreds of thousands of years. The extensive usage, flexibility of the language, creativity of the human beings, and social, cultural, and economic changes that have taken place in daily life have added new constructs, styles, and features to the language. One such feature of the language is its ability to express ideas, opinions, and facts in an implicit manner. This is a feature that is used extensively in day to day communications in situations such as: 1) expressing sarcasm, 2) when trying to recall forgotten things, 3) when required to convey descriptive information, 4) when emphasizing the features of an entity, and 5) when communicating a common understanding. Consider the tweet New Sandra Bullock astronaut lost in space movie looks absolutely terrifying and the text snippet extracted from a clinical narrative He is suffering from nausea and severe headaches. Dolasteron was prescribed . The tweet has an implicit mention of the entity Gravity and the clinical text snippet has implicit mention of the relationship between medication Dolasteron and clinical condition nausea . Such implicit references of the entities and the relationships are common occurrences in daily communication and they add value to conversations. However, extracting implicit constructs has not received enough attention in the information extraction literature. This dissertation focuses on extracting implicit entities and relationships from clinical narratives and extracting implicit entities from Tweets. When people use implicit constructs in their daily communication, they assume the existence of a shared knowledge with the audience about the subject being discussed. This shared knowledge helps to decode implicitly conveyed information. For example, the above Twitter user assumed that his/her audience knows that the actress Sandra Bullock starred in the movie Gravity and it is a movie about space exploration. The clinical professional who wrote the clinical narrative above assumed that the reader knows that Dolasteron is an anti-nausea drug. The audience without such domain knowledge may not have correctly decoded the information conveyed in the above examples. This dissertation demonstrates manifestations of implicit constructs in text, studies their characteristics, and develops a software solution that is capable of extracting implicit information from text. The developed solution starts by acquiring relevant knowledge to solve the implicit information extraction problem. The relevant knowledge includes domain knowledge, contextual knowledge, and linguistic knowledge. The acquired knowledge can take different syntactic forms such as a text snippet, structured knowledge represented in standard knowledge representation languages such as the Resource Description Framework (RDF) or other custom formats. Hence, the acquired knowledge is pre-processed to create models that can be processed by machines. Such models provide the infrastructure to perform implicit information extraction. This dissertation focuses on three different use cases of implicit information and demonstrates the applicability of the developed solution in these use cases. They are: 1) implicit entity linking in clinical narratives, 2) implicit entity linking in Twitter, and 3) implicit relationship extraction from clinical narratives. The evaluations are conducted on relevant annotated datasets for implicit information and they demonstrate the effectiveness of the developed solution in extracting implicit information from text

    INNOVATION AND KNOWLEDGE TRANSFER MECHANISMS IN AN “ENGAGED” UNIVERSITY. THE CASE OF THE “FEDERICO II" SAN GIOVANNI HUB (SGH)

    Get PDF
    What happens when a former industrial area (dismissed for nearly 20 years) is replaced by a knowledge-intensive Hub hosting: a University Campus, research centres and laboratories, firms, and a hybrid form of advanced education programmes in partnership with global-scale companies? The present research aims at defining the scope of such emerging phenomenon occurring in a peripheral suburb in the East area of the city of Naples (Italy), and characterised by the settlement of a knowledge intensive Hub involving innovation, technology and knowledge transfer processes. The main subject of the study is the San Giovanni a Teduccio “Federico II” University Hub, a university campus and research centre hosted by a peripheral urban suburb in the East area of Naples and herein named the San Giovanni Hub (“SGH”) or simply the “Hub”

    Text mining for social sciences: new approaches

    Get PDF
    The rise of the Internet has determined an important change in the way we look at the world, and then the mode we measure it. In June 2018, more than 55% of the world’s population has an Internet access. It follows that, every day we are able to quantify what more than four billion people do, how and when they do it. This means data. The availability of all these data raised more than one questions: How to manage them? How to treat them? How to extract information from them? Now, more than ever before, we need to think about new rules, new methods and new procedures for handling this huge amount of data, which are characterized by being unstructured, raw and messy. One of the most interesting challenge in this field regards the implementation of processes for deriving information from textual sources; this process is also known as Text Mining. Born in the mid-90s, Text Mining represents a prolific field which has evolved – thanks to technology evolution – from the Automatic Text Analysis, a set of methods for the description and the analysis of documents. Textual data, even if transformed into a structured format, present several criticisms as they are characterized by high dimensionality and noise. Moreover, online texts – like social media posts or blogs comments – are most of the time very short, and this means more sparseness of the matrices when the data are encoded. All these findings pose the problem of looking at new and advanced solutions for treating Web Data, that are able to overcome these criticisms and at the same time, return the information contained into these texts. The objective is to propose a fast and scalable method, able to deal with the findings of the online texts, and then with big and sparse matrices. To do that, we propose a procedure that starts from the collection of texts to the interpretation of the results. The innovative parts of this procedure consist of the choice of the weighting scheme for the term-document matrix and the co-clustering approach for data classification. To verify the validity of the procedure, we test it through two real applications: one concerning the topic of the safety and health at work and another regarding the subject of the Brexit vote. It will be shown how the technique works on different types of texts, allowing us to obtain meaningful results. For the reasons described above, in this research work we implement and test on real datasets a new procedure for content analysis of textual data, using a two-way approach in the Text Clustering field. As will be shown in the following pages, Text Clustering is a process of unsupervised classification that reproduces the internal structure of the data, by dividing the text into different groups on the basis of the lexical similarities. Text Clustering is mostly utilized for content analysis, and it might be applied for the classification of words, documents or both. In latter case we refer to two-way clustering, that is the specific approach we implemented within this research work for the treatment of the texts. To better organize the research work, we divided it into two parts: a first part of theory and a second one of application. The first part contains a preliminary chapter of literature review on the field of the Automatic Text Analysis in the context of data revolution, and a second chapter where the new procedure for text co-clustering is proposed. The second part regards the application of the proposed techniques on two different set of texts, one composed of news and another one composed of tweets. The idea is to test the same procedure on different type of texts, in order to verify the validity and the robustness of the method

    The narrative interview for the assessment of the assisted person: structure, method and data analysis

    Get PDF
    Background and aim: If it is true that the impact of the symptoms of the disease is differently perceived by each person and that there is an incommunicability of the experiences of suffering, it is equally true that the narration provides an understandable representation, which derives from the network of representations that are part of a personal history. The aim of this study was to offer an in-depth analysis of the “narrative interview” collected during the assessment of a 74 years old diabetic woman. Methods: A case study was conducted by a nurse with advanced expertise in conducting narrative interview. Content analysis and Meaning analysis were performed using a Grounded theory approach and according with Gee’s Poetic Method. Results: The patient after the diagnosis felt disbelief, anger and confusion. The illness forces her to change her life, habits and social role, with high suffering. However she adjusted to this new condition and thanks to her strong and positive attitude and the social support she received, she has succeeded in activating her “post traumatic growth”. Conclusions: A good narrative interview starts long before the interview itself and it requires: a specific training in the use of the instrument; the strengthening of specific skills (e.g. the active listening); the choice of optimal setting and timing for the patient; the ability to offer encouragement in the expression of the subjective experience and to conduct an analysis of the patient’s words with a subjective lens, reflecting the uniqueness of each illness experience

    System Learning of User Interactions

    Get PDF
    The case presented in this paper describes an early prototype and next steps for developing a user-adaptive recommender system using semantic analysis and matching of user profiles and content. Machine learning methods optimize semantic analysis and matching based on implicit and explicit feedback of users. The constant interaction with users provides a valuable data source that is used to improve human-computer interaction and for adapting to specific user preferences. This can lead to, among others, higher accuracy and relevance in content matching, more intuitive graphical user interfaces, improved system performance, and better prioritization of tasks

    Enabling entity retrieval by exploiting Wikipedia as a semantic knowledge source

    Get PDF
    This dissertation research, PanAnthropon FilmWorld, aims to demonstrate direct retrieval of entities and related facts by exploiting Wikipedia as a semantic knowledge source, with the film domain as its proof-of-concept domain of application. To this end, a semantic knowledge base concerning the film domain has been constructed with the data extracted/derived from 10,640 Wikipedia pages on films and additional pages on film awards. The knowledge base currently contains 209,266 entities and 2,345,931 entity-centric facts. Both the knowledge base and the corresponding semantic search interface are based on the coherent classification of entities. Entity-centric facts are also consistently represented as tuples. The semantic search interface (http://dlib.ischool.drexel.edu:8080/sofia/PA/) supports multiple types of semantic search functions, which go beyond the traditional keyword-based search function, including the main General Entity Retrieval Query (GERQ) function, which is concerned with retrieving all entities that match the specified entity type, subtype, and semantic conditions and thus corresponds to the main research problem. Two types of evaluation have been performed in order to evaluate (1) the quality of information extraction and (2) the effectiveness of information retrieval using the semantic interface. The first type of evaluation has been performed by inspecting 11,495 film-centric facts concerning 100 films. The results have confirmed high data quality with 99.96% average precision and 99.84% average recall. The second type of evaluation has been performed by conducting an experiment with human subjects. The experiment involved having the subjects perform a retrieval task by using both the PanAnthropon interface and the Internet Movie Database (IMDb) interface and comparing their task performance between the two interfaces. The results have confirmed higher effectiveness of the PanAnthropon interface vs. the IMDb interface (83.11% vs. 40.78% average precision; 83.55% vs. 40.26% average recall). Moreover, the subjects’ responses to the post-task questionnaire indicate that the subjects found the PanAnthropon interface to be highly usable and easily understandable as well as highly effective. The main contribution from this research therefore consists in achieving the set research goal, namely, demonstrating the utility and feasibility of semantics-based direct entity retrieval.Ph.D., Information Studies -- Drexel University, 201

    The discourse of tourism and national heritage: a constrastive study from a cultural perspective

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Filosofía y Letras, Departamento de Filología Inglesa. Fecha de lectura: 20-11-2014This thesis presents a research study in the field of online tourism promotion. It focuses on the national online promotion of UNESCO World Heritage Sites, in two different types of websites –institutional and commercial– from three countries, Great Britain, Spain and Romania. The study analyses the way each country presents its national landmarks and combines various modes to create a virtual brochure with a promotional message from both institutional and commercial positions. For this, it studies the organization of the websites and their webpages, as well as the lexico-grammatical and visual features of the promotional messages. Results of the different analyses are interpreted from a cultural perspective. The theoretical framework for the analysis is Systemic Functional Linguistics. The linguistic text is analysed following Halliday’s theory of the metafunctions (1985, 1994; Halliday and Matthiessen 2004). Thus, the analysis focuses on the ideational, interpersonal and textual meanings of the verbal message. Analysis of the visual text applies Kress and van Leeuwen’s model (1996, 2006), studying the same types of meanings realised visually. The results of the different analyses are compared from two perspectives: in relation to the types of websites and to the countries in which they were produced. Comparison between institutional and commercial websites reveals a pattern in which the similarities seem to be related to characteristics typical of web organization and layout, tourist promotion and specific topic, while differences reflect the types of websites and their functions. However, when the websites are compared from the point of view of the different countries, a number of national characteristics of web promotion, common to the two functions of websites are revealed. These are further interpreted from a cultural point of view, showing that the findings can be accounted for by the context dimension of cultural variability (Hall 1976, 2000; Hall and Hall 1990). The British and Spanish sets of websites are, in general, consistent with the literature on intercultural communication consulted (Hall 2000; Würtz 2005; Neuliep 2006; Şerbănescu 2007), whereas the Romanian sets do not follow the pattern for its usual classification as a high-context culture, but combine features of both low- and high-contexts. The consistencies seem to indicate the stability of British and Spanish cultures. At the same time, departure from the cultural contextual patterns exists in all the cases analysed. These inconsistencies can be explained by cultural changes and influences due to globalization, and internal changes in terms of politics, economy and society. They also indicate that cultural patterns can be affected by the medium of communication (Internet) and the context of communication (types of promotion). Findings from the thesis emphasize the need for an understanding of multimodality and interculturality in online tourism promotion, especially as applied to creating an image or brand for a country's successful international promotion They show that Systemic Functional Linguistics offers a useful tool from both theoretical and practical perspective which can be applied to areas like composition of promotional messages, online promotion, tourism discourse and its strategies, or intercultural communicationEsta tesis presenta un estudio de investigación en el campo de la promoción turística por internet. Específicamente, analiza la promoción nacional por internet de los Sitios Patrimonio de la Humanidad de la UNESCO, en dos tipos de sitios web – institucional y comercial – de tres países, Gran Bretaña, España y Rumanía. El estudio analiza el modo en el cual cada país presenta sus objetivos turísticos nacionales y combina varias modalidades para crear un folleto virtual con un mensaje promocional desde ambas posiciones, institucional y comercial. Para esto, estudia tanto la organización de los sitios web y sus páginas, como las características léxico-gramaticales y visuales de los mensajes promocionales. Los resultados de los diferentes análisis se interpretan desde una perspectiva cultural. El marco teórico utilizado para el análisis es la Lingüística Sistémico-Funcional. El texto lingüístico es analizado siguiendo la teoría de las metafunciones de Halliday (1985, 1994; Halliday y Matthiessen 2004). Así, el análisis se centra en los significados ideacional, interpersonal y textual de los mensajes verbales. El análisis del texto visual aplica el modelo de Kress y van Leeuwen (1996, 2006), estudiando los mismos tipos de significados realizados visualmente. Los resultados de los diferentes análisis se comparan desde dos perspectivas: en relación con los tipos de sitios web y con los países de donde proceden. Las comparaciones entre los sitios web institucionales y comerciales revelan un patrón en el cual las similitudes parecen relacionadas con las características típicas de la organización y disposición de la web, promoción turística y tema específico, mientras que las diferencias reflejan los tipos de sitios web y sus funciones. Sin embargo, cuando los sitios web se comparan desde el punto de vista de las diferentes culturas, se revela un número de características nacionales de la promoción en línea, comunes en las dos funciones de los sitios web. Estas características nacionales se interpretan más a fondo desde un punto de vista cultural, mostrando que los resultados pueden ser explicados por la dimensión del "contexto" de la variabilidad cultural (Hall 1976, 2000; Hall and Hall 1990). Los córpora de sitios web británicos y españoles son, en general, congruentes con los estudios sobre la comunicación intercultural consultados (Hall 2000; Würtz 2005; Neuliep 2006; Şerbănescu 2007), mientras que los córpora rumanos no siguen el patrón de su clasificación usual como cultura de contexto alto, sino que combinan características de ambos contextos, bajo y alto. Las consistencias parecen indicar la estabilidad de las culturas británica y española. Al mismo tiempo, existen desviaciones de los patrones culturales contextuales en todos los casos analizados. Estas inconsistencias se pueden explicar por los cambios culturales y las influencias debidas a la globalización y los cambios internos en términos de política, economía y sociedad. También indican que los patrones culturales pueden ser afectados por el medio de comunicación (internet) y el contexto de comunicación (tipo de promoción). Los resultados de la tesis ponen de relieve la necesidad de una comprensión de la multimodalidad y la interculturalidad en la promoción turística por internet, especialmente en relación a la creación de una imagen o marca para la promoción internacional de un país. Demuestran que la Lingüística Sistémico-Funcional ofrece una herramienta útil, tanto desde la perspectiva teórica como de la práctica, que se puede aplicar a áreas como la composición de mensajes promocionales, la promoción por internet, el discurso del turismo y sus estrategias, o la comunicación intercultura

    Automatic Document Summarization Using Knowledge Based System

    Get PDF
    This dissertation describes a knowledge-based system to create abstractive summaries of documents by generalizing new concepts, detecting main topics and creating new sentences. The proposed system is built on the Cyc development platform that consists of the world’s largest knowledge base and one of the most powerful inference engines. The system is unsupervised and domain independent. Its domain knowledge is provided by the comprehensive ontology of common sense knowledge contained in the Cyc knowledge base. The system described in this dissertation generates coherent and topically related new sentences as a summary for a given document. It uses syntactic structure and semantic features of the given documents to fuse information. It makes use of the knowledge base as a source of domain knowledge. Furthermore, it uses the reasoning engine to generalize novel information. The proposed system consists of three main parts: knowledge acquisition, knowledge discovery, and knowledge representation. Knowledge acquisition derives syntactic structure of each sentence in the document and maps words and their syntactic relationships into Cyc knowledge base. Knowledge discovery abstracts novel concepts, not explicitly mentioned in the document by exploring the ontology of mapped concepts and derives main topics described in the document by clustering the concepts. Knowledge representation creates new English sentences to summarize main concepts and their relationships. The syntactic structure of the newly created sentences is extended beyond simple subject-predicate-object triplets by incorporating adjective and adverb modifiers. This structure allows the system to create sentences that are more complex. The proposed system was implemented and tested. Test results show that the system is capable of creating new sentences that include abstracted concepts not mentioned in the original document and is capable of combining information from different parts of the document text to compose a summary
    corecore