12 research outputs found

    Cross-language document retrieval by using non-linear semantic mapping

    Get PDF
    Se propone un procedimiento no-lineal de mapeado semántico para extraer información multilingüe. El método consiste en utilizar una técnica de reducción de espacio no-lineal para agrupar colecciones de documentos multilingües. En el método propuesto, se construyen para cada lengua agrupaciones independientes de la colección multilingüe y se usan las similitudes de las expresiones semánticas para extraer documentos multilingües. Se implementan dos variantes del método y se comparan con técnicas de extracción de información multilingüe. El método propuesto, para unas tareas específicas, mejora el convencional.A non-linear semantic mapping procedure is proposed for cross-language document retrieval. The method relays on a non-linear space reduction technique for constructing semantic embeddings of multilingual document collections. In the proposed method, an independent embedding is constructed for each language in the multilingual collection and the similarities among the resulting semantic representations are used for cross-language document retrieval. Two variants of the proposed method are implemented and compared with a state-of-the-art cross-language information retrieval technique. It is shown that, for some specific tasks, the proposed method outperforms the conventional one

    Review of a proposed methodology for bibliometric and visualization analyses for organizations: application to the collaboration economy

    Get PDF
    This paper presents the bibliometric and visualization method applied to a dataset of 729 documents published in the collaborative economy research field. Four steps are described in details: (1) the delimitation of the field of study; (2) the selection of databases, keywords, and search criteria; (3) the extraction, cleaning, and formatting; and finally (4) the co-citation analysis and visualization. The method validation section shows the results obtained by applying our methodological procedure to an author network analysis as well as a source title network analysis. This study is unique which presents a co-citation analysis coupled with a network visualization applied to the rapidly growing research area of the collaborative economy as a whole and not only of the collaborative tourism and hospitality research, as has been previously. The originality of this method lies firstly in the fact that the data were extracted from two databases (Scopus and Web of Science) instead of one as is commonly done in analytic studies. Secondly, VOSviewer was our main analytical tool performing the co-citation analysis and the network visualizations

    IT for a better future how to integrate ethics, politics and innovation.

    Get PDF
    Summary of the ETICA project from a perspective of responsible research and innovationPurpose The paper explores future and emerging information and communication technologies. It gives a general overview of the social consequences and ethical issues arising from technologies that can currently be reasonably expected. This overview is used to present recommendations and integrate these in a framework of responsible innovation. Design / methodology / approach The identification of emerging ICTs and their ethical consequences is based on the review and analysis if several different bodies of literature. The individual features of the ICTs and the ethical issues identified this way are then aggregated and analysed. Findings The paper outlines the 11 ICTs identified. Some of the shared features that are likely to have social relevance include an increase in natural interaction, the invisibility of technology, direct links between humans and technology, detailed models and data of humans and an increasing autonomy of technology that may lead to power over the user. Ethical issues include several current topics such as privacy, data protection, intellectual property and digital divides. New problems may include changes to the way humans are perceived and the role of humans and technology in society. This includes changing power structures and different ways of treating humans. Research limitations / implications The paper presents a piece of foresight research which cannot claim exact knowledge of the future. However, by developing a detailed understanding of possible futures it provides an important basis for current decisions relating to future technology development and governance. Practical implications The paper spells out a range of recommendations for both policy makers and researchers / industry. These refer to the framework within which technology is developed and how such a framework could be designed to allow the development of ethical reflexivity. Social Implications The work described here is likely to influence EU policy on ICT research and technology reseach and innovation more broadly. This may have implications for the type of technologies funded and broad implications for the social use of emerging technologies. Originality/value The paper presents a novel and important broad view of the future of ICTs that is required in order to inform current policy decisions.The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 230318

    An Anthological Review of Research Utilizing MontyLingua: a Python-Based End-to-End Text Processor

    Get PDF
    MontyLingua, an integral part of ConceptNet which is currently the largest commonsense knowledge base, is an English text processor developed using Python programming language in MIT Media Lab. The main feature of MontyLingua is the coverage for all aspects of English text processing from raw input text to semantic meanings and summary generation, yet each component in MontyLingua is loosely-coupled to each other at the architectural and code level, which enabled individual components to be used independently or substituted. However, there has been no review exploring the role of MontyLingua in recent research work utilizing it. This paper aims to review the use of and roles played by MontyLingua and its components in research work published in 19 articles between October 2004 and August 2006. We had observed a diversified use of MontyLingua in many different areas, both generic and domain-specific. Although the use of text summarizing component had not been observe, we are optimistic that it will have a crucial role in managing the current trend of information overload in future research

    Parts-of-Speech Tagger Errors Do Not Necessarily Degrade Accuracy in Extracting Information from Biomedical Text

    Get PDF
    Background: An ongoing assessment of the literature is difficult with the rapidly increasing volume of research publications and limited effective information extraction tools which identify entity relationships from text. A recent study reported development of Muscorian, a generic text processing tool for extracting protein-protein interactions from text that achieved comparable performance to biomedical-specific text processing tools. This result was unexpected since potential errors from a series of text analysis processes is likely to adversely affect the outcome of the entire process. Most biomedical entity relationship extraction tools have used biomedical-specific parts-of-speech (POS) tagger as errors in POS tagging and are likely to affect subsequent semantic analysis of the text, such as shallow parsing. This study aims to evaluate the parts-of-speech (POS) tagging accuracy and attempts to explore whether a comparable performance is obtained when a generic POS tagger, MontyTagger, was used in place of MedPost, a tagger trained in biomedical text. Results: Our results demonstrated that MontyTagger, Muscorian's POS tagger, has a POS tagging accuracy of 83.1% when tested on biomedical text. Replacing MontyTagger with MedPost did not result in a significant improvement in entity relationship extraction from text; precision of 55.6% from MontyTagger versus 56.8% from MedPost on directional relationships and 86.1% from MontyTagger compared to 81.8% from MedPost on nondirectional relationships. This is unexpected as the potential for poor POS tagging by MontyTagger is likely to affect the outcome of the information extraction. An analysis of POS tagging errors demonstrated that 78.5% of tagging errors are being compensated by shallow parsing. Thus, despite 83.1% tagging accuracy, MontyTagger has a functional tagging accuracy of 94.6%. Conclusions: The POS tagging error does not adversely affect the information extraction task if the errors were resolved in shallow parsing through alternative POS tag use

    Opérationnaliser les compétences transversales en analyse bibliométrique et en visualisation des réseaux au thème de l’économie collaborative

    Get PDF
    Une analyse bibliométrique utilisant la visualisation de réseaux pour représenter le thème de l’Économie collaborative (ÉC) au sein de la recherche scientifique a été réalisée dans le cadre de ce mémoire. Pour ce faire, 729 documents ont été recensés sur deux bases de données (Scopus et Web of Science). Ces documents ont été désambiguïsés, nettoyés et standardisés pour être compilés et analysés avec BibExcel et VOSviewer. Différentes statistiques bibliométriques et analyses non évaluatives (cooccurence, cocitation, coauteur) ont pu être exécutées et visualisées afin de mieux comprendre le milieu de l’ÉC. Les résultats de cette recherche sont multiples. D’une part, l’importance des États-Unis est notoire, autant au niveau des auteurs, des organisations que des collaborations. Ensuite, différentes grappes thématiques ont pu être formées selon la variable étudiée. L’importance de la technologie, du tourisme, du développement durable, de l’aspect managérial et enfin de la théorie/conceptualisation de l’ÉC est récurrente. Plusieurs auteurs sont déterminants dans la littérature, mais les plus influents sont Russel Belk et Rachel Botsman. Le livre What’s yours is mine (Botsman et Rogers, 2010) est le document le plus cité, et ce malgré le fait qu’il soit écrit par des auteurs ne provenant pas du milieu académique. Les relations entre les publications étudiées démontrent une cohésion entre les différentes idées et thématiques véhiculées dans le domaine, et ce malgré le fait qu’il existe des problèmes définitionnels et conceptuels à propos de l’ÉC. Enfin, l’évolution chronologique des publications subit une croissance très importante depuis 2016 et témoigne d’un champ de recherche émergent : le développement durable. La somme des résultats analysés donne un nouveau regard sur l’ÉC. Elle permet pour les nouveaux chercheurs de les introduire aux caractéristiques du domaine et sert aux experts à cerner les thématiques, revues et auteurs à considérer lors de leurs propres analyses. La méthodologie et les résultats de cette recherche furent publiés dans le Journal of Cleaner Production (Ertz & Leblanc-Proulx, 2018), le Journal of Markteing Analytics (Ertz & Leblanc-Proulx, 2019a) et le livre Sage Research methods Cases (Ertz & Leblanc-Proulx, 2019b)

    A Novel Algorithm for Visualizing Concept Associations

    No full text