2,079 research outputs found

    Econometrics meets sentiment : an overview of methodology and applications

    Get PDF
    The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software

    Knowledge-Based Techniques for Scholarly Data Access: Towards Automatic Curation

    Get PDF
    Accessing up-to-date and quality scientific literature is a critical preliminary step in any research activity. Identifying relevant scholarly literature for the extents of a given task or application is, however a complex and time consuming activity. Despite the large number of tools developed over the years to support scholars in their literature surveying activity, such as Google Scholar, Microsoft Academic search, and others, the best way to access quality papers remains asking a domain expert who is actively involved in the field and knows research trends and directions. State of the art systems, in fact, either do not allow exploratory search activity, such as identifying the active research directions within a given topic, or do not offer proactive features, such as content recommendation, which are both critical to researchers. To overcome these limitations, we strongly advocate a paradigm shift in the development of scholarly data access tools: moving from traditional information retrieval and filtering tools towards automated agents able to make sense of the textual content of published papers and therefore monitor the state of the art. Building such a system is however a complex task that implies tackling non trivial problems in the fields of Natural Language Processing, Big Data Analysis, User Modelling, and Information Filtering. In this work, we introduce the concept of Automatic Curator System and present its fundamental components.openDottorato di ricerca in InformaticaopenDe Nart, Dari

    The state of research on folksonomies in the field of Library and Information Science : a Systematic Literature Review

    Get PDF
    Purpose – The purpose of this thesis is to provide an overview of all relevant peer-reviewed articles on folksonomies, social tagging and social bookmarking as knowledge organisation systems within the field of Library and Information Science by reviewing the current state of research on these systems of managing knowledge. Method – I use the systematic literature review method in order to systematically and transparently review and synthesise data extracted from 39 articles found through the discovery system LUBsearch in order to find out which, and to which degree different methods, theories and systems are represented, which subfields can be distinguished, how present research within these subfields is and which larger conclusions can be drawn from research conducted between 2003-2013 on folksonomies. Findings – There have been done many studies which are exploratory or reviewing literature discussions, and other frequently used methods which have been used are questionnaires or surveys, although often in conjunction with other methods. Furthermore, out of the 39 studies, 22 were quantitative, 15 were qualitative and 2 used mixed methods. I also found that there were an underwhelming number of theories being explicitly used, where merely 11 articles explicitly used theories, and only one theory was used twice. No key authors on the topic were identified, though Knowledge Organization, Information Processing & Management and Journal of the American Society for Information Science and Technology were recognised as key journals for research on folksonomies. There have been plenty of studies on how tags and folksonomies have effected other knowledge organisation systems, or how pre-existing have been used to create new systems. Other well represented subfields include studies on the quality or characteristics of tags or text, and studies aiming to improve folksonomies, search methods or tags. Value – I provide an overview on what has been researched and where the focus on said research has been during the last decade and present future research suggestions and identify possible dangers to be wary of which I argue will benefit folksonomies and knowledge organisation as a whole

    Human-competitive automatic topic indexing

    Get PDF
    Topic indexing is the task of identifying the main topics covered by a document. These are useful for many purposes: as subject headings in libraries, as keywords in academic publications and as tags on the web. Knowing a document's topics helps people judge its relevance quickly. However, assigning topics manually is labor intensive. This thesis shows how to generate them automatically in a way that competes with human performance. Three kinds of indexing are investigated: term assignment, a task commonly performed by librarians, who select topics from a controlled vocabulary; tagging, a popular activity of web users, who choose topics freely; and a new method of keyphrase extraction, where topics are equated to Wikipedia article names. A general two-stage algorithm is introduced that first selects candidate topics and then ranks them by significance based on their properties. These properties draw on statistical, semantic, domain-specific and encyclopedic knowledge. They are combined using a machine learning algorithm that models human indexing behavior from examples. This approach is evaluated by comparing automatically generated topics to those assigned by professional indexers, and by amateurs. We claim that the algorithm is human-competitive because it chooses topics that are as consistent with those assigned by humans as their topics are with each other. The approach is generalizable, requires little training data and applies across different domains and languages

    Identifying collaboration dynamics of bipartite author-topic networks with the influences of interest changes

    Get PDF
    Knowing driving factors and understanding researcher behaviors from the dynamics of collaborations over time offer some insights, i.e. help funding agencies in designing research grant policies. We present longitudinal network analysis on the observed collaborations through co-authorship over 15 years. Since co-authors possibly influence researchers to have interest changes, by focusing on researchers who could become the influencer, we propose a stochastic actor-oriented model of bipartite (two-mode) author-topic networks from article metadata. Information of scientific fields or topics of article contents, which could represent the interests of researchers, are often unavailable in the metadata. Topic absence issue differentiates this work with other studies on collaboration dynamics from article metadata of title-abstract and author properties. Therefore, our works also include procedures to extract and map clustered keywords as topic substitution of research interests. Then, the next step is to generate panel-waves of co-author networks and bipartite author-topic networks for the longitudinal analysis. The proposed model is used to find the driving factors of co-authoring collaboration with the focus on researcher behaviors in interest changes. This paper investigates the dynamics in an academic social network setting using selected metadata of publicly-available crawled articles in interrelated domains of "natural language processing" and "information extraction". Based on the evidence of network evolution, researchers have a conformed tendency to co-author behaviors in publishing articles and exploring topics. Our results indicate the processes of selection and influence in forming co-author ties contribute some levels of social pressure to researchers. Our findings also discussed on how the co-author pressure accelerates the changes of interests and behaviors of the researchers

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Land Use Identification of the Metropolitan Area of Guadalajara Using Bicycle Data: An Unsupervised Classification Approach

    Get PDF
    El siguiente trabajo propone diferentes maneras de resolver una problemática que se encuentra en la actualidad, que es el hacer la investigación en el área de land-use, mapeo y comportamiento humano evaluando su movimiento por medio de fuentes de información que contienen información geo referenciada, también se comparte la meta de clasificar diferentes secciones y su relación entre ellas. Se utilizó como fuente de información MiBici que es una plataforma de compartimiento de bicicleta que existe en la ciudad de Guadalajara, Jalisco, la cual comparte mes tras mes un archivo consolidado de los viajes que se realizan en cada mes, cabe mencionar que el acceso de esta información es totalmente libre. Las metodologías utilizadas fueron agile para planeación del proyecto, KNN, Decision Trees y KMeans para la cauterización de las zonas, el lenguaje de programación utilizado fue Python, además se anexo una propuesta de implementación utilizando la plataforma de Amazon Web Service con el objetivo de proponer una solución más “sencilla” de implementar, pero con el mismo valor que hacerlo con puros recursos libres. El proceso se dividió primordialmente en 3 partes en donde la primera fue limpiar datos y entenderlos, se aplicaron algoritmos machine learning que fueron Decision tree y KNN, para la segunda etapa evaluando los resultados de la etapa anterior se hicieron modificaciones a los datos en donde se agregaron nuevos campos para mejor los resultados y se aplicó KMeans para la creación de grupos y como último paso se creó un flujo que inicio con la limpieza de los datos en crudo utilizando herramientas de AWS y se terminó con la interpretación de los resultados finales. Los resultados obtenidos fueron demasiados alentadores ya que los grupos que se obtuvieron fueron demasiados marcados y revisándolo con las zonas relacionadas a los nodos se encontró una gran relación. Sin duda alguna queda aún demasiado trabajo a desarrollar en esta rama de investigación

    Community-developed checklists for publishing images and image analysis

    Get PDF
    Images document scientific discoveries and are prevalent in modern biomedical research. Microscopy imaging in particular is currently undergoing rapid technological advancements. However for scientists wishing to publish the obtained images and image analyses results, there are to date no unified guidelines. Consequently, microscopy images and image data in publications may be unclear or difficult to interpret. Here we present community-developed checklists for preparing light microscopy images and image analysis for publications. These checklists offer authors, readers, and publishers key recommendations for image formatting and annotation, color selection, data availability, and for reporting image analysis workflows. The goal of our guidelines is to increase the clarity and reproducibility of image figures and thereby heighten the quality of microscopy data is in publications.Comment: 28 pages, 8 Figures, 3 Supplmentary Figures, Manuscript, Essential recommendations for publication of microscopy image dat
    corecore