778 research outputs found

    Preferred Level of Weird: A Librarian\u27s Guide to Fanfiction

    Get PDF
    This instruction guide aims to provide librarians with an understanding the basics of fanfiction including a glossary of terms, an introduction to the information seeking behaviours of fanfiction readers, and some search tips on a popular general fanfiction archive for helping both librarian and patron find the reading experience they are looking for

    Linked Research on the Decentralised Web

    Get PDF
    This thesis is about research communication in the context of the Web. I analyse literature which reveals how researchers are making use of Web technologies for knowledge dissemination, as well as how individuals are disempowered by the centralisation of certain systems, such as academic publishing platforms and social media. I share my findings on the feasibility of a decentralised and interoperable information space where researchers can control their identifiers whilst fulfilling the core functions of scientific communication: registration, awareness, certification, and archiving. The contemporary research communication paradigm operates under a diverse set of sociotechnical constraints, which influence how units of research information and personal data are created and exchanged. Economic forces and non-interoperable system designs mean that researcher identifiers and research contributions are largely shaped and controlled by third-party entities; participation requires the use of proprietary systems. From a technical standpoint, this thesis takes a deep look at semantic structure of research artifacts, and how they can be stored, linked and shared in a way that is controlled by individual researchers, or delegated to trusted parties. Further, I find that the ecosystem was lacking a technical Web standard able to fulfill the awareness function of research communication. Thus, I contribute a new communication protocol, Linked Data Notifications (published as a W3C Recommendation) which enables decentralised notifications on the Web, and provide implementations pertinent to the academic publishing use case. So far we have seen decentralised notifications applied in research dissemination or collaboration scenarios, as well as for archival activities and scientific experiments. Another core contribution of this work is a Web standards-based implementation of a clientside tool, dokieli, for decentralised article publishing, annotations and social interactions. dokieli can be used to fulfill the scholarly functions of registration, awareness, certification, and archiving, all in a decentralised manner, returning control of research contributions and discourse to individual researchers. The overarching conclusion of the thesis is that Web technologies can be used to create a fully functioning ecosystem for research communication. Using the framework of Web architecture, and loosely coupling the four functions, an accessible and inclusive ecosystem can be realised whereby users are able to use and switch between interoperable applications without interfering with existing data. Technical solutions alone do not suffice of course, so this thesis also takes into account the need for a change in the traditional mode of thinking amongst scholars, and presents the Linked Research initiative as an ongoing effort toward researcher autonomy in a social system, and universal access to human- and machine-readable information. Outcomes of this outreach work so far include an increase in the number of individuals self-hosting their research artifacts, workshops publishing accessible proceedings on the Web, in-the-wild experiments with open and public peer-review, and semantic graphs of contributions to conference proceedings and journals (the Linked Open Research Cloud). Some of the future challenges include: addressing the social implications of decentralised Web publishing, as well as the design of ethically grounded interoperable mechanisms; cultivating privacy aware information spaces; personal or community-controlled on-demand archiving services; and further design of decentralised applications that are aware of the core functions of scientific communication

    “There’s A Tag for That”: An Exploratory Study of Tag Functions in the Archive of Our Own

    Get PDF
    Although there have been many studies on the effectiveness of tagging systems for information organization and retrieval, there have been far fewer studies to address other tag functions and their impact on user experience and the evaluation of information. There was a particular lack of research into how tags function for users who did not add them to a resource. This study used a diary protocol followed by interviews to investigate the functions tags played for users of the Archive of Our Own and their impact on the user experience of the site. Results suggested that tags frequently influenced a users’ decision to consume a fanwork and could also affect their perception of the fanwork or its creator. Participants generally had a positive user experience of the AO3 and found it easier to retrieve fanworks on it than other repositories. Some suggestions for future research are made in the conclusion.Master of Science in Information Scienc

    Information of social media platforms: the case of Last.fm

    Get PDF
    Social media has become a global phenomenon. Currently, there are 2 billion active users on Facebook. However, much of the research on social media is about the consumption side of social media rather than the production or operational aspects of social media. Although research on the production side is still relatively small, it is growing, indicating that it is a fruitful area to study. This thesis attempts to contribute to this area of research to unravel the inner operations of social media with one key research question: How does social media platform organize information? The theory of digital object of Kallinikos et al. (2013) is used to investigate this question. Information display that users of a social media platform interact with is a digital object and it is constructed by two key components which are a database and algorithms. The database and the algorithms shape how information is being organized on information displays, and these influence user behaviors which are then captured as social data in the database. This thesis also critically examines the technology of recommender system by importing engineering literature on information filtering and retrieval. While newsfeed algorithm such as EdgeRank of Facebook has already been critically examined, information systems and media scholars have yet to investigate recommendation algorithms, despite the fact that they have been widely deployed all over the Internet. It is found that the key weakness of recommendation algorithms is their inability to recommend novel items. This is because the main tenet of any recommender system is to “recommend similar items to those that users already like”. Fortunately, this problem can be alleviated when recommender system is being deployed in the digital information environment of social media platforms. In turn, seven theoretical conjectures can be postulated. These are (1) navigation of information display as assembled by social media is highly interactive, (2) information organization of social media is highly unstable which would also render user behaviors unstable, (3) quality of data aggregation casts significant implications on user behaviors, (4) the amount of data captured by social media platforms limits the usefulness of their information displays, (5) output from the recommendation algorithm (recommendation list) casts real implications on user behaviors, (6) circle of friends on a social network can influence user behaviors, and (7) metadata attached to items being displayed casts influence on user behaviors. Data from Last.fm, a social media for music discovery, is used to evaluate these conjectures. The analysis supported most of the conjectures except the instability of information display and the importance of metadata attached to items being displayed. Some kinds of information organization are more stable than initially expected and some kinds of user generated contents are not so important for user behaviors

    The New Legal Landscape for Text Mining and Machine Learning

    Get PDF
    Now that the dust has settled on the Authors Guild cases, this Article takes stock of the legal context for TDM research in the United States. This reappraisal begins in Part I with an assessment of exactly what the Authors Guild cases did and did not establish with respect to the fair use status of text mining. Those cases held unambiguously that reproducing copyrighted works as one step in the process of knowledge discovery through text data mining was transformative, and thus ultimately a fair use of those works. Part I explains why those rulings followed inexorably from copyright\u27s most fundamental principles. It also explains why the precedent set in the Authors Guild cases is likely to remain settled law in the United States. Parts II and III address legal considerations for would-be text miners and their supporting institutions beyond the core holding of the Authors Guild cases. The Google Books and HathiTrust cases held, in effect, that copying expressive works for non-expressive purposes was justified as fair use. This addresses the most significant issue for the legality of text data mining research in the United States; however, the legality of non-expressive use is far from the only legal issue that researchers and their supporting institutions must confront if they are to realize the full potential of these technologies. Neither case addressed issues arising under contract law, laws prohibiting computer hacking, laws prohibiting the circumvention of technological protection measures (i.e., encryption and other digital locks), or cross-border copyright issues. Furthermore, although Google Books addressed the display of snippets of text as part of the communication of search results, and both Authors Guild cases addressed security issues that might bear upon the fair use claim, those holdings were a product of the particular factual circumstances of those cases and can only be extended cautiously to other contexts. Specifically, Part II surveys the legal status of TDM research in other important jurisdictions and explains some of the key differences between the law in the United States and the law in the European Union. It also explains how researchers can predict which law will apply in different situations. Part III sets out a four-stage model of the lifecycle of text data mining research and uses this model to identify and explain the relevant legal issues beyond the core holdings of the Authors Guild cases in relation to TDM as a non-expressive use

    Government Transparency: Six Strategies for More Open and Participatory Government

    Get PDF
    Offers strategies for realizing Knight's 2009 call for e-government and openness using Web 2.0 and 3.0 technologies, including public-private partnerships to develop applications, flexible procurement procedures, and better community broadband access

    Enriching information extraction pipelines in clinical decision support systems

    Get PDF
    Programa Oficial de Doutoramento en Tecnoloxías da Información e as Comunicacións. 5032V01[Resumo] Os estudos sanitarios de múltiples centros son importantes para aumentar a repercusión dos resultados da investigación médica debido ao número de suxeitos que poden participar neles. Para simplificar a execución destes estudos, o proceso de intercambio de datos debería ser sinxelo, por exemplo, mediante o uso de bases de datos interoperables. Con todo, a consecución desta interoperabilidade segue sendo un tema de investigación en curso, sobre todo debido aos problemas de gobernanza e privacidade dos datos. Na primeira fase deste traballo, propoñemos varias metodoloxías para optimizar os procesos de estandarización das bases de datos sanitarias. Este traballo centrouse na estandarización de fontes de datos heteroxéneas nun esquema de datos estándar, concretamente o OMOP CDM, que foi desenvolvido e promovido pola comunidade OHDSI. Validamos a nosa proposta utilizando conxuntos de datos de pacientes con enfermidade de Alzheimer procedentes de distintas institucións. Na seguinte etapa, co obxectivo de enriquecer a información almacenada nas bases de datos de OMOP CDM, investigamos solucións para extraer conceptos clínicos de narrativas non estruturadas, utilizando técnicas de recuperación de información e de procesamento da linguaxe natural. A validación realizouse a través de conxuntos de datos proporcionados en desafíos científicos, concretamente no National NLP Clinical Challenges(n2c2). Na etapa final, propuxémonos simplificar a execución de protocolos de estudos provenientes de múltiples centros, propoñendo solucións novas para perfilar, publicar e facilitar o descubrimento de bases de datos. Algunhas das solucións desenvolvidas están a utilizarse actualmente en tres proxectos europeos destinados a crear redes federadas de bases de datos de saúde en toda Europa.[Resumen] Los estudios sanitarios de múltiples centros son importantes para aumentar la repercusión de los resultados de la investigación médica debido al número de sujetos que pueden participar en ellos. Para simplificar la ejecución de estos estudios, el proceso de intercambio de datos debería ser sencillo, por ejemplo, mediante el uso de bases de datos interoperables. Sin embargo, la consecución de esta interoperabilidad sigue siendo un tema de investigación en curso, sobre todo debido a los problemas de gobernanza y privacidad de los datos. En la primera fase de este trabajo, proponemos varias metodologías para optimizar los procesos de estandarización de las bases de datos sanitarias. Este trabajo se centró en la estandarización de fuentes de datos heterogéneas en un esquema de datos estándar, concretamente el OMOP CDM, que ha sido desarrollado y promovido por la comunidad OHDSI. Validamos nuestra propuesta utilizando conjuntos de datos de pacientes con enfermedad de Alzheimer procedentes de distintas instituciones. En la siguiente etapa, con el objetivo de enriquecer la información almacenada en las bases de datos de OMOP CDM, hemos investigado soluciones para extraer conceptos clínicos de narrativas no estructuradas, utilizando técnicas de recuperación de información y de procesamiento del lenguaje natural. La validación se realizó a través de conjuntos de datos proporcionados en desafíos científicos, concretamente en el National NLP Clinical Challenges (n2c2). En la etapa final, nos propusimos simplificar la ejecución de protocolos de estudios provenientes de múltiples centros, proponiendo soluciones novedosas para perfilar, publicar y facilitar el descubrimiento de bases de datos. Algunas de las soluciones desarrolladas se están utilizando actualmente en tres proyectos europeos destinados a crear redes federadas de bases de datos de salud en toda Europa.[Abstract] Multicentre health studies are important to increase the impact of medical research findings due to the number of subjects that they are able to engage. To simplify the execution of these studies, the data-sharing process should be effortless, for instance, through the use of interoperable databases. However, achieving this interoperability is still an ongoing research topic, namely due to data governance and privacy issues. In the first stage of this work, we propose several methodologies to optimise the harmonisation pipelines of health databases. This work was focused on harmonising heterogeneous data sources into a standard data schema, namely the OMOP CDM which has been developed and promoted by the OHDSI community. We validated our proposal using data sets of Alzheimer’s disease patients from distinct institutions. In the following stage, aiming to enrich the information stored in OMOP CDM databases, we have investigated solutions to extract clinical concepts from unstructured narratives, using information retrieval and natural language processing techniques. The validation was performed through datasets provided in scientific challenges, namely in the National NLP Clinical Challenges (n2c2). In the final stage, we aimed to simplify the protocol execution of multicentre studies, by proposing novel solutions for profiling, publishing and facilitating the discovery of databases. Some of the developed solutions are currently being used in three European projects aiming to create federated networks of health databases across Europe

    Measuring metadata quality

    Get PDF

    Information management and social networks in organizational innovation networks

    Get PDF
    Tese de mestrado. Ciência da Informação. Faculdade de Engenharia. Universidade do Porto. 201
    corecore