214,568 research outputs found

    Knowledge Management: A Discovery Process

    Get PDF
    Getting strategic about how you organize and redistribute knowledge can help just about anyone achieve their goals more efficiently. We at The McKnight Foundation often find ourselves at the center of meaty, data-rich, analytic conversations. This case study summarizes our yearlong exploration and planning to consume, organize, and share knowledge better

    Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings

    Get PDF
    In this paper we present a novel interactive multimodal learning system, which facilitates search and exploration in large networks of social multimedia users. It allows the analyst to identify and select users of interest, and to find similar users in an interactive learning setting. Our approach is based on novel multimodal representations of users, words and concepts, which we simultaneously learn by deploying a general-purpose neural embedding model. We show these representations to be useful not only for categorizing users, but also for automatically generating user and community profiles. Inspired by traditional summarization approaches, we create the profiles by selecting diverse and representative content from all available modalities, i.e. the text, image and user modality. The usefulness of the approach is evaluated using artificial actors, which simulate user behavior in a relevance feedback scenario. Multiple experiments were conducted in order to evaluate the quality of our multimodal representations, to compare different embedding strategies, and to determine the importance of different modalities. We demonstrate the capabilities of the proposed approach on two different multimedia collections originating from the violent online extremism forum Stormfront and the microblogging platform Twitter, which are particularly interesting due to the high semantic level of the discussions they feature

    Identity in research infrastructure and scientific communication: Report from the 1st IRISC workshop, Helsinki Sep 12-13, 2011

    Get PDF
    Motivation for the IRISC workshop came from the observation that identity and digital identification are increasingly important factors in modern scientific research, especially with the now near-ubiquitous use of the Internet as a global medium for dissemination and debate of scientific knowledge and data, and as a platform for scientific collaborations and large-scale e-science activities.

The 1 1/2 day IRISC2011 workshop sought to explore a series of interrelated topics under two main themes: i) unambiguously identifying authors/creators & attributing their scholarly works, and ii) individual identification and access management in the context of identity federations. Specific aims of the workshop included:

• Raising overall awareness of key technical and non-technical challenges, opportunities and developments.
• Facilitating a dialogue, cross-pollination of ideas, collaboration and coordination between diverse – and largely unconnected – communities.
• Identifying & discussing existing/emerging technologies, best practices and requirements for researcher identification.

This report provides background information on key identification-related concepts & projects, describes workshop proceedings and summarizes key workshop findings

    Big data: the potential role of research data management and research data registries

    Get PDF
    Universities generate and hold increasingly vast quantities of research data – both in the form of large, well-structured datasets but more often in the form of a long tail of small, distributed datasets which collectively amount to ‘Big Data’ and offer significant potential for reuse. However, unlike big data, these collections of small data are often less well curated and are usually very difficult to find thereby reducing their potential reuse value. The Digital Curation Centre (DCC) works to support UK universities to better manage and expose their research data so that its full value may be realised. With a focus on tapping into this long tail of small data, this presentation will cover two main DCC, services: DMPonline which helps researchers to identify potentially valuable research data and to plan for its longer-term retention and reuse; and the UK pilot research data registry and discovery service (RDRDS) which will help to ensure that research data produced in UK HEIs can be found, understood, and reused. Initially we will introduce participants to the role of data management planning to open up dialogue between researchers and library services to ensure potentially valuable research data are managed appropriately and made available for reuse where feasible. DMPs provide institutions with valuable insights into the scale of their data holdings, highlight any ethical and legal requirements that need to be met, and enable planning for dissemination and reuse. We will also introduce the DCC’s DMPonline, a tool to help researchers write DMPs, which can be customised by institutions and integrated with other systems to simplify and enhance the management and reuse of data. In the second part of the presentation we will focus on making selected research data more visible for reuse and explore the potential value of local and national research data registries. In particular we will highlight the Jisc-funded RDRDS pilot to establish a UK national service that aggregates metadata relating to data collections held in research institutions and subject data centres. The session will conclude by exploring some of the opportunities we may collaboratively explore in facilitating the management, aggregation and reuse of research data

    A golden age for working with public proteomics data

    Get PDF
    Data sharing in mass spectrometry (MS)-based proteomics is becoming a common scientific practice, as is now common in the case of other, more mature 'omics' disciplines like genomics and transcriptomics. We want to highlight that this situation, unprecedented in the field, opens a plethora of opportunities for data scientists. First, we explain in some detail some of the work already achieved, such as systematic reanalysis efforts. We also explain existing applications of public proteomics data, such as proteogenomics and the creation of spectral libraries and spectral archives. Finally, we discuss the main existing challenges and mention the first attempts to combine public proteomics data with other types of omics data sets

    Analysis and Detection of Information Types of Open Source Software Issue Discussions

    Full text link
    Most modern Issue Tracking Systems (ITSs) for open source software (OSS) projects allow users to add comments to issues. Over time, these comments accumulate into discussion threads embedded with rich information about the software project, which can potentially satisfy the diverse needs of OSS stakeholders. However, discovering and retrieving relevant information from the discussion threads is a challenging task, especially when the discussions are lengthy and the number of issues in ITSs are vast. In this paper, we address this challenge by identifying the information types presented in OSS issue discussions. Through qualitative content analysis of 15 complex issue threads across three projects hosted on GitHub, we uncovered 16 information types and created a labeled corpus containing 4656 sentences. Our investigation of supervised, automated classification techniques indicated that, when prior knowledge about the issue is available, Random Forest can effectively detect most sentence types using conversational features such as the sentence length and its position. When classifying sentences from new issues, Logistic Regression can yield satisfactory performance using textual features for certain information types, while falling short on others. Our work represents a nontrivial first step towards tools and techniques for identifying and obtaining the rich information recorded in the ITSs to support various software engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering (ICSE2019

    Report of the user requirements and web based access for eResearch workshops

    Get PDF
    The User Requirements and Web Based Access for eResearch Workshop, organized jointly by NeSC and NCeSS, was held on 19 May 2006. The aim was to identify lessons learned from e-Science projects that would contribute to our capacity to make Grid infrastructures and tools usable and accessible for diverse user communities. Its focus was on providing an opportunity for a pragmatic discussion between e-Science end users and tool builders in order to understand usability challenges, technological options, community-specific content and needs, and methodologies for design and development. We invited members of six UK e-Science projects and one US project, trying as far as possible to pair a user and developer from each project in order to discuss their contrasting perspectives and experiences. Three breakout group sessions covered the topics of user-developer relations, commodification, and functionality. There was also extensive post-meeting discussion, summarized here. Additional information on the workshop, including the agenda, participant list, and talk slides, can be found online at http://www.nesc.ac.uk/esi/events/685/ Reference: NeSC report UKeS-2006-07 available from http://www.nesc.ac.uk/technical_papers/UKeS-2006-07.pd

    SPEIR: Scottish Portals for Education, Information and Research. Final Project Report: Elements and Future Development Requirements of a Common Information Environment for Scotland

    Get PDF
    The SPEIR (Scottish Portals for Education, Information and Research) project was funded by the Scottish Library and Information Council (SLIC). It ran from February 2003 to September 2004, slightly longer than the 18 months originally scheduled and was managed by the Centre for Digital Library Research (CDLR). With SLIC's agreement, community stakeholders were represented in the project by the Confederation of Scottish Mini-Cooperatives (CoSMiC), an organisation whose members include SLIC, the National Library of Scotland (NLS), the Scottish Further Education Unit (SFEU), the Scottish Confederation of University and Research Libraries (SCURL), regional cooperatives such as the Ayrshire Libraries Forum (ALF)1, and representatives from the Museums and Archives communities in Scotland. Aims; A Common Information Environment For Scotland The aims of the project were to: o Conduct basic research into the distributed information infrastructure requirements of the Scottish Cultural Portal pilot and the public library CAIRNS integration proposal; o Develop associated pilot facilities by enhancing existing facilities or developing new ones; o Ensure that both infrastructure proposals and pilot facilities were sufficiently generic to be utilised in support of other portals developed by the Scottish information community; o Ensure the interoperability of infrastructural elements beyond Scotland through adherence to established or developing national and international standards. Since the Scottish information landscape is taken by CoSMiC members to encompass relevant activities in Archives, Libraries, Museums, and related domains, the project was, in essence, concerned with identifying, researching, and developing the elements of an internationally interoperable common information environment for Scotland, and of determining the best path for future progress

    User Review-Based Change File Localization for Mobile Applications

    Get PDF
    In the current mobile app development, novel and emerging DevOps practices (e.g., Continuous Delivery, Integration, and user feedback analysis) and tools are becoming more widespread. For instance, the integration of user feedback (provided in the form of user reviews) in the software release cycle represents a valuable asset for the maintenance and evolution of mobile apps. To fully make use of these assets, it is highly desirable for developers to establish semantic links between the user reviews and the software artefacts to be changed (e.g., source code and documentation), and thus to localize the potential files to change for addressing the user feedback. In this paper, we propose RISING (Review Integration via claSsification, clusterIng, and linkiNG), an automated approach to support the continuous integration of user feedback via classification, clustering, and linking of user reviews. RISING leverages domain-specific constraint information and semi-supervised learning to group user reviews into multiple fine-grained clusters concerning similar users' requests. Then, by combining the textual information from both commit messages and source code, it automatically localizes potential change files to accommodate the users' requests. Our empirical studies demonstrate that the proposed approach outperforms the state-of-the-art baseline work in terms of clustering and localization accuracy, and thus produces more reliable results.Comment: 15 pages, 3 figures, 8 table
    corecore