214,568 research outputs found
Knowledge Management: A Discovery Process
Getting strategic about how you organize and redistribute knowledge can help just about anyone achieve their goals more efficiently. We at The McKnight Foundation often find ourselves at the center of meaty, data-rich, analytic conversations. This case study summarizes our yearlong exploration and planning to consume, organize, and share knowledge better
Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings
In this paper we present a novel interactive multimodal learning system,
which facilitates search and exploration in large networks of social multimedia
users. It allows the analyst to identify and select users of interest, and to
find similar users in an interactive learning setting. Our approach is based on
novel multimodal representations of users, words and concepts, which we
simultaneously learn by deploying a general-purpose neural embedding model. We
show these representations to be useful not only for categorizing users, but
also for automatically generating user and community profiles. Inspired by
traditional summarization approaches, we create the profiles by selecting
diverse and representative content from all available modalities, i.e. the
text, image and user modality. The usefulness of the approach is evaluated
using artificial actors, which simulate user behavior in a relevance feedback
scenario. Multiple experiments were conducted in order to evaluate the quality
of our multimodal representations, to compare different embedding strategies,
and to determine the importance of different modalities. We demonstrate the
capabilities of the proposed approach on two different multimedia collections
originating from the violent online extremism forum Stormfront and the
microblogging platform Twitter, which are particularly interesting due to the
high semantic level of the discussions they feature
Identity in research infrastructure and scientific communication: Report from the 1st IRISC workshop, Helsinki Sep 12-13, 2011
Motivation for the IRISC workshop came from the observation that identity and digital identification are increasingly important factors in modern scientific research, especially with the now near-ubiquitous use of the Internet as a global medium for dissemination and debate of scientific knowledge and data, and as a platform for scientific collaborations and large-scale e-science activities.

The 1 1/2 day IRISC2011 workshop sought to explore a series of interrelated topics under two main themes: i) unambiguously identifying authors/creators & attributing their scholarly works, and ii) individual identification and access management in the context of identity federations. Specific aims of the workshop included:

• Raising overall awareness of key technical and non-technical challenges, opportunities and developments.
• Facilitating a dialogue, cross-pollination of ideas, collaboration and coordination between diverse – and largely unconnected – communities.
• Identifying & discussing existing/emerging technologies, best practices and requirements for researcher identification.

This report provides background information on key identification-related concepts & projects, describes workshop proceedings and summarizes key workshop findings
Big data: the potential role of research data management and research data registries
Universities generate and hold increasingly vast quantities of research data – both in the form of large, well-structured datasets but more often in the form of a long tail of small, distributed datasets which collectively amount to ‘Big Data’ and offer significant potential for reuse. However, unlike big data, these collections of small data are often less well curated and are usually very difficult to find thereby reducing their potential reuse value. The Digital Curation Centre (DCC) works to support UK universities to better manage and expose their research data so that its full value may be realised. With a focus on tapping into this long tail of small data, this presentation will cover two main DCC, services: DMPonline which helps researchers to identify potentially valuable research data and to plan for its longer-term retention and reuse; and the UK pilot research data registry and discovery service (RDRDS) which will help to ensure that research data produced in UK HEIs can be found, understood, and reused.
Initially we will introduce participants to the role of data management planning to open up dialogue between researchers and library services to ensure potentially valuable research data are managed appropriately and made available for reuse where feasible. DMPs provide institutions with valuable insights into the scale of their data holdings, highlight any ethical and legal requirements that need to be met, and enable planning for dissemination and reuse. We will also introduce the DCC’s DMPonline, a tool to help researchers write DMPs, which can be customised by institutions and integrated with other systems to simplify and enhance the management and reuse of data.
In the second part of the presentation we will focus on making selected research data more visible for reuse and explore the potential value of local and national research data registries. In particular we will highlight the Jisc-funded RDRDS pilot to establish a UK national service that aggregates metadata relating to data collections held in research institutions and subject data centres. The session will conclude by exploring some of the opportunities we may collaboratively explore in facilitating the management, aggregation and reuse of research data
A golden age for working with public proteomics data
Data sharing in mass spectrometry (MS)-based proteomics is becoming a common scientific practice, as is now common in the case of other, more mature 'omics' disciplines like genomics and transcriptomics. We want to highlight that this situation, unprecedented in the field, opens a plethora of opportunities for data scientists. First, we explain in some detail some of the work already achieved, such as systematic reanalysis efforts. We also explain existing applications of public proteomics data, such as proteogenomics and the creation of spectral libraries and spectral archives. Finally, we discuss the main existing challenges and mention the first attempts to combine public proteomics data with other types of omics data sets
Recommended from our members
Benefits and challenges of applying Semantic Web Services in the e-Government domain
Joining up services in e-Government usually implies governmental agencies acting in concert without a central control regime. This requires the sharing of scattered and heterogeneous data. Semantic Web Service (SWS) technology can help to integrate, mediate and reason between these datasets. However, since few real-world applications have been developed, it is still unclear which are the actual benefits and issues of adopting such a technology in the e-Government domain. In this paper, we contribute to raising awareness of the potential benefits in the e-Government community by analyzing motivations, requirements, and expected results, before proposing a reusable SWS-based framework. We demonstrate the application of this framework by a compelling use case: a GIS-based emergency planning system. We illustrate the obtained benefits and the key challenges which remain to be addressed
Analysis and Detection of Information Types of Open Source Software Issue Discussions
Most modern Issue Tracking Systems (ITSs) for open source software (OSS)
projects allow users to add comments to issues. Over time, these comments
accumulate into discussion threads embedded with rich information about the
software project, which can potentially satisfy the diverse needs of OSS
stakeholders. However, discovering and retrieving relevant information from the
discussion threads is a challenging task, especially when the discussions are
lengthy and the number of issues in ITSs are vast. In this paper, we address
this challenge by identifying the information types presented in OSS issue
discussions. Through qualitative content analysis of 15 complex issue threads
across three projects hosted on GitHub, we uncovered 16 information types and
created a labeled corpus containing 4656 sentences. Our investigation of
supervised, automated classification techniques indicated that, when prior
knowledge about the issue is available, Random Forest can effectively detect
most sentence types using conversational features such as the sentence length
and its position. When classifying sentences from new issues, Logistic
Regression can yield satisfactory performance using textual features for
certain information types, while falling short on others. Our work represents a
nontrivial first step towards tools and techniques for identifying and
obtaining the rich information recorded in the ITSs to support various software
engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering
(ICSE2019
Report of the user requirements and web based access for eResearch workshops
The User Requirements and Web Based Access for eResearch Workshop, organized jointly by NeSC and NCeSS, was held on 19 May 2006. The aim was to identify lessons learned from e-Science projects that would contribute to our capacity to make Grid infrastructures and tools usable and accessible for diverse user communities. Its focus was on providing an opportunity for a pragmatic discussion between e-Science end users
and tool builders in order to understand usability challenges, technological options, community-specific content and needs, and methodologies for design and development. We invited members of six UK e-Science projects and one US project, trying as far as
possible to pair a user and developer from each project in order to discuss their contrasting perspectives and experiences. Three breakout group sessions covered the
topics of user-developer relations, commodification, and functionality. There was also extensive post-meeting discussion, summarized here.
Additional information on the workshop, including the agenda, participant list, and talk slides, can be found online at http://www.nesc.ac.uk/esi/events/685/
Reference: NeSC report UKeS-2006-07 available from http://www.nesc.ac.uk/technical_papers/UKeS-2006-07.pd
SPEIR: Scottish Portals for Education, Information and Research. Final Project Report: Elements and Future Development Requirements of a Common Information Environment for Scotland
The SPEIR (Scottish Portals for Education, Information and Research) project was funded by the Scottish Library and Information Council (SLIC). It ran from February 2003 to September 2004, slightly longer than the 18 months originally scheduled and was managed by the Centre for Digital Library Research (CDLR). With SLIC's agreement, community stakeholders were represented in the project by the Confederation of Scottish Mini-Cooperatives (CoSMiC), an organisation whose members include SLIC, the National Library of Scotland (NLS), the Scottish Further Education Unit (SFEU), the Scottish Confederation of University and Research Libraries (SCURL), regional cooperatives such as the Ayrshire Libraries Forum (ALF)1, and representatives from the Museums and Archives communities in Scotland. Aims; A Common Information Environment For Scotland The aims of the project were to: o Conduct basic research into the distributed information infrastructure requirements of the Scottish Cultural Portal pilot and the public library CAIRNS integration proposal; o Develop associated pilot facilities by enhancing existing facilities or developing new ones; o Ensure that both infrastructure proposals and pilot facilities were sufficiently generic to be utilised in support of other portals developed by the Scottish information community; o Ensure the interoperability of infrastructural elements beyond Scotland through adherence to established or developing national and international standards. Since the Scottish information landscape is taken by CoSMiC members to encompass relevant activities in Archives, Libraries, Museums, and related domains, the project was, in essence, concerned with identifying, researching, and developing the elements of an internationally interoperable common information environment for Scotland, and of determining the best path for future progress
User Review-Based Change File Localization for Mobile Applications
In the current mobile app development, novel and emerging DevOps practices
(e.g., Continuous Delivery, Integration, and user feedback analysis) and tools
are becoming more widespread. For instance, the integration of user feedback
(provided in the form of user reviews) in the software release cycle represents
a valuable asset for the maintenance and evolution of mobile apps. To fully
make use of these assets, it is highly desirable for developers to establish
semantic links between the user reviews and the software artefacts to be
changed (e.g., source code and documentation), and thus to localize the
potential files to change for addressing the user feedback. In this paper, we
propose RISING (Review Integration via claSsification, clusterIng, and
linkiNG), an automated approach to support the continuous integration of user
feedback via classification, clustering, and linking of user reviews. RISING
leverages domain-specific constraint information and semi-supervised learning
to group user reviews into multiple fine-grained clusters concerning similar
users' requests. Then, by combining the textual information from both commit
messages and source code, it automatically localizes potential change files to
accommodate the users' requests. Our empirical studies demonstrate that the
proposed approach outperforms the state-of-the-art baseline work in terms of
clustering and localization accuracy, and thus produces more reliable results.Comment: 15 pages, 3 figures, 8 table
- …