12,355 research outputs found

    Development of Distributed Research Center for analysis of regional climatic and environmental changes

    Get PDF
    We present an approach and first results of a collaborative project being carried out by a joint team of researchers from the Institute of Monitoring of Climatic and Ecological Systems, Russia and Earth Systems Research Center UNH, USA. Its main objective is development of a hardware and software platform prototype of a Distributed Research Center (DRC) for monitoring and projecting of regional climatic and environmental changes in the Northern extratropical areas. The DRC should provide the specialists working in climate related sciences and decision-makers with accurate and detailed climatic characteristics for the selected area and reliable and affordable tools for their in-depth statistical analysis and studies of the effects of climate change. Within the framework of the project, new approaches to cloud processing and analysis of large geospatial datasets (big geospatial data) inherent to climate change studies are developed and deployed on technical platforms of both institutions. We discuss here the state of the art in this domain, describe web based information-computational systems developed by the partners, justify the methods chosen to reach the project goal, and briefly list the results obtained so far

    An Optimized Data Structure for High Throughput 3D Proteomics Data: mzRTree

    Get PDF
    As an emerging field, MS-based proteomics still requires software tools for efficiently storing and accessing experimental data. In this work, we focus on the management of LC-MS data, which are typically made available in standard XML-based portable formats. The structures that are currently employed to manage these data can be highly inefficient, especially when dealing with high-throughput profile data. LC-MS datasets are usually accessed through 2D range queries. Optimizing this type of operation could dramatically reduce the complexity of data analysis. We propose a novel data structure for LC-MS datasets, called mzRTree, which embodies a scalable index based on the R-tree data structure. mzRTree can be efficiently created from the XML-based data formats and it is suitable for handling very large datasets. We experimentally show that, on all range queries, mzRTree outperforms other known structures used for LC-MS data, even on those queries these structures are optimized for. Besides, mzRTree is also more space efficient. As a result, mzRTree reduces data analysis computational costs for very large profile datasets.Comment: Paper details: 10 pages, 7 figures, 2 tables. To be published in Journal of Proteomics. Source code available at http://www.dei.unipd.it/mzrtre

    Social media analytics: a survey of techniques, tools and platforms

    Get PDF
    This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing

    BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models

    Get PDF
    Background: Quantitative models of biochemical and cellular systems are used to answer a variety of questions in the biological sciences. The number of published quantitative models is growing steadily thanks to increasing interest in the use of models as well as the development of improved software systems and the availability of better, cheaper computer hardware. To maximise the benefits of this growing body of models, the field needs centralised model repositories that will encourage, facilitate and promote model dissemination and reuse. Ideally, the models stored in these repositories should be extensively tested and encoded in community-supported and standardised formats. In addition, the models and their components should be cross-referenced with other resources in order to allow their unambiguous identification. Description: BioModels Database http://www.ebi.ac.uk/biomodels/ is aimed at addressing exactly these needs. It is a freely-accessible online resource for storing, viewing, retrieving, and analysing published, peer-reviewed quantitative models of biochemical and cellular systems. The structure and behaviour of each simulation model distributed by BioModels Database are thoroughly checked; in addition, model elements are annotated with terms from controlled vocabularies as well as linked to relevant data resources. Models can be examined online or downloaded in various formats. Reaction network diagrams generated from the models are also available in several formats. BioModels Database also provides features such as online simulation and the extraction of components from large scale models into smaller submodels. Finally, the system provides a range of web services that external software systems can use to access up-to-date data from the database. Conclusions: BioModels Database has become a recognised reference resource for systems biology. It is being used by the community in a variety of ways; for example, it is used to benchmark different simulation systems, and to study the clustering of models based upon their annotations. Model deposition to the database today is advised by several publishers of scientific journals. The models in BioModels Database are freely distributed and reusable; the underlying software infrastructure is also available from SourceForge https://sourceforge.net/projects/biomodels/ under the GNU General Public License

    A software definable MIMO testbed: architecture and functionality

    Get PDF
    Following the intensive theoretical studies of recently emerged MIMO technology, a variety of performance measures become important to investigate the challenges and trade-offs at various levels throughout MIMO system design process. This paper presents a review of the MIMO testbed recently set up at King’s College London. The architecture that distinguishes the testbed as a flexible and reconfigurable system is first preseneted. This includes both the hardware and software aspects, and is followed by a discussion of implementation methods and evaluation of system research capabilities

    CLOUD-BASED SOLUTIONS IMPROVING TRANSPARENCY, OPENNESS AND EFFICIENCY OF OPEN GOVERNMENT DATA

    Get PDF
    A central pillar of open government programs is the disclosure of data held by public agencies using Information and Communication Technologies (ICT). This disclosure relies on the creation of open data portals (e.g. Data.gov) and has subsequently been associated with the expression Open Government Data (OGD). The overall goal of these governmental initiatives is not limited to enhance transparency of public sectors but aims to raise awareness of how released data can be put to use in order to enable the creation of new products and services by private sectors. Despite the usage of technological platforms to facilitate access to government data, open data portals continue to be organized in order to serve the goals of public agencies without opening the doors to public accountability, information transparency, public scrutiny, etc. This thesis considers the basic aspects of OGD including the definition of technical models for organizing such complex contexts, the identification of techniques for combining data from several portals and the proposal of user interfaces that focus on citizen-centred usability. In order to deal with the above issues, this thesis presents a holistic approach to OGD that aims to go beyond problems inherent their simple disclosure by providing a tentative answer to the following questions: 1) To what extent do the OGD-based applications contribute towards the creation of innovative, value-added services? 2) What technical solutions could increase the strength of this contribution? 3) Can Web 2.0 and Cloud technologies favour the development of OGD apps? 4) How should be designed a common framework for developing OGD apps that rely on multiple OGD portals and external web resources? In particular, this thesis is focused on devising computational environments that leverage the content of OGD portals (supporting the initial phase of data disclosure) for the creation of new services that add value to the original data. The thesis is organized as follows. In order to offer a general view about OGD, some important aspects about open data initiatives are presented including their state of art, the existing approaches for publishing and consuming OGD across web resources, and the factors shaping the value generated through government data portals. Then, an architectural framework is proposed that gathers OGD from multiple sites and supports the development of cloud-based apps that leverage these data according to potentially different exploitation roots ranging from traditional business to specialized supports for citizens. The proposed framework is validated by two cloud-based apps, namely ODMap (Open Data Mapping) and NESSIE (A Network-based Environment Supporting Spatial Information Exploration). In particular, ODMap supports citizens in searching and accessing OGD from several web sites. NESSIE organizes data captured from real estate agencies and public agencies (i.e. municipalities, cadastral offices and chambers of commerce) in order to provide citizens with a geographic representation of real estate offers and relevant statistics about the price trend.A central pillar of open government programs is the disclosure of data held by public agencies using Information and Communication Technologies (ICT). This disclosure relies on the creation of open data portals (e.g. Data.gov) and has subsequently been associated with the expression Open Government Data (OGD). The overall goal of these governmental initiatives is not limited to enhance transparency of public sectors but aims to raise awareness of how released data can be put to use in order to enable the creation of new products and services by private sectors. Despite the usage of technological platforms to facilitate access to government data, open data portals continue to be organized in order to serve the goals of public agencies without opening the doors to public accountability, information transparency, public scrutiny, etc. This thesis considers the basic aspects of OGD including the definition of technical models for organizing such complex contexts, the identification of techniques for combining data from several portals and the proposal of user interfaces that focus on citizen-centred usability. In order to deal with the above issues, this thesis presents a holistic approach to OGD that aims to go beyond problems inherent their simple disclosure by providing a tentative answer to the following questions: 1) To what extent do the OGD-based applications contribute towards the creation of innovative, value-added services? 2) What technical solutions could increase the strength of this contribution? 3) Can Web 2.0 and Cloud technologies favour the development of OGD apps? 4) How should be designed a common framework for developing OGD apps that rely on multiple OGD portals and external web resources? In particular, this thesis is focused on devising computational environments that leverage the content of OGD portals (supporting the initial phase of data disclosure) for the creation of new services that add value to the original data. The thesis is organized as follows. In order to offer a general view about OGD, some important aspects about open data initiatives are presented including their state of art, the existing approaches for publishing and consuming OGD across web resources, and the factors shaping the value generated through government data portals. Then, an architectural framework is proposed that gathers OGD from multiple sites and supports the development of cloud-based apps that leverage these data according to potentially different exploitation roots ranging from traditional business to specialized supports for citizens. The proposed framework is validated by two cloud-based apps, namely ODMap (Open Data Mapping) and NESSIE (A Network-based Environment Supporting Spatial Information Exploration). In particular, ODMap supports citizens in searching and accessing OGD from several web sites. NESSIE organizes data captured from real estate agencies and public agencies (i.e. municipalities, cadastral offices and chambers of commerce) in order to provide citizens with a geographic representation of real estate offers and relevant statistics about the price trend

    NiftyNet: a deep-learning platform for medical imaging

    Get PDF
    Medical image analysis and computer-assisted intervention problems are increasingly being addressed with deep-learning-based solutions. Established deep-learning platforms are flexible but do not provide specific functionality for medical image analysis and adapting them for this application requires substantial implementation effort. Thus, there has been substantial duplication of effort and incompatible infrastructure developed across many research groups. This work presents the open-source NiftyNet platform for deep learning in medical imaging. The ambition of NiftyNet is to accelerate and simplify the development of these solutions, and to provide a common mechanism for disseminating research outputs for the community to use, adapt and build upon. NiftyNet provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications. Components of the NiftyNet pipeline including data loading, data augmentation, network architectures, loss functions and evaluation metrics are tailored to, and take advantage of, the idiosyncracies of medical image analysis and computer-assisted intervention. NiftyNet is built on TensorFlow and supports TensorBoard visualization of 2D and 3D images and computational graphs by default. We present 3 illustrative medical image analysis applications built using NiftyNet: (1) segmentation of multiple abdominal organs from computed tomography; (2) image regression to predict computed tomography attenuation maps from brain magnetic resonance images; and (3) generation of simulated ultrasound images for specified anatomical poses. NiftyNet enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications.Comment: Wenqi Li and Eli Gibson contributed equally to this work. M. Jorge Cardoso and Tom Vercauteren contributed equally to this work. 26 pages, 6 figures; Update includes additional applications, updated author list and formatting for journal submissio
    • …
    corecore