8 research outputs found
diArk – a resource for eukaryotic genome research
<p>Abstract</p> <p>Background</p> <p>The number of completed eukaryotic genome sequences and cDNA projects has increased exponentially in the past few years although most of them have not been published yet. In addition, many microarray analyses yielded thousands of sequenced EST and cDNA clones. For the researcher interested in single gene analyses (from a phylogenetic, a structural biology or other perspective) it is therefore important to have up-to-date knowledge about the various resources providing primary data.</p> <p>Description</p> <p>The database is built around 3 central tables: species, sequencing projects and publications. The species table contains commonly and alternatively used scientific names, common names and the complete taxonomic information. For projects the sequence type and links to species project web-sites and species homepages are stored. All publications are linked to projects. The web-interface provides comprehensive search modules with detailed options and three different views of the selected data. We have especially focused on developing an elaborate taxonomic tree search tool that allows the user to instantaneously identify e.g. the closest relative to the organism of interest.</p> <p>Conclusion</p> <p>We have developed a database, called diArk, to store, organize, and present the most relevant information about completed genome projects and EST/cDNA data from eukaryotes. Currently, diArk provides information about 415 eukaryotes, 823 sequencing projects, and 248 publications.</p
menoci: Lightweight Extensible Web Portal enabling FAIR Data Management for Biomedical Research Projects
Background: Biomedical research projects deal with data management
requirements from multiple sources like funding agencies' guidelines, publisher
policies, discipline best practices, and their own users' needs. We describe
functional and quality requirements based on many years of experience
implementing data management for the CRC 1002 and CRC 1190. A fully equipped
data management software should improve documentation of experiments and
materials, enable data storage and sharing according to the FAIR Guiding
Principles while maximizing usability, information security, as well as
software sustainability and reusability. Results: We introduce the modular web
portal software menoci for data collection, experiment documentation, data
publication, sharing, and preservation in biomedical research projects. Menoci
modules are based on the Drupal content management system which enables
lightweight deployment and setup, and creates the possibility to combine
research data management with a customisable project home page or collaboration
platform. Conclusions: Management of research data and digital research
artefacts is transforming from individual researcher or groups best practices
towards project- or organisation-wide service infrastructures. To enable and
support this structural transformation process, a vital ecosystem of open
source software tools is needed. Menoci is a contribution to this ecosystem of
research data management tools that is specifically designed to support
biomedical research projects.Comment: Preprint. 19 pages, 2 figure
diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data
<p>Abstract</p> <p>Background</p> <p>Nowadays, the sequencing of even the largest mammalian genomes has become a question of days with current next-generation sequencing methods. It comes as no surprise that dozens of genome assemblies are released per months now. Since the number of next-generation sequencing machines increases worldwide and new major sequencing plans are announced, a further increase in the speed of releasing genome assemblies is expected. Thus it becomes increasingly important to get an overview as well as detailed information about available sequenced genomes. The different sequencing and assembly methods have specific characteristics that need to be known to evaluate the various genome assemblies before performing subsequent analyses.</p> <p>Results</p> <p>diArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide. Currently, diArk 2.0 contains information about more than 880 species and more than 2350 genome assembly files. Many meta-data like sequencing and read-assembly methods, sequencing coverage, GC-content, extended lists of alternatively used scientific names and common species names, and various kinds of statistics are provided. To intuitively approach the data the web interface makes extensive usage of modern web techniques. A number of search modules and result views facilitate finding and judging the data of interest. Subscribing to the RSS feed is the easiest way to stay up-to-date with the latest genome data.</p> <p>Conclusions</p> <p>diArk 2.0 is the most up-to-date database of sequenced eukaryotic genomes compared to databases like GOLD, NCBI Genome, NHGRI, and ISC. It is different in that only those projects are stored for which genome assembly data or considerable amounts of cDNA data are available. Projects in planning stage or in the process of being sequenced are not included. The user can easily search through the provided data and directly access the genome assembly files of the sequenced genome of interest. diArk 2.0 is available at <url>http://www.diark.org</url>.</p
Peakr: simulating solid-state NMR spectra of proteins.
International audienceWhen analyzing solid-state nuclear magnetic resonance (NMR) spectra of proteins, assignment of resonances to nuclei and derivation of restraints for 3D structure calculations are challenging and time-consuming processes. Simulated spectra that have been calculated based on, for example, chemical shift predictions and structural models can be of considerable help. Existing solutions are typically limited in the type of experiment they can consider and difficult to adapt to different settings. Here, we present Peakr, a software to simulate solid-state NMR spectra of proteins. It can generate simulated spectra based on numerous common types of internuclear correlations relevant for assignment and structure elucidation, can compare simulated and experimental spectra and produces lists and visualizations useful for analyzing measured spectra. Compared with other solutions, it is fast, versatile and user friendly. Peakr is maintained under the GPL license and can be accessed at http://www.peakr.org. The source code can be obtained on request from the authors
PERICLES Deliverable 6.4:Final version of integration framework and API implementation
The PERICLES integration framework is designed for the flexible execution of varied and varying processing and control components in typical preservation workflows, while itself being controllable by abstract models of the overall preservation system. It is the project’s focal point for connecting tools, models and application use-cases to demonstrate the potential of model-driven digital preservation.This final design for the integration framework has changed slightly from the initial version presented in PERICLES deliverable D6.1 [10]. We describe the changes and the reasons for them in the early chapters of this report.The integration framework is built from standard encapsulation technologies – Docker containers and RESTful web services – and controlled by a standard workflow environment – jBPM controlled by the Jenkins continuous integration system. On this execution layer, arbitrary workflows representing digital preservation activities can be deployed, run and evaluated. Standard tools – mediainfo, bagit, fido and so on – can be encapsulated and deployed, as can new preservation tools developed within the project.Two new subsystems have been designed to couple the workflow execution layer to the abstract models developed through the research activities of the project: the Process Compiler (PC) and the Entity Registry-Model Repository (ERMR). The ERMR also provides the key link to the Linked Resource Model Service, an external semantic reasoning service under development by partner XEROX Research. These two subsystems provide the means to couple powerful semantic reasoning and policy-driven models to a “live” digital preservation system.The API designs and technology choices for the test bed are now settled and implementation of the underlying (standard) test bed infrastructure is complete. The APIs and communication patterns are based on RESTful web services and JSON payloads and are described in detail in Section 5 and in the appendices.Implementation of the new ERMR and PC subsystems is well underway. The focus for the integrated test bed over the final stages of the project will be on demonstrating the full end-to-end power of the model-driven preservation approach through the implementation of key application scenarios using models, tools and components drawn from across the PERICLES project. Examples of suchscenarios are given in the appendices
