Search CORE

19 research outputs found

An annotation infrastructure for the analysis and interpretation of Affymetrix exon array data

Author: Dibben Siân
Miller Crispin J
Okoniewski Michał J
Yates Tim
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

An annotation database (X:MAP) and BioConductor/R package (exonmap) have been developed to support fine-grained analysis of exon array data

Crossref

Springer - Publisher Connector

PubMed Central

Modulator structures for radio-on-fiber applications, Journal of Telecommunications and Information Technology, 2003, nr 1

Author: Davies Robert J.
Hum Sean V.
Okoniewski Michał
Publication venue: Instytut Łączności - Państwowy Instytut Badawczy, Warszawa
Publication date
Field of study

Traditional electro-optic modulators are not optimized for bandpass applications such as radio-on-fiber delivery systems due to their limited electro-optic response. This paper presents a resonant electrode structure that can be employed in optical modulator designs to enhance the electro-optic response of the modulator over a narrow frequency band, and improve the performance of optical radio systems. A simple model of the structure is developed, and experimental results validating the model and illustrating the effectiveness of the structure are presented

Biblioteka Cyfrowa Instytutu Łączności / National Institute of Telecomunications: Digital Library

SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision

Author: Gawrysiak Piotr
Maffioletti Sergio
Messina Antonio
Okoniewski Michał J.
Pacholewska Alicja
Wiewiórka Marek S.
Publication venue
Publication date: 02/08/2017
Field of study

Many time-consuming analyses of next -: generation sequencing data can be addressed with modern cloud computing. The Apache Hadoop-based solutions have become popular in genomics BECAUSE OF: their scalability in a cloud infrastructure. So far, most of these tools have been used for batch data processing rather than interactive data querying. The SparkSeq software has been created to take advantage of a new MapReduce framework, Apache Spark, for next-generation sequencing data. SparkSeq is a general-purpose, flexible and easily extendable library for genomic cloud computing. It can be used to build genomic analysis pipelines in Scala and run them in an interactive way. SparkSeq opens up the possibility of customized ad hoc secondary analyses and iterative machine learning algorithms. This article demonstrates its scalability and overall fast performance by running the analyses of sequencing datasets. Tests of SparkSeq also prove that the use of cache and HDFS block size can be tuned for the optimal performance on multiple worker node

Repository for Publications and Research Data

RERO DOC Digital Library

rnaSeqMap: a Bioconductor package for RNA sequencing data exploration

Author: A Goncalves
A Mortazavi
Anna Leśniewska
B Langmead
C Trapnell
C Trapnell
J Bradford
J Li
JH Bullard
K Wang
M Fiume
M Guttman
M Morgan
M Okoniewski
M Okoniewski
MD Robinson
Michał J Okoniewski
MJ Okoniewski
P Gardina
PC Ng
S Anders
T Lu
T Yates
WS Cleveland
Y Aumann
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

BACKGROUND: The throughput of commercially available sequencers has recently significantly increased. It has reached the point where measuring the RNA expression by the depth of coverage has become feasible even for largest genomes. The development of software tools is constantly following the progress of biological hardware. In particular, as RNA sequencing software can be regarded genome browsers, exon junction tools and statistical tools operating on counts of reads in predefined regions. The library rnaSeqMap, freely available via Bioconductor, is an RNA sequencing software which is independent of any biological hardware platform. It is based upon standard Bioconductor infrastructure for sequencing data and includes several novel features focused on deeper understanding of coverage expression profiles and discovery of novel transcription regions. RESULTS: rnaSeqMap is a toolbox for analyses that may be performed with the use of gene annotations or alternatively, in an unsupervised mode, on any genomic region to find novel or non-standard transcripts. The data back-end may be a MySQL database or a set of files in standard BAM format. The processing in R can be run on a machine without any particular hardware requirements, and scales linearly with the number of genomic loci and number of samples analyzed. The main features of rnaSeqMap include coverage operations, discovering irreducible regions of high expression, significance search and splicing analyses with nucleotide granularity. CONCLUSIONS: This software may be used for a range of applications related to RNA sequencing by building customized analysis pipelines. The applicability and precision is expected to increase in parallel with the progress of the genome coverage in sequencers

CiteSeerX

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ZORA

Preferred analysis methods for single genomic regions in RNA sequencing revealed by processing the shape of coverage

Author: Leśniewska Anna
Morzy Tadeusz
Okoniewski Michał J.
Ryan Martin
Schlapbach Ralph
Schäfer Beat
Szabelska Alicja
Wachtel Marco
Zyprych-Walczak Joanna
Publication venue
Publication date: 02/08/2017
Field of study

The informational content of RNA sequencing is currently far from being completely explored. Most of the analyses focus on processing tables of counts or finding isoform deconvolution via exon junctions. This article presents a comparison of several techniques that can be used to estimate differential expression of exons or small genomic regions of expression, based on their coverage function shapes. The problem is defined as finding the differentially expressed exons between two samples using local expression profile normalization and statistical measures to spot the differences between two profile shapes. Initial experiments have been done using synthetic data, and real data modified with synthetically created differential patterns. Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared. As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve. These platform-independent techniques have been implemented in the Bioconductor package rnaSeqMap. They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this. The areas of application include significant difference searches, splicing identification algorithms and finding suitable regions for QPCR primer

RERO DOC Digital Library

Comprehensive Analysis of Affymetrix Exon Arrays Using BioConductor

Author: B Modrek
Crispin J Miller
Dalma-Weiszhausz
Fran Lewitter
J Lu
JM Johnson
JP Venables
K Kapur
M Dai
Michał J Okoniewski
MJ Okoniewski
MJ Okoniewski
NA Faustino
P Gardina
R Bender
R Gentleman
RA Irizarry
RC Gentleman
SD Pepper
T Clark
TJ Hubbard
WM Liu
WN Venables
Publication venue: Public Library of Science
Publication date: 01/02/2008
Field of study

ISSN:1553-734XISSN:1553-735

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

PubMed Central

Preferred analysis methods for single genomic regions in RNA sequencing revealed by processing the shape of coverage

Author: Alicja Szabelska
Anders
Anna Leśniewska
Beat Schäfer
Bohnert
Choe
Dabney
Garber
Gardina
Guttman
Hower
Jiang
Joanna Zyprych-Walczak
Langmead
Leśniewska
Li
Marco Wachtel
Martin Ryan
Michał J. Okoniewski
Ralph Schlapbach
Roberts
Robertson
Robinson
Robinson
Tadeusz Morzy
Tarazona
Trapnell
Wang
Wu
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Repository for Publications and Research Data

Crossref

PubMed Central

ZORA

Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations

Author: A Nimgaonkar
Affymetrix
Affymetrix
AI Su
AJ Butte
AJ Butte
AT Adai
BH Mecham
BH Mecham
C Wu
CL Wilson
Crispin J Miller
E Birney
G Liu
G Sherlock
H Wang
HS Leong
J Harbig
J Stuart
KD Pruitt
L Gautier
L Gautier
M Dai
Michał J Okoniewski
O Teuffel
R Gentleman
R Irizarry
S Carter
S Zakharkin
T Attwood
W Shannon
Z Wu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Microarrays measure the binding of nucleotide sequences to a set of sequence specific probes. This information is combined with annotation specifying the relationship between probes and targets and used to make inferences about transcript- and, ultimately, gene expression. In some situations, a probe is capable of hybridizing to more than one transcript, in others, multiple probes can target a single sequence. These 'multiply targeted' probes can result in non-independence between measured expression levels. RESULTS: An analysis of these relationships for Affymetrix arrays considered both the extent and influence of exact matches between probe and transcript sequences. For the popular HGU133A array, approximately half of the probesets were found to interact in this way. Both real and simulated expression datasets were used to examine how these effects influenced the expression signal. It was found not only to lead to increased signal strength for the affected probesets, but the major effect is to significantly increase their correlation, even in situations when only a single probe from a probeset was involved. By building a network of probe-probeset-transcript relationships, it is possible to identify families of interacting probesets. More than 10% of the families contain members annotated to different genes or even different Unigene clusters. Within a family, a mixture of genuine biological and artefactual correlations can occur. CONCLUSION: Multiple targeting is not only prevalent, but also significant. The ability of probesets to hybridize to more than one gene product can lead to false positives when analysing gene expression. Comprehensive annotation describing multiple targeting is required when interpreting array data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Exon level integration of proteomics and microarray data

Author: Bitton Danny A
Connolly Yvonne
Miller Crispin J
Okoniewski Michał J
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2008
Field of study

Abstract Background Previous studies comparing quantitative proteomics and microarray data have generally found poor correspondence between the two. We hypothesised that this might in part be because the different assays were targeting different parts of the expressed genome and might therefore be subjected to confounding effects from processes such as alternative splicing. Results Using a genome database as a platform for integration, we combined quantitative protein mass spectrometry with Affymetrix Exon array data at the level of individual exons. We found significantly higher degrees of correlation than have been previously observed (r = 0.808). The study was performed using cell lines in equilibrium in order to reduce a major potential source of biological variation, thus allowing the analysis to focus on the data integration methods in order to establish their performance. Conclusion We conclude that part of the variation observed when integrating microarray and proteomics data may occur as a consequence both of the data analysis and of the high granularity to which studies have until recently been limited. The approach opens up the possibility for the first time of considering combined microarray and proteomics datasets at the level of individual exons and isoforms, important given the high proportion of alternative splicing observed in the human genome.</p

Directory of Open Access Journals