2,951 research outputs found
Robust parent-identifying codes and combinatorial arrays
An -word over a finite alphabet of cardinality is called a descendant of a set of words if for all A code \cC=\{x^1,\dots,x^M\} is said to have the -IPP property if for any -word that is a descendant of at most parents belonging to the code it is possible to identify at least one of them. From earlier works it is known that -IPP codes of positive rate exist if and only if .
We introduce a robust version of IPP codes which allows {unconditional} identification of parents even if some of the coordinates in can break away from the descent rule, i.e., can take arbitrary values from the alphabet, or become completely unreadable. We show existence of robust -IPP codes
for all and some positive proportion of such coordinates.
The proofs involve relations between IPP codes and combinatorial arrays with separating properties such as perfect hash functions and hash codes, partially hashing families and separating codes.
For we find the exact proportion of mutant coordinates (for several error scenarios) that permits unconditional identification
of parents
Recent Developments in Nonregular Fractional Factorial Designs
Nonregular fractional factorial designs such as Plackett-Burman designs and
other orthogonal arrays are widely used in various screening experiments for
their run size economy and flexibility. The traditional analysis focuses on
main effects only. Hamada and Wu (1992) went beyond the traditional approach
and proposed an analysis strategy to demonstrate that some interactions could
be entertained and estimated beyond a few significant main effects. Their
groundbreaking work stimulated much of the recent developments in design
criterion creation, construction and analysis of nonregular designs. This paper
reviews important developments in optimality criteria and comparison, including
projection properties, generalized resolution, various generalized minimum
aberration criteria, optimality results, construction methods and analysis
strategies for nonregular designs.Comment: Submitted to the Statistics Surveys (http://www.i-journals.org/ss/)
by the Institute of Mathematical Statistics (http://www.imstat.org
Long-range-enhanced surface codes
The surface code is a quantum error-correcting code for one logical qubit,
protected by spatially localized parity checks in two dimensions. Due to
fundamental constraints from spatial locality, storing more logical qubits
requires either sacrificing the robustness of the surface code against errors
or increasing the number of physical qubits. We bound the minimal number of
spatially non-local parity checks necessary to add logical qubits to a surface
code while maintaining, or improving, robustness to errors. We asymptotically
saturate this bound using a family of hypergraph product codes, interpolating
between the surface code and constant-rate low-density parity-check codes.
Fault-tolerant protocols for logical operations generalize naturally to these
longer-range codes, based on those from ordinary surface codes. We provide
near-term practical implementations of this code for hardware based on trapped
ions or neutral atoms in mobile optical tweezers. Long-range-enhanced surface
codes outperform conventional surface codes using hundreds of physical qubits,
and represent a practical strategy to enhance the robustness of logical qubits
to errors in near-term devices.Comment: 16 pages, 12 figures; v2 changes: fixed typos and added citation
Compressed Text Indexes:From Theory to Practice!
A compressed full-text self-index represents a text in a compressed form and
still answers queries efficiently. This technology represents a breakthrough
over the text indexing techniques of the previous decade, whose indexes
required several times the size of the text. Although it is relatively new,
this technology has matured up to a point where theoretical research is giving
way to practical developments. Nonetheless this requires significant
programming skills, a deep engineering effort, and a strong algorithmic
background to dig into the research results. To date only isolated
implementations and focused comparisons of compressed indexes have been
reported, and they missed a common API, which prevented their re-use or
deployment within other applications.
The goal of this paper is to fill this gap. First, we present the existing
implementations of compressed indexes from a practitioner's point of view.
Second, we introduce the Pizza&Chili site, which offers tuned implementations
and a standardized API for the most successful compressed full-text
self-indexes, together with effective testbeds and scripts for their automatic
validation and test. Third, we show the results of our extensive experiments on
these codes with the aim of demonstrating the practical relevance of this novel
and exciting technology
Identifying Alternative Hyper-Splicing Signatures in MG-Thymoma by Exon Arrays
BACKGROUND: The vast majority of human genes (>70%) are alternatively spliced. Although alternative pre-mRNA processing is modified in multiple tumors, alternative hyper-splicing signatures specific to particular tumor types are still lacking. Here, we report the use of Affymetrix Human Exon Arrays to spot hyper-splicing events characteristic of myasthenia gravis (MG)-thymoma, thymic tumors which develop in patients with MG and discriminate them from colon cancer changes. METHODOLOGY/PRINCIPAL FINDINGS: We combined GO term to parent threshold-based and threshold-independent ad-hoc functional statistics with in-depth analysis of key modified transcripts to highlight various exon-specific changes. These denote alternative splicing in MG-thymoma tumors compared to healthy human thymus and to in-house and Affymetrix datasets from colon cancer and healthy tissues. By using both global and specific, term-to-parent Gene Ontology (GO) statistical comparisons, our functional integrative ad-hoc method allowed the detection of disease-relevant splicing events. CONCLUSIONS/SIGNIFICANCE: Hyper-spliced transcripts spanned several categories, including the tumorogenic ERBB4 tyrosine kinase receptor and the connective tissue growth factor CTGF, as well as the immune function-related histocompatibility gene HLA-DRB1 and interleukin (IL)19, two muscle-specific collagens and one myosin heavy chain gene; intriguingly, a putative new exon was discovered in the MG-involved acetylcholinesterase ACHE gene. Corresponding changes in spliceosome composition were indicated by co-decreases in the splicing factors ASF/SF(2) and SC35. Parallel tumor-associated changes occurred in colon cancer as well, but the majority of the apparent hyper-splicing events were particular to MG-thymoma and could be validated by Fluorescent In-Situ Hybridization (FISH), Reverse Transcription-Polymerase Chain Reaction (RT-PCR) and mass spectrometry (MS) followed by peptide sequencing. Our findings demonstrate a particular alternative hyper-splicing signature for transcripts over-expressed in MG-thymoma, supporting the hypothesis that alternative hyper-splicing contributes to shaping the biological functions of these and other specialized tumors and opening new venues for the development of diagnosis and treatment approaches
Modeling genetic inheritance of copy number variations
Copy number variations (CNVs) are being used as genetic markers or functional candidates in gene-mapping studies. However, unlike single nucleotide polymorphism or microsatellite genotyping techniques, most CNV detection methods are limited to detecting total copy numbers, rather than copy number in each of the two homologous chromosomes. To address this issue, we developed a statistical framework for intensity-based CNV detection platforms using family data. Our algorithm identifies CNVs for a family simultaneously, thus avoiding the generation of calls with Mendelian inconsistency while maintaining the ability to detect de novo CNVs. Applications to simulated data and real data indicate that our method significantly improves both call rates and accuracy of boundary inference, compared to existing approaches. We further illustrate the use of Mendelian inheritance to infer SNP allele compositions in each of the two homologous chromosomes in CNV regions using real data. Finally, we applied our method to a set of families genotyped using both the Illumina HumanHap550 and Affymetrix genome-wide 5.0 arrays to demonstrate its performance on both inherited and de novo CNVs. In conclusion, our method produces accurate CNV calls, gives probabilistic estimates of CNV transmission and builds a solid foundation for the development of linkage and association tests utilizing CNVs
Decoding Complex Chemical Mixtures with a Physical Model of a Sensor Array
Combinatorial sensor arrays, such as the olfactory system, can detect a large number of analytes using a relatively small number of receptors. However, the complex pattern of receptor responses to even a single analyte, coupled with the non-linearity of responses to mixtures of analytes, makes quantitative prediction of compound concentrations in a mixture a challenging task. Here we develop a physical model that explicitly takes receptor-ligand interactions into account, and apply it to infer concentrations of highly related sugar nucleotides from the output of four engineered G-protein-coupled receptors. We also derive design principles that enable accurate mixture discrimination with cross-specific sensor arrays. The optimal sensor parameters exhibit relatively weak dependence on component concentrations, making a single designed array useful for analyzing a sizable range of mixtures. The maximum number of mixture components that can be successfully discriminated is twice the number of sensors in the array. Finally, antagonistic receptor responses, well-known to play an important role in natural olfactory systems, prove to be essential for the accurate prediction of component concentrations
Advances in Evolutionary Algorithms
With the recent trends towards massive data sets and significant computational power, combined with evolutionary algorithmic advances evolutionary computation is becoming much more relevant to practice. Aim of the book is to present recent improvements, innovative ideas and concepts in a part of a huge EA field
- β¦