1,087 research outputs found
Information-bit error rate and false positives in an MDS code
In this paper, a refinement of the weight distribution in an MDS code is
computed. Concretely, the number of codewords with a fixed amount of nonzero
bits in both information and redundancy parts is obtained. This refinement
improves the theoretical approximation of the information-bit and -symbol error
rate, in terms of the channel bit-error rate, in a block transmission through a
discrete memoryless channel. Since a bounded distance reproducing encoder is
assumed, the computation of the here-called false positive (a decoding failure
with no information-symbol error) is provided. As a consequence, a new
performance analysis of an MDS code is proposed
Noise-Resilient Group Testing: Limitations and Constructions
We study combinatorial group testing schemes for learning -sparse Boolean
vectors using highly unreliable disjunctive measurements. We consider an
adversarial noise model that only limits the number of false observations, and
show that any noise-resilient scheme in this model can only approximately
reconstruct the sparse vector. On the positive side, we take this barrier to
our advantage and show that approximate reconstruction (within a satisfactory
degree of approximation) allows us to break the information theoretic lower
bound of that is known for exact reconstruction of
-sparse vectors of length via non-adaptive measurements, by a
multiplicative factor .
Specifically, we give simple randomized constructions of non-adaptive
measurement schemes, with measurements, that allow efficient
reconstruction of -sparse vectors up to false positives even in the
presence of false positives and false negatives within the
measurement outcomes, for any constant . We show that, information
theoretically, none of these parameters can be substantially improved without
dramatically affecting the others. Furthermore, we obtain several explicit
constructions, in particular one matching the randomized trade-off but using measurements. We also obtain explicit constructions
that allow fast reconstruction in time \poly(m), which would be sublinear in
for sufficiently sparse vectors. The main tool used in our construction is
the list-decoding view of randomness condensers and extractors.Comment: Full version. A preliminary summary of this work appears (under the
same title) in proceedings of the 17th International Symposium on
Fundamentals of Computation Theory (FCT 2009
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
MalwareLab: Experimentation with Cybercrime Attack Tools
Cybercrime attack tools (i.e. Exploit Kits) are reportedly
responsible for the majority of attacks affecting home
users. Exploit kits are traded in the black markets at
different prices and advertising different capabilities and
functionalities. In this paper we present our experimental
approach in testing 10 exploit kits leaked from the markets
that we deployed in an isolated environment, our
MalwareLab. The purpose of this experiment is to test
these tools in terms of resiliency against changing software
configurations in time. We present our experiment
design and implementation, discuss challenges, lesson
learned and open problems, and present a preliminary
analysis of the results
On the Design of Future Communication Systems with Coded Transport, Storage, and Computing
Communication systems are experiencing a fundamental change. There are novel applications that require an increased performance not only of throughput but also latency, reliability, security, and heterogeneity support from these systems. To fulfil the requirements, future systems understand communication not only as the transport of bits but also as their storage, processing, and relation. In these systems, every network node has transport storage and computing resources that the network operator and its users can exploit through virtualisation and softwarisation of the resources. It is within this context that this work presents its results. We proposed distributed coded approaches to improve communication systems. Our results improve the reliability and latency performance of the transport of information. They also increase the reliability, flexibility, and throughput of storage applications. Furthermore, based on the lessons that coded approaches improve the transport and storage performance of communication systems, we propose a distributed coded approach for the computing of novel in-network applications such as the steering and control of cyber-physical systems. Our proposed approach can increase the reliability and latency performance of distributed in-network computing in the presence of errors, erasures, and attackers
Generative deep learning for biomedical data analysis
In my article, the author selected two types of breast cancer samples, Ductal carcinoma in situ(DCIS) and Lobular Carcinoma. The authors selected 35 Ductal carcinoma in situ samples and 35 Lobular Carcinoma samples from the TCGA database. After non-specific filtering of the samples for low expression genes in the RNAseq data, only 636 genes were left in each group. The retained genes were used for expression differential analysis studies. The author used heat maps and principal component analysis to visualize the actual impact of each gene. Afterward, a Variational Auto-encoder model was built to simulate the generation of new gene sequences. The specific model trained in this study consists of gene expression input (the 636 most variably expressed genes by median absolute deviation) compressed into two vectors of length 100 (mean and variance coding space), which are made deterministic by a re-parameterization technique that draws ε-vectors from the uniform distribution. The coding layer is then decoded back to the original 636 dimensions by a single reconstruction layer. The encoding scheme also uses relu activation, while the decoder uses sigmoid activation to perform forward activation. All weights are initialized uniformly by Glorot. Finally, we can see that the values of the generated sequences are relatively close to the original sequences
Applications of Derandomization Theory in Coding
Randomized techniques play a fundamental role in theoretical computer science
and discrete mathematics, in particular for the design of efficient algorithms
and construction of combinatorial objects. The basic goal in derandomization
theory is to eliminate or reduce the need for randomness in such randomized
constructions. In this thesis, we explore some applications of the fundamental
notions in derandomization theory to problems outside the core of theoretical
computer science, and in particular, certain problems related to coding theory.
First, we consider the wiretap channel problem which involves a communication
system in which an intruder can eavesdrop a limited portion of the
transmissions, and construct efficient and information-theoretically optimal
communication protocols for this model. Then we consider the combinatorial
group testing problem. In this classical problem, one aims to determine a set
of defective items within a large population by asking a number of queries,
where each query reveals whether a defective item is present within a specified
group of items. We use randomness condensers to explicitly construct optimal,
or nearly optimal, group testing schemes for a setting where the query outcomes
can be highly unreliable, as well as the threshold model where a query returns
positive if the number of defectives pass a certain threshold. Finally, we
design ensembles of error-correcting codes that achieve the
information-theoretic capacity of a large class of communication channels, and
then use the obtained ensembles for construction of explicit capacity achieving
codes.
[This is a shortened version of the actual abstract in the thesis.]Comment: EPFL Phd Thesi
Qrs detection based on medical knowledge and cascades of moving average filters
Heartbeat detection is the first step in automatic analysis of the electrocardiogram (ECG). For mobile and wearable devices, the detection process should be both accurate and computationally efficient. In this paper, we present a QRS detection algorithm based on moving average filters, which affords a simple yet robust signal processing technique. The decision logic considers the rhythmic and morphological features of the QRS complex. QRS enhancing is performed with channel-specific moving average cascades selected from a pool of derivative systems we designed. We measured the effectiveness of our algorithm on well-known benchmark databases, reporting F1 scores, sensitivity on abnormal beats and processing time. We also evaluated the performances of other available detectors for a direct comparison with the same criteria. The algorithm we propose achieved satisfying performances on par with or higher than the other QRS detectors. Despite the performances we report are not the highest that have been published so far, our approach to QRS detection enhances computational efficiency while maintaining high accuracy
- …