Search CORE

989 research outputs found

Impossibility of independence amplification in Kolmogorov complexity theory

Author: A. Rao
A. Zvonkin
B. Chor
J. Bourgain
L. Bienvenu
L. Fortnow
M. Santha
M. Zimand
M. Zimand
N.K. Vereshchagin
R. Downey
R. Raz
Y. Dodis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The paper studies randomness extraction from sources with bounded independence and the issue of independence amplification of sources, using the framework of Kolmogorov complexity. The dependency of strings

x

and

y

{\rm dep}(x,y) = \max\{C(x) - C(x \mid y), C(y) - C(y\mid x)\}

, where

C(\cdot)

denotes the Kolmogorov complexity. It is shown that there exists a computable Kolmogorov extractor

f

such that, for any two

n

-bit strings with complexity

s(n)

and dependency

\alpha(n)

, it outputs a string of length

s(n)

with complexity

s(n)- \alpha(n)

conditioned by any one of the input strings. It is proven that the above are the optimal parameters a Kolmogorov extractor can achieve. It is shown that independence amplification cannot be effectively realized. Specifically, if (after excluding a trivial case) there exist computable functions

f_1

and

f_2

such that

{\rm dep}(f_1(x,y), f_2(x,y)) \leq \beta(n)

for all

n

-bit strings

x

and

y

with

{\rm dep}(x,y) \leq \alpha(n)

, then

\beta(n) \geq \alpha(n) - O(\log n)

arXiv.org e-Print Archive

Crossref

Linear Transformations for Randomness Extraction

Author: Bruck Jehoshua
Zhou Hongchao
Publication venue
Publication date: 04/09/2012
Field of study

Information-efficient approaches for extracting randomness from imperfect sources have been extensively studied, but simpler and faster ones are required in the high-speed applications of random number generation. In this paper, we focus on linear constructions, namely, applying linear transformation for randomness extraction. We show that linear transformations based on sparse random matrices are asymptotically optimal to extract randomness from independent sources and bit-fixing sources, and they are efficient (may not be optimal) to extract randomness from hidden Markov sources. Further study demonstrates the flexibility of such constructions on source models as well as their excellent information-preserving capabilities. Since linear transformations based on sparse random matrices are computationally fast and can be easy to implement using hardware like FPGAs, they are very attractive in the high-speed applications. In addition, we explore explicit constructions of transformation matrices. We show that the generator matrices of primitive BCH codes are good choices, but linear transformations based on such matrices require more computational time due to their high densities.Comment: 2 columns, 14 page

arXiv.org e-Print Archive

CiteSeerX

Caltech Authors

Linear extractors for extracting randomness from noisy sources

Author: Bruck Jehoshua
Zhou Hongchao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2011
Field of study

Linear transformations have many applications in information theory, like data compression and error-correcting codes design. In this paper, we study the power of linear transformations in randomness extraction, namely linear extractors, as another important application. Comparing to most existing methods for randomness extraction, linear extractors (especially those constructed with sparse matrices) are computationally fast and can be simply implemented with hardware like FPGAs, which makes them very attractive in practical use. We mainly focus on simple, efficient and sparse constructions of linear extractors. Specifically, we demonstrate that random matrices can generate random bits very efficiently from a variety of noisy sources, including noisy coin sources, bit-fixing sources, noisy (hidden) Markov sources, as well as their mixtures. It shows that low-density random matrices have almost the same efficiency as high-density random matrices when the input sequence is long, which provides a way to simplify hardware/software implementation. Note that although we constructed matrices with randomness, they are deterministic (seedless) extractors - once we constructed them, the same construction can be used for any number of times without using any seeds. Another way to construct linear extractors is based on generator matrices of primitive BCH codes. This method is more explicit, but less practical due to its computational complexity and dimensional constraints

CiteSeerX

Crossref

Caltech Authors

Spatial and Temporal Modeling of Community Non-Timber Forest Extraction

Author: Elizabeth J. Z. Robinson
Heidi J. Albers
Jeffrey C. Williams
Publication venue
Publication date
Field of study

This paper examines the interaction of spatial and dynamic aspects of resource extraction from forests by local people. Highly cyclical and varied across both space and time, the patterns of resource extraction resulting from the spatial-temporal model bear little resemblance to the patterns drawn from focusing either on spatial or temporal aspects of extraction, as is typical in both the modeling and empirical literature to date. Combining the spatial-temporal model with a measure of success in community forest management.the ability to avoid open-access resource degradation.characterizes the impact of incomplete property rights on patterns of resource extraction and stocks. Key words: Spatial and temporal modeling; renewable resources; non-timber forest products; common property resources

Research Papers in Economics

Face Recognition Based on PCA Algorithm

Author: Ruprah Taranpreet Singh
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 11/09/2020
Field of study

This paper is proposed the face recognition method using PCA with neural network back error propagation learning algorithm .In this paper a feature is extracted using principal component analysis and then classification by creation of back propagation neural network. We run our algorithm for face recognition application using principal component analysis, neural network and also calculate its performance by using the photometric normalization technique: Histogram Equalization and comparing with Euclidean Distance, and Normalized correlation classifiers. The system produces promising results for face verification and face recognition. Demonstrate the recognition accuracy for given number of input pattern

Interscience Research Network

Recommended from our members

Distributed computing and cryptography with general weak random sources

Author: Li Xin, Ph. D.
Publication venue
Publication date: 14/08/2015
Field of study

The use of randomness in computer science is ubiquitous. Randomized protocols have turned out to be much more efficient than their deterministic counterparts. In addition, many problems in distributed computing and cryptography are impossible to solve without randomness. However, these applications typically require uniform random bits, while in practice almost all natural random phenomena are biased. Moreover, even originally uniform random bits can be damaged if an adversary learns some partial information about these bits. In this thesis, we study how to run randomized protocols in distributed computing and cryptography with imperfect randomness. We use the most general model for imperfect randomness where the weak random source is only required to have a certain amount of min-entropy. One important tool here is the randomness extractor. A randomness extractor is a function that takes as input one or more weak random sources, and outputs a distribution that is close to uniform in statistical distance. Randomness extractors are interesting in their own right and are closely related to many other problems in computer science. Giving efficient constructions of randomness extractors with optimal parameters is one of the major open problems in the area of pseudorandomness. We construct network extractor protocols that extract private random bits for parties in a communication network, assuming that they each start with an independent weak random source, and some parties are corrupted by an adversary who sees all communications in the network. These protocols imply fault-tolerant distributed computing protocols and secure multi-party computation protocols where only imperfect randomness is available. The probabilistic method shows that there exists an extractor for two independent sources with logarithmic min-entropy, while known constructions are far from achieving these parameters. In this thesis we construct extractors for two independent sources with any linear min-entropy, based on a computational assumption. We also construct the best known extractors for three independent sources and affine sources. Finally we study the problem of privacy amplification. In this model, two parties share a private weak random source and they wish to agree on a private uniform random string through communications in a channel controlled by an adversary, who has unlimited computational power and can change the messages in arbitrary ways. All previous results assume that the two parties have local uniform random bits. We show that this problem can be solved even if the two parties only have local weak random sources. We also improve previous results in various aspects by constructing the first explicit non-malleable extractor and giving protocols based on this extractor.Computer Science

Texas ScholarWorks

Knowledge extraction from unstructured data and classification through distributed ontologies

Author: Rizzo Giuseppe
Publication venue
Publication date: 01/01/2012
Field of study

The World Wide Web has changed the way humans use and share any kind of information. The Web removed several access barriers to the information published and has became an enormous space where users can easily navigate through heterogeneous resources (such as linked documents) and can easily edit, modify, or produce them. Documents implicitly enclose information and relationships among them which become only accessible to human beings. Indeed, the Web of documents evolved towards a space of data silos, linked each other only through untyped references (such as hypertext references) where only humans were able to understand. A growing desire to programmatically access to pieces of data implicitly enclosed in documents has characterized the last efforts of the Web research community. Direct access means structured data, thus enabling computing machinery to easily exploit the linking of different data sources. It has became crucial for the Web community to provide a technology stack for easing data integration at large scale, first structuring the data using standard ontologies and afterwards linking them to external data. Ontologies became the best practices to define axioms and relationships among classes and the Resource Description Framework (RDF) became the basic data model chosen to represent the ontology instances (i.e. an instance is a value of an axiom, class or attribute). Data becomes the new oil, in particular, extracting information from semi-structured textual documents on the Web is key to realize the Linked Data vision. In the literature these problems have been addressed with several proposals and standards, that mainly focus on technologies to access the data and on formats to represent the semantics of the data and their relationships. With the increasing of the volume of interconnected and serialized RDF data, RDF repositories may suffer from data overloading and may become a single point of failure for the overall Linked Data vision. One of the goals of this dissertation is to propose a thorough approach to manage the large scale RDF repositories, and to distribute them in a redundant and reliable peer-to-peer RDF architecture. The architecture consists of a logic to distribute and mine the knowledge and of a set of physical peer nodes organized in a ring topology based on a Distributed Hash Table (DHT). Each node shares the same logic and provides an entry point that enables clients to query the knowledge base using atomic, disjunctive and conjunctive SPARQL queries. The consistency of the results is increased using data redundancy algorithm that replicates each RDF triple in multiple nodes so that, in the case of peer failure, other peers can retrieve the data needed to resolve the queries. Additionally, a distributed load balancing algorithm is used to maintain a uniform distribution of the data among the participating peers by dynamically changing the key space assigned to each node in the DHT. Recently, the process of data structuring has gained more and more attention when applied to the large volume of text information spread on the Web, such as legacy data, news papers, scientific papers or (micro-)blog posts. This process mainly consists in three steps: \emph{i)} the extraction from the text of atomic pieces of information, called named entities; \emph{ii)} the classification of these pieces of information through ontologies; \emph{iii)} the disambigation of them through Uniform Resource Identifiers (URIs) identifying real world objects. As a step towards interconnecting the web to real world objects via named entities, different techniques have been proposed. The second objective of this work is to propose a comparison of these approaches in order to highlight strengths and weaknesses in different scenarios such as scientific and news papers, or user generated contents. We created the Named Entity Recognition and Disambiguation (NERD) web framework, publicly accessible on the Web (through REST API and web User Interface), which unifies several named entity extraction technologies. Moreover, we proposed the NERD ontology, a reference ontology for comparing the results of these technologies. Recently, the NERD ontology has been included in the NIF (Natural language processing Interchange Format) specification, part of the Creating Knowledge out of Interlinked Data (LOD2) project. Summarizing, this dissertation defines a framework for the extraction of knowledge from unstructured data and its classification via distributed ontologies. A detailed study of the Semantic Web and knowledge extraction fields is proposed to define the issues taken under investigation in this work. Then, it proposes an architecture to tackle the single point of failure issue introduced by the RDF repositories spread within the Web. Although the use of ontologies enables a Web where data is structured and comprehensible by computing machinery, human users may take advantage of it especially for the annotation task. Hence, this work describes an annotation tool for web editing, audio and video annotation in a web front end User Interface powered on the top of a distributed ontology. Furthermore, this dissertation details a thorough comparison of the state of the art of named entity technologies. The NERD framework is presented as technology to encompass existing solutions in the named entity extraction field and the NERD ontology is presented as reference ontology in the field. Finally, this work highlights three use cases with the purpose to reduce the amount of data silos spread within the Web: a Linked Data approach to augment the automatic classification task in a Systematic Literature Review, an application to lift educational data stored in Sharable Content Object Reference Model (SCORM) data silos to the Web of data and a scientific conference venue enhancer plug on the top of several data live collectors. Significant research efforts have been devoted to combine the efficiency of a reliable data structure and the importance of data extraction techniques. This dissertation opens different research doors which mainly join two different research communities: the Semantic Web and the Natural Language Processing community. The Web provides a considerable amount of data where NLP techniques may shed the light within it. The use of the URI as a unique identifier may provide one milestone for the materialization of entities lifted from a raw text to real world object

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino