1,419 research outputs found
COVAR: Computer Program for Multifactor Relative Risks and Tests of Hypotheses Using a Variance-Covariance Matrix from Linear and Log-Linear Regression
A computer program for multifactor relative risks, confidence limits, and tests of hypotheses using regression coefficients and a variance-covariance matrix obtained from a previous additive or multiplicative regression analysis is described in detail. Data used by the program can be stored and input from an external disk-file or entered via the keyboard. The output contains a list of the input data, point estimates of single or joint effects, confidence intervals and tests of hypotheses based on a minimum modified chi-square statistic. Availability of the program is also discussed.
Generalized Singular Value Decomposition with Additive Components
The singular value decomposition (SVD) technique is extended to incorporate the additive components for approximation of a rectangular matrix by the outer products of vectors. While dual vectors of the regular SVD can be expressed one via linear transformation of the other, the modified SVD corresponds to the general linear transformation with the additive part. The method obtained can be related to the family of principal component and correspondence analyses, and can be reduced to an eigenproblem of a specific transformation of a data matrix. This technique is applied to constructing dual eigenvectors for data visualizing in a two dimensional space
CONTEXT AWARE PRIVACY PRESERVING CLUSTERING AND CLASSIFICATION
Data are valuable assets to any organizations or individuals. Data are sources of useful information which is a big part of decision making. All sectors have potential to benefit from having information. Commerce, health, and research are some of the fields that have benefited from data. On the other hand, the availability of the data makes it easy for anyone to exploit the data, which in many cases are private confidential data. It is necessary to preserve the confidentiality of the data. We study two categories of privacy: Data Value Hiding and Data Pattern Hiding. Privacy is a huge concern but equally important is the concern of data utility. Data should avoid privacy breach yet be usable. Although these two objectives are contradictory and achieving both at the same time is challenging, having knowledge of the purpose and the manner in which it will be utilized helps. In this research, we focus on some particular situations for clustering and classification problems and strive to balance the utility and privacy of the data.
In the first part of this dissertation, we propose Nonnegative Matrix Factorization (NMF) based techniques that accommodate constraints defined explicitly into the update rules. These constraints determine how the factorization takes place leading to the favorable results. These methods are designed to make alterations on the matrices such that user-specified cluster properties are introduced. These methods can be used to preserve data value as well as data pattern. As NMF and K-means are proven to be equivalent, NMF is an ideal choice for pattern hiding for clustering problems. In addition to the NMF based methods, we propose methods that take into account the data structures and the attribute properties for the classification problems. We separate the work into two different parts: linear classifiers and nonlinear classifiers. We propose two different solutions based on the classifiers. We study the effect of distortion on the utility of data.
We propose three distortion measurement metrics which demonstrate better characteristics than the traditional metrics. The effectiveness of the measures is examined on different benchmark datasets. The result shows that the methods have the desirable properties such as invariance to translation, rotation, and scaling
Social interaction, noise and antibiotic-mediated switches in the intestinal microbiota
The intestinal microbiota plays important roles in digestion and resistance
against entero-pathogens. As with other ecosystems, its species composition is
resilient against small disturbances but strong perturbations such as
antibiotics can affect the consortium dramatically. Antibiotic cessation does
not necessarily restore pre-treatment conditions and disturbed microbiota are
often susceptible to pathogen invasion. Here we propose a mathematical model to
explain how antibiotic-mediated switches in the microbiota composition can
result from simple social interactions between antibiotic-tolerant and
antibiotic-sensitive bacterial groups. We build a two-species (e.g. two
functional-groups) model and identify regions of domination by
antibiotic-sensitive or antibiotic-tolerant bacteria, as well as a region of
multistability where domination by either group is possible. Using a new
framework that we derived from statistical physics, we calculate the duration
of each microbiota composition state. This is shown to depend on the balance
between random fluctuations in the bacterial densities and the strength of
microbial interactions. The singular value decomposition of recent metagenomic
data confirms our assumption of grouping microbes as antibiotic-tolerant or
antibiotic-sensitive in response to a single antibiotic. Our methodology can be
extended to multiple bacterial groups and thus it provides an ecological
formalism to help interpret the present surge in microbiome data.Comment: 20 pages, 5 figures accepted for publication in Plos Comp Bio.
Supplementary video and information availabl
- …