169 research outputs found
Bandwidth Allocation Mechanism based on Users' Web Usage Patterns for Campus Networks
Managing the bandwidth in campus networks becomes a challenge in recent years. The limited bandwidth resource and continuous growth of users make the IT managers think on the strategies concerning bandwidth allocation. This paper introduces a mechanism for allocating bandwidth based on the users’ web usage patterns. The main purpose is to set a higher bandwidth to the users who are inclined to browsing educational websites compared to those who are not. In attaining this proposed technique, some stages need to be done. These are the preprocessing of the weblogs, class labeling of the dataset, computation of the feature subspaces, training for the development of the ANN for LDA/GSVD algorithm, visualization, and bandwidth allocation. The proposed method was applied to real weblogs from university’s proxy servers. The results indicate that the proposed method is useful in classifying those users who used the internet in an educational way and those who are not. Thus, the developed ANN for LDA/GSVD algorithm outperformed the existing algorithm up to 50% which indicates that this approach is efficient. Further, based on the results, few users browsed educational contents. Through this mechanism, users will be encouraged to use the internet for educational purposes. Moreover, IT managers can make better plans to optimize the distribution of bandwidth
A primer on correlation-based dimension reduction methods for multi-omics analysis
The continuing advances of omic technologies mean that it is now more
tangible to measure the numerous features collectively reflecting the molecular
properties of a sample. When multiple omic methods are used, statistical and
computational approaches can exploit these large, connected profiles.
Multi-omics is the integration of different omic data sources from the same
biological sample. In this review, we focus on correlation-based dimension
reduction approaches for single omic datasets, followed by methods for pairs of
omics datasets, before detailing further techniques for three or more omic
datasets. We also briefly detail network methods when three or more omic
datasets are available and which complement correlation-oriented tools. To aid
readers new to this area, these are all linked to relevant R packages that can
implement these procedures. Finally, we discuss scenarios of experimental
design and present road maps that simplify the selection of appropriate
analysis methods. This review will guide researchers navigate the emerging
methods for multi-omics and help them integrate diverse omic datasets
appropriately and embrace the opportunity of population multi-omics.Comment: 30 pages, 2 figures, 6 table
Linear dimensionality reduction: Survey, insights, and generalizations
Linear dimensionality reduction methods are a cornerstone of analyzing high
dimensional data, due to their simple geometric interpretations and typically
attractive computational properties. These methods capture many data features
of interest, such as covariance, dynamical structure, correlation between data
sets, input-output relationships, and margin between data classes. Methods have
been developed with a variety of names and motivations in many fields, and
perhaps as a result the connections between all these methods have not been
highlighted. Here we survey methods from this disparate literature as
optimization programs over matrix manifolds. We discuss principal component
analysis, factor analysis, linear multidimensional scaling, Fisher's linear
discriminant analysis, canonical correlations analysis, maximum autocorrelation
factors, slow feature analysis, sufficient dimensionality reduction,
undercomplete independent component analysis, linear regression, distance
metric learning, and more. This optimization framework gives insight to some
rarely discussed shortcomings of well-known methods, such as the suboptimality
of certain eigenvector solutions. Modern techniques for optimization over
matrix manifolds enable a generic linear dimensionality reduction solver, which
accepts as input data and an objective to be optimized, and returns, as output,
an optimal low-dimensional projection of the data. This simple optimization
framework further allows straightforward generalizations and novel variants of
classical methods, which we demonstrate here by creating an
orthogonal-projection canonical correlations analysis. More broadly, this
survey and generic solver suggest that linear dimensionality reduction can move
toward becoming a blackbox, objective-agnostic numerical technology.JPC and ZG received funding from the UK Engineering and Physical Sciences Research Council (EPSRC EP/H019472/1). JPC received funding from a Sloan Research Fellowship, the Simons Foundation (SCGB#325171 and SCGB#325233), the Grossman Center at Columbia University, and the Gatsby Charitable Trust.This is the author accepted manuscript. The final version is available from MIT Press via http://jmlr.org/papers/v16/cunningham15a.htm
Multi-Label Dimensionality Reduction
abstract: Multi-label learning, which deals with data associated with multiple labels simultaneously, is ubiquitous in real-world applications. To overcome the curse of dimensionality in multi-label learning, in this thesis I study multi-label dimensionality reduction, which extracts a small number of features by removing the irrelevant, redundant, and noisy information while considering the correlation among different labels in multi-label learning. Specifically, I propose Hypergraph Spectral Learning (HSL) to perform dimensionality reduction for multi-label data by exploiting correlations among different labels using a hypergraph. The regularization effect on the classical dimensionality reduction algorithm known as Canonical Correlation Analysis (CCA) is elucidated in this thesis. The relationship between CCA and Orthonormalized Partial Least Squares (OPLS) is also investigated. To perform dimensionality reduction efficiently for large-scale problems, two efficient implementations are proposed for a class of dimensionality reduction algorithms, including canonical correlation analysis, orthonormalized partial least squares, linear discriminant analysis, and hypergraph spectral learning. The first approach is a direct least squares approach which allows the use of different regularization penalties, but is applicable under a certain assumption; the second one is a two-stage approach which can be applied in the regularization setting without any assumption. Furthermore, an online implementation for the same class of dimensionality reduction algorithms is proposed when the data comes sequentially. A Matlab toolbox for multi-label dimensionality reduction has been developed and released. The proposed algorithms have been applied successfully in the Drosophila gene expression pattern image annotation. The experimental results on some benchmark data sets in multi-label learning also demonstrate the effectiveness and efficiency of the proposed algorithms.Dissertation/ThesisPh.D. Computer Science 201
- …