2,148 research outputs found
Pycobra: A Python Toolbox for Ensemble Learning and Visualisation
We introduce \texttt{pycobra}, a Python library devoted to ensemble learning
(regression and classification) and visualisation. Its main assets are the
implementation of several ensemble learning algorithms, a flexible and generic
interface to compare and blend any existing machine learning algorithm
available in Python libraries (as long as a \texttt{predict} method is given),
and visualisation tools such as Voronoi tessellations. \texttt{pycobra} is
fully \texttt{scikit-learn} compatible and is released under the MIT
open-source license. \texttt{pycobra} can be downloaded from the Python Package
Index (PyPi) and Machine Learning Open Source Software (MLOSS). The current
version (along with Jupyter notebooks, extensive documentation, and continuous
integration tests) is available at
\href{https://github.com/bhargavvader/pycobra}{https://github.com/bhargavvader/pycobra}
and official documentation website is
\href{https://modal.lille.inria.fr/pycobra}{https://modal.lille.inria.fr/pycobra}
Eigenvalue Interlacing of Bipartite Graphs and Construction of Expander Code using Vertex-split of a Bipartite Graph
The second largest eigenvalue of a graph is an important algebraic parameter
which is related with the expansion, connectivity and randomness properties of
a graph. Expanders are highly connected sparse graphs. In coding theory,
Expander codes are Error Correcting codes made up of bipartite expander graphs.
In this paper, first we prove the interlacing of the eigenvalues of the
adjacency matrix of the bipartite graph with the eigenvalues of the bipartite
quotient matrices of the corresponding graph matrices. Then we obtain bounds
for the second largest and second smallest eigenvalues. Since the graph is
bipartite, the results for Laplacian will also hold for Signless Laplacian
matrix. We then introduce a new method called vertex-split of a bipartite graph
to construct asymptotically good expander codes with expansion factor
and and prove a condition for
the vertex-split of a bipartite graph to be connected with respect to
Further, we prove that the vertex-split of is a bipartite
expander. Finally, we construct an asymptotically good expander code whose
factor graph is a graph obtained by the vertex-split of a bipartite graph.Comment: 17 pages, 2 figure
Kernel-Based Ensemble Learning in Python
We propose a new supervised learning algorithm, for classification and
regression problems where two or more preliminary predictors are available. We
introduce \texttt{KernelCobra}, a non-linear learning strategy for combining an
arbitrary number of initial predictors. \texttt{KernelCobra} builds on the
COBRA algorithm introduced by \citet{biau2016cobra}, which combined estimators
based on a notion of proximity of predictions on the training data. While the
COBRA algorithm used a binary threshold to declare which training data were
close and to be used, we generalize this idea by using a kernel to better
encapsulate the proximity information. Such a smoothing kernel provides more
representative weights to each of the training points which are used to build
the aggregate and final predictor, and \texttt{KernelCobra} systematically
outperforms the COBRA algorithm. While COBRA is intended for regression,
\texttt{KernelCobra} deals with classification and regression.
\texttt{KernelCobra} is included as part of the open source Python package
\texttt{Pycobra} (0.2.4 and onward), introduced by \citet{guedj2018pycobra}.
Numerical experiments assess the performance (in terms of pure prediction and
computational complexity) of \texttt{KernelCobra} on real-life and synthetic
datasets.Comment: 11 page
Reachability Based Web Page Ranking Using Wavelets
AbstractA naïve approach has been made by applying the concept of reachability for web page ranking and implementing multi resolution analysis using Haar wavelet to order the web pages. In this article, page ranking has been done by developing a structured signal using in links, out links and reachability values of the web pages of network graphs. Using Haar wavelet, the page ranking is proposed and developed. The average and detailed coefficients of the input signal and the down sampling process provides the necessary page ranking of web pages. This approach does not involve any iterative technique, damping factor or initialization of the page ranks. In this paper, comparison between the original page rank, category-based page rank and the proposed approach have been made. The result reflects the role of paths between the pages in page rankings
- …