Search CORE

1,366 research outputs found

Jerarca: Efficient Analysis of Complex Networks Using Hierarchical Clustering

Author: A Clauset
A Marco
A Marco
AD King
AL Barabási
AM Yip
AW Rives
BJ Breitkreutz
C Brun
Carl Kingsford
DJ Watts
GD Bader
H Lu
Ignacio Marín
JB Pereira-Leal
JI Lucas
JS Farris
K Tamura
M Altaf-Ul-Amin
M Nei
MEJ Newman
MS Cline
N Pržulj
N Saitou
R Sharan
Rodrigo Aldecoa
RR Sokal
SY Pu
T Aittokallio
V Arnau
V Spirin
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: How to extract useful information from complex biological networks is a major goal in many fields, especially in genomics and proteomics. We have shown in several works that iterative hierarchical clustering, as implemented in the UVCluster program, is a powerful tool to analyze many of those networks. However, the amount of computation time required to perform UVCluster analyses imposed significant limitations to its use. Methodology/Principal Findings: We describe the suite Jerarca, designed to efficiently convert networks of interacting units into dendrograms by means of iterative hierarchical clustering. Jerarca is divided into three main sections. First, weighted distances among units are computed using up to three different approaches: a more efficient version of UVCluster and two new, related algorithms called RCluster and SCluster. Second, Jerarca builds dendrograms based on those distances, using well-known phylogenetic algorithms, such as UPGMA or Neighbor-Joining. Finally, Jerarca provides optimal partitions of the trees using statistical criteria based on the distribution of intra- and intercluster connections. Outputs compatible with the phylogenetic software MEGA and the Cytoscape package are generated, allowing the results to be easily visualized. Conclusions/Significance: The four main advantages of Jerarca in respect to UVCluster are: 1) Improved speed of a novel UVCluster algorithm; 2) Additional, alternative strategies to perform iterative hierarchical clustering; 3) Automatic evaluatio

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Digital.CSIC

Bayesian stochastic blockmodeling

Author: Airoldi E. M.
Catherine Matias
Erdős P.
Jeffreys H.
MacKay D. J. C.
Newman M. E. J.
Shtar'kov Y. M.
Yan X.
Publication venue: 'Wiley'
Publication date: 06/02/2020
Field of study

This chapter provides a self-contained introduction to the use of Bayesian inference to extract large-scale modular structures from network data, based on the stochastic blockmodel (SBM), as well as its degree-corrected and overlapping generalizations. We focus on nonparametric formulations that allow their inference in a manner that prevents overfitting, and enables model selection. We discuss aspects of the choice of priors, in particular how to avoid underfitting via increased Bayesian hierarchies, and we contrast the task of sampling network partitions from the posterior distribution with finding the single point estimate that maximizes it, while describing efficient algorithms to perform either one. We also show how inferring the SBM can be used to predict missing and spurious links, and shed light on the fundamental limitations of the detectability of modular structures in networks.Comment: 44 pages, 16 figures. Code is freely available as part of graph-tool at https://graph-tool.skewed.de . See also the HOWTO at https://graph-tool.skewed.de/static/doc/demos/inference/inference.htm

arXiv.org e-Print Archive

Crossref

Survey of data mining approaches to user modeling for adaptive hypermedia

Author: Chen SY
Frias-Martinez E
Liu X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio

CiteSeerX

Crossref

Brunel University Research Archive

Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires

Author: Afzal
Ahmed
Albert
Amanna
Andrews
Angermueller
Apeltsin
Arden
Atchley
Avnir
Barabási
Barak
Bashford-Rogers
Bastian
Baum
Becattini
Ben-Hamo
Berger
Betz
Boc
Bolen
Bolkhovskaya
Bolotin
Bouckaert
Boyd
Boyd
Breden
Brown
Burnet
Bürckert
Calis
Castro
Chang
Chao
Chen
Ching
Cobey
Collins
Corcoran
Covacu
Csardi
Cui
Dash
de Bourcy
DeKosky
DeKosky
DeWitt
Dziubianau
Elhanati
Elhanati
Elhanati
Ellebedy
Emerson
Felsenstein
Friedensohn
Gadala-Maria
Galson
Galson
Geering
Georgiou
Ghraichy
Giribet
Giudicelli
Glanville
Glanville
Glanville
Good
Granato
Greiff
Greiff
Greiff
Greiff
Greiff
Grigaityte
Guindon
Gupta
Gupta
Hagberg
Halliley
Hammarlund
Heather
Hershberg
Hochreiter
Hoehn
Hoehn
Hoehn
Horns
Howie
Iversen
Jackson
Janeway
Jiang
Johnston
Jost
Jurtz
Kaplinsky
Kaplinsky
Kendall
Khavrutskii
Kidd
Kidd
Kidera
Kirik
Konishi
Kumar
Landsverk
Larkin
Lavinder
Laydon
Laydon
Lee
Lee
Lewitus
Li
Lindeman
Lindner
Liu
Love
Lozupone
Madi
Madi
Malissen
Mamoshina
Mangul
Manz
Martin
Meng
Miho
Mora
Morisita
Murugan
Nazarov
Nouri
Oakes
Ostmeyer
Paradis
Parameswaran
Parola
Pinheiro
Pollok
Ralph
Ravn
Reddy
Rempala
Rempala
Revell
Rizzetto
Robinson
Ronquist
Roybal
Rubelt
Rubelt
Safonova
Schliep
Schramm
Schwab
Shannon
Sheng
Sheng
Shlemov
Shugay
Shugay
Shugay
Snir
Snir
Stamatakis
Stern
Strauli
Stubbington
Stubbington
Sun
Sun Cinelli
Swofford
Thomas
Tickotsky
Tonegawa
Torkamani
Trepel
VanDuijn
Venturi
Venturi
Vieira
Vita
Wang
Wardemann
Warren
Watson
Watson
Watson
Weinstein
Wine
Wine
Wu
Yaari
Yaari
Yang
Yeap
Yermanos
Yokota
Yu
Zhu
Publication venue: 'Frontiers Media SA'
Publication date: 29/11/2017
Field of study

The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the quantitative and molecular-level profiling of immune repertoires thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity in order to understand the dynamics of adaptive immunity. Here, we review the current research on (i) diversity, (ii) clustering and network, (iii) phylogenetic and (iv) machine learning methods applied to dissect, quantify and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology towards coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics.Comment: 27 pages, 2 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

NORA - Norwegian Open Research Archives

Multi-clustering net model for VLSI placement

Author: Ziyatdinov Andrey
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2008
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

System of Terrain Analysis, Energy Estimation and Path Planning for Planetary Exploration by Robot Teams

Author: Michel David C
Publication venue: Scholarship@Western
Publication date: 14/12/2012
Field of study

NASA’s long term plans involve a return to manned moon missions, and eventually sending humans to mars. The focus of this project is the use of autonomous mobile robotics to enhance these endeavors. This research details the creation of a system of terrain classification, energy of traversal estimation and low cost path planning for teams of inexpensive and potentially expendable robots. The first stage of this project was the creation of a model which estimates the energy requirements of the traversal of varying terrain types for a six wheel rocker-bogie rover. The wheel/soil interaction model uses Shibly’s modified Bekker equations and incorporates a new simplified rocker-bogie model for estimating wheel loads. In all but a single trial the relative energy requirements for each soil type were correctly predicted by the model. A path planner for complete coverage intended to minimize energy consumption was designed and tested. It accepts as input terrain maps detailing the energy consumption required to move to each adjacent location. Exploration is performed via a cost function which determines the robot’s next move. This system was successfully tested for multiple robots by means of a shared exploration map. At peak efficiency, the energy consumed by our path planner was only 56% that used by the best case back and forth coverage pattern. After performing a sensitivity analysis of Shibly’s equations to determine which soil parameters most affected energy consumption, a neural network terrain classifier was designed and tested. The terrain classifier defines all traversable terrain as one of three soil types and then assigns an assumed set of soil parameters. The classifier performed well over all, but had some difficulty distinguishing large rocks from sand. This work presents a system which successfully classifies terrain imagery into one of three soil types, assesses the energy requirements of terrain traversal for these soil types and plans efficient paths of complete coverage for the imaged area. While there are further efforts that can be made in all areas, the work achieves its stated goals

Scholarship@Western

Computational fluids domain reduction to a simplified fluid network

Author: Smith Robert E.
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/01/2012
Field of study

The primary goal of this project is to demonstrate the practical use of data mining algorithms to cluster a solved steady-state computational fluids simulation (CFD) flow domain into a simplified lumped-parameter network. A commercial-quality code, “cfdMine” was created using a volume-weighted k-means clustering that that can accomplish the clustering of a 20 million cell CFD domain on a single CPU in several hours or less. Additionally agglomeration and k-means Mahalanobis were added as optional post-processing steps to further enhance the separation of the clusters. The resultant nodal network is considered a reduced-order model and can be solved transiently at a very minimal computational cost. The reduced order network is then instantiated in the commercial thermal solver MuSES to perform transient conjugate heat transfer using convection predicted using a lumped network (based on steady-state CFD). When inserting the lumped nodal network into a MuSES model, the potential for developing a “localized heat transfer coefficient” is shown to be an improvement over existing techniques. Also, it was found that the use of the clustering created a new flow visualization technique. Finally, fixing clusters near equipment newly demonstrates a capability to track temperatures near specific objects (such as equipment in vehicles)

Michigan Technological University

Modelling communities and populations: An introduction to computational social science

Author: Buda Andrzej
Jarynowski Andrzej
Paradowski Michał B.
Publication venue: Wydawnictwo Naukowe UAM
Publication date: 01/01/2019
Field of study

In sociology, interest in modelling has not yet become widespread. However, the methodology has been gaining increased attention in parallel with its growing popularity in economics and other social sciences, notably psychology and political science, and the growing volume of social data being measured and collected. In this paper, we present representative computational methodologies from both data-driven (such as “black box”) and rule-based (such as “per analogy”) approaches. We show how to build simple models, and discuss both the greatest successes and the major limitations of modelling societies. We claim that the end goal of computational tools in sociology is providing meaningful analyses and calculations in order to allow making causal statements in sociological explanation and support decisions of great importance for society

Adam Mickiewicz University Repository

Repozytorium Uniwersytetu im. Adama Mickiewicza (AMUR)