Search CORE

989 research outputs found

Forecasting bus passenger flows by using a clustering-based support vector regression approach

Author: Bai Yun
Cheng Zhiwei
Li Chuan
Wang Xiaodan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

As a significant component of the intelligent transportation system, forecasting bus passenger flows plays a key role in resource allocation, network planning, and frequency setting. However, it remains challenging to recognize high fluctuations, nonlinearity, and periodicity of bus passenger flows due to varied destinations and departure times. For this reason, a novel forecasting model named as affinity propagation-based support vector regression (AP-SVR) is proposed based on clustering and nonlinear simulation. For the addressed approach, a clustering algorithm is first used to generate clustering-based intervals. A support vector regression (SVR) is then exploited to forecast the passenger flow for each cluster, with the use of particle swarm optimization (PSO) for obtaining the optimized parameters. Finally, the prediction results of the SVR are rearranged by chronological order rearrangement. The proposed model is tested using real bus passenger data from a bus line over four months. Experimental results demonstrate that the proposed model performs better than other peer models in terms of absolute percentage error and mean absolute percentage error. It is recommended that the deterministic clustering technique with stable cluster results (AP) can improve the forecasting performance significantly.info:eu-repo/semantics/publishedVersio

Sapientia

A Preference Model on Adaptive Affinity Propagation

Author: Juarna Asep
Mutiara Achmad Benny
Refianti Rina
Suhendra Adang
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/06/2018
Field of study

In recent years, two new data clustering algorithms have been proposed. One of them isAffinity Propagation (AP). AP is a new data clustering technique that use iterative message passing and consider all data points as potential exemplars. Two important inputs of AP are a similarity matrix (SM) of the data and the parameter ”preference” p. Although the original AP algorithm has shown much success in data clustering, it still suffer from one limitation: it is not easy to determine the value of the parameter ”preference” p which can result an optimal clustering solution. To resolve this limitation, we propose a new model of the parameter ”preference” p, i.e. it is modeled based on the similarity distribution. Having the SM and p, Modified Adaptive AP (MAAP) procedure is running. MAAP procedure means that we omit the adaptive p-scanning algorithm as in original Adaptive-AP (AAP) procedure. Experimental results on random non-partition and partition data sets show that (i) the proposed algorithm, MAAP-DDP, is slower than original AP for random non-partition dataset, (ii) for random 4-partition dataset and real datasets the proposed algorithm has succeeded to identify clusters according to the number of dataset’s true labels with the execution times that are comparable with those original AP. Beside that the MAAP-DDP algorithm demonstrates more feasible and effective than original AAP procedure

IAES journal

Crossref

Institute of Advanced Engineering and Science

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs

Author: AC Gavin
AC Gavin
AK Jain
B Alberts
BJ Frey
BJ Frey
C Stark
E Pieroni
GD Bader
H Chipman
H Yu
J MacQueen
J Vlasblom
James Vlasblom
M Blatt
ME Cusick
MJ Brusco
N Johnsson
NJ Krogan
P Shannon
R Sharan
S Bader
S Brohee
S Charbonnier
S Fields
S Lloyd
S Pu
S Pu
S van Dongen
SH Yook
Shoshana J Wodak
SR Collins
T Formosa
T Hastie
TE Ideker
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clustering algorithms, some of which have been applied to this problem. One of the most successful clustering procedures in this context has been the Markov Cluster algorithm (MCL), which was recently shown to outperform a number of other procedures, some of which were specifically designed for partitioning protein interactions graphs. A novel promising clustering procedure termed Affinity Propagation (AP) was recently shown to be particularly effective, and much faster than other methods for a variety of problems, but has not yet been applied to partition protein interaction graphs. Results In this work we compare the performance of the Affinity Propagation (AP) and Markov Clustering (MCL) procedures. To this end we derive an unweighted network of protein-protein interactions from a set of 408 protein complexes from <it>S. cervisiae </it>hand curated in-house, and evaluate the performance of the two clustering algorithms in recalling the annotated complexes. In doing so the parameter space of each algorithm is sampled in order to select optimal values for these parameters, and the robustness of the algorithms is assessed by quantifying the level of complex recall as interactions are randomly added or removed to the network to simulate noise. To evaluate the performance on a weighted protein interaction graph, we also apply the two algorithms to the consolidated protein interaction network of <it>S. cerevisiae</it>, derived from genome scale purification experiments and to versions of this network in which varying proportions of the links have been randomly shuffled. Conclusion Our analysis shows that the MCL procedure is significantly more tolerant to noise and behaves more robustly than the AP algorithm. The advantage of MCL over AP is dramatic for unweighted protein interaction graphs, as AP displays severe convergence problems on the majority of the unweighted graph versions that we tested, whereas MCL continues to identify meaningful clusters, albeit fewer of them, as the level of noise in the graph increases. MCL thus remains the method of choice for identifying protein complexes from binary interaction networks.</p

University of Toronto Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

An integrative approach to inferring biologically meaningful gene modules

Author: Cho Ji-Hoon
Galas David J
Wang Kai
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. Results We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in <it>Saccharomyces cerevisiae</it>, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. Conclusions The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Automatic Detection and Classification of Breast Tumors in Ultrasonic Images Using Texture and Morphological Features

Author: Baraldi A
Brendan JF
Chang RF
Chen DR
Chen DR
Chen DR
Cheng HD
Dietterich T
Garra BS
Horsch K
Huang YL
Huang YL
Jing Jiao
Joo SY
Li CM
Liu B
Madabhushi A
Mu TT
Ning JF
Noble JA
Qiu P
Qiu P
Shi JB
Stavros AT
Stavros AT
Su YN
Tsantis S
Wu Z
Yang XS
Yanni Su
Yi Guo
Yu YJ
Yuanyuan Wang
Zhou Z
Publication venue: Bentham Open
Publication date
Field of study

Due to severe presence of speckle noise, poor image contrast and irregular lesion shape, it is challenging to build a fully automatic detection and classification system for breast ultrasonic images. In this paper, a novel and effective computer-aided method including generation of a region of interest (ROI), segmentation and classification of breast tumor is proposed without any manual intervention. By incorporating local features of texture and position, a ROI is firstly detected using a self-organizing map neural network. Then a modified Normalized Cut approach considering the weighted neighborhood gray values is proposed to partition the ROI into clusters and get the initial boundary. In addition, a regional-fitting active contour model is used to adjust the few inaccurate initial boundaries for the final segmentation. Finally, three textures and five morphologic features are extracted from each breast tumor; whereby a highly efficient Affinity Propagation clustering is used to fulfill the malignancy and benign classification for an existing database without any training process. The proposed system is validated by 132 cases (67 benignancies and 65 malignancies) with its performance compared to traditional methods such as level set segmentation, artificial neural network classifiers, and so forth. Experiment results show that the proposed system, which needs no training procedure or manual interference, performs best in detection and classification of ultrasonic breast tumors, while having the lowest computation complexity

Crossref

PubMed Central

Structural learning for large scale image classification

Author: NC DOCKS at The University of North Carolina at Charlotte
Shen Yi
Publication venue
Publication date: 01/01/2013
Field of study

To leverage large-scale collaboratively-tagged (loosely-tagged) images for training a large number of classifiers to support large-scale image classification, we need to develop new frameworks to deal with the following issues: (1) spam tags, i.e., tags are not relevant to the semantic of the images; (2) loose object tags, i.e., multiple object tags are loosely given at the image level without their locations in the images; (3) missing object tags, i.e. some object tags are missed due to incomplete tagging; (4) inter-related object classes, i.e., some object classes are visually correlated and their classifiers need to be trained jointly instead of independently; (5) large scale object classes, which requires to limit the computational time complexity for classifier training algorithms as well as the storage spaces for intermediate results. To deal with these issues, we propose a structural learning framework which consists of the following key components: (1) cluster-based junk image filtering to address the issue of spam tags; (2) automatic tag-instance alignment to address the issue of loose object tags; (3) automatic missing object tag prediction; (4) object correlation network for inter-class visual correlation characterization to address the issue of missing tags; (5) large-scale structural learning with object correlation network for enhancing the discrimination power of object classifiers. To obtain enough numbers of labeled training images, our proposed framework leverages the abundant web images and their social tags. To make those web images usable, tag cleansing has to be done to neutralize the noise from user tagging preferences, in particularly junk tags, loose tags and missing tags. Then a discriminative learning algorithm is developed to train a large number of inter-related classifiers for achieving large-scale image classification, e.g., learning a large number of classifiers for categorizing large-scale images into a large number of inter-related object classes and image concepts. A visual concept network is first constructed for organizing enumorus object classes and image concepts according to their inter-concept visual correlations. The visual concept network is further used to: (a) identify inter-related learning tasks for classifier training; (b) determine groups of visually-similar object classes and image concepts; and (c) estimate the learning complexity for classifier training. A large-scale discriminative learning algorithm is developed for supporting multi-class classifier training and achieving accurate inter-group discrimination and effective intra-group separation. Our discriminative learning algorithm can significantly enhance the discrimination power of the classifiers and dramatically reduce the computational cost for large-scale classifier training

The University of North Carolina at Greensboro