Search CORE

96 research outputs found

Optimasi Conjugate Gradient pada Backpropagation Neural Network untuk Deteksi Kualitas Daun Tembakau

Author: Ansari R. (Rudy)
Izzana M. (Meila)
Lareno B. (Bambang)
Marleny F. D. (Finki)
P R. A. (Ricardus)
Sari Y. (Yuslena)
Publication venue: STMIK STIKOM Bali
Publication date: 01/10/2015
Field of study

Tembakau merupakan komoditi perkebunan yang memiliki nilai ekonomi tingg, teutama sebagai bahan utama rokok. Produksi rokok memberikan pengaruh pada perekonomian di beberapa negara. Sebelum proses produksi rokok, diperlukan klasifikasi kualitas daun tembakau agar mendapatkan komposisi bahan baku rokok yang tepat. Penilaian kualitas daun tembakau ini terdiri dari dua faktor yaitu human sensory dan human vision yang dilakukan oleh grader. Perkembangan teknologi informasi saat ini mampu melakukan pengolahan citra sehingga dapat memaksimalkan faktor human vision yang diharapkan dapat menghemat waktu dan biaya. Pada penelitian ini, deteksi kualitas daun tembakau didasarkan pada dua ekstraksi fitur daun tembakau yaitu bentuk dan tekstur. Kedua fitur tersebut nantinya akan diklasifikasikan menggunakan optimasi Conjugate Gradient pada Backpropagation Neural Network. Hasilnya, metode yang digunakan mampu meningkatkan tingkat akurasi deteksi kualitas daun tembakau. Peningkatan akurasi untuk klasifikasi grade daun tembakau dengan metode backpropagation neural network mencapai akurasi hingga 77,50%

Neliti

Learning multi-linear representations of distributions for efficient inference

Author: A. Darwiche
A. Darwiche
D. Burdick
D. Gunopulos
D. Heckerman
D. Roth
Dan Roth
E. Castillo
F. V. Jensen
J. Pearl
J. S. Yedidia
M. J. Wainwright
M. Jaeger
M. Meila
N. L. Zhang
N. Srebro
Rajhans Samdani
W. R. Gilks
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Kernel Spectral Clustering and applications

In this chapter we review the main literature related to kernel spectral clustering (KSC), an approach to clustering cast within a kernel-based optimization setting. KSC represents a least-squares support vector machine based formulation of spectral clustering described by a weighted kernel PCA objective. Just as in the classifier case, the binary clustering model is expressed by a hyperplane in a high dimensional space induced by a kernel. In addition, the multi-way clustering can be obtained by combining a set of binary decision functions via an Error Correcting Output Codes (ECOC) encoding scheme. Because of its model-based nature, the KSC method encompasses three main steps: training, validation, testing. In the validation stage model selection is performed to obtain tuning parameters, like the number of clusters present in the data. This is a major advantage compared to classical spectral clustering where the determination of the clustering parameters is unclear and relies on heuristics. Once a KSC model is trained on a small subset of the entire data, it is able to generalize well to unseen test points. Beyond the basic formulation, sparse KSC algorithms based on the Incomplete Cholesky Decomposition (ICD) and

L_0

L_1, L_0 + L_1

, Group Lasso regularization are reviewed. In that respect, we show how it is possible to handle large scale data. Also, two possible ways to perform hierarchical clustering and a soft clustering method are presented. Finally, real-world applications such as image segmentation, power load time-series clustering, document clustering and big data learning are considered.Comment: chapter contribution to the book "Unsupervised Learning Algorithms

arXiv.org e-Print Archive

Crossref

A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods

Author: A Friaes
A Vainio
D Steinley
DL Wallace
E Simpson
Enrico Scalas
ER Martins
FR Pinto
Francisco R. Pinto
H Grundmann
JA Carrico
José Melo-Cristino
M Erlandsson
M Meila
M Miragaia
Mário Ramirez
NA Faria
S Camiz
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Very diverse research fields frequently deal with the analysis of multiple clustering results, which should imply an objective detection of overlaps and divergences between the formed groupings. The congruence between these multiple results can be quantified by clustering comparison measures such as the Wallace coefficient (W). Since the measured congruence is dependent on the particular sample taken from the population, there is variability in the estimated values relatively to those of the true population. In the present work we propose the use of a confidence interval (CI) to account for this variability when W is used. The CI analytical formula is derived assuming a Gaussian sampling distribution and recurring to the algebraic relationship between W and the Simpson's index of diversity. This relationship also allows the estimation of the expected Wallace value under the assumption of independence of classifications. We evaluated the CI performance using simulated and published microbial typing data sets. The simulations showed that the CI has the desired 95% coverage when the W is greater than 0.5. This behaviour is robust to changes in cluster number, cluster size distributions and sample size. The analysis of the published data sets demonstrated the usefulness of the new CI by objectively validating some of the previous interpretations, while showing that other conclusions lacked statistical support

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Influence of wiring cost on the large-scale architecture of human cortical connectivity

Author: A Fornito
A Zalesky
AK Seth
AL Traud
Anil K. Seth
B Karrer
BJ Baars
C Echtermeyer
C Koch
CC Hilgetag
CJ Honey
CJ Stam
D Bhowmik
D Holten
D Meunier
D Meunier
D Tomasi
D Tomasi
David Samu
DB Chklovskii
DC Van Essen
DJ Watts
DS Bassett
DS Bassett
DS Bassett
DS Modha
E Bullmore
E Ravasz
FD Rossa
G Tononi
G Zamora-López
G Zamora-López
H Johansen-Berg
H Pan
J Cabral
J Gómez-Gardeñes
JP Onnela
L Cammoun
L Danon
M Arthuis
M Kaiser
M Kaiser
M Kaiser
M Kaiser
M Kaiser
M Meila
M Rubinov
M Rubinov
M Shanahan
M Shanahan
M Shanahan
M Shanahan
M Steen
M Valencia
MD Greicius
MD Humphries
MD Humphries
ME Newman
ME Newman
ME Raichle
MEJ Newman
MEJ Newman
MG Kitzbichler
MP Van den Heuvel
MP van den Heuvel
O Sporns
O Sporns
O Sporns
O Sporns
O Sporns
O Sporns
O Sporns
O Sporns
O Sporns
O Sporns
Olaf Sporns
P Hagmann
R Milo
S Achard
S Boccaletti
S Jbabdi
S Maslov
S Mori
S Zhou
SB Seidman
SL Bressler
SP Borgatti
T Opsahl
T Sørensen
TE Conturo
Thomas Nowotny
V Colizza
V Latora
Y Chen
Y He
YY Ahn
ZJ Chen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/04/2014
Field of study

In the past two decades some fundamental properties of cortical connectivity have been discovered: small-world structure, pronounced hierarchical and modular organisation, and strong core and rich-club structures. A common assumption when interpreting results of this kind is that the observed structural properties are present to enable the brain's function. However, the brain is also embedded into the limited space of the skull and its wiring has associated developmental and metabolic costs. These basic physical and economic aspects place separate, often conflicting, constraints on the brain's connectivity, which must be characterized in order to understand the true relationship between brain structure and function. To address this challenge, here we ask which, and to what extent, aspects of the structural organisation of the brain are conserved if we preserve specific spatial and topological properties of the brain but otherwise randomise its connectivity. We perform a comparative analysis of a connectivity map of the cortical connectome both on high- and low-resolutions utilising three different types of surrogate networks: spatially unconstrained (‘random’), connection length preserving (‘spatial’), and connection length optimised (‘reduced’) surrogates. We find that unconstrained randomisation markedly diminishes all investigated architectural properties of cortical connectivity. By contrast, spatial and reduced surrogates largely preserve most properties and, interestingly, often more so in the reduced surrogates. Specifically, our results suggest that the cortical network is less tightly integrated than its spatial constraints would allow, but more strongly segregated than its spatial constraints would necessitate. We additionally find that hierarchical organisation and rich-club structure of the cortical connectivity are largely preserved in spatial and reduced surrogates and hence may be partially attributable to cortical wiring constraints. In contrast, the high modularity and strong s-core of the high-resolution cortical network are significantly stronger than in the surrogates, underlining their potential functional relevance in the brain

Crossref

Directory of Open Access Journals

PubMed Central

Sussex Research Online

Three Modern Roles for Logic in AI

Author: Amarilli Antoine
Chan Hei
Choi Arthur
Choi Arthur
Choi Arthur
Darwiche Adnan
Darwiche Adnan
Darwiche Adnan
Latour Anna L. D.
Manhaeve Robin
McCarthy John
Meila Marina
Muise Christian J.
Murphy Kevin Patrick
Narodytska Nina
Oztok Umut
Ribeiro Marco Tú
Roth Dan
Sharma Shubham
Shen Yujia
Shih Andy
Slivovsky Friedrich
Thurley Marc
Xu Jingyi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/04/2020
Field of study

We consider three modern roles for logic in artificial intelligence, which are based on the theory of tractable Boolean circuits: (1) logic as a basis for computation, (2) logic for learning from a combination of data and knowledge, and (3) logic for reasoning about the behavior of machine learning systems.Comment: To be published in PODS 202

arXiv.org e-Print Archive

Crossref

Ranked Adjusted Rand: integrating distance and partition information in a measure of clustering agreement

Author: A Thalamuthu
B Larsen
C Silva-Costa
C Silva-Costa
D Steinley
DL Wallace
EB Fowlkes
FJ Rohlf
Francisco R Pinto
FX Wu
GW Milligan
H Chipman
H Li
HL Kundel
I Serrano
JA Carrico
JA Carrico
Jonas S Almeida
João A Carriço
L Hubert
M Meila
Mário Ramirez
PH Sneath
S van Dongen
WM Rand
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Biological information is commonly used to cluster or classify entities of interest such as genes, conditions, species or samples. However, different sources of data can be used to classify the same set of entities and methods allowing the comparison of the performance of two data sources or the determination of how well a given classification agrees with another are frequently needed, especially in the absence of a universally accepted "gold standard" classification. RESULTS: Here, we describe a novel measure – the Ranked Adjusted Rand (RAR) index. RAR differs from existing methods by evaluating the extent of agreement between any two groupings, taking into account the intercluster distances. This characteristic is relevant to evaluate cases of pairs of entities grouped in the same cluster by one method and separated by another. The latter method may assign them to close neighbour clusters or, on the contrary, to clusters that are far apart from each other. RAR is applicable even when intercluster distance information is absent for both or one of the groupings. In the first case, RAR is equal to its predecessor, Adjusted Rand (HA) index. Artificially designed clusterings were used to demonstrate situations in which only RAR was able to detect differences in the grouping patterns. A study with larger simulated clusterings ensured that in realistic conditions, RAR is effectively integrating distance and partition information. The new method was applied to biological examples to compare 1) two microbial typing methods, 2) two gene regulatory network distances and 3) microarray gene expression data with pathway information. In the first application, one of the methods does not provide intercluster distances while the other originated a hierarchical clustering. RAR proved to be more sensitive than HA in the choice of a threshold for defining clusters in the hierarchical method that maximizes agreement between the results of both methods. CONCLUSION: RAR has its major advantage in combining cluster distance and partition information, while the previously available methods used only the latter. RAR should be used in the research problems were HA was previously used, because in the absence of inter cluster distance effects it is an equally effective measure, and in the presence of distance effects it is a more complete one

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Delineating Geographical Regions with Networks of Human Interactions in an Extensive Set of Countries

Author: A Clauset
BC Csáji
C Ratti
C Ratti
C Thiemann
Carlo Ratti
D Lazer
E Fowlkes
G Miritello
M González
M Mata
M Meila
M Newman
M Newman
M Szell
Michael Szell
O Attanasio
R Leonardi
Riccardo Campari
S Becker
S Fortunato
S Phithakkitnukoon
S Rinzivillo
S Sobolevsky
Stanislav Sobolevsky
T Gallagher
Thomas Couronné
V Belik
V Blondel
V Blondel
VD Blondel
W Christaller
W Rand
Yamir Moreno
Zbigniew Smoreda
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/08/2013
Field of study

Large-scale networks of human interaction, in particular country-wide telephone call networks, can be used to redraw geographical maps by applying algorithms of topological community detection. The geographic projections of the emerging areas in a few recent studies on single regions have been suggested to share two distinct properties: first, they are cohesive, and second, they tend to closely follow socio-economic boundaries and are similar to existing political regions in size and number. Here we use an extended set of countries and clustering indices to quantify overlaps, providing ample additional evidence for these observations using phone data from countries of various scales across Europe, Asia, and Africa: France, the UK, Italy, Belgium, Portugal, Saudi Arabia, and Ivory Coast. In our analysis we use the known approach of partitioning country-wide networks, and an additional iterative partitioning of each of the first level communities into sub-communities, revealing that cohesiveness and matching of official regions can also be observed on a second level if spatial resolution of the data is high enough. The method has possible policy implications on the definition of the borderlines and sizes of administrative regions.National Science Foundation (U.S.)Singapore-MIT Alliance for Research and Technolog

arXiv.org e-Print Archive

Public Library of Science (PLOS)

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Two case reports of bilateral vertebral artery tortuosity and spiral twisting in vascular vertigo

Author: A Dodevski
CC Lee
D Meila
DP Zhang
DW Giang
E Jellici
GC Cloud
HC Han
J Gutierrez
JH Park
JJ Chen
K Ikeda
M Baldy-Moulinier
M Canyigit
M Cosar
M Karatas
ME Drake Jr
S Giannopoulos
SA Morris
SH Lee
SI Savitz
V Ranganatha Sastry
Zhang Dao-pei
Zhang Hong-tao
Zhang Shu-ling
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Apples and oranges: avoiding different priors in Bayesian DNA sequence analysis

Author: A Bernal
A Culotta
A Feelders
AE Kel
AL Berger
AY Ng
C Burge
CM Bishop
D Cai
D Grossman
D Heckerman
D Klein
E Redhead
E Segal
F Pernkopf
G Yeo
GD Stormo
H Wallach
H Wettig
HE Peckham
I Ben-Gal
Ivo Grosse
J Cerquides
J Davis
J Goodman
J Grau
J Keilwagen
Jan Grau
Jens Keilwagen
L Narlikar
M Arita
M Meila-Predoviciu
M Tompa
M Zhang
MI Jordan
NK Kim
O Schulte
O Yakhnenko
P Grünwald
R Castelo
R Castelo
R Greiner
R Staden
S Chen
S Sonnenburg
SL Salzberg
Stefan Posch
T Fawcett
TH Kim
TM Chen
WL Buntine
Y Barash
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background One of the challenges of bioinformatics remains the recognition of short signal sequences in genomic DNA such as donor or acceptor splice sites, splicing enhancers or silencers, translation initiation sites, transcription start sites, transcription factor binding sites, nucleosome binding sites, miRNA binding sites, or insulator binding sites. During the last decade, a wealth of algorithms for the recognition of such DNA sequences has been developed and compared with the goal of improving their performance and to deepen our understanding of the underlying cellular processes. Most of these algorithms are based on statistical models belonging to the family of Markov random fields such as position weight matrix models, weight array matrix models, Markov models of higher order, or moral Bayesian networks. While in many comparative studies different learning principles or different statistical models have been compared, the influence of choosing different prior distributions for the model parameters when using different learning principles has been overlooked, and possibly lead to questionable conclusions. Results With the goal of allowing direct comparisons of different learning principles for models from the family of Markov random fields based on the <it>same a-priori information</it>, we derive a generalization of the commonly-used product-Dirichlet prior. We find that the derived prior behaves like a Gaussian prior close to the maximum and like a Laplace prior in the far tails. In two case studies, we illustrate the utility of the derived prior for a direct comparison of different learning principles with different models for the recognition of binding sites of the transcription factor Sp1 and human donor splice sites. Conclusions We find that comparisons of different learning principles using the same a-priori information can lead to conclusions different from those of previous studies in which the effect resulting from different priors has been neglected. We implement the derived prior is implemented in the open-source library Jstacs to enable an easy application to comparative studies of different learning principles in the field of sequence analysis.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central