2,330 research outputs found

    Robust Clustering based on Winner-Population Markov Chain

    Get PDF
    [[abstract]]In this paper, we propose an unsupervised genetic clustering algorithm, which produces a new chromosome without any conventional genetic operators, and instead according to the gene reproducing probabilities determined by Markov chain modeling. Selection of cluster centers from the dataset enables construction of a look-up table that saves the distances between all pairs of data points. The experimental results show that the proposed algorithm not only solves the premature problem to provide a more stable clustering performance in terms of number of clusters and clustering results, but also improves the time efficiency[[sponsorship]]IAPR[[notice]]補正完畢[[incitationindex]]EI[[conferencetype]]國際[[conferencedate]]20060820~20060824[[booktype]]紙本[[iscallforpapers]]Y[[conferencelocation]]Hong Kong, Chin

    [[alternative]]Population-Markov-Chain-Based Clustering Tecnique

    Get PDF
    計畫編號:NSC94-2213-E032-027研究期間:200508~200607研究經費:333,000[[abstract]]本研究提出一個新的分群(clustering)技術,以基因演算法(Genetic Algorithm, GA)為基礎,但不需要執行GA運算。藉由分析族群馬可夫鏈(population Markov chains) 以及一些基因演算法操作運算的修改,本篇提出的技術效能遠遠超越現存的其它基因演算法分群(GA clustering)方法。本文提出的策略採用Yong Gao et al. 所提之馬可夫鏈的修改版本來計算演化的過程。在演化的過程中,子代的產生根據馬可夫鏈模型(Markov chain modeling)所提供的機率而得,因而不需要傳統的基因演算運算子,如複製、交配、突變等等。因此可以省掉基因演算法中所需的大量計算。在分群的過程中,每個群聚(cluster)的中心從資料集中挑選且以二元表示法來表示群聚中心, 因此可事先計算資料集合內每兩點的距離,再存放於一個查詢表(look-up table)中,如此在計算適應函數(fitness function)時能避免重複的計算。此計畫中我們將分析不同的距離度量並研究如何保持群聚的特性,比如形狀和大小。最後利用DB index來量測群聚效度(cluster validity)。實驗結果指示出我們所提的方法無論在分群結果或執行效率上均優於其它傳統基因演算法。[[sponsorship]]行政院國家科學委員

    Recovering complete and draft population genomes from metagenome datasets.

    Get PDF
    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution

    EEG source-space synchrostate transitions and Markov modeling in the math-gifted brain during a long-chain reasoning task

    Get PDF
    To reveal transition dynamics of global neuronal networks of math‐gifted adolescents in handling long‐chain reasoning, this study explores momentary phase‐synchronized patterns, that is, electroencephalogram (EEG) synchrostates, of intracerebral sources sustained in successive 50 ms time windows during a reasoning task and non‐task idle process. Through agglomerative hierarchical clustering for functional connectivity graphs and nested iterative cosine similarity tests, this study identifies seven general and one reasoning‐specific prototypical functional connectivity patterns from all synchrostates. Markov modeling is performed for the time‐sequential synchrostates of each trial to characterize the interstate transitions. The analysis reveals that default mode network, central executive network (CEN), dorsal attention network, cingulo‐opercular network, left/right ventral frontoparietal network, and ventral visual network aperiodically recur over non‐task or reasoning process, exhibiting high predictability in interactively reachable transitions. Compared to non‐gifted subjects, math‐gifted adolescents show higher fractional occupancy and mean duration in CEN and reasoning‐triggered transient right frontotemporal network (rFTN) in the time course of the reasoning process. Statistical modeling of Markov chains reveals that there are more self‐loops in CEN and rFTN of the math‐gifted brain, suggesting robust state durability in temporally maintaining the topological structures. Besides, math‐gifted subjects show higher probabilities in switching from the other types of synchrostates to CEN and rFTN, which represents more adaptive reconfiguration of connectivity pattern in the large‐scale cortical network for focused task‐related information processing, which underlies superior executive functions in controlling goal‐directed persistence and high predictability of implementing imagination and creative thinking during long‐chain reasoning

    ToyArchitecture: Unsupervised Learning of Interpretable Models of the World

    Full text link
    Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providing a particular theory and its implementation. In contrast with purely theoretical approaches, the resulting architecture should be usable in realistic settings, but also form the core of a framework containing all the basic mechanisms, into which it should be easier to integrate additional required functionality. In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world, learning the influence of one's own actions on the world, model-based reinforcement learning, hierarchical planning and plan execution, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations with the following properties: 1) they are increasingly more abstract, but can retain details when needed, and 2) they are easy to manipulate in their local and symbolic-like form, thus also allowing one to observe the learning process at each level of abstraction. On all levels of the system, the representation of the data can be interpreted in both a symbolic and a sub-symbolic manner. This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.Comment: Revision: changed the pdftitl

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
    corecore