150 research outputs found

    ๊ทธ๋ž˜ํ”„ ์ตœ์ ํ™” ๋ฌธ์ œ๋ฅผ ์œ„ํ•œ ์ ์ง„์  ์œ ์ „ ์•Œ๊ณ ๋ฆฌ์ฆ˜

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2016. 8. ๋ฌธ๋ณ‘๋กœ.A combinatorial optimization problem is an optimization problem having a discrete solution space. Lots of the graph problems belong to this category as graphs are discrete objects. Graphs are widely used in the various field and there are lots of real world combinatorial optimization problems which take the graphs as their input. For some of these problems, the magnitude of the solution space is exponential to the size of the problem, and thereby efficient space search algorithms are required to deal with them. Genetic algorithms are widely used to solve combinatorial optimization problems, and incremental genetic algorithms could be used to efficiently solve graph optimization problems.We define subproblems and solve them step by step instead of tackling the problems directly. A subproblem solved by an incremental genetic algorithm deals with a restriction of the original graph structure. The subproblems are solved in the intermediate steps and the size of the subproblem is gradually increased. We apply the same genetic algorithm to each subproblem, and it is initialized with the evolved population of the previous step. We propose incremental genetic algorithms for two different combinatorial optimization problemsthe subgraph isomorphism problem and graph cut optimization problem. We devise an optimal substructure on the subproblem sequence and explain how it is related to the optimality of the process, along with other related factors. We present graph expansion methodologies and vertex reordering schemes to define an appropriate sequence of subproblems. We combine the proposed incremental approach with a hybrid genetic algorithm for the subgraph isomorphism problem, and the algorithm was further developed for nearly perfect results. Based on our analysis, we also propose an incremental genetic algorithm to solve graph cut optimization problems. We tested the implementation of the algorithm on benchmark graph instances for the graph partitioning problem and the maximum cut problem. Through experiments, we investigate and analyze how the sequence of subproblems affects the search space landscape. The performance of a genetic algorithm makes an improvement when the incremental approach is applied with respect to an appropriate sequence of subproblems.Chapter I. Introduction 1 Chapter II. Incremental Genetic Algorithm 6 2.1 Overview and Traditional Applications 6 2.2 Application on Graph Optimization Problems 9 2.2.1 Formalization of the Incremental Process 9 2.2.2 Theoretical Background 12 2.2.3 Sequence of Subproblems 15 Chapter III. Subgraph Isomorphism Problem 19 3.1 Introduction 19 3.2 The Proposed Algorithm 21 3.2.1 The Structure of the Incremental Genetic Algorithm 21 3.2.2 Design Issues 25 3.2.3 Genetic Framework 28 3.3 Experimental Results 31 3.3.1 Dataset and Evaluation 31 3.3.2 Results and Discussions 33 3.3.3 Overall Results 39 3.4 Further Improvement 42 3.4.1 New Operators 43 3.4.2 Improvements by New Operators 45 3.4.3 Overall Result 46 Chapter IV. Graph Cut Optimization Problems 50 4.1 Introduction 50 4.2 The Proposed Algorithm 51 4.2.1 Subproblem Structure 51 4.2.2 Reordering Schemes 54 4.2.3 Genetic Framework 55 4.3 Experimental Results 57 4.3.1 Dataset and Evaluation 57 4.3.2 Results on Graph Partitioning Problem 58 4.3.3 Results on Maximum Cut Problem 66 4.3.4 Results on Problem Variants 70 Chapter V. Related Applications 75 5.1 Measuring Source Code Similarity with an Incremental Genetic Algorithm 75 5.1.1 Introduction 75 5.1.2 The Proposed System 76 5.1.3 Experimental Results 80 5.1.4 Discussion 88 5.2 Linear Ordering Problem and an Approximate Fitness Evaluation 88 5.2.1 Introduction 88 5.2.2 The Proposed Method 89 5.2.3 Experimental Results 91 Chapter VI. Conclusions 94 Bibliography 96 ๊ตญ๋ฌธ ์ดˆ๋ก 106Docto

    Online Spectral Clustering on Network Streams

    Get PDF
    Graph is an extremely useful representation of a wide variety of practical systems in data analysis. Recently, with the fast accumulation of stream data from various type of networks, significant research interests have arisen on spectral clustering for network streams (or evolving networks). Compared with the general spectral clustering problem, the data analysis of this new type of problems may have additional requirements, such as short processing time, scalability in distributed computing environments, and temporal variation tracking. However, to design a spectral clustering method to satisfy these requirements certainly presents non-trivial efforts. There are three major challenges for the new algorithm design. The first challenge is online clustering computation. Most of the existing spectral methods on evolving networks are off-line methods, using standard eigensystem solvers such as the Lanczos method. It needs to recompute solutions from scratch at each time point. The second challenge is the parallelization of algorithms. To parallelize such algorithms is non-trivial since standard eigen solvers are iterative algorithms and the number of iterations can not be predetermined. The third challenge is the very limited existing work. In addition, there exists multiple limitations in the existing method, such as computational inefficiency on large similarity changes, the lack of sound theoretical basis, and the lack of effective way to handle accumulated approximate errors and large data variations over time. In this thesis, we proposed a new online spectral graph clustering approach with a family of three novel spectrum approximation algorithms. Our algorithms incrementally update the eigenpairs in an online manner to improve the computational performance. Our approaches outperformed the existing method in computational efficiency and scalability while retaining competitive or even better clustering accuracy. We derived our spectrum approximation techniques GEPT and EEPT through formal theoretical analysis. The well established matrix perturbation theory forms a solid theoretic foundation for our online clustering method. We facilitated our clustering method with a new metric to track accumulated approximation errors and measure the short-term temporal variation. The metric not only provides a balance between computational efficiency and clustering accuracy, but also offers a useful tool to adapt the online algorithm to the condition of unexpected drastic noise. In addition, we discussed our preliminary work on approximate graph mining with evolutionary process, non-stationary Bayesian Network structure learning from non-stationary time series data, and Bayesian Network structure learning with text priors imposed by non-parametric hierarchical topic modeling

    ELRUNA: Elimination Rule-based Network Alignment

    Get PDF
    Networks model a variety of complex phenomena across different domains. In many applications, one of the most essential tasks is to align two or more networks to infer the similarities between cross-network vertices and discover potential node-level correspondence. In this thesis, we propose ELRUNA (Elimination rule-based network alignment), a novel network alignment algorithm that relies exclusively on the underlying graph structure. Under the guidance of the elimination rules that we define, ELRUNA computes the similarity between a pair of cross-network vertices iteratively by accumulating the similarities between their selected neighbors. The resulting cross-network similarity matrix is then used to infer a permutation matrix that encodes the final alignment of cross-network vertices. In addition to the novel alignment algorithm, we also improve the performance of local search, a commonly used post-processing step for solving the network alignment problem, by introducing a novel selection method RAWSEM (Random-walk based selection method) based on the propagation of the levels of mismatching (dened in the thesis) of vertices across the networks. The key idea is to pass on the initial levels of mismatching of vertices throughout the entire network in a random-walk fashion. Through extensive numerical experiments on real networks, we demonstrate that ELRUNA significantly outperforms the state-of-the-art alignment methods in terms of alignment accuracy under lower or comparable running time. Moreover, ELRUNA is robust to network perturbations such that it can maintain a close to optimal objective value under a high level of noise added to the original networks. Finally, the proposed RAWSEM can further improve the alignment quality with a less number of iterations compared with the naive local search method

    Doctor of Philosophy

    Get PDF
    dissertationNetwork emulation has become an indispensable tool for the conduct of research in networking and distributed systems. It offers more realism than simulation and more control and repeatability than experimentation on a live network. However, emulation testbeds face a number of challenges, most prominently realism and scale. Because emulation allows the creation of arbitrary networks exhibiting a wide range of conditions, there is no guarantee that emulated topologies reflect real networks; the burden of selecting parameters to create a realistic environment is on the experimenter. While there are a number of techniques for measuring the end-to-end properties of real networks, directly importing such properties into an emulation has been a challenge. Similarly, while there exist numerous models for creating realistic network topologies, the lack of addresses on these generated topologies has been a barrier to using them in emulators. Once an experimenter obtains a suitable topology, that topology must be mapped onto the physical resources of the testbed so that it can be instantiated. A number of restrictions make this an interesting problem: testbeds typically have heterogeneous hardware, scarce resources which must be conserved, and bottlenecks that must not be overused. User requests for particular types of nodes or links must also be met. In light of these constraints, the network testbed mapping problem is NP-hard. Though the complexity of the problem increases rapidly with the size of the experimenter's topology and the size of the physical network, the runtime of the mapper must not; long mapping times can hinder the usability of the testbed. This dissertation makes three contributions towards improving realism and scale in emulation testbeds. First, it meets the need for realistic network conditions by creating Flexlab, a hybrid environment that couples an emulation testbed with a live-network testbed, inheriting strengths from each. Second, it attends to the need for realistic topologies by presenting a set of algorithms for automatically annotating generated topologies with realistic IP addresses. Third, it presents a mapper, assign, that is capable of assigning experimenters' requested topologies to testbeds' physical resources in a manner that scales well enough to handle large environments

    SANA: simulated annealing far outperforms many other search algorithms for biological network alignment

    Full text link
    SummaryEvery alignment algorithm consists of two orthogonal components: an objective function M measuring the quality of an alignment, and a search algorithm that explores the space of alignments looking for ones scoring well according to M . We introduce a new search algorithm called SANA (Simulated Annealing Network Aligner) and apply it to protein-protein interaction networks using S 3 as the topological measure. Compared against 12 recent algorithms, SANA produces 5-10 times as many correct node pairings as the others when the correct answer is known. We expose an anti-correlation in many existing aligners between their ability to produce good topological vs. functional similarity scores, whereas SANA usually outscores other methods in both measures. If given the perfect objective function encoding the identity mapping, SANA quickly converges to the perfect solution while many other algorithms falter. We observe that when aligning networks with a known mapping and optimizing only S 3 , SANA creates alignments that are not perfect and yet whose S 3 scores match that of the perfect alignment. We call this phenomenon saturation of the topological score . Saturation implies that a measure's correlation with alignment correctness falters before the perfect alignment is reached. This, combined with SANA's ability to produce the perfect alignment if given the perfect objective function, suggests that better objective functions may lead to dramatically better alignments. We conclude that future work should focus on finding better objective functions, and offer SANA as the search algorithm of choice.Availability and implementationSoftware available at http://sana.ics.uci.edu [email protected] informationSupplementary data are available at Bioinformatics online
    • โ€ฆ
    corecore