Search CORE

11 research outputs found

Replicable parallel branch and bound search

Author: Abu-Khzam
Alba
Aldinucci
Archibald
Bernard Gendron
Blair Archibald
Blumofe
Bomze
Butenko
Chandra
Chu
Ciaran McCreesh
Cole
Dean
Depolli
de Bruin
Eblen
Everitt
Fukagawa
Hall
Harvey
Jones
Konc
Lai
Laporte
Li
Li
Li
Maier
Martello
Martello
Matoušek
McCreesh
McCreesh
McCreesh
McCreesh
Moisan
Morrison
Nikolaev
Okubo
Olivier
Patrick Maier
Phil Trinder
Pisinger
Poldner
Prim
Prosser
Regula
Reinders
Reinelt
Robert Stewart
Salkin
San Segundo
San Segundo
Segundo
Segundo
Segundo
Tomita
Tomita
Tomita
Trienekens
Vogels
Walsh
Wu
Xiang
Yan
Publication venue: 'Elsevier BV'
Publication date: 12/07/2017
Field of study

Combinatorial branch and bound searches are a common technique for solving global optimisation and decision problems. Their performance often depends on good search order heuristics, refined over decades of algorithms research. Parallel search necessarily deviates from the sequential search order, sometimes dramatically and unpredictably, e.g. by distributing work at random. This can disrupt effective search order heuristics and lead to unexpected and highly variable parallel performance. The variability makes it hard to reason about the parallel performance of combinatorial searches. This paper presents a generic parallel branch and bound skeleton, implemented in Haskell, with replicable parallel performance. The skeleton aims to preserve the search order heuristic by distributing work in an ordered fashion, closely following the sequential search order. We demonstrate the generality of the approach by applying the skeleton to 40 instances of three combinatorial problems: Maximum Clique, 0/1 Knapsack and Travelling Salesperson. The overheads of our Haskell skeleton are reasonable: giving slowdown factors of between 1.9 and 6.2 compared with a class-leading, dedicated, and highly optimised C++ Maximum Clique solver. We demonstrate scaling up to 200 cores of a Beowulf cluster, achieving speedups of 100x for several Maximum Clique instances. We demonstrate low variance of parallel performance across all instances of the three combinatorial problems and at all scales up to 200 cores, with median Relative Standard Deviation (RSD) below 2%. Parallel solvers that do not follow the sequential search order exhibit far higher variance, with median RSD exceeding 85% for Knapsack

arXiv.org e-Print Archive

Enlighten: Research Data (University of Glasgow)

Crossref

Heriot Watt Pure

Stirling Online Research Repository (RIOXX)

Sheffield Hallam University Research Archive

Enlighten

Stirling Online Research Repository

Multiple graph matching and applications

Author: Solé Ribalta Albert
Publication venue: 'Universitat Rovira I Virgili'
Publication date: 01/01/2012
Field of study

En aplicaciones de reconocimiento de patrones, los grafos con atributos son en gran medida apropiados. Normalmente, los vértices de los grafos representan partes locales de los objetos i las aristas relaciones entre estas partes locales. No obstante, estas ventajas vienen juntas con un severo inconveniente, la distancia entre dos grafos no puede ser calculada en un tiempo polinómico. Considerando estas características especiales el uso de los prototipos de grafos es necesariamente omnipresente. Las aplicaciones de los prototipos de grafos son extensas, siendo las más habituales clustering, clasificación, reconocimiento de objetos, caracterización de objetos i bases de datos de grafos entre otras. A pesar de la diversidad de aplicaciones de los prototipos de grafos, el objetivo del mismo es equivalente en todas ellas, la representación de un conjunto de grafos. Para construir un prototipo de un grafo todos los elementos del conjunto de enteramiento tienen que ser etiquetados comúnmente. Este etiquetado común consiste en identificar que nodos de que grafos representan el mismo tipo de información en el conjunto de entrenamiento. Una vez este etiquetaje común esta hecho, los atributos locales pueden ser combinados i el prototipo construido. Hasta ahora los algoritmos del estado del arte para calcular este etiquetaje común mancan de efectividad o bases teóricas. En esta tesis, describimos formalmente el problema del etiquetaje global i mostramos una taxonomía de los tipos de algoritmos existentes. Además, proponemos seis nuevos algoritmos para calcular soluciones aproximadas al problema del etiquetaje común. La eficiencia de los algoritmos propuestos es evaluada en diversas bases de datos reales i sintéticas. En la mayoría de experimentos realizados los algoritmos propuestos dan mejores resultados que los existentes en el estado del arte.In pattern recognition, the use of graphs is, to a great extend, appropriate and advantageous. Usually, vertices of the graph represent local parts of an object while edges represent relations between these local parts. However, its advantages come together with a sever drawback, the distance between two graph cannot be optimally computed in polynomial time. Taking into account this special characteristic the use of graph prototypes becomes ubiquitous. The applicability of graphs prototypes is extensive, being the most common applications clustering, classification, object characterization and graph databases to name some. However, the objective of a graph prototype is equivalent to all applications, the representation of a set of graph. To synthesize a prototype all elements of the set must be mutually labeled. This mutual labeling consists in identifying which nodes of which graphs represent the same information in the training set. Once this mutual labeling is done the set can be characterized and combined to create a graph prototype. We call this initial labeling a common labeling. Up to now, all state of the art algorithms to compute a common labeling lack on either performance or theoretical basis. In this thesis, we formally describe the common labeling problem and we give a clear taxonomy of the types of algorithms. Six new algorithms that rely on different techniques are described to compute a suboptimal solution to the common labeling problem. The performance of the proposed algorithms is evaluated using an artificial and several real datasets. In addition, the algorithms have been evaluated on several real applications. These applications include graph databases and group-wise image registration. In most of the tests and applications evaluated the presented algorithms have showed a great improvement in comparison to state of the art applications

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Repositori Institucional URV

グラフや超グラフに含まれる非巡回部分構造の列挙に関する研究

Author: 和佐州洋
Publication venue
Publication date: 24/03/2016
Field of study

Hokkaido University Collection of Scholarly and Academic Papers

Schema decision trees for heterogeneous JSON arrays

Author
Publication venue: University of Northern British Columbia
Publication date: 20/08/2020
Field of study

Due to the popularity of the JavaScript Object Notation (JSON), a need has arisen for the creation of schema documents for the purpose of validating the content of other JSON documents. Existing automatic schema generation tools, however, have not adequately considered the scenario of an array of JSON objects with different types of structures. These tools work off the assumption that all objects have the same structure, and thus, only generate a single schema combining them together. To address this problem, this thesis looks to improve upon schema generation for heterogeneous JSON arrays. We develop an algorithm to determine a set of keys that identifies what type of structure each element has. These keys are then used as the basis for a schema decision tree. The objective of this tree is to help in the validation process by allowing each element to be compared against a single, more tailored, schema

Arca British Columbia's network of post-secondary digital repositories

最大クリーク問題の多項式時間的可解性に関する研究

Author: Hiroaki Nakanishi
中西裕陽
Publication venue
Publication date: 19/12/2016
Field of study

いわゆる“最大クリーク問題”は典型的なNP 完全問題であり, 多項式時間的に本問題を解くことはほぼ不可能であると強く予測されている．従って, 少なくともどのような条件下ならばこのNP 完全問題を多項式時間的に解くことが出来るかを明らかにすることは重要な課題である．これに対し, 平面グラフ, コーダルグラフ等いくつかの特殊グラフに対しては多項式時間的可解性が成立することが示されている. しかし一般グラフにおいては, 最大クリーク問題が多項式時間的可解となる条件について, これまでにおいて有意義な定量的結果は発表されていなかった. そこで本研究では, 先ず極大クリーク全列挙アルゴリズムCLIQUES (E. Tomita, A. Tanaka, H. Takahashi: Theoretical Computer Science, 2006) を基にして, 基本的な最大クリーク抽出の深さ優先探索アルゴリズムを確立した. この基本的アルゴリズムに対して探索領域限定操作をより強力化し, 対応したより詳細な場合分けを伴った解析を行うことにより, アルゴリズムが多項式時間的に終端する条件を逐次緩和し, 次の定量的な多項式時間的可解性条件を与えた．即ち, 先ず一般グラフにおいてグラフの最大次数Δ のみを条件とした, 最大クリーク問題に対する以下の多項式時間的可解性の成立を示した. 「節点数n のグラフG = (V,E) の最大次数Δ が,Δ_0:定数) なる条件を満たすとき, 最大クリーク問題はO(n1+d) なる多項式時間で可解である. 」さらに本研究においては, 全節点に対する前記条件をより緩和した, 次の拡張結果も与えた. 「サイズn0>_2 なる任意の連結な誘導部分グラフG(C)( C⊆V ) に対して, C 中の最小次数節点v が, deg(v)_0:定数) を満たすとき, 最大クリーク問題はO(nmax(2,1+d)) の多項式時間で可解である. 」これは, サイズn0 である連結な誘導部分グラフのうち, 次数最小の節点を除き全く無条件としたもので, 制限条件の大きい緩和である. 以上本論文では, 最大クリーク問題の多項式時間的可解性について, 新しい枠組みを与えた.電気通信大学201

Creative Repository of Electro-Communications

29th International Symposium on Algorithms and Computation: ISAAC 2018, December 16-19, 2018, Jiaoxi, Yilan, Taiwan

Author: ISAAC <29. 2018, Jiaoxi, Yilan>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/12/2018
Field of study

Digitale Bibliothek Thüringen

Solving hard subgraph problems in parallel

Author: McCreesh Ciaran
Publication venue
Publication date: 01/01/2017
Field of study

This thesis improves the state of the art in exact, practical algorithms for finding subgraphs. We study maximum clique, subgraph isomorphism, and maximum common subgraph problems. These are widely applicable: within computing science, subgraph problems arise in document clustering, computer vision, the design of communication protocols, model checking, compiler code generation, malware detection, cryptography, and robotics; beyond, applications occur in biochemistry, electrical engineering, mathematics, law enforcement, fraud detection, fault diagnosis, manufacturing, and sociology. We therefore consider both the ``pure'' forms of these problems, and variants with labels and other domain-specific constraints. Although subgraph-finding should theoretically be hard, the constraint-based search algorithms we discuss can easily solve real-world instances involving graphs with thousands of vertices, and millions of edges. We therefore ask: is it possible to generate ``really hard'' instances for these problems, and if so, what can we learn? By extending research into combinatorial phase transition phenomena, we develop a better understanding of branching heuristics, as well as highlighting a serious flaw in the design of graph database systems. This thesis also demonstrates how to exploit two of the kinds of parallelism offered by current computer hardware. Bit parallelism allows us to carry out operations on whole sets of vertices in a single instruction---this is largely routine. Thread parallelism, to make use of the multiple cores offered by all modern processors, is more complex. We suggest three desirable performance characteristics that we would like when introducing thread parallelism: lack of risk (parallel cannot be exponentially slower than sequential), scalability (adding more processing cores cannot make runtimes worse), and reproducibility (the same instance on the same hardware will take roughly the same time every time it is run). We then detail the difficulties in guaranteeing these characteristics when using modern algorithmic techniques. Besides ensuring that parallelism cannot make things worse, we also increase the likelihood of it making things better. We compare randomised work stealing to new tailored strategies, and perform experiments to identify the factors contributing to good speedups. We show that whilst load balancing is difficult, the primary factor influencing the results is the interaction between branching heuristics and parallelism. By using parallelism to explicitly offset the commitment made to weak early branching choices, we obtain parallel subgraph solvers which are substantially and consistently better than the best sequential algorithms

Glasgow Theses Service

木編集距離の宣言的意味に基づく階層とその計算に関する研究

Author: 芳野拓也
Publication venue: 平田, 耕一
Publication date: 13/06/2018
Field of study

WebにおけるHTMLデータやXMLデータ,バイオインフォマティクスにおけるRNAや糖鎖データのような根付きラベル付き木(以後,木という)として表現される木構造データを比較することは,構造データからのデータマイニングや機械学習における重要な研究の一つである.そのような木同士の距離として有名なものの一つに木編集距離がある.木編集距離は,ノードの削除,挿入,置換からなる編集操作を用いて,一方の根付き木から他方の木への変換に必要な編集操作列の最小コストとして定式化される.2つの木の間の編集操作列は無数に存在するため,操作列をすべて計算して木編集距離を求める方法は現実的ではない.そこでTaiは,木編集距離計算の指針として,木編集距離に宣言的意味を与えるTaiマッピング(以後単にマッピングともいう)を導入した.このTaiマッピングは,先祖子孫関係(および順序木の場合は兄弟関係)を保持する木のノード間の一対一対応であり,Taiマッピングの最小コストは木編集距離と一致する.木編集距離の計算時間は,順序木の場合はノード数nに対してO(n3)時間であるが,無順序木の場合はMAX SNP困難である.一方,糖鎖データではノードのつながりに意味があるためそのつながりを崩さないような制約が求められ,XMLデータでは根ノードから一定のノードはどの木にも共通する場合があり,より葉ノードに重点を置いた距離が求められる.このように,対象によっては木編集距離は過度に一般的となるため,他方では計算効率を上げるという目的の下に,宣言的意味であるマッピングに制限を加えることで木編集距離のさまざまな変種が研究されている.特に,RNA解析などで利用され,削除の前に挿入を行う木編集距離でもある木アライメント距離の計算は,順序木の場合はノード数nに対してO(n4)時間,無順序木の場合は一般にMAX SNP困難であるが,次数が限定されている木のときは多項式時間で計算できる.このアライメント距離は,2つの木の超木となるアライメント木の最小コストとして定式化することができ,Taiマッピングに制限を加えた劣制限マッピングの最小コストと一致する.本論文では,まず,マッピングへの制限をTaiマッピングの階層として捉え,この階層を共通部分森,特に,共通部分森中のノードの接続と部分木の並びの観点から見直すことで,木編集距離の変種の計算における本質について研究する.また,これらの観点によって新たに導入されるマッピングについて,それらの最小コストとなる編集距離の変種の時間計算時間を解析する.また,木アライメント距離に対して,森アライメント構築の高速化を目的として導入されたアンカーアライメント問題が提唱されている.これは,アンカーと呼ばれるマッピングを入力とし,そのアンカーでの対応を保持したアライメント木を構築する問題であるが,このアンカーはTaiマッピングであり,劣制限マッピングでないマッピングがアンカーとして入力されると木が構築することができない.そこで本論文では,木アライメント距離の宣言的意味が劣制限マッピングとなることの構成的な別証明を与え,その構成方法を利用することで,アンカーアライメント問題の出力を,アライメント木が構築できない場合は”no”を返す形に定式化する.また,それに基づくアンカーアライメント距離を定式化し,アンカーアライメント距離とアライメント距離を実データをもとに比較する.さらに,順序木より一般的であり,無順序木より制限された巡回的順序木を提案し,巡回的順序木間でのアライメント距離を計算するアルゴリズムを設計する.最後に,木編集距離に関するさまざまな内容として,無順序木編集距離を計算する動的A∗アルゴリズムの設計,Taiマッピングの根無し木への拡張,巡回的順序木と次数制限無順序木のマッピングカーネルの設計を行う.無順序木編集距離を計算するアルゴリズムとしては,既に,複数の下限関数を用いるHiguchiらのA∗アルゴリズムが導入されているが,これには計算の重複が存在するため,改善の余地がある.本論文では,その重複計算を動的計画法を用いて省いた動的A∗アルゴリズムを導入する.また,実験により,下限関数の効率を確認する.また,根付き木Taiマッピングは木編集距離に対応する重要な概念であるが,このTaiマッピングを根無し木に拡張するためには,単射であることに加えて,先祖子孫関係に代わる条件を導入する必要がある.そこで,ZhangらがLCA保存マッピングを根無し木に拡張する際に用いた中心に着目し,根無し木のマッピングを導入する.特に,根無し木としてよく表現される進化系統樹を特徴づける条件である4点条件と3点条件を木のトポロジーを特徴づける条件に変更し,それぞれの条件を保存するようなマッピングを導入する.さらに,サポートベクターマシンを利用して木を分類するための基本的な方法の1つである木カーネルは順序木について多く研究がおこなわれており,そのほとんどが,順序木間のマッピングを数え上げるマッピングカーネルのフレームワークに分類される.一方で,無順序木のカーネルは,その計算の難しさからほとんど研究がなされていない.そこで,巡回的順序木と,次数を定数Dに制限した無順序木に対するマッピングカーネルを設計し,それらの計算時間について議論する.九州工業大学博士学位論文学位記番号：情工博甲第332号学位授与年月日：平成30年3月23日第1章はじめに|第2章木編集距離と木アライメント距離|第3章共通部分森に基づくTaiマッピング階層|第4章木アライメント距離の計算|第5章さまざまな拡張|第6章結論と今後の課題九州工業大学平成29年

Kyutacar : Kyushu Institute of Technology Academic Repository

Evolutionary genomics : statistical and computational methods

Author: Anisimova Maria
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This open access book addresses the challenge of analyzing and understanding the evolutionary dynamics of complex biological systems at the genomic level, and elaborates on some promising strategies that would bring us closer to uncovering of the vital relationships between genotype and phenotype. After a few educational primers, the book continues with sections on sequence homology and alignment, phylogenetic methods to study genome evolution, methodologies for evaluating selective pressures on genomic sequences as well as genomic evolution in light of protein domain architecture and transposable elements, population genomics and other omics, and discussions of current bottlenecks in handling and analyzing genomic data. Written for the highly successful Methods in Molecular Biology series, chapters include the kind of detail and expert implementation advice that lead to the best results. Authoritative and comprehensive, Evolutionary Genomics: Statistical and Computational Methods, Second Edition aims to serve both novices in biology with strong statistics and computational skills, and molecular biologists with a good grasp of standard mathematical concepts, in moving this important field of study forward

ZHAW digitalcollection

Directory of Open Access Books (DOAB)