Search CORE

15 research outputs found

Data Structure Lower Bounds for Document Indexing Problems

Author: Afshani Peyman
Nielsen Jesper Sindahl
Publication venue
Publication date: 01/01/2016
Field of study

We study data structure problems related to document indexing and pattern matching queries and our main contribution is to show that the pointer machine model of computation can be extremely useful in proving high and unconditional lower bounds that cannot be obtained in any other known model of computation with the current techniques. Often our lower bounds match the known space-query time trade-off curve and in fact for all the problems considered, there is a very good and reasonable match between the our lower bounds and the known upper bounds, at least for some choice of input parameters. The problems that we consider are set intersection queries (both the reporting variant and the semi-group counting variant), indexing a set of documents for two-pattern queries, or forbidden- pattern queries, or queries with wild-cards, and indexing an input set of gapped-patterns (or two-patterns) to find those matching a document given at the query time.Comment: Full version of the conference version that appeared at ICALP 2016, 25 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Makespan Scheduling of Unit Jobs with Precedence Constraints in $O(1.995^n)$ time

Author: Nederlof J.
Swennenhuis C.
Węgrzycki K.
Publication venue
Publication date: 01/01/2022
Field of study

In a classical scheduling problem, we are given a set of

n

jobs of unitlength along with precedence constraints and the goal is to find a schedule ofthese jobs on

m

identical machines that minimizes the makespan. This problemis well-known to be NP-hard for an unbounded number of machines. Using standard3-field notation, it is known as

P|\text{prec}, p_j=1|C_{\max}

. We present an algorithm for this problem that runs in

O(1.995^n)

time.Before our work, even for

m=3

machines the best known algorithms ran in

O^\ast(2^n)

time. In contrast, our algorithm works when the number ofmachines

m

is unbounded. A crucial ingredient of our approach is an algorithmwith a runtime that is only single-exponential in the vertex cover of thecomparability graph of the precedence constraint graph. This heavily relies oninsights from a classical result by Dolev and Warmuth (Journal of Algorithms1984) for precedence graphs without long chains.<br

MPG.PuRe

Makespan Scheduling of Unit Jobs with Precedence Constraints in $O(1.995^n)$ time

Author: Nederlof Jesper
Swennenhuis Céline M. F.
Węgrzycki Karol
Publication venue
Publication date: 04/08/2022
Field of study

In a classical scheduling problem, we are given a set of

n

jobs of unit length along with precedence constraints and the goal is to find a schedule of these jobs on

m

identical machines that minimizes the makespan. This problem is well-known to be NP-hard for an unbounded number of machines. Using standard 3-field notation, it is known as

P|\text{prec}, p_j=1|C_{\max}

. We present an algorithm for this problem that runs in

O(1.995^n)

time. Before our work, even for

m=3

machines the best known algorithms ran in

O^\ast(2^n)

time. In contrast, our algorithm works when the number of machines

m

is unbounded. A crucial ingredient of our approach is an algorithm with a runtime that is only single-exponential in the vertex cover of the comparability graph of the precedence constraint graph. This heavily relies on insights from a classical result by Dolev and Warmuth (Journal of Algorithms 1984) for precedence graphs without long chains.Comment: 26 pages, 7 figure

arXiv.org e-Print Archive

Formalization of block pruning: reducing the number of cells computed in exact biological sequence comparison algorithms

Author: Ayguadé Parra Eduard
De Sandes Edans
Martorell Bofill Xavier
Melo Alba
Teodoro George
Walter Maria Emilia
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

This is a pre-copyedited, author-produced version of an article accepted for publication in The Computer Journal following peer review. The version of record Edans F O Sandes, George L M Teodoro, Maria Emilia M T Walter, Xavier Martorell, Eduard Ayguade, Alba C M A Melo; Formalization of Block Pruning: Reducing the Number of Cells Computed in Exact Biological Sequence Comparison Algorithms, The Computer Journal, Volume 61, Issue 5, 1 May 2018, Pages 687–713 is available online at: The Computer Journal https://academic.oup.com/comjnl/article-abstract/61/5/687/4539903 and https://doi.org/10.1093/comjnl/bxx090.Biological sequence comparison algorithms that compute the optimal local and global alignments calculate a dynamic programming (DP) matrix with quadratic time complexity. The DP matrix H is calculated with a recurrence relation in which the value of each cell Hi,j is the result of a maximum operation on the cells’ values Hi-1,j-1, Hi-1,j and Hi,j-1 added or subtracted by a constant value. Therefore, it can be noticed that the difference between the value of cell Hi,j being calculated and the values of direct neighbor cells previously computed respect well-defined upper and lower bounds. Using these bounds, we can show that it is possible to determine the maximum and the minimum value of every cell in H, for a given reference cell. We use this result to define a generic pruning method which determines the cells that can pruned (i.e. no need to be computed since they will not contribute to the final solution), accelerating the computation but keeping the guarantee that the optimal result will be produced. The goal of this paper is thus to investigate and formalize properties of the DP matrix in order to estimate and increase the pruning method efficiency. We also show that the pruning efficiency depends mainly on three characteristics: (a) the order in which the cells of H are calculated, (b) the values of the parameters used in the recurrence relation and (c) the contents of the sequences compared.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Almost Every Simply Typed Lambda-Term Has a Long Beta-Reduction Sequence

Author: Asada Kazuyuki
Kobayashi Naoki
Sin'ya Ryoma
Tsukada Takeshi
Publication venue
Publication date: 01/02/2019
Field of study

It is well known that the length of a beta-reduction sequence of a simply typed lambda-term of order k can be huge; it is as large as k-fold exponential in the size of the lambda-term in the worst case. We consider the following relevant question about quantitative properties, instead of the worst case: how many simply typed lambda-terms have very long reduction sequences? We provide a partial answer to this question, by showing that asymptotically almost every simply typed lambda-term of order k has a reduction sequence as long as (k-1)-fold exponential in the term size, under the assumption that the arity of functions and the number of variables that may occur in every subterm are bounded above by a constant. To prove it, we have extended the infinite monkey theorem for strings to a parametrized one for regular tree languages, which may be of independent interest. The work has been motivated by quantitative analysis of the complexity of higher-order model checking

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals

Statistical properties of lambda terms

Author: Bendkowski Maciej
Bodini Olivier
Dovgal Sergey
Publication venue
Publication date: 14/08/2018
Field of study

We present a quantitative, statistical analysis of random lambda terms in the de Bruijn notation. Following an analytic approach using multivariate generating functions, we investigate the distribution of various combinatorial parameters of random open and closed lambda terms, including the number of redexes, head abstractions, free variables or the de Bruijn index value profile. Moreover, we conduct an average-case complexity analysis of finding the leftmost-outermost redex in random lambda terms showing that it is on average constant. The main technical ingredient of our analysis is a novel method of dealing with combinatorial parameters inside certain infinite, algebraic systems of multivariate generating functions. Finally, we briefly discuss the random generation of lambda terms following a given skewed parameter distribution and provide empirical results regarding a series of more involved combinatorial parameters such as the number of open subterms and binding abstractions in closed lambda terms.Comment: Major revision of section 5. In particular, proofs of Lemma 5.7 and Theorem 5.

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository

Statistical properties of lambda terms

Author: Bendkowski Maciej
Bodini Olivier
Dovgal Sergey
Publication venue
Publication date: 01/01/2019
Field of study

We present a quantitative, statistical analysis of random lambda terms in the De Bruijn notation. Following an analytic approach using multivariate generat-ing functions, we investigate the distribution of various combinatorial parameters of random open and closed lambda terms, including the number of redexes, head abstractions, free variables or the De Bruijn index value profile. Moreover, we con-duct an average-case complexity analysis of finding the leftmost-outermost redex in random lambda terms showing that it is on average constant. The main technical ingredient of our analysis is a novel method of dealing with combinatorial paramet-ers inside certain infinite, algebraic systems of multivariate generating functions. Finally, we briefly discuss the random generation of lambda terms following a given skewed parameter distribution and provide empirical results regarding a series of more involved combinatorial parameters such as the number of open subterms and binding abstractions in closed lambda terms

Jagiellonian Univeristy Repository

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Author: Cheng Hao
Guo Ruocheng
Klochkov Yegor
Li Hang
Liu Yang
Taufiq Muhammad Faaiz
Ton Jean-Francois
Yao Yuanshun
Zhang Xiaoying
Publication venue
Publication date: 10/08/2023
Field of study

Ensuring alignment, which refers to making models behave in accordance with human intentions [1,2], has become a critical task before deploying large language models (LLMs) in real-world applications. For instance, OpenAI devoted six months to iteratively aligning GPT-4 before its release [3]. However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations. This obstacle hinders systematic iteration and deployment of LLMs. To address this issue, this paper presents a comprehensive survey of key dimensions that are crucial to consider when assessing LLM trustworthiness. The survey covers seven major categories of LLM trustworthiness: reliability, safety, fairness, resistance to misuse, explainability and reasoning, adherence to social norms, and robustness. Each major category is further divided into several sub-categories, resulting in a total of 29 sub-categories. Additionally, a subset of 8 sub-categories is selected for further investigation, where corresponding measurement studies are designed and conducted on several widely-used LLMs. The measurement results indicate that, in general, more aligned models tend to perform better in terms of overall trustworthiness. However, the effectiveness of alignment varies across the different trustworthiness categories considered. This highlights the importance of conducting more fine-grained analyses, testing, and making continuous improvements on LLM alignment. By shedding light on these key dimensions of LLM trustworthiness, this paper aims to provide valuable insights and guidance to practitioners in the field. Understanding and addressing these concerns will be crucial in achieving reliable and ethically sound deployment of LLMs in various applications

arXiv.org e-Print Archive

Distributed Systems and Mobile Computing

Author
Publication venue: 'MDPI AG'
Publication date: 24/02/2022
Field of study

The book is about Distributed Systems and Mobile Computing. This is a branch of Computer Science devoted to the study of systems whose components are in different physical locations and have limited communication capabilities. Such components may be static, often organized in a network, or may be able to move in a discrete or continuous environment. The theoretical study of such systems has applications ranging from swarms of mobile robots (e.g., drones) to sensor networks, autonomous intelligent vehicles, the Internet of Things, and crawlers on the Web. The book includes five articles. Two of them are about networks: the first one studies the formation of networks by agents that interact randomly and have the ability to form connections; the second one is a study of clustering models and algorithms. The three remaining articles are concerned with autonomous mobile robots operating in continuous space. One article studies the classical gathering problem, where all robots have to reach a common location, and proposes a fast algorithm for robots that are endowed with a compass but have limited visibility. The last two articles deal with the evacuations problem, where two robots have to locate an exit point and evacuate a region in the shortest possible time

Directory of Open Access Books (DOAB)