Search CORE

296 research outputs found

Direct laser printing of thin-film polyaniline devices

Author: Chatzandroulis S.
Kandyla M.
Pandis C.
Pissis P.
Zergioti I.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We report the fabrication of electrically functional polyaniline thin-film microdevices. Polyaniline films were printed in the solid phase by Laser Induced Forward Transfer directly between Au electrodes on a Si/SiO2 substrate. To apply solid-phase deposition, aniline was in situ polymerized on quartz substrates. Laser deposition preserves the morphology of the films and delivers sharp features with controllable dimensions. The electrical characteristics of printed polyaniline present ohmic behavior, allowing for electroactive applications. Results on gas sensing of ammonia are presented.Comment: In Pres

arXiv.org e-Print Archive

UCL Discovery

DSpace at NTUA

Recommended from our members

Text Indexing for Long Patterns: Anchors are All you Need

Author: Ayad L
Loukides G
Pissis S
Publication venue: Association for Computing Machinery
Publication date: 10/07/2023
Field of study

PVLDB Artifact Availability: The source code, data, and/or other artifacts have been made available at https://github.com/lorrainea/BDA- index.Copyright © 2023 the owner/author(s). In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to represent such string datasets in a compact form but also to simultaneously enable fast pattern matching queries. This is the classic text indexing problem. The four absolute measures anyone should pay attention to when designing or implementing a text index are: (i) index space; (ii) query time; (iii) construction space; and (iv) construction time. Unfortunately, however, most (if not all) widely-used indexes (e.g., suffix tree, suffix array, or their compressed counterparts) are not optimized for all four measures simultaneously, as it is difficult to have the best of all four worlds. Here, we take an important step in this direction by showing that text indexing with locally consistent anchors (lc-anchors) offers remarkably good performance in all four measures, when we have at hand a lower bound l on the length of the queried patterns --- which is arguably a quite reasonable assumption in practical applications. Specifically, we improve on the construction of the index proposed by Loukides and Pissis, which is based on bidirectional string anchors (bd-anchors), a new type of lc-anchors, by: (i) designing an average-case linear-time algorithm to compute bd-anchors; and (ii) developing a semi-external-memory implementation to construct the index in small space using near-optimal work. We then present an extensive experimental evaluation, based on the four measures, using real benchmark datasets. The results show that, for long patterns, the index constructed using our improved algorithms compares favorably to all classic indexes: (compressed) suffix tree; (compressed) suffix array; and the FM-index.European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreements No 872539 and 956229, respectively; and by UKRI through REPHRAIN (EP/V011189/1)

Brunel University Research Archive

Linear-Time Superbubble Identification Algorithm for Genome Assembly

Author: Brankovic Ljiljana
Iliopoulos Costas S.
Kundu Ritu
Mohamed Manal
Pissis Solon P.
Vayani Fatima
Publication venue
Publication date: 17/09/2015
Field of study

DNA sequencing is the process of determining the exact order of the nucleotide bases of an individual's genome in order to catalogue sequence variation and understand its biological implications. Whole-genome sequencing techniques produce masses of data in the form of short sequences known as reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly algorithms utilize de Bruijn graphs constructed from reads for this purpose. A critical step of these algorithms is to detect typical motif structures in the graph caused by sequencing errors and genome repeats, and filter them out; one such complex subgraph class is a so-called superbubble. In this paper, we propose an O(n+m)-time algorithm to detect all superbubbles in a directed acyclic graph with n nodes and m (directed) edges, improving the best-known O(m log m)-time algorithm by Sung et al

arXiv.org e-Print Archive

University of Newcastle's Digital Repository

Elsevier - Publisher Connector

Crossref

King's Research Portal

Efficient Computation of Sequence Mappability

Author: Alzamel Mai
Charalampopoulos Panagiotis
Iliopoulos Costas S.
Kociumaka Tomasz
Pissis Solon P.
Radoszewski Jakub
Straszyński Juliusz
Publication venue
Publication date: 31/07/2018
Field of study

Sequence mappability is an important task in genome re-sequencing. In the

(k,m)

-mappability problem, for a given sequence

T

of length

n

, our goal is to compute a table whose

i

th entry is the number of indices

j \ne i

such that length-

m

substrings of

T

starting at positions

i

and

j

have at most

k

mismatches. Previous works on this problem focused on heuristic approaches to compute a rough approximation of the result or on the case of

k=1

. We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that works in

\mathcal{O}(n \min\{m^k,\log^{k+1} n\})

time and

\mathcal{O}(n)

space for

k=\mathcal{O}(1)

. It requires a carefu l adaptation of the technique of Cole et al.~[STOC 2004] to avoid multiple counting of pairs of substrings. We also show

\mathcal{O}(n^2)

-time algorithms to compute all results for a fixed

m

and all

k=0,\ldots,m

or a fixed

k

and all

m=k,\ldots,n-1

. Finally we show that the

(k,m)

-mappability problem cannot be solved in strongly subquadratic time for

k,m = \Theta(\log n)

unless the Strong Exponential Time Hypothesis fails.Comment: Accepted to SPIRE 201

arXiv.org e-Print Archive

VU Research Portal

CWI's Institutional Repository

INRIA a CCSD electronic archive server

Range Shortest Unique Substring queries

Author: Abedin P. (Paniz)
Ganguly A. (Arnab)
Pissis S. (Solon)
Thankachan S.V. (Sharma)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/10/2019
Field of study

Let be a string of length n and be the substring of starting at position i and ending at position j. A substring of is a repeat if it occurs more than once in; otherwise, it is a unique substring of. Repeats and unique substrings are of great interest in computational biology and in information retrieval. Given string as input, the Shortest Unique Substring problem is to find a shortest substring of that does not occur elsewhere in. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over answering the following type of online queries efficiently. Given a range, return a shortest substring of with exactly one occurrence in. We present an -word data structure with query time, where is the word size. Our construction is based on a non-trivial reduction allowing us to apply a recently introduced optimal geometric data structure [Chan et al. ICALP 2018]

CWI's Institutional Repository

Longest common substring made fully dynamic

Author: Amir A. (Amihood)
Charalampopoulos P. (Panagiotis)
Pissis S. (Solon)
Radoszewski J. (Jakub)
Publication venue
Publication date: 16/07/2018
Field of study

Given two strings S and T, each of length at most n, the longest common substring (LCS) problem is to find a longest substring common to S and T. This is a classical problem in computer science with an O(n)-time solution. In the fully dynamic setting, edit operations are allowed in either of the two strings, and the problem is to find an LCS after each edit. We present the first solution to this problem requiring sublinear time in n per edit operation. In particular, we show how to find an LCS after each edit operation in Õ(n2/3) time, after Õ(n)-time and space preprocessing. 1 This line of research has been recently initiated in a somewhat restricted dynamic variant by Amir et al. [SPIRE 2017]. More specifically, they presented an Õ(n)-sized data structure that returns an LCS of the two strings after a single edit operation (that is reverted afterwards) in Õ(1) time. At CPM 2018, three papers (Abedin et al., Funakoshi et al., and Urabe et al.) studied analogously restricted dynamic variants of problems on strings. We show that the techniques we develop can be applied to obtain fully dynamic algorithms for all of these variants. The only previously known sublinear-time dynamic algorithms for problems on strings were for maintaining a dynamic collection of strings for comparison queries and for pattern matching, with the most recent advances made by Gawrychowski et al. [SODA 2018] and by Clifford et al. [STACS 2018]. As an intermediate problem we consider computing the solution for a string with a given set of k edits, which leads us, in particular, to answering internal queries on a string. The input to such a query is specified by a substring (or substrings) of a given string. Data structures for answering internal string queries that were proposed by Kociumaka et al. [SODA 2015] and by Gagie et al. [CCCG 2013] are used, along with new ones, based on ingredients such as the suffix tree, heavy-path decomposition, orthogonal range queries, difference covers, and string periodicity

arXiv.org e-Print Archive

CWI's Institutional Repository

Dagstuhl Research Online Publication Server

Bidirectional string anchors: A new string sampling mechanism

Author: Loukides G. (Grigorios)
Pissis S. (Solon)
Publication venue
Publication date: 01/01/2021
Field of study

The minimizers sampling mechanism is a popular mechanism for string sampling introduced independently by Schleimer et al. [SIGMOD 2003] and by Roberts et al. [Bioinf. 2004]. Given two positive integers w and k, it selects the lexicographically smallest length-k substring in every fragment of w consecutive length-k substrings (in every sliding window of length w+k-1). Minimizers samples are approximately uniform, locally consistent, and computable in linear time. Although they do not have good worst-case guarantees on their size, they are often small in practice. They thus have been successfully employed in several string processing applications. Two main disadvantages of minimizers sampling mechanisms are: first, they also do not have good guarantees on the expected size of their samples for every combination of w and k; and, second, indexes that are constructed over their samples do not have good worst-case guarantees for on-line pattern searches. To alleviate these disadvantages, we introduce bidirectional string anchors (bd-anchors), a new string sampling mechanism. Given a positive integer , our mechanism selects the lexicographically smallest rotation in every length- fragment (in every sliding window of length ). We show that bd-anchors samples are also approximately uniform, locally consistent, and computable in linear time. In addition, our experimen

VU Research Portal

CWI's Institutional Repository

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

Symbolic regression is NP-hard

Author: Pissis S. (Solon)
Virgolin M. (Marco)
Publication venue
Publication date: 25/10/2022
Field of study

Symbolic regression (SR) is the task of learning a model of data in the form of a mathematical expression. By their nature, SR models have the potential to be accurate and human-interpretable at the same time. Unfortunately, finding such models, i.e., performing SR, appears to be a computationally intensive task. Historically, SR has been tackled with heuristics such as greedy or genetic algorithms and, while some works have hinted at the possible hardness of SR, no proof has yet been given that SR is, in fact, NP-hard. This begs the question: Is there an exact polynomial-time algorithm to compute SR models? We provide evidence suggesting that the answer is probably negative by showing that SR is NP-hard

CWI's Institutional Repository

All-pairs suffix/prefix in optimal time using Aho-Corasick space

Author: Loukides G. (Grigorios)
Pissis S. (Solon)
Publication venue: 'Elsevier BV'
Publication date: 28/04/2022
Field of study

The all-pairs suffix/prefix (APSP) problem is a classic problem in computer science with many applications in bioinformatics. Given a set {S1,…,Sk} of k strings of total length n, we are asked to find, for each string Si, i∈[1,k], its longest suffix that is a prefix of string Sj, for all j≠i, j∈[1,k]. Several algorithms running in the optimal O(n+k2) time for solving APSP are known. All of these algorithms are based on suffix sorting and thus require space Ω(n) in any case. We consider the parameterized version of the APSP problem, denoted by ℓ-APSP, in which we are asked to output only the pairs whose suffix/prefix overlap is of length at least ℓ. We give an algorithm for solving ℓ-APSP that runs in the optimal O(n+|OUTPUTℓ|) time using O(n) space, where OUTPUTℓ is the set of output pairs. Our algorithm is thus optimal for the APSP problem as well by setting ℓ=0. Notably, our algorithm is fundamentally different from all optimal algorithms solving the APSP problem: it does not rely on sorting the suffixes of all input strings but on a novel traversal of the Aho-Corasick machine, and it thus requires space linear in the size of the machine

VU Research Portal

CWI's Institutional Repository

INRIA a CCSD electronic archive server

King's Research Portal