89 research outputs found
Designing a parallel suffix sort
Suffix sort plays a critical role in various computational algorithms
including genomics as well as in frequently used day to day software
applications. The sorting algorithm becomes tricky when we have lot of repeated
characters in the string for a given radix. Various innovative implementations
are available in this area e.g., Manber Myers. We present here an analysis that
uses a concept around generalized polynomial factorization to sort these
suffixes. The initial generation of these substring specific polynomial can be
efficiently done using parallel threads and shared memory. The set of distinct
factors and their order are known beforehand, and this helps us to sort the
polynomials (equivalent of strings) accordingly
On the readability of machine checkable formal proofs
It is possible to implement mathematical proofs in a machine-readable language. Indeed, certain proofs, especially those deriving properties of safety-critical systems, are often required to be checked by machine in order to avoid human errors. However, machine checkable proofs are very hard to follow by a human reader. Because of their unreadability, such proofs are hard to implement, and more difficult still to maintain and modify. In this thesis we study the possibility of implementing machine checkable proofs in a more readable format. We design a declarative proof language, SPL, which is based on the Mizar language. We also implement a proof checker for SPL which derives theorems in the HOL system from SPL proof scripts. The language and its proof checker are extensible, in the sense that the user can modify and extend the syntax of the language and the deductive power of the proof checker during the mechanisation of a theory. A deductive database of trivial knowledge is used by the proof checker to derive facts which are considered trivial by the developer of mechanised theories so that the proofs of such facts can be omitted. We also introduce the notion of structured straightforward justifications, in which simple facts, or conclusions, are justified by a number of premises together with a number of inferences which are used in deriving the conclusion from the given premises. A tableau prover for first-order logic with equality is implemented as a HOL derived rule and used during the proof checking of SPL scripts. The work presented in this thesis also includes a case study involving the mechanisation of a number of results in group theory in SPL, in which the deductive power of the SPL proof checker is extended throughout the development of the theory
On injectivity of quantum finite automata
We consider notions of freeness and ambiguity for the acceptance probability of Moore-Crutchfield Measure Once Quantum Finite Automata (MO-QFA). We study the injectivity problem of determining if the acceptance probability function of a MO-QFA is injective over all input words, i.e., giving a distinct probability for each input word. We show that the injectivity problem is undecidable for 8 state MO-QFA, even when all unitary matrices and the projection matrix are rational and the initial state vector is real algebraic. We also show undecidability of this problem when the initial vector is rational, although with a huge increase in the number of states. We utilize properties of quaternions, free rotation groups, representations of tuples of rationals as linear sums of radicals and a reduction of the mixed modification of Post's correspondence problem, as well as a new result on rational polynomial packing functions which may be of independent interest.</div
In Memoriam, Solomon Marcus
This book commemorates Solomon Marcus’s fifth death anniversary with a selection of articles in mathematics, theoretical computer science, and physics written by authors who work in Marcus’s research fields, some of whom have been influenced by his results and/or have collaborated with him
NNMap: A method to construct a good embedding for nearest neighbor classification
a b s t r a c t This paper aims to deal with the practical shortages of nearest neighbor classifier. We define a quantitative criterion of embedding quality assessment for nearest neighbor classification, and present a method called NNMap to construct a good embedding. Furthermore, an efficient distance is obtained in the embedded vector space, which could speed up nearest neighbor classification. The quantitative quality criterion is proposed as a local structure descriptor of sample data distribution. Embedding quality corresponds to the quality of the local structure. In the framework of NNMap, one-dimension embeddings act as weak classifiers with pseudo-losses defined on the amount of the local structure preserved by the embedding. Based on this property, the NNMap method reduces the problem of embedding construction to the classical boosting problem. An important property of NNMap is that the embedding optimization criterion is appropriate for both vector and non-vector data, and equally valid in both metric and non-metric spaces. The effectiveness of the new method is demonstrated by experiments conducted on the MNIST handwritten dataset, the CMU PIE face images dataset and the datasets from UCI machine learning repository
On Compensation Loops in Genomic Duplications
Electronic version of an article published as International Journal of Foundations of Computer Science 2020 31:01, 133-142, DOI: 10.1142/S0129054120400092 © World Scientific
Publishing Company https://www.worldscientific.com/worldscinet/ijfcs[EN] In this paper, we investigate the compensation loops, a DNA rearrangement in chromosomes due to unequal crossing over. We study the e fect of compensation loops over the gene duplication, and we formalize it as a restricted case of gene duplication in general. We study this biological process under the point of view of formal languages, and we provide some results about the languages de fined in this way.Sempere Luna, JM. (2020). On Compensation Loops in Genomic Duplications. International Journal of Foundations of Computer Science. 31(1):133-142. https://doi.org/10.1142/S0129054120400092S133142311Bovet, D. P., & Varricchio, S. (1992). On the regularity of languages on a binary alphabet generated by copying systems. Information Processing Letters, 44(3), 119-123. doi:10.1016/0020-0190(92)90050-6Dassow, J., Mitrana, V., & Salomaa, A. (1997). Context-free evolutionary grammars and the structural language of nucleic acids. Biosystems, 43(3), 169-177. doi:10.1016/s0303-2647(97)00036-1Ehrenfeucht, A., & Rozenberg, G. (1984). On regularity of languages generated by copying systems. Discrete Applied Mathematics, 8(3), 313-317. doi:10.1016/0166-218x(84)90129-xLeupold, P., MartĂn-Vide, C., & Mitrana, V. (2005). Uniformly bounded duplication languages. Discrete Applied Mathematics, 146(3), 301-310. doi:10.1016/j.dam.2004.10.003Leupold, P., & Mitrana, V. (2007). Uniformly bounded duplication codes. RAIRO - Theoretical Informatics and Applications, 41(4), 411-424. doi:10.1051/ita:2007021Leupold, P., Mitrana, V., & Sempere, J. M. (2003). Formal Languages Arising from Gene Repeated Duplication. Lecture Notes in Computer Science, 297-308. doi:10.1007/978-3-540-24635-0_22Rozenberg, G., & Salomaa, A. (Eds.). (1997). Handbook of Formal Languages. doi:10.1007/978-3-642-59126-
Two-Way Parikh Automata
Parikh automata extend automata with counters whose values can only be tested at the end of the computation, with respect to membership into a semi-linear set. Parikh automata have found several applications, for instance in transducer theory, as they enjoy a decidable emptiness problem.
In this paper, we study two-way Parikh automata. We show that emptiness becomes undecidable in the non-deterministic case. However, it is PSpace-C when the number of visits to any input position is bounded and the semi-linear set is given as an existential Presburger formula. We also give tight complexity bounds for the inclusion, equivalence and universality problems. Finally, we characterise precisely the complexity of those problems when the semi-linear constraint is given by an arbitrary Presburger formula
- …