52 research outputs found
Detecting palindromes, patterns, and borders in regular languages
Given a language L and a nondeterministic finite automaton M, we consider
whether we can determine efficiently (in the size of M) if M accepts at least
one word in L, or infinitely many words. Given that M accepts at least one word
in L, we consider how long a shortest word can be. The languages L that we
examine include the palindromes, the non-palindromes, the k-powers, the
non-k-powers, the powers, the non-powers (also called primitive words), the
words matching a general pattern, the bordered words, and the unbordered words.Comment: Full version of a paper submitted to LATA 2008. This is a new version
with John Loftus added as a co-author and containing new results on
unbordered word
Regular realizability problems and context-free languages
We investigate regular realizability (RR) problems, which are the problems of
verifying whether intersection of a regular language -- the input of the
problem -- and fixed language called filter is non-empty. In this paper we
focus on the case of context-free filters. Algorithmic complexity of the RR
problem is a very coarse measure of context-free languages complexity. This
characteristic is compatible with rational dominance. We present examples of
P-complete RR problems as well as examples of RR problems in the class NL. Also
we discuss RR problems with context-free filters that might have intermediate
complexity. Possible candidates are the languages with polynomially bounded
rational indices.Comment: conference DCFS 201
Shortest Repetition-Free Words Accepted by Automata
We consider the following problem: given that a finite automaton of
states accepts at least one -power-free (resp., overlap-free) word, what is
the length of the shortest such word accepted? We give upper and lower bounds
which, unfortunately, are widely separated.Comment: 12 pages, conference pape
Decision Problems on Copying and Shuffling
We study decision problems of the form: given a regular or linear
context-free language , is there a word of a given fixed form in , where
given fixed forms are based on word operations copy, marked copy, shuffle and
their combinations
Automata Equipped with Auxiliary Data Structures and Regular Realizability Problems
We consider general computational models: one-way and two-way finite
automata, and logarithmic space Turing machines, all equipped with an auxiliary
data structure (ADS). The definition of an ADS is based on the language of
protocols of work with the ADS. We describe the connection of automata-based
models with ``Balloon automata'' that are another general formalization of
automata equipped with an ADS presented by Hopcroft and Ullman in 1967.
This definition establishes the connection between the non-emptiness problem
for one-way automata with ADS, languages recognizable by nondeterministic
log-space Turing machines equipped with the same ADS, and a regular
realizability problem (NRR) for the language of ADS' protocols. The NRR problem
is to verify whether the regular language on the input has a non-empty
intersection with the language of protocols. The computational complexity of
these problems (and languages) is the same up to log-space reductions.Comment: 25 pages. An extended version of the conference paper (DCFS 2021),
submitted to International Journal of Foundations of Computer Scienc
Vaade inimese Y kromosoomile – fülogenees, populatsiooni dünaamika ja asutajasündmused
Väitekirja elektrooniline versioon ei sisalda publikatsiooneRahvastikusündmused on jätnud oma jälje iga inimese genoomi. Täna suudame neid ’lugeda’ nii praegu elavate kui juba ammu surnud inimeste geneetilisest materjalist. Y-kromosoom on eriline genoomi osa, mis pärandub edasi vaid mööda isaliini, kõikide maailma isaliinide omavahlelist sugulust näitab nende ‘sugupuu’. Täna saame ka Y-kromosoomilt ohtralt DNA-lugemeid, mis võimaldavad võrratult täpsemalt hinnata inimese isaliinide mitmekesisust ning harude lahknemisaegu isaliinide puul. Doktoritöös uuriti mineviku rahvastikusündmusi peamiselt inimese Y-kromosoomi andmeid analüüsides.
Töö tulemused näitasid, et inimese kõikide teadaolevate isaliinide viimane ühine eellane elas Aafrikas umbes 250 tuhat aastat tagasi, paljude liinide arvukuse kasv toimus aga viimase 15 tuhande aasta sees. Üllatuslikult leidsime ka, et 4–8 tuhat aastat tagasi kahanes järsult järglasi saavate meeste suhteline arv, samas kui naistel see arv ei muutunud. Kuna sigivate meeste arvukus vähenes samal ajal, kui muutusid inimeste eluviisid – mindi üle küttimiselt ja koriluselt põlluharimisele, võisid need kultuurilised muutused mõjutada meeste reproduktiivkäitumist.
Lisaks näitasime, et Lõuna-Siberist pärit Baikali-äärse 24 000 aasta vanuse ülempaleoliitilise Malta kultuuri esindaja ema- ja isaliin ei ole tüüpilised tänased seal piirkonnas levinud Ida-Euraasia liinid, näidates geneetilise pärandi olulist muutumist läbi aja.
Asutajasündmused, mil uus rahvakild tekib mingi algse grupi väikesest alamhulgast, jätavad uue grupi geneetilisse pärandisse iseloomuliku jälje. Neid analüüsisime Euroopa romi ja aškenaasi leviidi meeste seas. Lõuna-Aasia päritolu H1a1-M82 tüüpi isaliin on levinud ka Euroopa romide seas, viidates nende algkodule. Romide liinidele kõige sarnasemad on Loode- ja Põhja-India meeste seas levinud variandid, viidates võimalikule Romide päritolupiirkonnale. Tüüpiline aškenaasi leviitide isaliin R1a-Y2619 on kõige tõenäolisemalt pärit Lähis-Idast. Näitasime, et see kuulus aškenaasi leviitide asutajaliinide hulka, kuid selle levik oli seotud pigem aškenaasi juutide populatsiooni üldise laienemisega.Demographic processes have left their traces into every human genome. Today we can ‘read’ them from the genetic material of people living now and those passed away long ago. Mitochondrial DNA and Y chromosome (chrY) are parts of the genome that pass on through maternal and paternal lines. The relationships of all these lineages in the world are captured in a global ‘family tree’ of maternal or paternal lineages. Just recently it became possible to attain high numbers of sequencing reads also from chrY. This enables to assess the variation of human paternal lineages and date their splits on the tree with unmatched precision. This thesis investigates the past demographic events mainly by analysing the sequencing datasets of human chrY.
We showed that the most common ancestor of all known paternal lineages lived in Africa about 250 thousand years ago (kya) and many of the now widespread lineages started to expand 15 kya. Then, 4–8 kya the relative number of males who had offspring (Nm) decreased drastically, while in females it did not change. Since the decrease of Nm coincided with the changes of lifestyle from hunting and gathering to farming, the decrease in the number of breeding males could have been caused by cultural forces that influence the reproductive behaviour of men.
The maternal and paternal lineages of a southern Siberian 24,000 years-old Upper Palaeolithic individual from near Lake Baikal are not typical East Eurasian lineages found in the area today. This testifies for population changes affecting the genetic make-up of the people living in that region.
Founder events during which a new population forms as a small subset of an initial group, leave distinct traces into the genomic legacy of the newly formed group. We analysed these traces in the paternal gene pool of European Roma and Ashkenazi Levites. H1a1a-M82 is a paternal lineage carried by 12% of South Asian men. The same lineage is spread among European Roma whose variants have closest relations to men from north western and northern India, pointing to their potential place of origin. The main lineage among Ashkenazi Levites, R1a-Y2619, originates in the Near East and it was probably carried by the first founders of the Ashkenazi Levites. The increase in numbers of carriers of this lineage was not an event specific to Levites, but part of the general Ashkenazi Jewish expansion
Repetition in Words
The main topic of this thesis is combinatorics on words. The field of combinatorics on words dates back at least to the beginning of the 20th century when Axel Thue constructed an infinite squarefree sequence over a ternary alphabet. From this celebrated result also emerged the subfield of repetition in words which is the main focus of this thesis.
One basic tool in the study of repetition in words is the iteration of morphisms. In Chapter 1, we introduce this tool among other basic notions. In Chapter 2, we see applications of iterated morphisms in several examples. The second half of the chapter contains a survey of results concerning Dejean's conjecture. In Chapter 3, we generalize Dejean's conjecture to circular factors. We see several applications of iterated morphism in this chapter. We continue our study of repetition in words in Chapter 4, where we study the length of the shortest repetition-free word in regular languages. Finally, in Chapter 5, we conclude by presenting a number of open problems
- …