Search CORE

59 research outputs found

Višestruko poravnavanje i HMM

Author: Stojković Mihaela
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 13/07/2015
Field of study

Višestruko poravnavanje je važan objekt u bioinformatici jer daje puno informacija proteinskim familijama. U ovom radu smo vidjeli kako napraviti višestruko poravnanje pomoću skrivenog Markovljevog modela. Pokazuje se da su rezultati vrlo osjetljivi i ovisni o uzorku. Provedena je analiza kojom je iz početnog poravnanja procijenjen model, zatim je na nekoliko načina provedena simulacija i pokazalo se da distribucija “score”-ova jest Gumbelova kako smo i očekivali. Naposljetku je napravljeno novo poravnanje. Provedena parametrizacija modela je vrlo osjetljiva, pa dobiveni model ne omogućava daljnju analizu. Zbog toga pokušavamo postepeno graditi model od najboljih poravnanja, koja su i najmanje varijabilna, a kako smo vidjeli to je dobar način da izbjegnemo neke od problema na koje smo naišli.Multiple sequence alignment is an important object in bioinformatics for obtaining information about protein families. In this thesis we show how to build a multiple sequence alignment using hidden Markov models. We have observed that the results are very sensitive to the choice of various parameters and sample biased. Analysis carried out consists of model estimation from given alignment, simulation and realigning. Distribution of scores is approximately Gumbel, as expected. Since parametrisation of a family profile is a very sensitive procedure, we gradually build a model using less variable subsamples. This method provides a good solution to avoid some of the obstacles we encountered

Repository of Faculty of Science, University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Analiza kompleksnosti skrivenih Markovljevih modela

Author: Valčić Irma
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 13/07/2015
Field of study

U ovom radu analizirali smo skrivene Markovljeve modele, statistički alat koji danas nalazi sve veću primjenu u različitim područjima. Dali smo njihovu formalnu definiciju, opisali neke algoritme za rad sa skrivenim Markovljevim modelima i implementirali ih u programskom jeziku Python. Konstruirali smo i primjer povremeno nepoštene kockarnice kao osnovu za kompliciranije primjene u bioinformatici (npr. modeliranje genoma), proveli simulaciju te pokušali pronaći najbolji model koji bi opisao tako dobivene podatke. Koristili smo nekoliko statističkih metoda za odabir najboljeg modela — maksimizaciju vjerodostojnosti, omjer vjerodostojnosti, AIC i BIC — no niti jedna od tih metoda nije dala dobar rezultat. To ukazuje na kompleksnost skrivenih Markovljevih modela unatoč tome što intuitivno ne djeluju kao nešto iznimno složeno.This thesis explores hidden Markov models, a powerful statistical tool applied in various scientific fields. We give the formal definition of a hidden Markov model, describe several algorithms traditionally used in their analysis and present their implementation in Python. We also construct an example of an occasionally dishonest casino as the basis for more complicated applications in bioinformatics (e.g. genome analysis), simulate data and attempt to find the best model for it. Several statistical methods are used — likelihood maximization, log-likelihood ratio, AIC and BIC — but none of them yield satisfactory results. This indicates the complexity of hidden Markov models even though initially they may not appear particularly difficult

University of Zagreb Repository

Analiza kompleksnosti skrivenih Markovljevih modela

Author: Valčić Irma
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 13/07/2015
Field of study

Repository of Faculty of Science, University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Višestruko poravnavanje i HMM

Author: Stojković Mihaela
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 13/07/2015
Field of study

University of Zagreb Repository

Poravnanje proteinskih struktura

Author: Kurolt Ivana
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 23/09/2014
Field of study

U ovom radu opisan je jedan algoritam za poravnanje proteinske strukture različitih duljina. Na početku smo definirali matrice udaljenosti točaka za dvije proteinske strukture zasebno. Pomoću tih matrica smo napravili novu koja uspoređuje njihove vektore određene duljine (koju smo sami zadali parametrom wind) i govori koliko su bliski. Kako bismo odredili što je relativno blisko, uveli smo novi parametar (trash), te sve vrijednosti koje su manje od parametra postavili na 1, a ostale na 0. Nakon toga smo tražili uzastopne pozicije, koje u matrici imaju vrijednost 1, i koje zapravo predstavljaju točke iz dva skupa točaka sa početka. Pozicije najduljih nizova smo spremili u polje iz kojeg smo kasnije izbacivali one koji su najlošiji prema ostalima. Tim postupkom smo došli do pozicija u proteinima koje međusobno imaju najmanje udaljenosti iz kojih zaključujemo na kojim dijelovima se proteini podudaraju, odnosno na kojim dijelovima su poravnati.This paper describes an algorithm for protein structure alignment of different lengths. At the beginning, we define the matrix of distances between points for two protein structures separately. Using these matrices, we made a new one that compares their vectors of a certain length (which we have set with parameter wind) and tells us how close they are. To determine which is relatively close, we introduced a new parameter (trash), and all values that are less than the parameter set to 1, and the rest to 0. After that we were looking for consecutive positions, which in the matrix have a value 1, and that actually represent points from the two sets of points from the beginning. Positions of longest strings are stored in the field, from which we later evicted those who are worst to others. With this process, we come to positions in proteins that have the least distance from each other. From that, we conclude what parts of the proteins coincide, ie which parts are aligned

Repository of Faculty of Science, University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Neural networks in speech recognition

Author: Kresnik Iva
Publication venue: University of Zagreb. Faculty of Mechanical Engineering and Naval Architecture.
Publication date: 26/09/2019
Field of study

U ovom završnom radu razmatrana je primjena umjetnih neuronskih mreža u prepoznavanju govora. Na početku su definirani govor, jezik i pismo u pogledu lingvistike. Zatim su objašnjene osnove umjetnih neuronskih mreža i skrivenih Markovljevih modela. Prije razrade problema prepoznavanja govora, dan je kratki povijesni pregled razvitka sustava za prepoznavanje govora. Slijedi opis rada konvencionalnog sustava za prepoznavanje govora i moguća područja primjene umjetnih neuronskih mreža unutar njega. Spomenuti su i sustavi koji isključivo ovise o umjetnim neuronskim mrežama, ali koji su još uvijek u razvitku. Na kraju su navedeni aktualni problemi te moguća rješenja u budućnosti.This final thesis considers the use of artificial neural networks in speech recognition. It starts with the definition of speech, language and script in terms of linguistics. Then the basics of artificial neural networks and hidden Markov models are explained. Before explaining the speech recognition problem, a brief historical overview of the development of the speech recognition systems is given. The following is a description of the operation of a traditional speech recognition system and the possible applications of artificial neural networks within it. Systems that depend solely on artificial neural networks, but are still under development, are mentioned. Finally, current problems and possible solutions in the future are outlined

Repository of Faculty of Mechanical Engineering and Naval Architecture University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Karakterizacija likova u dječjim pričama

Author: Levačić Gorana
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 29/11/2016
Field of study

U ovom diplomskom radu rješavali smo problem automatske karakterizacije likova u dječjim pričama. Problem se sastoji od prepoznavanja samih likova u tekstu te, za svakog prepoznatog lika, odlučivanja radi li se o dobrom ili lošem liku. U prvom dijelu rada obradili smo teorijsku podlogu ovoga problema. Započeli smo s opisom područja obrade prirodnog jezika, koje obuhvaća brojne probleme. Neke od tih problema smo potom prepoznali kao zasebne cjeline unutar problema karakterizacije likova. To su problemi prepoznavanja imenovanih entiteta, razrješavanja koreferencije te analize sentimenta. U nastavku smo, za svaki od ova tri problema, opisali nekoliko modela strojnog učenja kojima se oni rješavaju. U drugom dijelu diplomskog rada smo implementirali softversko rješenje za problem karakterizacije likova. Kao temelj smo koristili skup alata Stanford CoreNLP razvijen na Sveučilištu u Stanfordu. U radu je dana opsežna dokumentacija rješenja, kao i detaljna analiza dobivenih rezultata za nekoliko dječjih priča. Na temelju njih smo donijeli zaključke o prednostima i nedostacima implementiranog rješenja. Naposljetku smo dali nekoliko prijedloga za poboljšanje postojećeg rješenja, kao i za implementaciju novoga.In this thesis we solved the problem of automatic resolving of good and bad characters in children's stories. This problem is comprised of recognition of characters in the text, and deciding for each found character whether he/she/it is good or bad. In the first part of the thesis we considered the theoretical background of this problem. We began by describing the field of natural language processing, which comprises many problems. Some of these problems we recognized as separate components of the character resolving problem. These are the problems of named entity recognition, coreference resolution and sentiment analysis. For each of these three problems, we described several machine learning models which are used to solve them. In the second part of the thesis we implemented a software solution for the character resolving problem. As a basis we used Stanford CoreNLP toolkit developed at Stanford University. Comprehensive documentation for the solution is given in the thesis, as well as detailed analysis of obtained results for several children's stories. Based on them we came to the conclusion about advantages and disadvantages of the implemented solution. Lastly, we gave several recommendations for improvement of the current solution, as well as for implementation of a new solution

Repository of Faculty of Science, University of Zagreb

Croatian Digital Thesis Repository

University of Zagreb Repository

Kompleksnost skrivenih Markovljevih modela

Author: Horvatek Tea
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 01/02/2018
Field of study

U ovom diplomskom radu bavili smo se analizom skrivenih Markovljevih modela. Dana je njihova formalna definicija te opisani neki algoritmi za njihovu analizu. Algoritme smo implementirali u programskom jeziku Python te primijenili na modelu nepoštene kockarnice s 32-stranim ``kockama". Prikazano je nekoliko poznatih statističkih metoda za odabir najboljeg modela, međutim, nijedna nije dala zadovoljavajući rezultat pri primjeni na procjenu kompleksnosti skrivenih Markovljevih modela. Zato u radu predlažemo novi kriterij, koji daje bolje rezultate.This thesis is concerned with analysis of hidden Markov models. We give a formal definition of an HMM and describe several algorithms used in their analysis. We implement these algorithms in Python and show their application on an occasionally dishonest casino model with 32-sided “cubes”. We describe several known statistical methods for choosing the best model. However, none give satisfying results when used to assess the complexity of given hidden Markov model. Therefore, we propose a new criterion which gives better results

Repository of Faculty of Science, University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Poravnanje više nizova

Author: Grubelić Neven
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 02/12/2015
Field of study

Problem poravnanja više nizova svoju svrhu nalazi u svakodnevnoj upotrebi i stoga je od velikog interesa imati na raspolaganju dovoljno dobre metode rješavanja tog problema. Kako je sam problem poravnanja više nizova prilično složen i egzaktno rješavanje u praksi ne dolazi u obzir, u ovom radu dan je pregled poznatih metoda rješavanja, a poseban naglasak stavljen je na metodu rješavanja problema genetskim algoritmom. Provedena je analiza kvalitete dobivenih rješenja s obzirom na određivanje nekih varijabli u samom algoritmu, te je na kraju opisan smjer u kojem bi se potencijalno moglo krenuti u daljnjem istraživanju ove problematike.Multiple sequence analysis is a problem which is encountered daily, therefore, it is very important to have good methods for solving this problem. Because solving multiple sequence alignment problem using dynamic programming would have no practical application, due to complexity of the problem itself, this paper gives some of the most popular methods for solving this problem with emphasis on genetic algorithms. Result quality analysis was made with respect to different methods for assessing various parameters included in the algorithm. Finally, some possibilities for future improvements were discussed

Repository of Faculty of Science, University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Poravnanje proteinskih struktura

Author: Kurolt Ivana
Publication venue: University of Zagreb. Faculty of Science. Department of Mathematics.
Publication date: 23/09/2014
Field of study

Croatian Digital Thesis Repository