Search CORE

3 research outputs found

Pattern based fact extraction from Estonian free-texts

Author: Petmanson Timo
Publication venue: Tartu Ülikool
Publication date: 01/01/2012
Field of study

Vabatekstide töötlus on üks keerulisemaid probleeme arvutiteaduses. Tekstide täpne analüüs on tihti mitmestimõistetavuse tõttu arvutite jaoks keeruline või võimatu. Sellegipoolest on võimalik teatud fakte eraldada. Käesolevas töös uurime mustripõhiseid meetodeid faktide tuletamiseks eesti keelsetest tekstidest. Rakendame oma metoodikat reaalsetel tekstidel ning analüüsime tulemusi. Kirjeldame lühidalt aktiivõppe metoodikat, mis võimaldab suuri korpuseid kiiremini märgendada. Lisaks oleme implementeerinud prototüüplahenduse korpuste märgendamiseks ning mustripõhise faktituletuse läbiviimiseks.Natural language processing is one of the most difficult problems, since words and language constructions have often ambiguous meaning that cannot be resolved without extensive cultural background. However, some facts are easier to deduce than the others. In this work, we consider unary, binary and ternary relations between the words that can be deduced form a single sentence. The relations represented by sets of patterns are combined with basic machine learning methods, that are used to train and deploy patterns for fact extraction. We also describe the process of active learning, which helps to speed up annotating relations in large corpora. Other contributions include a prototype implementation with plain-text preprocessor, corpus annotator, pattern miner and fact extractor. Additionally, we provide empirical study about the efficiency of the prototype implementation with several relations and corpora

DSpace at Tartu University Library

Mining Motifs in DNA Regulatory Area

Author: Petmanson Timo
Publication venue: Tartu Ülikool
Publication date: 01/01/2010
Field of study

Käesolev töö uurib algoritme, mille abil on võimalik uurida organismide geeniregulatsiooni probleeme eksperimentaalsete andmete põhjal. Keskendutakse DNA regulatiivsetest aladest oluliste motiivide ning fragmentide otsimisele, millel võib olla kriitiline roll organismi elutalitluse reguleerimisel ja kordineerimisel. Töö teoreetilises osas kirja pandud matemaatilise formalisatsiooni abil uuritakse ja tõestatakse mitmeid omadusi, mis panevad aluse võimalikele otsingualgoritmidele ja nende analüüsimisele. Töö praktiline osa käsitleb väljatöötatud algoritmide ajalist efektiivsust ning võimekust töötada bioloogiliste andmetega.In this work, we introduced and developed novel mathematical formalization, algorithms and data structures needed to describe data mining methods using multiple input promoters and several layers of data. We reformulated standard sequence mining techniques and studied different properties of our new formalization. We benchmarked and analyzed the runtime speed of the algorithms. We also tested how our methods work on real biological data

DSpace at Tartu University Library

Academic Plagiarism Detection

Author: Abnar Samira
Alberts Houda
Alfikri Zakiy Firdaus
Alvi Faisal
Alzahrani Salha
An Vo Ngoc Phuoc
Asghari Habibollah
Bagnall Douglas
Bagnall Douglas
Bartoli Alberto
Bela Gipp
Bensalem Imene
Billah Nagoudi El Moatez
Bobicev Victoria
Buscaldi Davide
Castillo Esteban
Castro Daniel
Ceska Zdenek
Chudá Daniela
Dan Avishek
Dawn Arnav Kumar
Dharani T.
Diego
Ehsan Nava
Elizalde Victoria
Elizalde Victoria
Esteki Fezeh
Fagan Jody Condit
Feng Vanessa Wei
Fishman Teddi
Franco-Salvador Marc
Fréry Jordan
Gabrilovich Evgeniy
García-Mondeja Yasmany
Garg Urvashi
Ghaeini M. R.
Gharavi Erfaneh
Gillam Lee
Gipp Bela
Glinos Demetrios G.
Goutte Cyril
Gross Philipp
Gupta Deepa
Gutierrez Josue
Gómez-Adorno Helena
Gómez-Adorno Helena
Hagen Matthias
Haggag Osama
Halvani Oren
Halvani Oren
Halvani Oren
Halvani Oren
Harvey Sarah
Hussain
Hussein Ashraf S.
Hürlimann Manuela
Ibnu Subroto Imam Much
Jankowska Magdalena
Jankowska Magdalena
Jayapal Arun
Jiffriya M. A. C.
Juola Patrick
Juola Patrick
Kanjirangat Vani
Kanjirangat Vani
Kanjirangat Vani
Karaś Daniel
Kern Roman
Khan Imtiaz H.
Khan Jamal Ahmad
Khonji Mahmoud
Khoshnavataher Khadijeh
Kocher Mirco
Kocher Mirco
Kocher Mirco
Kong Leilei
Kong Leilei
Kong Leilei
Kong Leilei
Kuznetsov Mikhail
Layton Robert
Ledesma Paola
Lee Taemin
Magooda Ahmed
Mahgoub Ashraf Y.
Maitra Promita
Mayor Cristhian
Modaresi Pashutan
Mohebbi Majid
Momtaz Mozhgan
Moreau Erwan
Moreau Erwan
Moreau Erwan
Norman Meuschke
Pacheco María Leonor
Palkovskii Yurii
Palkovskii Yurii
Pertile Solange
Petmanson Timo
Pilehvar Mohammad Taher
Posadas-Durán Juan-Pablo
Potthast Martin
Potthast Martin
Potthast Martin
Potthast Martin
Potthast Martin
Prakash Amit
Rafiei Javad
Rakian Shima
Ravi N. Riya
Rexha Andi
Riya Ravi N
Rodríguez Torrejón Diego Antonio
Safin Kamil
Saini Anuj
Sanchez-Perez Miguel A
Sanchez-Perez Miguel A.
Sari Yunita
Sari Yunita
Schmidt Andreas
Seidman Shachar
Shrestha Prasha
Shrestha Prasha
Siddiqui Muazzam Ahmed
Sittar Abdul
Soori Hussein
Stamatatos Efstathios
Stamatatos Efstathios
Stamatatos Efstathios
Suchomel Šimon
Suchomel Šimon
Suchomel Šimon
Sánchez-Vega Fernando
Tomáš Foltýnek
Tschuggnall Michael
van Dam Michiel
Vartapetiance Anna
Veselý Ondřej
Vilariño Darnes
Wang Shuai
Wibowo Agung Toto
Williams Kyle
Williams Kyle
Williams Kyle
Yao Xuchen
Zmiycharov Valentin
Zubarev Denis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref