Search CORE

104 research outputs found

Semi-automated map generation for concept gaming

Author: Lahti Lauri
Tarhio Jorma
Publication venue: International Association for Development of the Information Society. IADIS Press
Publication date: 01/01/2008
Field of study

Conventional learning games have often limited flexibility to address individual needs of a learner. The concept gaming approach provides a frame for handling conceptual structures that are defined by a concept map. A single concept map can be used to create many alternative games and these can be chosen so that personal learning goals can be taken well into account. However, the workload of creating new concept maps and sharing them effectively seems to easily hinder adoption of concept gaming. We now propose a new semi-automated map generation method for concept gaming. Due to fast increase in the open access knowledge available in the Web, the articles of the Wikipedia encyclopedia were chosen to serve as a source for concept map generation. Based on a given entry name the proposed method produces hierarchical concept maps that can be freely explored and modified. Variants of this approach could be successfully implemented in the wide range of educational tasks. In addition, ideas for further development of concept gaming are proposed.Peer reviewe

Aaltodoc Publication Archive

Transposition invariant pattern matching for multi-track strings

Author: Lemstrom Kjell
Tarhio Jorma
Publication venue
Publication date: 01/01/2003
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

The Hierarchie of LT-Attributed Grammars

Author: Melichar Borivoj
op den Akker Rieks
Tarhio Jorma
Publication venue: Springer
Publication date: 01/01/1988
Field of study

University of Twente Research Information

Attribute evaluation and parsing

Author: Melichar Bořivoj
op den Akker Rieks
Tarhio Jorma
Publication venue: Springer
Publication date: 01/01/1991
Field of study

Crossref

University of Twente Research Information

Using reads to annotate the genome: influence of length, background distribution, and sequence errors on prediction capacity

Author: Boureux Anthony
Bréhélin Laurent
Commes Thérèse
Philippe Nicolas
Rivals Éric
Tarhio Jorma
Publication venue: Oxford University Press
Publication date: 16/06/2009
Field of study

Ultra high-throughput sequencing is used to analyse the transcriptome or interactome at unprecedented depth on a genome-wide scale. These techniques yield short sequence reads that are then mapped on a genome sequence to predict putatively transcribed or protein-interacting regions. We argue that factors such as background distribution, sequence errors, and read length impact on the prediction capacity of sequence census experiments. Here we suggest a computational approach to measure these factors and analyse their influence on both transcriptomic and epigenomic assays. This investigation provides new clues on both methodological and biological issues. For instance, by analysing chromatin immunoprecipitation read sets, we estimate that 4.6% of reads are affected by SNPs. We show that, although the nucleotide error probability is low, it significantly increases with the position in the sequence. Choosing a read length above 19 bp practically eliminates the risk of finding irrelevant positions, while above 20 bp the number of uniquely mapped reads decreases. With our procedure, we obtain 0.6% false positives among genomic locations. Hence, even rare signatures should identify biologically relevant regions, if they are mapped on the genome. This indicates that digital transcriptomics may help to characterize the wealth of yet undiscovered, low-abundance transcripts

PubMed Central

HAL Descartes

Parallel and sequential approximation of shortest superstrings

Author: J-S. Turner
J. Gallant
J. Tarhio
M. R. Garey
V. Chvatal
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A framework for research on technology-enhanced special education

Author: Jormanainen Ilkka
Kärnä-Lin Eija
Lahti Lauri
Pihlainen-Bednarik Kaisa
Sutinen E.
Tarhio J.
Virnes M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Based on results from the Technologies for Childrenwith Individual Needs Project and two case projects,we propose a new multidisciplinary framework forresearch between computer science, educationaltechnology, and special education. The frameworkpresents a way to conduct research that aims atdeveloping new methods for technology-enhancedspecial education and for developing adaptablesoftware and hardware tools for individual needs ineducational settings.Peer reviewe

Crossref

Aaltodoc Publication Archive

Fast Searching in Packed Strings

Author: A. Amir
D.E. Knuth
E.W. Myers
G. Navarro
J. Tarhio
K. Fredriksson
K. Fredriksson
R. Baeza-Yates
R.A. Baeza-Yates
R.M. Karp
R.S. Boyer
S. Wu
S.T. Klein
T.A. Welch
V.L. Arlazarov
W. Masek
W. Rytter
Publication venue
Publication date: 01/01/2009
Field of study

Given strings

P

and

Q

the (exact) string matching problem is to find all positions of substrings in

Q

matching

P

. The classical Knuth-Morris-Pratt algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear time which is optimal if we can only read one character at the time. However, most strings are stored in a computer in a packed representation with several characters in a single word, giving us the opportunity to read multiple characters simultaneously. In this paper we study the worst-case complexity of string matching on strings given in packed representation. Let

m \leq n

be the lengths

P

and

Q

, respectively, and let

\sigma

denote the size of the alphabet. On a standard unit-cost word-RAM with logarithmic word size we present an algorithm using time O\left(\frac{n}{\log_\sigma n} + m + \occ\right). Here \occ is the number of occurrences of

P

Q

. For

m = o(n)

this improves the

O(n)

bound of the Knuth-Morris-Pratt algorithm. Furthermore, if

m = O(n/\log_\sigma n)

our algorithm is optimal since any algorithm must spend at least \Omega(\frac{(n+m)\log \sigma}{\log n} + \occ) = \Omega(\frac{n}{\log_\sigma n} + \occ) time to read the input and report all occurrences. The result is obtained by a novel automaton construction based on the Knuth-Morris-Pratt algorithm combined with a new compact representation of subautomata allowing an optimal tabulation-based simulation.Comment: To appear in Journal of Discrete Algorithms. Special Issue on CPM 200

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

Online Research Database In Technology

Transcriptome annotation using tandem SAGE tags

Author: Anthony Boureux
Bertone
Bertone
Bertone
Brenner
Carninci
Chen
Cheng
Claverie
Cummins
ENCODE
Eric Rivals
Fabien Pierrat
Florence Ottones
Florence Ruffle
Ge
Horspool
Huttenhofer
Jacques Marti
Johnson
Jorma Tarhio
Jurka
Margulies
Mireille Lejeune
Mockler
Ng
Nielsen
Oscar Pecharromàn Pérez
Piquemal
Quéré
Quéré
Rinn
Saha
Semon
Shendure
Silva
Tarhio
Thérèse Commes
Velculescu
Virlon
Wheeler
Woelk
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

Analysis of several million expressed gene signatures (tags) revealed an increasing number of different sequences, largely exceeding that of annotated genes in mammalian genomes. Serial analysis of gene expression (SAGE) can reveal new Poly(A) RNAs transcribed from previously unrecognized chromosomal regions. However, conventional SAGE tags are too short to identify unambiguously unique sites in large genomes. Here, we design a novel strategy with tags anchored on two different restrictions sites of cDNAs. New transcripts are then tentatively defined by the two SAGE tags in tandem and by the spanning sequence read on the genome between these tagged sites. Having developed a new algorithm to locate these tag-delimited genomic sequences (TDGS), we first validated its capacity to recognize known genes and its ability to reveal new transcripts with two SAGE libraries built in parallel from a single RNA sample. Our algorithm proves fast enough to experiment this strategy at a large scale. We then collected and processed the complete sets of human SAGE tags to predict yet unknown transcripts. A cross-validation with tiling arrays data shows that 47% of these TDGS overlap transcriptional active regions. Our method provides a new and complementary approach for complex transcriptome annotation