11 research outputs found
Recommended from our members
Markov Aggregation for Speeding Up Agent-Based Movement Simulations
...In this work, we investigate Markov aggregation for agent-based models (ABMs). Specifically, if the ABM models agent movements on a graph, if its ruleset satisfies certain assumptions, and if the aim is to simulate aggregate statistics such as vertex populations, then the ABM can be replaced by a Markov chain on a comparably small state space. This equivalence between a function of the ABM and a smaller Markov chain allows to reduce the computational complexity of the agent-based simulation from being linear in the number of agents, to being constant in the number of agents and polynomial in the number of locations. We instantiate our theory for a recent ABM for forced migration (Flee).We show that,even though the rulesets of Flee violate some of our necessary assumptions, the aggregated Markov chain-based model,Markov Flee,achieves comparable accuracy at substantially reduced computational cost. Thus, Markov Flee can help NGOs and policy makers forecast forced migration in certain conflict scenarios in a cost-effective manner, contributing to fast and efficient delivery of humanitarian relief.This work has been supported by the HiDALGO, ITFLOWS, SEAVEA
ExCALIBUR, and BrAIN projects. The projects HiDALGO (Grant No.
824115) and ITFLOWS (Grant No. 882986) have been funded by the
European Commission’s H2020 Programme. The project SEAVEA
ExCALIBUR (Grant No. EP/W007711/1) has received funding from
EPSRC. The project BrAIN – Brownfield Artificial Intelligence Network
for Forging of High Quality Aerospace Components (Grant
No. 881039) is funded in the framework of the program “TAKE
OFF”, which is a research and technology program of the Austrian
Federal Ministry of Transport, Innovation and Technology.
The Know-Center is funded within the Austrian COMET Program
- Competence Centers for Excellent Technologies - under
the auspices of the Austrian Federal Ministry of Climate Action,
Environment, Energy, Mobility, Innovation and Technology, the
Austrian Federal Ministry of Digital and Economic Affairs, and by
the State of Styria. COMET is managed by the Austrian Research
Promotion Agency FFG
MolabIS: A Labs Backbone for Storing, Managing and Evaluating Molecular Genetics Data
Using paper lab books and spreadsheets to store and manage growing datasets in a file system is inefficient, time consuming and error-prone. Therefore, the overall purpose of this study is to develop an integrated information system for small laboratories conducting Sanger sequencing and microsatellite genotyping projects. To address this, the thesis has investigated the following three issues. First, we proposed a uniform solution using the workflow approach to efficiently collect and store data items in different labs. The outcome is the design of the formalized data framework which is the basic to create a general data model for biodiversity studies. Second, we designed and implemented a web-based information system (MolabIS) allowing lab people to store all original data at each step of their workflow. MolabIS provides essential tools to import, store, organize, search, modify, report and export relevant data. Finally, we conducted a case study to evaluate the performance of MolabIS with typical operations in a production mode. Consequently, we can propose the use of virtual appliance as an efficient solution for the deployment of complex open-source information systems like MolabIS. The major result of this study, along with the publications, is the MolabIS software which is freely released under GPL license at http://www.molabis.org. With its general data model, easy installation process and additional tools for data migration, MolabIS can be used in a wide range of molecular genetics labs
The COCOMO-Models in the Light of the Agile Software Development
Aufwandsschätzungen sind wichtig, um ökonomische und strategische Entscheidungen in der Softwareentwicklung treffen zu können. Verschiedene Veröffentlichungen propagieren das Constructive Cost Model (COCOMO) als ein algorithmisches Kostenmodell, basierend auf Formeln mit objektiven Variablen für Schätzungen in der klassischen Softwareentwicklung (KS). Arbeiten aus der agilen Softwareentwicklung (AS) verweisen auf den Einsatz von erfahrungsbasierten Schätzmethoden und von subjektiven Variablen. Aufgrund der schwachen Operationalisierung im agilen Kontext sind Aussagen über konkrete Ursache- und Wirkungszusammenhänge schwer zu treffen. Hinzu kommt der einseitige Fokus der klassischen und agilen Untersuchungen auf den eigene Forschungsbereich, der nach sich zieht, dass eine Verwendung von Variablen aus COCOMO in der AS unklar ist. Wenn hierzu Details bekannt wären, könnten operationalisierte Variablen aus COCOMO auch in der AS eingesetzt werden. Dadurch wird es möglich, in einer wissenschaftlichen Untersuchung eine Konzeptionierung von konkreten kausalen Abhängigkeiten vorzunehmen – diese Erkenntnisse würden wiederum eine Optimierung des Entwicklungsprozesses erlauben. Zur Identifikation von Variablen wird dazu eine qualitative und deskriptive Arbeit mit einer Literaturrecherche und einer Auswertung der Quellen durchgeführt. Erste Ergebnisse zwischen beiden Welten zeigen dabei sowohl Unterschiede als auch Gemeinsamkeiten. Eine Vielzahl von Variablen aus COCOMO kann in der AS verwendet werden. Inwieweit dies möglich ist, ist von den objektiven und subjektiven Anteilen der Variablen abhängig. Vertreter mit erfahrungsbasiertem Hintergrund wie Analyst Capability (ACAP) und Programmer Capability (PCAP) lassen sich aufgrund von Übereinstimmungen mit personenbezogenen Merkmalen gut in die AS übertragen. Parallel dazu sind Variablen aus dem Prozess- und Werkzeugumfeld weniger gut transferierbar, da konkret die AS einen Fokus auf solche Projektmerkmale ablehnt. Eine Weiterverwendung von Variablen ist damit grundsätzlich unter der Berücksichtigung von gegebenen Rahmenbedingungen möglich.Effort estimations are important in order to make economic and strategic decisions in software development. Various publications propagate the Constructive Cost Model (COCOMO) as an algorithmic cost model, based on formulas with objective variables for estimations in classical software development (KS). Papers from agile software development (AS) refers to the use of experience-based estimation methods and subjective variables. Due to the weak operationalization in an agile context, statements about concrete cause and effect relationships are difficult to make. In addition, there is the one-sided focus of classical and agile investigations on their own research field, which suggests that the use of variables from COCOMO in the AS is unclear. If details were available, operational variables from COCOMO could also be used in the AS. This makes it possible to carry out a conceptualization of concrete causal dependencies in a scientific investigation - these findings in turn would allow an optimization of the development process. To identify variables, a qualitative and descriptive work with a literature research and an evaluation of the sources is carried out. First results between the two worlds show both differences and similarities. A large number of variables from COCOMO can be used in the AS. This is possible depending on the objective and subjective proportions of the variables. Variables with an experience-based background, such as Analyst Capability (ACAP) and Programmer Capability (PCAP), can be well transferred to the AS by matching personal characteristics. At the same time, variables from the process and tool environment are less easily transferable, because AS specifically rejects a focus on such project features. A re-use of variables is thus possible under consideration of given conditions
Named Entity Recognition and Text Compression
Import 13/01/2017In recent years, social networks have become very popular. It is easy for users
to share their data using online social networks. Since data on social networks is
idiomatic, irregular, brief, and includes acronyms and spelling errors, dealing with
such data is more challenging than that of news or formal texts. With the huge
volume of posts each day, effective extraction and processing of these data will bring
great benefit to information extraction applications.
This thesis proposes a method to normalize Vietnamese informal text in social
networks. This method has the ability to identify and normalize informal text
based on the structure of Vietnamese words, Vietnamese syllable rules, and a trigram
model. After normalization, the data will be processed by a named entity
recognition (NER) model to identify and classify the named entities in these data.
In our NER model, we use six different types of features to recognize named entities
categorized in three predefined classes: Person (PER), Location (LOC), and
Organization (ORG).
When viewing social network data, we found that the size of these data are very
large and increase daily. This raises the challenge of how to decrease this size. Due
to the size of the data to be normalized, we use a trigram dictionary that is quite
big, therefore we also need to decrease its size. To deal with this challenge, in this
thesis, we propose three methods to compress text files, especially in Vietnamese
text. The first method is a syllable-based method relying on the structure of
Vietnamese morphosyllables, consonants, syllables and vowels. The second method
is trigram-based Vietnamese text compression based on a trigram dictionary. The
last method is based on an n-gram slide window, in which we use five dictionaries
for unigrams, bigrams, trigrams, four-grams and five-grams. This method achieves
a promising compression ratio of around 90% and can be used for any size of text file.In recent years, social networks have become very popular. It is easy for users
to share their data using online social networks. Since data on social networks is
idiomatic, irregular, brief, and includes acronyms and spelling errors, dealing with
such data is more challenging than that of news or formal texts. With the huge
volume of posts each day, effective extraction and processing of these data will bring
great benefit to information extraction applications.
This thesis proposes a method to normalize Vietnamese informal text in social
networks. This method has the ability to identify and normalize informal text
based on the structure of Vietnamese words, Vietnamese syllable rules, and a trigram
model. After normalization, the data will be processed by a named entity
recognition (NER) model to identify and classify the named entities in these data.
In our NER model, we use six different types of features to recognize named entities
categorized in three predefined classes: Person (PER), Location (LOC), and
Organization (ORG).
When viewing social network data, we found that the size of these data are very
large and increase daily. This raises the challenge of how to decrease this size. Due
to the size of the data to be normalized, we use a trigram dictionary that is quite
big, therefore we also need to decrease its size. To deal with this challenge, in this
thesis, we propose three methods to compress text files, especially in Vietnamese
text. The first method is a syllable-based method relying on the structure of
Vietnamese morphosyllables, consonants, syllables and vowels. The second method
is trigram-based Vietnamese text compression based on a trigram dictionary. The
last method is based on an n-gram slide window, in which we use five dictionaries
for unigrams, bigrams, trigrams, four-grams and five-grams. This method achieves
a promising compression ratio of around 90% and can be used for any size of text file.460 - Katedra informatikyvyhově
Fine spatial scale modelling of Trentino past forest landscape and future change scenarios to study ecosystem services through the years
Ciolli, MarcoCantiani, Maria Giulia1openLandscape in Europe has dramatically changed in the last decades. This has been especially
true for Alpine regions, where the progressive urbanization of the valleys has been accom-
panied by the abandonment of smaller villages and areas at higher elevation. This trend
has been clearly observable in the Provincia Autonoma di Trento (PAT) region in the Italian
Alps. The impact has been substantial for many rural areas, with the progressive shrinking
of meadows and pastures due to the forest natural recolonization. These modifications of the
landscape affect biodiversity, social and cultural dynamics, including landscape perception
and some ecosystem services. Literature review showed that this topic has been addressed
by several authors across the Alps, but their researches are limited in space coverage, spatial
resolution and time span. This thesis aims to create a comprehensive dataset of historical
maps and multitemporal orthophotos in the area of PAT to perform data analysis to identify
the changes in forest and open areas, being an evaluation of how these changes affected land-
scape structure and ecosystems, create a future change scenario for a test area and highlight
some major changes in ecosystem services through time.
In this study a high resolution dataset of maps covering the whole PAT area for over
a century was developed. The earlier representation of the PAT territory which contained
reliable data about forest coverage was considered is the Historic Cadastral maps of the 1859.
These maps in fact systematically and accurately represented the land use of each parcel in
the Habsburg Empire, included the PAT. Then, the Italian Kingdom Forest Maps, was the
next important source of information about the forest coverage after World War I, before
coming to the most recent datasets of the greyscale images of 1954, 1994 and the multiband
images of 2006 and 2015.
The purpose of the dataset development is twofold: to create a series of maps describing
the forest and open areas coverage in the last 160 years for the whole PAT on one hand and
to setup and test procedures to extract the relevant information from imagery and historical
maps on the other. The datasets were archived, processed and analysed using the Free and
Open Source Software (FOSS) GIS GRASS, QGIS and R.
The goal set by this work was achieved by a remote sensed analysis of said maps and
aerial imagery. A series of procedures were applied to extract a land use map, with the forest
categories reaching a level of detail rarely achieved for a study area of such an extension
(6200 km2
). The resolution of the original maps is in fact at a meter level, whereas the coarser
resampling adopted is 10mx10m pixels.
The great variety and size of the input data required the development, along the main part
of the research, of a series of new tools for automatizing the analysis of the aerial imagery,
to reduce the user intervention. New tools for historic map classification were as well developed, for eliminating from the resulting maps of land use from symbols (e.g.: signs), thus
enhancing the results.
Once the multitemporal forest maps were obtained, the second phase of the current work
was a qualitative and quantitative assessment of the forest coverage and how it changed.
This was performed by the evaluation of a number of landscape metrics, indexes used to
quantify the compaction or the rarefaction of the forest areas.
A recurring issue in the current Literature on the topic of landscape metrics was identified
along their analysis in the current work, that was extensively studied. This highlighted the
importance of specifying some parameters in the most used landscape fragmentation analy-
sis software to make the results of different studies properly comparable.
Within this analysis a set of data coming from other maps were used to characterize the process of afforestation in PAT, such as the potential forest maps, which were used to quantify
the area of potential forest which were actually afforested through the years, the Digital Ele-
vation Model, which was used to quantify the changes in forest area at a different ranges of
altitude, and finally the forest class map, which was used to estimate how afforestation has
affected each single forest type.
The output forest maps were used to analyse and estimate some ecosystem services, in par-
ticular the protection from soil erosion, the changes in biodiversity and the landscape of the
forests.
Finally, a procedure for the analysis of future changes scenarios was set up to study how
afforestation will proceed in absence of external factors in a protected area of PAT. The pro-
cedure was developed using Agent Based Models, which considers trees as thinking agents,
able to choose where to expand the forest area.
The first part of the results achieved consists in a temporal series of maps representing the
situation of the forest in each year of the considered dataset. The analysis of these maps
suggests a trend of afforestation across the PAT territory. The forest maps were then reclassi-
fied by altitude ranges and forest types to show how the afforestation proceeded at different
altitudes and forest types. The results showed that forest expansion acted homogeneously
through different altitude and forest types. The analysis of a selected set of landscape met-
rics showed a progressive compaction of the forests at the expenses of the open areas, in each
altitude range and for each forest type. This generated on one hand a benefit for all those
ecosystem services linked to a high forest cover, while reduced ecotonal habitats and affected
biodiversity distribution and quality. Finally the ABM procedure resulted in a set of maps
representing a possible evolution of the forest in an area of PAT, which represented a similar
situation respect to other simulations developed using different models in the same area. A
second part of the result achieved in the current work consisted in new open source tools
for image analysis developed for achieving the results showed, but with a potentially wider
field of application, along with new procedure for the evaluation of the image classification.
The current work fulfilled its aims, while providing in the meantime new tools and enhance-
ment of existing tools for remote sensing and leaving as heritage a large dataset that will be
used to deepen he knowledge of the territory of PAT, and, more widely to study emerging
pattern in afforestation in an alpine environment.openGobbi, S
Image Processing Using FPGAs
This book presents a selection of papers representing current research on using field programmable gate arrays (FPGAs) for realising image processing algorithms. These papers are reprints of papers selected for a Special Issue of the Journal of Imaging on image processing using FPGAs. A diverse range of topics is covered, including parallel soft processors, memory management, image filters, segmentation, clustering, image analysis, and image compression. Applications include traffic sign recognition for autonomous driving, cell detection for histopathology, and video compression. Collectively, they represent the current state-of-the-art on image processing using FPGAs
Simulations and Modelling for Biological Invasions
Biological invasions are characterized by the movement of organisms from their native geographic region to new, distinct regions in which they may have significant impacts. Biological invasions pose one of the most serious threats to global biodiversity, and hence significant resources are invested in predicting, preventing, and managing them. Biological systems and processes are typically large, complex, and inherently difficult to study naturally because of their immense scale and complexity. Hence, computational modelling and simulation approaches can be taken to study them. In this dissertation, I applied computer simulations to address two important problems in invasion biology. First, in invasion biology, the impact of genetic diversity of introduced populations on their establishment success is unknown. We took an individual-based modelling approach to explore this, leveraging an ecosystem simulation called EcoSim to simulate biological invasions. We conducted reciprocal transplants of prey individuals across two simulated environments, over a gradient of genetic diversity. Our simulation results demonstrated that a harsh environment with low and spatially-varying resource abundance mediated a relationship between genetic diversity and short-term establishment success of introduced populations rather than the degree of difference between native and introduced ranges. We also found that reducing Allee effects by maintaining compactness, a measure of spatial density, was key to the establishment success of prey individuals in EcoSim, which were sexually reproducing. Further, we found evidence of a more complex relationship between genetic diversity and long-term establishment success, assuming multiple introductions were occurring. Low-diversity populations seemed to benefit more strongly from multiple introductions than high-diversity populations. Our results also corroborated the evolutionary imbalance hypothesis: the environment that yielded greater diversity produced better invaders and itself was less invasible. Finally, our study corroborated a mechanical explanation for the evolutionary imbalance hypothesis – the populations evolved in a more intense competitive environment produced better invaders. Secondly, an important advancement in invasion biology is the use of genetic barcoding or metabarcoding, in conjunction with next-generation sequencing, as a potential means of early detection of aquatic introduced species. Barcoding and metabarcoding invariably requires some amount of computational DNA sequence processing. Unfortunately, optimal processing parameters are not known in advance and the consequences of suboptimal parameter selection are poorly understood. We aimed to determine the optimal parameterization of a common sequence processing pipeline for both early detection of aquatic nonindigenous species and conducting species richness assessments. We then aimed to determine the performance of optimized pipelines in a simulated inoculation of sequences into community samples. We found that early detection requires relatively lenient processing parameters. Further, optimality depended on the research goal – what was optimal for early detection was suboptimal for estimating species richness and vice-versa. Finally, with optimal parameter selection, fewer than 11 target sequences were required in order to detect 90% of nonindigenous species
Deep Learning in Medical Image Analysis
The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis