477 research outputs found
Recommended from our members
Linking early geospatial documents, one place at a time: annotation of geographic documents with Recogito
Recogito is an open source tool for the semi-automatic annotation of place references in maps and texts. It was developed as part of the Pelagios 3 research project, which aims to build up a comprehensive directory of places referred to in early maps and geographic writing predating the year 1492. Pelagios 3 focuses specifically on sources from the Classical Latin, Greek and Byzantine periods; on Mappae Mundi and narrative texts from the European Medieval period; on Late Medieval Portolans; and on maps and texts from the early Islamic and early Chinese traditions. Since the start of the project in September 2013, the team has harvested more than 120,000 toponyms, manually verifying almost 60,000 of them. Furthermore, the team held two public annotation workshops supported through the Open Humanities Awards 2014. In these workshops, a mixed audience of students and academics of different backgrounds used Recogito to add several thousand contributions on each workshop day.
A number of benefits arise out of this work: on the one hand, the digital identification of places – and the names used for them – makes the documents' contents amenable to information retrieval technology, i.e. documents become more easily search- and discoverable to users than through conventional metadata-based search alone. On the other hand, the documents are opened up to new forms of re-use. For example, it becomes possible to “map” and compare the narrative of texts, and the contents of maps with modern day tools like Web maps and GIS; or to analyze and contrast documents’ geographic properties, toponymy and spatial relationships. Seen in a wider context, we argue that initiatives such as ours contribute to the growing ecosystem of the “Graph of Humanities Data” that is gathering pace in the Digital Humanities (linking data about people, places, events, canonical references, etc.), which has the potential to open up new avenues for computational and quantitative research in a variety of fields including History, Geography, Archaeology, Classics, Genealogy and Modern Languages
Detecting Urban Road Changes using Segmentation and Vector Analysis
The rapid growth of urbanization is driving increased road infrastructure development. Detecting and monitoring changes in urban road areas is challenging for city planners. This research proposes using semantic segmentation and vector analysis on high-resolution images to identify road network changes. The U-Net model performs semantic segmentation, pre-trained on a Massachusetts road dataset, predicting labels for a specific area with temporal data and co-registration to reduce distortions. Predicted labels are converted to shapefiles for vector analysis. Satellite images from Google Earth archives demonstrate the change detection process. The outcome of this predictive phase was the transformation of projected labels into shapefiles, thereby facilitating vector analysis to pinpoint and characterize alterations
Automatic tree detection and attribute characterization using portable terrestrial lidar
Currently, the implementation of portable laser scanners (PLS) in forest inventories is being studied, since they allow for significantly reduced field-work time and costs when compared to the traditional inventory methods and other LiDAR systems. However, it has been shown that their operability and efficiency are dependent upon the species assessed, and therefore, there is a need for more research assessing different types of stands and species. Additionally, a few studies have been conducted in Eucalyptus stands, one of the tree genus that is most commonly planted around the world. In this study, a PLS system was tested in a Eucalyptus globulus stand to obtain different metrics of individual trees. An automatic methodology to obtain inventory data (individual tree positions, DBH, diameter at different heights, and height of individual trees) was developed using public domain software. The results were compared to results obtained with a static terrestrial laser scanner (TLS). The methodology was able to identify 100% of the trees present in the stand in both the PLS and TLS point clouds. For the PLS point cloud, the RMSE of the DBH obtained was 0.0716, and for the TLS point cloud, it was 0.176. The RMSE for height for the PLS point cloud was 3.415 m, while for the PLS point cloud, it was 10.712 m. This study demonstrates the applicability of PLS systems for the estimation of the metrics of individual trees in adult Eucalyptus globulus stands.Agencia Estatal de Investigación | Ref. PID2019-111581RB-I00Ministerio de Ciencia, Innovación y Universidades | Ref. FPU19/02054Universidade de Vigo/CISU
Contribution to the kownledge of cultural heritage via a Heritage Information System (HIS): the case of “La Cultura del Agua” in Valverde de Burguillos, Badajoz (Spain)
Modern
science
is
going
through
a
period
of
important
reflection
on
the
role
of
different
agents
and
multiple
disciplines
in
the
management
and
safeguarding
of
architectural
heritage.
This
new
focus
generates
a
greater
amount
and
diversity
of
information,
so
the
implementation
of
a
unifying
tool
in
the
framework
of
digital
information
models
would
mean
a
better
knowledge
of
cultural
heritage
as
well
as
aiding
its
safeguarding
and
protection.
In
addition,
it
must
be
taken
into
account
that,
for
the
correct
management
of
information
in
its
broadest
dimension,
this
tool
must
make
it
possible
to
relate
alphanumeric
data
about
an
item
of
heritage
to
its
spatial
location.
In
this
sense,
this
article
proposes
a
Heritage
Information
System
(HIS)—understood
as
a
digital
knowledge
tool—that
consists
of
a
relational
database
and
a
map
manager
with
Geographic
Information
System
(GIS)
technology
(a
geodatabase).
The
methodology
suggested
here
sets
out
the
steps
that
make
up
the
HIS,
so
that
the
system
can
be
applied
to
other
geographical
elements
or
realities.
For
this
reason,
a
study
was
made
of
“La
Cultura
del
Agua”
in
Valverde
de
Burguillos
(Spain),
a
heritage
ensemble
that
consists
of
rural
architecture
and
dispersed
preindustrial
elements,
which
are
currently
at
risk.
The
HIS
seeks
to
develop
a
more
complete
identification
of
these
elements
(individually
and
as
a
system)
and
a
justified
argument
for
their
being
given
value
and
great
visibility.
This
new
approach
encourages
sustainable
development
in
terms
of
efficiency
and
effectiveness
for
the
analysis,
diagnosis,
and
reactivation
of
cultural
heritage,
always
placing
importance
on
the
balance
of
social
participation
with
the
territory
in
which
the
system
is
applied,
and
with
global
societ
Biclustering fMRI time series
Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2020Biclustering é um método de análise que procura gerar clusters tendo em conta simultaneamente as linhas e as colunas de uma matriz de dados. Este método tem sido vastamente explorado em análise de dados genéticos. Apesar de diversos estudos reconhecerem as capacidades deste método de análise em outras áreas de investigação, as últimas duas décadas tem sido marcadas por um número elevado de estudos aplicados em dados genéticos e pela ausência de uma linha de investigação que explore as capacidades de biclustering fora desta área tradicional Esta tese segue pistas que sugerem potencial no uso de biclustering em dados de natureza espaço-temporal. Considerando o contexto particular das neurociências, esta tese explora as capacidades dos algoritmos de biclustering em extrair conhecimento das séries temporais geradas por técnicas de imagem por ressonância magnética funcional (fMRI). Eta tese propõe uma metodologia para avaliar a capacidade de algoritmos de biclustering em estudar dados fMRI, considerando tanto dados sintéticos como dados reais. Para avaliar estes algoritmos, usamos métricas de avaliação interna. Os nossos resultados discutem o uso de diversas estratégias de busca, revelando a superioridade de estratégias exaustivos para obter os biclusters mais homogéneos. No entanto, o elevado custo computacional de estratégias exaustivas ainda são um desafio e é necessário pesquisa adicional para a busca eficiente de biclusters no contexto de análise de dados fMRI. Propomos adicionalmente uma nova metodologia de análise de biclusters baseada em algoritmos de descoberta de padrões para determinar os padrões mais frequentes presentes nas soluções de biclustering geradas. Um bicluster não é mais que um hipervértice num hipergrafo . Extrair padrões frequentes numa solução de biclustering implica extrair os hipervértices mais significativos. Numa primeira abordagem, isto permite entender relações entre regiões do cérebro e traçar perfis temporais que métodos tradicionais de estudos de correlação não são capazes de detetar. Adicionalmente, o processo de gerar os biclusters permite filtrar ligações pouco interessantes, permitindo potencialmente gerar hipergrafos de forma eficiente. A questão final é o que podemos fazer com este conhecimento. Conhecer a relação entre regiões do cérebro é o objetivo central das neurociências. Entender as ligações entre regiões do cérebro para vários sujeitos permitem traçar perfis. Nesse caso, propomos uma metodologia para extrapolar biclusters para dados tridimensionais e efetuar triclustering. Adicionalmente, entender a ligação entre zonas cerebrais permite identificar doenças como a esquizofrenia, demência ou o Alzheimer. Este trabalho aponta caminhos para o uso de biclustering na análise de dados espaço-temporais, em particular em neurociências. A metodologia de avaliação proposta mostra evidências da eficácia do biclustering para encontrar padrões locais em dados de fMRI, embora mais trabalhos sejam necessários em relação à escalabilidade para promover a aplicação em cenários reais.The effectiveness of biclustering, simultaneous clustering of both rows and columns in a data matrix, has been primarily shown in gene expression data analysis. Furthermore, several researchers recognize its potentialities in other research areas. Nevertheless, the last two decades witnessed many biclustering algorithms targeting gene expression data analysis and a lack of consistent studies exploring the capacities of biclustering outside this traditional application domain. Following hints that suggest potentialities for biclustering on Spatiotemporal data, particularly in neurosciences, this thesis explores biclustering’s capacity to extract knowledge from fMRI time series. This thesis proposes a methodology to evaluate biclustering algorithms’ feasibility to study the fMRI signal, considering both synthetic and realworld fMRI datasets. In the absence of ground truth to compare bicluster solutions with a reference one, we used internal valuation metrics. Results discussing the use of different search strategies showed the superiority of exhaustive approaches, obtaining the most homogeneous biclusters. However, their high computational cost is still a challenge, and further work is needed for the efficient use of biclustering in fMRI data analysis. We propose a new methodology for analyzing biclusters based on performing pattern mining algorithms to determine the most frequent patterns present in the generated biclustering solutions. A bicluster is nothing more than a hyperlink in a hypergraph. Extracting frequent patterns in a biclustering solution implies extracting the most significant hyperlinks. In a first approach, this allows to understand relationships between regions of the brain and draw temporal profiles that traditional methods of correlation studies cannot detect. Additionally, the process of generating biclusters allows filtering uninteresting links, potentially allowing to generate hypergraphs efficiently. The final question is, what can we do with this knowledge. Knowing the relationship between brain regions is the central objective of neurosciences. Understanding the connections between regions of the brain for various subjects allows one to draw profiles. In this case, we propose a methodology to extrapolate biclusters to threedimensional data and perform triclustering. Additionally, understanding the link between brain zones allows identifying diseases like schizophrenia, dementia, or Alzheimer’s. This work pinpoints avenues for the use of biclustering in Spatiotemporal data analysis, in particular neurosciences applications. The proposed evaluation methodology showed evidence of biclustering’s effectiveness in finding local fMRI data patterns, although further work is needed regarding scalability to promote the application in real scenarios
Monitoring land use changes using geo-information : possibilities, methods and adapted techniques
Monitoring land use with geographical databases is widely used in decision-making. This report presents the possibilities, methods and adapted techniques using geo-information in monitoring land use changes. The municipality of Soest was chosen as study area and three national land use databases, viz. Top10Vector, CBS land use statistics and LGN, were used. The restrictions of geo-information for monitoring land use changes are indicated. New methods and adapted techniques improve the monitoring result considerably. Providers of geo-information, however, should coordinate on update frequencies, semantic content and spatial resolution to allow better possibilities of monitoring land use by combining data sets
Management of Scientific Images: An approach to the extraction, annotation and retrieval of figures in the field of High Energy Physics
El entorno de la información en la primera década del siglo XXI no tiene precedentes. Las barreras físicas que han limitado el acceso al conocimiento están desapareciendo a medida que los métodos tradicionales de acceso a información se reemplazan o se mejoran gracias al uso de sistemas basados en computador. Los sistemas digitales son capaces de gestionar colecciones mucho más grandes de documentos, confrontando a los usuarios de información con la avalancha de documentos asociados a su tópico de interés. Esta nueva situación ha creado un incentivo para el desarrollo de técnicas de minería de datos y la creación de motores de búsqueda más eficientes y capaces de limitar los resultados de búsqueda a un subconjunto reducido de los más relevantes. Sin embargo, la mayoría de los motores de búsqueda en la actualidad trabajan con descripciones textuales. Estas descripciones se pueden extraer o bien del contenido o a través de fuentes externas. La recuperación basada en el contenido no textual de documentos es un tema de investigación continua. En particular, la recuperación de imágenes y el desentrañar la información contenida en ellas están suscitando un gran interés en la comunidad científica. Las bibliotecas digitales se sitúan en una posición especial dentro de los sistemas que facilitan el acceso al conocimiento. Actúan como repositorios de documentos que comparten algunas características comunes (por ejemplo, pertenecer a la misma área de conocimiento o ser publicados por la misma institución) y como tales contienen documentos considerados de interés para un grupo particular de usuarios. Además, facilitan funcionalidades de recuperación sobre las colecciones gestionadas. Normalmente, las publicaciones científicas son las unidades más pequeñas gestionadas por las bibliotecas digitales científicas. Sin embargo, en el proceso de creación científica hay diferentes tipos de artefactos, entre otros: figuras y conjuntos de datos. Las figuras juegan un papel particularmente importante en el proceso de publicación científica. Representan los datos en una forma gráfica que nos permite mostrar patrones sobre grandes conjuntos de datos y transmitir ideas complejas de un modo fácilmente entendible. Los sistemas existentes para bibliotecas digitales facilitan el acceso a figuras, pero solo como parte de los ficheros sobre los que se serializa la publicación entera. El objetivo de esta tesis es proponer un conjunto de métodos ytécnicas que permitan transformar las figuras en productos de primera clase dentro del proceso de publicación científica, permitiendo que los investigadores puedan obtener el máximo beneficio a la hora de realizar búsquedas y revisiones de bibliografía existente. Los métodos y técnicas propuestos están orientados a facilitar la adquisición, anotación semántica y búsqueda de figuras contenidas en publicaciones científicas. Para demostrar la completitud de la investigación se han ilustrado las teorías propuestas mediante ejemplos en el campo de la Física de Partículas (también conocido como Física de Altas Energías). Para aquellos casos en los que se han necesitadoo en las figuras que aparecen con más frecuencia en las publicaciones de Física de Partículas: los gráficos científicos denominados en inglés con el término plots. Los prototipos que propuestas más detalladas han desarrollado para esta tesis se han integrado parcialmente dentro del software Invenio (1) para bibliotecas digitales, así como dentro de INSPIRE, una de las mayores bibliotecas digitales en Física de Partículas mantenida gracias a la colaboración de grandes laboratorios y centros de investigación como son el CERN, SLAC, DESY y Fermilab. 1). http://invenio-software.org
An Evolutionary Approach to Adaptive Image Analysis for Retrieving and Long-term Monitoring Historical Land Use from Spatiotemporally Heterogeneous Map Sources
Land use changes have become a major contributor to the anthropogenic global change. The ongoing dispersion and concentration of the human species, being at their orders unprecedented, have indisputably altered Earth’s surface and atmosphere. The effects are so salient and irreversible that a new geological epoch, following the interglacial Holocene, has been announced: the Anthropocene. While its onset is by some scholars dated back to the Neolithic revolution, it is commonly referred to the late 18th century. The rapid development since the industrial revolution and its implications gave rise to an increasing awareness of the extensive anthropogenic land change and led to an urgent need for sustainable strategies for land use and land management. By preserving of landscape and settlement patterns at discrete points in time, archival geospatial data sources such as remote sensing imagery and historical geotopographic maps, in particular, could give evidence of the dynamic land use change during this crucial period.
In this context, this thesis set out to explore the potentials of retrospective geoinformation for monitoring, communicating, modeling and eventually understanding the complex and gradually evolving processes of land cover and land use change. Currently, large amounts of geospatial data sources such as archival maps are being worldwide made online accessible by libraries and national mapping agencies. Despite their abundance and relevance, the usage of historical land use and land cover information in research is still often hindered by the laborious visual interpretation, limiting the temporal and spatial coverage of studies. Thus, the core of the thesis is dedicated to the computational acquisition of geoinformation from archival map sources by means of digital image analysis. Based on a comprehensive review of literature as well as the data and proposed algorithms, two major challenges for long-term retrospective information acquisition and change detection were identified: first, the diversity of geographical entity representations over space and time, and second, the uncertainty inherent to both the data source itself and its utilization for land change detection.
To address the former challenge, image segmentation is considered a global non-linear optimization problem. The segmentation methods and parameters are adjusted using a metaheuristic, evolutionary approach. For preserving adaptability in high level image analysis, a hybrid model- and data-driven strategy, combining a knowledge-based and a neural net classifier, is recommended. To address the second challenge, a probabilistic object- and field-based change detection approach for modeling the positional, thematic, and temporal uncertainty adherent to both data and processing, is developed. Experimental results indicate the suitability of the methodology in support of land change monitoring. In conclusion, potentials of application and directions for further research are given
Discovering ship navigation patterns towards environment impact modeling
Ship positioning and maneuvering information is highly relevant to understand
the levels of pollution on coastal cities and sea-life quality, containing latent patterns
of vessels behavior, that are of utility on earth sciences and environmental
research.
Using Automatic Identification System (AIS) data enables air quality models
to have finer grain estimations. However, the data as it is, carries uncertainty
and errors. Therefore, there is a need for a methodology to filter and clean it and
to extract patterns. Ship navigation traces can be understood as time series.
Here, we present a methodology for characterizing ships by their navigation
traces, using Conditional Restricted Boltzmann Machines (CRBMs) plus classic
clustering techniques like k-Means.
From the inputs received from ships using the AIS, containing ship positions,
speed, and characteristics, we produce a processed cruising trace that a CRBM
can encode while preserving the time factor and reducing dimensionality of data.
Such codification can be then clustered or pattern-mined, then used not only for
ship classification but also to cross such behavior patterns with environmental
information. In this paper we detail such methodology and validate it using
data from the Spanish Ports Authority records from national and international
fishing vessels and passenger and cargo ships.
Along the pattern mining methodology we propose how to use Apache Spark
for the data cleaning process until it arrives to the Conditional Restricted Boltzmann
Machine (CRBM). Finally, we develop a visualization tool for data exploration
and pattern evaluation
- …