1,724 research outputs found
Visual Analysis of High-Dimensional Point Clouds using Topological Abstraction
This thesis is about visualizing a kind of data that is trivial to process by computers but difficult to imagine by humans because nature does not allow for intuition with this type of information: high-dimensional data. Such data often result from representing observations of objects under various aspects or with different properties. In many applications, a typical, laborious task is to find related objects or to group those that are similar to each other. One classic solution for this task is to imagine the data as vectors in a Euclidean space with object variables as dimensions. Utilizing Euclidean distance as a measure of similarity, objects with similar properties and values accumulate to groups, so-called clusters, that are exposed by cluster analysis on the high-dimensional point cloud. Because similar vectors can be thought of as objects that are alike in terms of their attributes, the point cloud\''s structure and individual cluster properties, like their size or compactness, summarize data categories and their relative importance. The contribution of this thesis is a novel analysis approach for visual exploration of high-dimensional point clouds without suffering from structural occlusion. The work is based on implementing two key concepts: The first idea is to discard those geometric properties that cannot be preserved and, thus, lead to the typical artifacts. Topological concepts are used instead to shift away the focus from a point-centered view on the data to a more structure-centered perspective. The advantage is that topology-driven clustering information can be extracted in the data\''s original domain and be preserved without loss in low dimensions. The second idea is to split the analysis into a topology-based global overview and a subsequent geometric local refinement. The occlusion-free overview enables the analyst to identify features and to link them to other visualizations that permit analysis of those properties not captured by the topological abstraction, e.g. cluster shape or value distributions in particular dimensions or subspaces. The advantage of separating structure from data point analysis is that restricting local analysis only to data subsets significantly reduces artifacts and the visual complexity of standard techniques. That is, the additional topological layer enables the analyst to identify structure that was hidden before and to focus on particular features by suppressing irrelevant points during local feature analysis. This thesis addresses the topology-based visual analysis of high-dimensional point clouds for both the time-invariant and the time-varying case. Time-invariant means that the points do not change in their number or positions. That is, the analyst explores the clustering of a fixed and constant set of points. The extension to the time-varying case implies the analysis of a varying clustering, where clusters appear as new, merge or split, or vanish. Especially for high-dimensional data, both tracking---which means to relate features over time---but also visualizing changing structure are difficult problems to solve
Spatial ontologies for architectural heritage
Informatics and artificial intelligence have generated new requirements for digital archiving, information, and documentation. Semantic interoperability has become fundamental for the management and sharing of information. The constraints to data interpretation enable both database interoperability, for data and schemas sharing and reuse, and information retrieval in large datasets. Another challenging issue is the exploitation of automated reasoning possibilities. The solution is the use of domain ontologies as a reference for data modelling in information systems. The architectural heritage (AH) domain is considered in this thesis. The documentation in this field, particularly complex and multifaceted, is well-known to be critical for the preservation, knowledge, and promotion of the monuments. For these reasons, digital inventories, also exploiting standards and new semantic technologies, are developed by international organisations (Getty Institute, ONU, European Union). Geometric and geographic information is essential part of a monument. It is composed by a number of aspects (spatial, topological, and mereological relations; accuracy; multi-scale representation; time; etc.). Currently, geomatics permits the obtaining of very accurate and dense 3D models (possibly enriched with textures) and derived products, in both raster and vector format. Many standards were published for the geographic field or in the cultural heritage domain. However, the first ones are limited in the foreseen representation scales (the maximum is achieved by OGC CityGML), and the semantic values do not consider the full semantic richness of AH. The second ones (especially the core ontology CIDOC – CRM, the Conceptual Reference Model of the Documentation Commettee of the International Council of Museums) were employed to document museums’ objects. Even if it was recently extended to standing buildings and a spatial extension was included, the integration of complex 3D models has not yet been achieved. In this thesis, the aspects (especially spatial issues) to consider in the documentation of monuments are analysed. In the light of them, the OGC CityGML is extended for the management of AH complexity. An approach ‘from the landscape to the detail’ is used, for considering the monument in a wider system, which is essential for analysis and reasoning about such complex objects. An implementation test is conducted on a case study, preferring open source applications
Detecting sociosemantic communities by applying social network analysis in tweets
International audienceVirtual social networks have led to a new way of communication that is different from the oral one, where the restriction of time and space generates new linguistic practices. Twitter, a medium for political and social discussion, can be analyzed to understand new ways of communication and to explore sociosemiotics aspects that come with the use of the hashtags and their relationship with other elements. This paper presents a quantitative study of tweets, around a fixed hashtag, in relation with other contents that users bring to connection. By calculating the frequency of terms, a table of nodes and edges is created to visualize tweets like graphs. Our study applies social network analysis that, going beyond mere topology, reveals relevant sociosemantic communities providing insights for the comparison of social and political movements
Aggregation-based information retrieval system for geospatial data catalogs
Geospatial data catalogs enable users to discover and access geographical information. Prevailing solutions are document oriented and fragment the spatial continuum of the geospatial data into independent and disconnected resources described through metadata. Due to this, the complete answer for a query may be scattered across multiple resources, making its discovery and access more difficult. This paper proposes an improved information retrieval process for geospatial data catalogs that aggregates the search results by identifying the implicit spatial/thematic relations between the metadata records of the resources. These aggregations are constructed in such a way that they match better the user query than each resource individually
Toxicity in Evolving Twitter Topics - Employing a novel Dynamic Topic volution Model (DyTEM) onTwitter data
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThis thesis presents an extensive investigation into the evolution of topics and their association with
speech toxicity on Twitter, based on a large corpus of tweets, providing crucial insights for monitoring
online discourse and potentially informing interventions to combat toxic behavior in digital
communities. A Dynamic Topic Evolution Model (DyTEM) is introduced, constructed by combining
static Topic Modelling techniques and sentence embeddings through the state-of-the-art sentence
transformer, sBERT. The DyTEM, tested and validated on a substantial sample of tweets, is represented
as a directed graph, encapsulating the inherent dynamism of Twitter discussions. For validating the
consistency of DyTEM and providing guidance for hyperparameter selection, a novel, hashtag-based
validation method is proposed. The analysis identifies and scrutinizes five distinct Topic Transition
Types: Topic Stagnation, Topic Merge, Topic Split, Topic Disappearance, and Topic Emergence. A
speech toxicity classification model is employed to delve into the toxicity dynamics within topic
evolution. A standout finding of this study is the positive correlation between topic popularity and its
toxicity, implying that trending or viral topics tend to contain more inflammatory speech. This insight,
along with the methodologies introduced in this study, contributes significantly to the broader
understanding of digital discourse dynamics and could guide future strategies aimed at fostering
healthier and more constructive online spaces
Spatial ontologies for architectural heritage
Informatics and artificial intelligence have generated new requirements for digital archiving, information, and documentation. Semantic interoperability has become fundamental for the management and sharing of information. The constraints to data interpretation enable both database interoperability, for data and schemas sharing and reuse, and information retrieval in large datasets. Another challenging issue is the exploitation of automated reasoning possibilities. The solution is the use of domain ontologies as a reference for data modelling in information systems. The architectural heritage (AH) domain is considered in this thesis. The documentation in this field, particularly complex and multifaceted, is well-known to be critical for the preservation, knowledge, and promotion of the monuments. For these reasons, digital inventories, also exploiting standards and new semantic technologies, are developed by international organisations (Getty Institute, ONU, European Union). Geometric and geographic information is essential part of a monument. It is composed by a number of aspects (spatial, topological, and mereological relations; accuracy; multi-scale representation; time; etc.). Currently, geomatics permits the obtaining of very accurate and dense 3D models (possibly enriched with textures) and derived products, in both raster and vector format. Many standards were published for the geographic field or in the cultural heritage domain. However, the first ones are limited in the foreseen representation scales (the maximum is achieved by OGC CityGML), and the semantic values do not consider the full semantic richness of AH. The second ones (especially the core ontology CIDOC – CRM, the Conceptual Reference Model of the Documentation Commettee of the International Council of Museums) were employed to document museums’ objects. Even if it was recently extended to standing buildings and a spatial extension was included, the integration of complex 3D models has not yet been achieved. In this thesis, the aspects (especially spatial issues) to consider in the documentation of monuments are analysed. In the light of them, the OGC CityGML is extended for the management of AH complexity. An approach ‘from the landscape to the detail’ is used, for considering the monument in a wider system, which is essential for analysis and reasoning about such complex objects. An implementation test is conducted on a case study, preferring open source applications
Culture boundaries in semantic web
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.Culture, being created by any and every of us, is the expression form of the society. We
easily manipulate this term in everyday life, but defining the culture brings a lot of discussions in
between scientists. The most common approach of understanding culture is from anthropologists
(Harris & Johnson, 2006; Tylor, 1871) who associate culture with the common developed complex
pattern of the society life expressed through knowledge, believes, art, morality, laws, traditions and
other features. Approaching extinct cultures all this can be found and interpreted just from
archaeological artefacts. Despite many culture definitions, the spatio-temporal aspect of culture is
brought mostly by archaeologists. All in all the culture and cultural area understandings remain very
fuzzy, though culture area is always formalized as a crispy one. Due to such fuzziness, author would
guess, there was no hurry for cultural area or boundary digitalization as it happened with other
cultural data in Europe within last decades. The cultural boundary question stayed 'taboo' in
semantic web also, that is recently developing for cultural data in order to help to represent the
meaning in a restricted sense. It is therefore in this thesis the culture boundary representation in
semantic web is analyzed
Supporting Methodology Transfer in Visualization Research with Literature-Based Discovery and Visual Text Analytics
[ES] La creciente especialización de la ciencia está motivando la rápida fragmentación
de disciplinas bien establecidas en comunidades interdisciplinares. Esta descom-
posición se puede observar en un tipo de investigación en visualización conocida
como investigación de visualización dirigida por el problema. En ella, equipos de
expertos en visualización y un dominio concreto, colaboran en un área especÃfica
de conocimiento como pueden ser las humanidades digitales, la bioinformática, la
seguridad informática o las ciencias del deporte. Esta tesis propone una serie de
métodos inspirados en avances recientes en el análisis automático de textos y la rep-
resentación del conocimiento para promover la adecuada comunicación y transferen-
cia de conocimiento entre estas comunidades. Los métodos obtenidos se combinaron
en una interfaz de análisis visual de textos orientada al descubrimiento cientÃfico,
GlassViz, que fue diseñada con estos objetivos en mente. La herramienta se probó
por primera vez en el dominio de las humanidades digitales para explorar un corpus
masivo de artÃculos de visualización de propósito general. GlassViz fue adaptada en
un estudio posterior para que soportase diferentes fuentes de datos representativas de
estas comunidades, mostrando evidencia de que el enfoque propuesto también es una
alternativa válida para abordar el problema de la fragmentación en la investigación
en visualización
- …