Search CORE

2,261 research outputs found

Geographical information retrieval with ontologies of place

Author: A. Tversky
B. Smith
B. Smith
C. B. Jones
D. Tudhope
D. Walker
H. Cuoclelis
J. H. Lee
K. Beard
M. A. Rodriguez
M. Agosti
M.R. Curry
N. Guarino
N. Guarino
P. Gould
P. Harpring
R. R. Larson
R. Rada
Y. W. Kirn
Publication venue
Publication date: 01/01/2001
Field of study

Geographical context is required of many information retrieval tasks in which the target of the search may be documents, images or records which are referenced to geographical space only by means of place names. Often there may be an imprecise match between the query name and the names associated with candidate sources of information. There is a need therefore for geographical information retrieval facilities that can rank the relevance of candidate information with respect to geographical closeness of place as well as semantic closeness with respect to the information of interest. Here we present an ontology of place that combines limited coordinate data with semantic and qualitative spatial relationships between places. This parsimonious model of geographical place supports maintenance of knowledge of place names that relate to extensive regions of the Earth at multiple levels of granularity. The ontology has been implemented with a semantic modelling system linking non-spatial conceptual hierarchies with the place ontology. An hierarchical spatial distance measure is combined with Euclidean distance between place centroids to create a hybrid spatial distance measure. This is integrated with thematic distance, based on classification semantics, to create an integrated semantic closeness measure that can be used for a relevance ranking of retrieved objects

CiteSeerX

Southampton (e-Prints Soton)

Crossref

University of South Wales Research Explorer

Open Research Online (The Open University)

Open City Data Pipeline

Author: Bischof Stefan
Harth Andreas
Kämpgen Benedikt
Polleres Axel
Schneider Patrik
Publication venue: 'Elsevier BV'
Publication date: 01/02/2017
Field of study

Statistical data about cities, regions and at country level is collected for various purposes and from various institutions. Yet, while access to high quality and recent such data is crucial both for decision makers as well as for the public, all to often such collections of data remain isolated and not re-usable, let alone properly integrated. In this paper we present the Open City Data Pipeline, a focused attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and republish this data in a reusable manner as Linked Data. The main feature of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques as well as ontological reasoning over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such imputations per indicator. Additionally, (iv) we make the integrated and enriched data available both in a we browser interface and as machine-readable Linked Data, using standard vocabularies such as QB and PROV, and linking to e.g. DBpedia. Lastly, in an exhaustive evaluation of our approach, we compare our enrichment and cleansing techniques to a preliminary version of the Open City Data Pipeline presented at ISWC2015: firstly, we demonstrate that the combination of equational knowledge and standard machine learning techniques significantly helps to improve the quality of our missing value imputations; secondly, we arguable show that the more data we integrate, the more reliable our predictions become. Hence, over time, the Open City Data Pipeline shall provide a sustainable effort to serve Linked Data about cities in increasing quality.Series: Working Papers on Information Systems, Information Business and Operation

Elektronische Publikationen der Wirtschaftsuniversität Wien

Seeking ‘the New Normal’? Troubled spaces of encountering visible differences in Warsaw

Author: Piekut A.
Valentine G.
Vieten U.M.
Publication venue: Polskie Towarzystwo Socjologiczne (Polish Sociological Association)
Publication date: 01/01/2014
Field of study

In times of globalisation and super-mobility, ideas of normality are in turmoil. In different societies in, across and beyond Europe, we face the challenge of undoing specific notions of normality and creating more inclusive societies with an open culture of learning to live with differences. The scope of the paper is to introduce some findings on encounters with difference and negotiations of social values in relation to a growing visibility of difference after 1989 in Poland, on the background of a critique of normality/ normalisation and normalcy. On the basis of interviews conducted in Warsaw, we investigate how normality/ normalisation discourses of visible homosexuality and physical disability are incorporated into individual self-reflections and justifications of prejudices (homophobia and disabilism). More specifically we argue that there are moments of ‘cultural transgressions’ present in everyday practices towards ‘visible’ sexual and (dis)ability difference

Queen's University Belfast Research Portal

Biblioteka Nauki - repozytorium artykuÅÃ³w

White Rose Research Online

An integrated approach to deliver OLAP for multidimensional Semantic Web Databases

Author: Matei Adriana P.
Publication venue
Publication date: 01/01/2015
Field of study

Semantic Webs (SW) and web data have become increasingly important sources to support Business Intelligence (BI), but they are difficult to manage due to the exponential increase in their volumes, inconsistency in semantics and complexity in representations. On-Line Analytical Processing (OLAP) is an important tool in analysing large and complex BI data, but it lacks the capability of processing disperse SW data due to the nature of its design. A new concept with a richer vocabulary than the existing ones for OLAP is needed to model distributed multidimensional semantic web databases. A new OLAP framework is developed, with multiple layers including additional vocabulary, extended OLAP operators, and usage of SPARQL to model heterogeneous semantic web data, unify multidimensional structures, and provide new enabling functions for interoperability. The framework is presented with examples to demonstrate its capability to unify existing vocabularies with additional vocabulary elements to handle both informational and topological data in Graph OLAP. The vocabularies used in this work are: the RDF Cube Vocabulary (QB) – proposed by the W3C to allow multi-dimensional, mostly statistical, data to be published in RDF; and the QB4OLAP – a QB extension introducing standard OLAP operators. The framework enables the composition of multiple databases (e.g. energy consumptions and property market values etc.) to generate observations through semantic pipe-like operators. This approach is demonstrated through Use Cases containing highly valuable data collected from a real-life environment. Its usability is proved through the development and usage of semantic pipe-like operators able to deliver OLAP specific functionalities. To the best of my knowledge there is no available data modelling approach handling both informational and topological Semantic Web data, which is designed either to provide OLAP capabilities over Semantic Web databases or to provide a means to connect such databases for further OLAP analysis. The thesis proposes that the presented work provides a wider understanding of: ways to access Semantic Web data; ways to build specialised Semantic Web databases, and, how to enrich them with powerful capabilities for further Business Intelligence

Coventry University Pure Portal

Words and their secrets

Author: Finatto Maria José Bocorny
Santos Diana
Publication venue
Publication date: 01/01/2010
Field of study

Repositório Comum

Discovering new kinds of patient safety incidents

Author: Bentham James
Bentham James
Publication venue: Mathematics, Imperial College London
Publication date: 01/09/2010
Field of study

Every year, large numbers of patients in National Health Service (NHS) care suffer because of a patient safety incident. The National Patient Safety Agency (NPSA) collects large amounts of data describing individual incidents. As well as being described by categorical and numerical variables, each incident is described using free text. The aim of the work was to find quite small groups of similar incidents, which were of types that were previously unknown to the NPSA. A model of the text was produced, such that the position of each incident reflected its meaning to the greatest extent possible. The basic model was the vector space model. Dimensionality reduction was carried out in two stages: unsupervised dimensionality reduction was carried out using principal component analysis, and supervised dimensionality reduction using linear discriminant analysis. It was then possible to look for groups of incidents that were more tightly packed than would be expected given the overall distribution of the incidents. The process for assessing these groups had three stages. Firstly, a quantitative measure was used, allowing a large number of parameter combinations to be examined. The groups found for an ‘optimum’ parameter combination were then divided into categories using a qualitative filtering method. Finally, clinical experts assessed the groups qualitatively. The transition probabilities model was also examined: this model was based on the empirical probabilities that two word sequences were seen in the text. An alternative method for dimensionality reduction was to use information about the subjective meaning of a small sample of incidents elicited from experts, producing a mapping between high and low dimensional models of the text. The analysis also included the direct use of the categorical variables to model the incidents, and empirical analysis of the behaviour of high dimensional spaces

Spiral - Imperial College Digital Repository

Recommended from our members

Changes in protein levels as markers of severe disease: an investigation of severe malaria

Author: Gitau Evelyn Nungari
Publication venue
Publication date: 15/05/2008
Field of study

Compounds directly involved in the pathogenesis of cerebral malaria (CM) remain unclear due to lack of robust methods of identifying and quantifying proteins expressed in low abundance. New developments in proteomics have now made it possible to identify low abundant proteins and provided new tools for studying host-parasite interactions. With these new tools, it may be possible to identify proteomic signatures for patients with various complications associated with severe malaria. A global proteomic strategy was used to identify differentially expressed proteins in archived plasma and CSF drawn from children diagnosed with cerebral malaria (CM) compared to those with acute bacterial meningitis (ABM) and slide negative encephalopathy (EN). Samples were first separated using two-dimensional gel electrophoresis (2-DE) or two-dimensional liquid chromatography (2D-LC) and analysed using mass spectrometry. The data collected was analyzed using various bio-informatics tools. Finally, a CM mass profile was created using MALDI-ToF mass spectrometry. Averages of about 150 spots per gel were resolved from CSF from CM and EN patients and 80 spots from ABM patients. In the gels from the CM and EN groups, 45 human proteins were found whilst 20 human proteins were unique to ABM compared to CM. For CSF, a total of 202 human proteins were identified using the 2D-LC system. Of these 13 were unique to CM, 124 to ABM and 32 to EN. 6 proteins were found in both CM and ABM and 18 were found in EN and ABM. 9 proteins were common to all 3 disease groups. A total of 66 P. falciparum proteins were identified but of these 48 were hypothetical proteins. Of the non-hypothetical proteins, 2 were found in both CM and ABM and the rest were found only in ABM. Results show that proteomics can be used to create protein profiles of different disease groups. Majority of the human proteins identified by 2-DE were found to be high abundant proteins found in CSF and plasma. The use of 2D-LC enabled the identification of more low abundant proteins but some of the P. falciparum proteins identified by 2-DE were not seen in the 2D-LC method. Majority of the human proteins found were acute phase response plasma proteins including common circulating proteins such as albumin and apolipoproteins, blood transporters and binding proteins, protease inhibitors, enzymes, cytokines and hormones, and channel and receptor-derived proteins. There seems to be a correlation between the number of proteins found in the CSF and the level of blood brain barrier break down

Open Research Online (The Open University)

Enhancing Deep Learning Models through Tensorization: A Comprehensive Survey and Framework

Author: Helal Manal
Publication venue
Publication date: 09/10/2023
Field of study

The burgeoning growth of public domain data and the increasing complexity of deep learning model architectures have underscored the need for more efficient data representation and analysis techniques. This paper is motivated by the work of (Helal, 2023) and aims to present a comprehensive overview of tensorization. This transformative approach bridges the gap between the inherently multidimensional nature of data and the simplified 2-dimensional matrices commonly used in linear algebra-based machine learning algorithms. This paper explores the steps involved in tensorization, multidimensional data sources, various multiway analysis methods employed, and the benefits of these approaches. A small example of Blind Source Separation (BSS) is presented comparing 2-dimensional algorithms and a multiway algorithm in Python. Results indicate that multiway analysis is more expressive. Contrary to the intuition of the dimensionality curse, utilising multidimensional datasets in their native form and applying multiway analysis methods grounded in multilinear algebra reveal a profound capacity to capture intricate interrelationships among various dimensions while, surprisingly, reducing the number of model parameters and accelerating processing. A survey of the multi-away analysis methods and integration with various Deep Neural Networks models is presented using case studies in different application domains.Comment: 34 pages, 8 figures, 4 table

arXiv.org e-Print Archive