1,261 research outputs found
Biodiversity studies in the Ningaloo Reef lagoon
As part of the CSIRO Wealth from Oceans Flagship’s Ningaloo Collaboration Cluster program currently underway in Western Australia, this study aims to examine the habitats and biodiversity of lagoonal areas within Ningaloo Reef. Key habitat types were identified using information from hyperspectral remote sensing and were used to develop a stratified sampling approach. Two focal areas were selected, based on sanctuary zones within Ningaloo Marine Park: Osprey Bay in the north and Coral Bay in the central section; an additional site has recently been added at Gnaraloo in the south. A nested sampling programme was initiated within each location, consisting of surveying transects at different spatial scales: cross-reef transects (shore to back-reef) to identify major habitat types and boundaries between habitats; and finer-scale habitat surveys of biodiversity and abundance of different major groups of organisms, focussing on non-scleractinian cnidarians, macroalgae, sponges, echinoderms and molluscs. Three geomorphological categories have been sampled at each location: back-reef, lagoon and inner reefflat. Ground-truthing was carried out on the extent of habitats along defined transects selected to maximize the diversity of each site. A nested quadrat sampling regime was used to validate remotely-sensed data with field-collected data
Storing and Querying Probabilistic XML Using a Probabilistic Relational DBMS
This work explores the feasibility of storing and querying probabilistic XML in a probabilistic relational database. Our approach is to adapt known techniques for mapping XML to relational data such that the possible worlds are preserved. We show that this approach can work for any XML-to-relational technique by adapting a representative schema-based (inlining) as well as a representative schemaless technique (XPath Accelerator). We investigate the maturity of probabilistic rela- tional databases for this task with experiments with one of the state-of- the-art systems, called Trio
Qualitative Effects of Knowledge Rules in Probabilistic Data Integration
One of the problems in data integration is data overlap: the fact that different data sources have data on the same real world entities. Much development time in data integration projects is devoted to entity resolution. Often advanced similarity measurement techniques are used to remove semantic duplicates from the integration result or solve other semantic conflicts, but it proofs impossible to get rid of all semantic problems in data integration. An often-used rule of thumb states that about 90% of the development effort is devoted to solving the remaining 10% hard cases. In an attempt to significantly decrease human effort at data integration time, we have proposed an approach that stores any remaining semantic uncertainty and conflicts in a probabilistic database enabling it to already be meaningfully used. The main development effort in our approach is devoted to defining and tuning knowledge rules and thresholds. Rules and thresholds directly impact the size and quality of the integration result. We measure integration quality indirectly by measuring the quality of answers to queries on the integrated data set in an information retrieval-like way. The main contribution of this report is an experimental investigation of the effects and sensitivity of rule definition and threshold tuning on the integration quality. This proves that our approach indeed reduces development effort — and not merely shifts the effort to rule definition and threshold tuning — by showing that setting rough safe thresholds and defining only a few rules suffices to produce a ‘good enough’ integration that can be meaningfully used
Quality Measures in Uncertain Data Management
Many applications deal with data that is uncertain. Some examples are applications dealing with sensor information, data integration applications and healthcare applications. Instead of these applications having to deal with the uncertainty, it should be the responsibility of the DBMS to manage all data including uncertain data. Several projects do research on this topic. In this paper, we introduce four measures to be used to assess and compare important characteristics of data and systems
User Feedback in Probabilistic XML
Data integration is a challenging problem in many application areas. Approaches mostly attempt to resolve semantic uncertainty and conflicts between information sources as part of the data integration process. In some application areas, this is impractical or even prohibitive, for example, in an ambient environment where devices on an ad hoc basis have to exchange information autonomously. We have proposed a probabilistic XML approach that allows data integration without user involvement by storing semantic uncertainty and conflicts in the integrated XML data. As a\ud
consequence, the integrated information source represents\ud
all possible appearances of objects in the real world, the\ud
so-called possible worlds.\ud
\ud
In this paper, we show how user feedback on query results\ud
can resolve semantic uncertainty and conflicts in the\ud
integrated data. Hence, user involvement is effectively postponed to query time, when a user is already interacting actively with the system. The technique relates positive and\ud
negative statements on query answers to the possible worlds\ud
of the information source thereby either reinforcing, penalizing, or eliminating possible worlds. We show that after repeated user feedback, an integrated information source better resembles the real world and may converge towards a non-probabilistic information source
An architecture and methodology for the design and development of Technical Information Systems
In order to meet demands in the context of Technical Information Systems (TIS) pertaining to reliability, extensibility, maintainability, etc., we have developed an architectural framework with accompanying methodological guidelines for designing such systems. With the framework, we aim at complex multiapplication information systems using a repository to share data among applications. The framework proposes to keep a strict separation between Man-Machine-Interface and Model data, and provides design and implementation support to do this effectively.\ud
The framework and methodological guidelines have been developed in the context of the ESPRIT project IMPRESS. The project also provided for ldquotesting groundsrdquo in the form of a TIS for the Spanish Electricity company Iberdrola.\ud
This work has been conducted within the ESPRIT project IMPRESS (Integrated, Multi-Paradigm, Reliable and Extensible Storage System), ESPRIT No. 635
The Changing role of agriculture in Dutch society
Dutch agriculture has undergone significant changes in the past century, similar to many countries in the European Union. Due to economies of scale and in order to remain economically profitable, it became necessary for farmers to increase farm size, efficiency and external inputs, while minimizing labour use per hectare. The latter has resulted in fewer people working in the agricultural sector. Consequently, Dutch society gradually lost its connection to agricultural production. This divergence resulted in a poor image for the agricultural sector, because of environmental pollution, homogenization of the landscape, outbreaks of contagious animal diseases and reduced animal welfare. Although the general attitude towards agriculture seems to have improved slightly in recent years, there is still a long way to go in regaining this trust. In order to keep the Dutch countryside viable, farmers are considered indispensable. However, their methods of production should match the demands of society in terms of sustainability. This applies both to farming systems that are used in a monofunctional way (production only) and to multifunctional farming systems. For researchers involved in development of these farming systems, this requires new capabilities; contrary to the situation in the past, citizens and stakeholder groups now demand involvement in the design of farming systems. In the current paper, it is suggested that, besides traditional mainstream agriculture, other alternative farming systems should be developed and implemented. Hence, Dutch agricultural research should remain focused on the cutting edge of economy and society. Despite all efforts, not all of these newly developed systems will acquire a position within the agricultural spectrum. However, some of the successful ones may prove extremely valuable
A generic open world named entity disambiguation approach for tweets
Social media is a rich source of information. To make use of this information, it is sometimes required to extract and disambiguate named entities. In this paper we focus on named entity disambiguation (NED) in twitter messages. NED in tweets is challenging in two ways. First, the limited length of Tweet makes it hard to have enough context while many disambiguation techniques depend on it. The second is that many named entities in tweets do not exist in a knowledge base (KB). In this paper we share ideas from information retrieval (IR) and NED to propose solutions for both challenges. For the first problem we make use of the gregarious nature of tweets to get enough context needed for disambiguation. For the second problem we look for an alternative home page if there is no Wikipedia page represents the entity. Given a mention, we obtain a list of Wikipedia candidates from YAGO KB in addition to top ranked pages from Google search engine. We use Support Vector Machine (SVM) to rank the candidate pages to find the best representative entities. Experiments conducted on two data sets show better disambiguation results compared with the baselines and a competitor
- …