30,051 research outputs found
Identifying the time profile of everyday activities in the home using smart meter data
Activities are a descriptive term for the common ways households spend their time. Examples include cooking, doing laundry, or socialising. Smart meter data can be used to generate time profiles of activities that are meaningful to householdsā own lived experience. Activities are therefore a lens through which energy feedback to households can be made salient and understandable. This paper demonstrates a multi-step methodology for inferring hourly time profiles of ten household activities using smart meter data, supplemented by individual appliance plug monitors and environmental sensors. First, household interviews, video ethnography, and technology surveys are used to identify appliances and devices in the home, and their roles in specific activities. Second, āontologiesā are developed to map out the relationships between activities and technologies in the home. One or more technologies may indicate the occurrence of certain activities. Third, data from smart meters, plug monitors and sensor data are collected. Smart meter data measuring aggregate electricity use are disaggregated and processed together with the plug monitor and sensor data to identify when and for how long different activities are occurring. Sensor data are particularly useful for activities that are not always associated with an energy-using device. Fourth, the ontologies are applied to the disaggregated data to make inferences on hourly time profiles of ten everyday activities. These include washing, doing laundry, watching TV (reliably inferred), and cleaning, socialising, working (inferred with uncertainties). Fifth, activity time diaries and structured interviews are used to validate both the ontologies and the inferred activity time profiles. Two case study homes are used to illustrate the methodology using data collected as part of a UK trial of smart home technologies. The methodology is demonstrated to produce reliable time profiles of a range of domestic activities that are meaningful to households. The methodology also emphasises the value of integrating coded interview and video ethnography data into both the development of the activity inference process
Enhanced Integrated Scoring for Cleaning Dirty Texts
An increasing number of approaches for ontology engineering from text are
gearing towards the use of online sources such as company intranet and the
World Wide Web. Despite such rise, not much work can be found in aspects of
preprocessing and cleaning dirty texts from online sources. This paper presents
an enhancement of an Integrated Scoring for Spelling error correction,
Abbreviation expansion and Case restoration (ISSAC). ISSAC is implemented as
part of a text preprocessing phase in an ontology engineering system. New
evaluations performed on the enhanced ISSAC using 700 chat records reveal an
improved accuracy of 98% as compared to 96.5% and 71% based on the use of only
basic ISSAC and of Aspell, respectively.Comment: More information is available at
http://explorer.csse.uwa.edu.au/reference
Efficient Discovery of Ontology Functional Dependencies
Poor data quality has become a pervasive issue due to the increasing
complexity and size of modern datasets. Constraint based data cleaning
techniques rely on integrity constraints as a benchmark to identify and correct
errors. Data values that do not satisfy the given set of constraints are
flagged as dirty, and data updates are made to re-align the data and the
constraints. However, many errors often require user input to resolve due to
domain expertise defining specific terminology and relationships. For example,
in pharmaceuticals, 'Advil' \emph{is-a} brand name for 'ibuprofen' that can be
captured in a pharmaceutical ontology. While functional dependencies (FDs) have
traditionally been used in existing data cleaning solutions to model syntactic
equivalence, they are not able to model broader relationships (e.g., is-a)
defined by an ontology. In this paper, we take a first step towards extending
the set of data quality constraints used in data cleaning by defining and
discovering \emph{Ontology Functional Dependencies} (OFDs). We lay out
theoretical and practical foundations for OFDs, including a set of sound and
complete axioms, and a linear inference procedure. We then develop effective
algorithms for discovering OFDs, and a set of optimizations that efficiently
prune the search space. Our experimental evaluation using real data show the
scalability and accuracy of our algorithms.Comment: 12 page
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up approaches
Semantic representation of multimedia information is vital for enabling the kind of multimedia search capabilities that professional searchers require. Manual annotation is often not possible because of the shear scale of the multimedia information that needs indexing. This paper explores the ways in which we are using both top-down, ontologically driven approaches and bottom-up, automatic-annotation approaches to provide retrieval facilities to users. We also discuss many of the current techniques that we are investigating to combine these top-down and bottom-up approaches
Automatic annotation of bioinformatics workflows with biomedical ontologies
Legacy scientific workflows, and the services within them, often present
scarce and unstructured (i.e. textual) descriptions. This makes it difficult to
find, share and reuse them, thus dramatically reducing their value to the
community. This paper presents an approach to annotating workflows and their
subcomponents with ontology terms, in an attempt to describe these artifacts in
a structured way. Despite a dearth of even textual descriptions, we
automatically annotated 530 myExperiment bioinformatics-related workflows,
including more than 2600 workflow-associated services, with relevant
ontological terms. Quantitative evaluation of the Information Content of these
terms suggests that, in cases where annotation was possible at all, the
annotation quality was comparable to manually curated bioinformatics resources.Comment: 6th International Symposium on Leveraging Applications (ISoLA 2014
conference), 15 pages, 4 figure
- ā¦