5,235 research outputs found
An Expressive Language and Efficient Execution System for Software Agents
Software agents can be used to automate many of the tedious, time-consuming
information processing tasks that humans currently have to complete manually.
However, to do so, agent plans must be capable of representing the myriad of
actions and control flows required to perform those tasks. In addition, since
these tasks can require integrating multiple sources of remote information ?
typically, a slow, I/O-bound process ? it is desirable to make execution as
efficient as possible. To address both of these needs, we present a flexible
software agent plan language and a highly parallel execution system that enable
the efficient execution of expressive agent plans. The plan language allows
complex tasks to be more easily expressed by providing a variety of operators
for flexibly processing the data as well as supporting subplans (for
modularity) and recursion (for indeterminate looping). The executor is based on
a streaming dataflow model of execution to maximize the amount of operator and
data parallelism possible at runtime. We have implemented both the language and
executor in a system called THESEUS. Our results from testing THESEUS show that
streaming dataflow execution can yield significant speedups over both
traditional serial (von Neumann) as well as non-streaming dataflow-style
execution that existing software and robot agent execution systems currently
support. In addition, we show how plans written in the language we present can
represent certain types of subtasks that cannot be accomplished using the
languages supported by network query engines. Finally, we demonstrate that the
increased expressivity of our plan language does not hamper performance;
specifically, we show how data can be integrated from multiple remote sources
just as efficiently using our architecture as is possible with a
state-of-the-art streaming-dataflow network query engine
Exploring Terms and Taxonomies Relating to the Cyber International Relations Research Field: or are "Cyberspace" and "Cyber Space" the same?
This project has at least two facets to it: (1) advancing the algorithms in the sub-field of bibliometrics often referred to as "text mining" whereby hundreds of thousands of documents (such as journal articles) are scanned and relationships amongst words and phrases are established and (2) applying these tools in support of the Explorations in Cyber International Relations (ECIR) research effort. In international relations, it is important that all the parties understand each other. Although dictionaries, glossaries, and other sources tell you what words/phrases are supposed to mean (somewhat complicated by the fact that they often contradict each other), they do not tell you how people are actually using them.
As an example, when we started, we assumed that "cyberspace" and "cyber space" were essentially the same word with just a minor variation in punctuation (i.e., the space, or lack thereof, between "cyber" and "space") and that the choice of the punctuation was a rather random occurrence. With that assumption in mind, we would expect that the taxonomies that would be constructed by our algorithms using "cyberspace" and "cyber space" as seed terms would be basically the same. As it turned out, they were quite different, both in overall shape and groupings within the taxonomy.
Since the overall field of cyber international relations is so new, understanding the field and how people think about (as evidenced by their actual usage of terminology, and how usage changes over time) is an important goal as part of the overall ECIR project
Ubiquitous intelligence for smart cities: a public safety approach
Citizen-centered safety enhancement is an integral component of public safety and a top priority for decision makers in a smart city development. However, public safety agencies are constantly faced with the challenge of deterring crime. While most smart city initiatives have placed emphasis on the use of modern technology for fighting crime, this may not be sufficient to achieve a sustainable safe and smart city in a resource constrained environment, such as in Africa. In particular, crime series which is a set of crimes considered to have been committed by the same offender is currently less explored in developing nations and has great potential in helping to fight against crime and promoting safety in smart cities. This research focuses on detecting the situation of crime through data mining approaches that can be used to promote citizens' safety, and assist security agencies in knowledge-driven decision support, such as crime series identification. While much research has been conducted on crime hotspots, not enough has been done in the area of identifying crime series. This thesis presents a novel crime clustering model, CriClust, for crime series pattern (CSP) detection and mapping to derive useful knowledge from a crime dataset, drawing on sound scientific and mathematical principles, as well as assumptions from theories of environmental criminology. The analysis is augmented using a dual-threshold model, and pattern prevalence information is encoded in similarity graphs. Clusters are identified by finding highly-connected subgraphs using adaptive graph size and Monte-Carlo heuristics in the Karger-Stein mincut algorithm. We introduce two new interest measures: (i) Proportion Difference Evaluation (PDE), which reveals the propagation effect of a series and dominant series; and (ii) Pattern Space Enumeration (PSE), which reveals underlying strong correlations and defining features for a series. Our findings on experimental quasi-real data set, generated based on expert knowledge recommendation, reveal that identifying CSP and statistically interpretable patterns could contribute significantly to strengthening public safety service delivery in a smart city development. Evaluation was conducted to investigate: (i) the reliability of the model in identifying all inherent series in a crime dataset; (ii) the scalability of the model with varying crime records volume; and (iii) unique features of the model compared to competing baseline algorithms and related research. It was found that Monte Carlo technique and adaptive graph size mechanism for crime similarity clustering yield substantial improvement. The study also found that proportion estimation (PDE) and PSE of series clusters can provide valuable insight into crime deterrence strategies. Furthermore, visual enhancement of clusters using graphical approaches to organising information and presenting a unified viable view promotes a prompt identification of important areas demanding attention. Our model particularly attempts to preserve desirable and robust statistical properties. This research presents considerable empirical evidence that the proposed crime cluster (CriClust) model is promising and can assist in deriving useful crime pattern knowledge, contributing knowledge services for public safety authorities and intelligence gathering organisations in developing nations, thereby promoting a sustainable "safe and smart" city
When Things Matter: A Data-Centric View of the Internet of Things
With the recent advances in radio-frequency identification (RFID), low-cost
wireless sensor devices, and Web technologies, the Internet of Things (IoT)
approach has gained momentum in connecting everyday objects to the Internet and
facilitating machine-to-human and machine-to-machine communication with the
physical world. While IoT offers the capability to connect and integrate both
digital and physical entities, enabling a whole new class of applications and
services, several significant challenges need to be addressed before these
applications and services can be fully realized. A fundamental challenge
centers around managing IoT data, typically produced in dynamic and volatile
environments, which is not only extremely large in scale and volume, but also
noisy, and continuous. This article surveys the main techniques and
state-of-the-art research efforts in IoT from data-centric perspectives,
including data stream processing, data storage models, complex event
processing, and searching in IoT. Open research issues for IoT data management
are also discussed
Traffic event detection framework using social media
This is an accepted manuscript of an article published by IEEE in 2017 IEEE International Conference on Smart Grid and Smart Cities (ICSGSC) on 18/09/2017, available online: https://ieeexplore.ieee.org/document/8038595
The accepted version of the publication may differ from the final published version.© 2017 IEEE. Traffic incidents are one of the leading causes of non-recurrent traffic congestions. By detecting these incidents on time, traffic management agencies can activate strategies to ease congestion and travelers can plan their trip by taking into consideration these factors. In recent years, there has been an increasing interest in Twitter because of the real-time nature of its data. Twitter has been used as a way of predicting revenues, accidents, natural disasters, and traffic. This paper proposes a framework for the real-time detection of traffic events using Twitter data. The methodology consists of a text classification algorithm to identify traffic related tweets. These traffic messages are then geolocated and further classified into positive, negative, or neutral class using sentiment analysis. In addition, stress and relaxation strength detection is performed, with the purpose of further analyzing user emotions within the tweet. Future work will be carried out to implement the proposed framework in the West Midlands area, United Kingdom.Published versio
Visualization of Big Spatial Data using Coresets for Kernel Density Estimates
The size of large, geo-located datasets has reached scales where
visualization of all data points is inefficient. Random sampling is a method to
reduce the size of a dataset, yet it can introduce unwanted errors. We describe
a method for subsampling of spatial data suitable for creating kernel density
estimates from very large data and demonstrate that it results in less error
than random sampling. We also introduce a method to ensure that thresholding of
low values based on sampled data does not omit any regions above the desired
threshold when working with sampled data. We demonstrate the effectiveness of
our approach using both, artificial and real-world large geospatial datasets
Work in progress: Data explorer - Assessment data integration, analytics, and visualization for STEM education research
Citation: Weese, J. L., & Hsu, W. H. (2016). Work in progress: Data explorer - Assessment data integration, analytics, and visualization for STEM education research.We describe a comprehensive system for comparative evaluation of uploaded and preprocessed data in physics education research with applicability to standardized assessments for discipline-based education research, especially in science, technology, mathematics, and engineering. Views are provided for inspection of aggregate statistics about student scores, comparison over time within one course, or comparison across multiple years. The design of this system includes a search facility for retrieving anonymized data from classes similar to the uploader's own. These visualizations include tracking of student performance on a range of standardized assessments. These assessments can be viewed as pre- and post-tests with comparative statistics (e.g., normalized gain), decomposed by answer in the case of multiple-choice questions, and manipulated using pre-specified data transformations such as aggregation and refinement (drill down and roll up). Furthermore, the system is designed to incorporate a scalable framework for machine learning-based analytics, including clustering and similarity-based retrieval, time series prediction, and probabilistic reasoning. © American Society for Engineering Education, 2016
- …