47,151 research outputs found
21st century learning infrastructure project
This Research Project was a practical experience for gaining knowledge on web design, streaming content delivery and the creation of online, educational objects and environments. While working for the Information Technology Department for the State of Iowa, this author was involved with the 21st Century Learning Infrastructure Pilot Project, an attempt to provide learners in Iowa with educational content on demand. This content includes streaming video and audio of various content, computer based training objects, web pages, pictures, interactions with each other, etc. This author\u27s responsibilities to the initial Pilot Project were that of lead web designer, creating five web sites in all, streaming media producer and learning management system assistant
A methodology for full-system power modeling in heterogeneous data centers
The need for energy-awareness in current data centers has encouraged the use of power modeling to estimate their power consumption. However, existing models present noticeable limitations, which make them application-dependent, platform-dependent, inaccurate, or computationally complex. In this paper, we propose a platform-and application-agnostic methodology for full-system power modeling in heterogeneous data centers that overcomes those limitations. It derives a single model per platform, which works with high accuracy for heterogeneous applications with different patterns of resource usage and energy consumption, by systematically selecting a minimum set of resource usage indicators and extracting complex relations among them that capture the impact on energy consumption of all the resources in the system. We demonstrate our methodology by generating power models for heterogeneous platforms with very different power consumption profiles. Our validation experiments with real Cloud applications show that such models provide high accuracy (around 5% of average estimation error).This work is supported by the Spanish Ministry of Economy and Competitiveness under contract TIN2015-65316-P, by the Gener-
alitat de Catalunya under contract 2014-SGR-1051, and by the European Commission under FP7-SMARTCITIES-2013 contract 608679 (RenewIT) and FP7-ICT-2013-10 contracts 610874 (AS- CETiC) and 610456 (EuroServer).Peer ReviewedPostprint (author's final draft
Deep Learning in the Automotive Industry: Applications and Tools
Deep Learning refers to a set of machine learning techniques that utilize
neural networks with many hidden layers for tasks, such as image
classification, speech recognition, language understanding. Deep learning has
been proven to be very effective in these domains and is pervasively used by
many Internet services. In this paper, we describe different automotive uses
cases for deep learning in particular in the domain of computer vision. We
surveys the current state-of-the-art in libraries, tools and infrastructures
(e.\,g.\ GPUs and clouds) for implementing, training and deploying deep neural
networks. We particularly focus on convolutional neural networks and computer
vision use cases, such as the visual inspection process in manufacturing plants
and the analysis of social media data. To train neural networks, curated and
labeled datasets are essential. In particular, both the availability and scope
of such datasets is typically very limited. A main contribution of this paper
is the creation of an automotive dataset, that allows us to learn and
automatically recognize different vehicle properties. We describe an end-to-end
deep learning application utilizing a mobile app for data collection and
process support, and an Amazon-based cloud backend for storage and training.
For training we evaluate the use of cloud and on-premises infrastructures
(including multiple GPUs) in conjunction with different neural network
architectures and frameworks. We assess both the training times as well as the
accuracy of the classifier. Finally, we demonstrate the effectiveness of the
trained classifier in a real world setting during manufacturing process.Comment: 10 page
Predicting Session Length in Media Streaming
Session length is a very important aspect in determining a user's
satisfaction with a media streaming service. Being able to predict how long a
session will last can be of great use for various downstream tasks, such as
recommendations and ad scheduling. Most of the related literature on user
interaction duration has focused on dwell time for websites, usually in the
context of approximating post-click satisfaction either in search results, or
display ads. In this work we present the first analysis of session length in a
mobile-focused online service, using a real world data-set from a major music
streaming service. We use survival analysis techniques to show that the
characteristics of the length distributions can differ significantly between
users, and use gradient boosted trees with appropriate objectives to predict
the length of a session using only information available at its beginning. Our
evaluation on real world data illustrates that our proposed technique
outperforms the considered baseline.Comment: 4 pages, 3 figure
Information Extraction in Illicit Domains
Extracting useful entities and attribute values from illicit domains such as
human trafficking is a challenging problem with the potential for widespread
social impact. Such domains employ atypical language models, have `long tails'
and suffer from the problem of concept drift. In this paper, we propose a
lightweight, feature-agnostic Information Extraction (IE) paradigm specifically
designed for such domains. Our approach uses raw, unlabeled text from an
initial corpus, and a few (12-120) seed annotations per domain-specific
attribute, to learn robust IE models for unobserved pages and websites.
Empirically, we demonstrate that our approach can outperform feature-centric
Conditional Random Field baselines by over 18\% F-Measure on five annotated
sets of real-world human trafficking datasets in both low-supervision and
high-supervision settings. We also show that our approach is demonstrably
robust to concept drift, and can be efficiently bootstrapped even in a serial
computing environment.Comment: 10 pages, ACM WWW 201
Traffic event detection framework using social media
This is an accepted manuscript of an article published by IEEE in 2017 IEEE International Conference on Smart Grid and Smart Cities (ICSGSC) on 18/09/2017, available online: https://ieeexplore.ieee.org/document/8038595
The accepted version of the publication may differ from the final published version.© 2017 IEEE. Traffic incidents are one of the leading causes of non-recurrent traffic congestions. By detecting these incidents on time, traffic management agencies can activate strategies to ease congestion and travelers can plan their trip by taking into consideration these factors. In recent years, there has been an increasing interest in Twitter because of the real-time nature of its data. Twitter has been used as a way of predicting revenues, accidents, natural disasters, and traffic. This paper proposes a framework for the real-time detection of traffic events using Twitter data. The methodology consists of a text classification algorithm to identify traffic related tweets. These traffic messages are then geolocated and further classified into positive, negative, or neutral class using sentiment analysis. In addition, stress and relaxation strength detection is performed, with the purpose of further analyzing user emotions within the tweet. Future work will be carried out to implement the proposed framework in the West Midlands area, United Kingdom.Published versio
Crowdbreaks: Tracking Health Trends using Public Social Media Data and Crowdsourcing
In the past decade, tracking health trends using social media data has shown
great promise, due to a powerful combination of massive adoption of social
media around the world, and increasingly potent hardware and software that
enables us to work with these new big data streams. At the same time, many
challenging problems have been identified. First, there is often a mismatch
between how rapidly online data can change, and how rapidly algorithms are
updated, which means that there is limited reusability for algorithms trained
on past data as their performance decreases over time. Second, much of the work
is focusing on specific issues during a specific past period in time, even
though public health institutions would need flexible tools to assess multiple
evolving situations in real time. Third, most tools providing such capabilities
are proprietary systems with little algorithmic or data transparency, and thus
little buy-in from the global public health and research community. Here, we
introduce Crowdbreaks, an open platform which allows tracking of health trends
by making use of continuous crowdsourced labelling of public social media
content. The system is built in a way which automatizes the typical workflow
from data collection, filtering, labelling and training of machine learning
classifiers and therefore can greatly accelerate the research process in the
public health domain. This work introduces the technical aspects of the
platform and explores its future use cases
- …