7,076 research outputs found
The Serums Tool-Chain:Ensuring Security and Privacy of Medical Data in Smart Patient-Centric Healthcare Systems
Digital technology is permeating all aspects of human society and life. This leads to humans becoming highly dependent on digital devices, including upon digital: assistance, intelligence, and decisions. A major concern of this digital dependence is the lack of human oversight or intervention in many of the ways humans use this technology. This dependence and reliance on digital technology raises concerns in how humans trust such systems, and how to ensure digital technology behaves appropriately. This works considers recent developments and projects that combine digital technology and artificial intelligence with human society. The focus is on critical scenarios where failure of digital technology can lead to significant harm or even death. We explore how to build trust for users of digital technology in such scenarios and considering many different challenges for digital technology. The approaches applied and proposed here address user trust along many dimensions and aim to build collaborative and empowering use of digital technologies in critical aspects of human society
Empirical Study of Deep Learning for Text Classification in Legal Document Review
Predictive coding has been widely used in legal matters to find relevant or
privileged documents in large sets of electronically stored information. It
saves the time and cost significantly. Logistic Regression (LR) and Support
Vector Machines (SVM) are two popular machine learning algorithms used in
predictive coding. Recently, deep learning received a lot of attentions in many
industries. This paper reports our preliminary studies in using deep learning
in legal document review. Specifically, we conducted experiments to compare
deep learning results with results obtained using a SVM algorithm on the four
datasets of real legal matters. Our results showed that CNN performed better
with larger volume of training dataset and should be a fit method in the text
classification in legal industry.Comment: 2018 IEEE International Conference on Big Data (Big Data
On the Global Convergence of Continuous-Time Stochastic Heavy-Ball Method for Nonconvex Optimization
We study the convergence behavior of the stochastic heavy-ball method with a
small stepsize. Under a change of time scale, we approximate the discrete
method by a stochastic differential equation that models small random
perturbations of a coupled system of nonlinear oscillators. We rigorously show
that the perturbed system converges to a local minimum in a logarithmic time.
This indicates that for the diffusion process that approximates the stochastic
heavy-ball method, it takes (up to a logarithmic factor) only a linear time of
the square root of the inverse stepsize to escape from all saddle points. This
results may suggest a fast convergence of its discrete-time counterpart. Our
theoretical results are validated by numerical experiments.Comment: accepted at IEEE International Conference on Big Data in 201
Experiments of posture estimation on vehicles using wearable acceleration sensors
In this paper, we study methods to estimate drivers' posture in vehicles
using acceleration data of wearable sensor and conduct a field test. Recently,
sensor technologies have been progressed. Solutions of safety management to
analyze vital data acquired from wearable sensor and judge work status are
proposed. To prevent huge accidents, demands for safety management of bus and
taxi are high. However, acceleration of vehicles is added to wearable sensor in
vehicles, and there is no guarantee to estimate drivers' posture accurately.
Therefore, in this paper, we study methods to estimate driving posture using
acceleration data acquired from T-shirt type wearable sensor hitoe, conduct
field tests and implement a sample application.Comment: 4 pages, 4 figures, The 3rd IEEE International Conference on Big Data
Security on Cloud (BigDataSecurity 2017), pp.14-17, Beijing, May 201
Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server
In last decade, data analytics have rapidly progressed from traditional
disk-based processing to modern in-memory processing. However, little effort
has been devoted at enhancing performance at micro-architecture level. This
paper characterizes the performance of in-memory data analytics using Apache
Spark framework. We use a single node NUMA machine and identify the bottlenecks
hampering the scalability of workloads. We also quantify the inefficiencies at
micro-architecture level for various data analysis workloads. Through empirical
evaluation, we show that spark workloads do not scale linearly beyond twelve
threads, due to work time inflation and thread level load imbalance. Further,
at the micro-architecture level, we observe memory bound latency to be the
major cause of work time inflation.Comment: Accepted to The 5th IEEE International Conference on Big Data and
Cloud Computing (BDCloud 2015
Topic Similarity Networks: Visual Analytics for Large Document Sets
We investigate ways in which to improve the interpretability of LDA topic
models by better analyzing and visualizing their outputs. We focus on examining
what we refer to as topic similarity networks: graphs in which nodes represent
latent topics in text collections and links represent similarity among topics.
We describe efficient and effective approaches to both building and labeling
such networks. Visualizations of topic models based on these networks are shown
to be a powerful means of exploring, characterizing, and summarizing large
collections of unstructured text documents. They help to "tease out"
non-obvious connections among different sets of documents and provide insights
into how topics form larger themes. We demonstrate the efficacy and
practicality of these approaches through two case studies: 1) NSF grants for
basic research spanning a 14 year period and 2) the entire English portion of
Wikipedia.Comment: 9 pages; 2014 IEEE International Conference on Big Data (IEEE BigData
2014
Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm
Twitter is a popular social network platform where users can interact and
post texts of up to 280 characters called tweets. Hashtags, hyperlinked words
in tweets, have increasingly become crucial for tweet retrieval and search.
Using hashtags for tweet topic classification is a challenging problem because
of context dependent among words, slangs, abbreviation and emoticons in a short
tweet along with evolving use of hashtags. Since Twitter generates millions of
tweets daily, tweet analytics is a fundamental problem of Big data stream that
often requires a real-time Distributed processing. This paper proposes a
distributed online approach to tweet topic classification with hashtags. Being
implemented on Apache Storm, a distributed real time framework, our approach
incrementally identifies and updates a set of strong predictors in the Na\"ive
Bayes model for classifying each incoming tweet instance. Preliminary
experiments show promising results with up to 97% accuracy and 37% increase in
throughput on eight processors.Comment: IEEE International Conference on Big Data 201
Proactive Preservation of World Heritage by Crowdsourcing and 3D Reconstruction Technology
Since over one million tourists annually visit the Angkor ruins, the effect on the buildings from the vibrations caused by these tourists is a huge problem for maintaining them. Such organisms as bryophytes, which adhere to the surface of the stones of the ruins, is another factor that damages them. Using crowdsourcing and 3D reconstruction technology, we are organizing a proactive preservation project for the Angkor Thom Bayon Temple, which is a world cultural heritage site. We evaluated its damaged parts and visualized the damaged state.Published in: 2017 IEEE International Conference on Big Data (Big Data) Date of Conference: 11-14 Dec. 2017 Conference Location: Boston, MA, US
- …