2,953 research outputs found
Recommended from our members
Security, Privacy, and Transparency Guarantees for Machine Learning Systems
Machine learning (ML) is transforming a wide range of applications, promising to bring immense economic and social benefits. However, it also raises substantial security, privacy, and transparency challenges. ML workloads indeed push companies toward aggressive data collection and loose data access policies, placing troves of sensitive user information at risk if the company is hacked. ML also introduces new attack vectors, such as adversarial example attacks, which can completely nullify models’ accuracy under attack. Finally, ML models make complex data-driven decisions, which are opaque to the end-users, and difficult to inspect for programmers. In this dissertation we describe three systems we developed. Each system addresses a dimension of the previous challenges, by combining new practical systems techniques with rigorous theory to achieve a guaranteed level of protection, and make systems easier to understand. First we present Sage, a differentially private ML platform that enforces a meaningful protection semantic for the troves of personal information amassed by today’s companies. Second we describe PixelDP, a defense against adversarial examples that leverages differential privacy theory to provide a guaranteed level of accuracy under attack. Third we introduce Sunlight, a tool to enhance the transparency of opaque targeting services, using rigorous causal inference theory to explain targeting decisions to end-users
Big Data Now, 2015 Edition
Now in its fifth year, O’Reilly’s annual Big Data Now report recaps the trends, tools, applications, and forecasts we’ve talked about over the past year. For 2015, we’ve included a collection of blog posts, authored by leading thinkers and experts in the field, that reflect a unique set of themes we’ve identified as gaining significant attention and traction.
Our list of 2015 topics include:
Data-driven cultures
Data science
Data pipelines
Big data architecture and infrastructure
The Internet of Things and real time
Applications of big data
Security, ethics, and governance
Is your organization on the right track? Get a hold of this free report now and stay in tune with the latest significant developments in big data
Space Station Human Factors Research Review. Volume 1: EVA Research and Development
An overview is presented of extravehicular activity (EVA) research and development activities at Ames. The majority of the program was devoted to presentations by the three contractors working in parallel on the EVA System Phase A Study, focusing on Implications for Man-Systems Design. Overhead visuals are included for a mission results summary, space station EVA requirements and interface accommodations summary, human productivity study cross-task coordination, and advanced EVAS Phase A study implications for man-systems design. Articles are also included on subsea approach to work systems development and advanced EVA system design requirements
TOWARD ASSURANCE AND TRUST FOR THE INTERNET OF THINGS
Kevin Ashton first used the term Internet of Things (IoT) in 1999 to describe a system in which objects in the physical world could be connected to the Internet by sensors. Since the inception of the term, the total number of Internet-connected devices has skyrocketed, resulting in their integration into every sector of society. Along with the convenience and functionality IoT devices introduce, there is serious concern regarding security, and the IoT security market has been slow to address fundamental security gaps. This dissertation explores some of these challenges in detail and proposes solutions that could make the IoT more secure. Because the challenges in IoT are broad, this work takes a broad view of securing the IoT.
Each chapter in this dissertation explores particular aspects of security and privacy of the IoT, and introduces approaches to address them. We outline security threats related to IoT. We outline trends in the IoT market and explore opportunities to apply machine learning to protect IoT. We developed an IoT testbed to support IoT machine learning research. We propose a Connected Home Automated Security Monitor (CHASM) system that prevents devices from becoming invisible and uses machine learning to improve the security of the connected home and other connected domains. We extend the machine learning algorithms in CHASM to the network perimeter via a novel IoT edge sensor device. We assess the ways in which cybersecurity analytics will need to evolve and identify the potential role of government in promoting needed changes due to IoT adoptions. We applied supervised learning and deep learning classifiers to an IoT network connection log dataset to effectively identify varied botnet activity. We proposed a methodology, based on trust metrics and Delphic and Analytic Hierarchical Processes, to identify vulnera¬bilities in a supply chain and better quantify risk. We built a voice assistant for cyber in response to the increased rigor and associated cognitive load needed to maintain and protect IoT networks
Data management and Data Pipelines: An empirical investigation in the embedded systems domain
Context: Companies are increasingly collecting data from all possible sources to extract insights that help in data-driven decision-making. Increased data volume, variety, and velocity and the impact of poor quality data on the development of data products are leading companies to look for an improved data management approach that can accelerate the development of high-quality data products. Further, AI is being applied in a growing number of fields, and thus it is evolving as a horizontal technology. Consequently, AI components are increasingly been integrated into embedded systems along with electronics and software. We refer to these systems as AI-enhanced embedded systems. Given the strong dependence of AI on data, this expansion also creates a new space for applying data management techniques. Objective: The overall goal of this thesis is to empirically identify the data management challenges encountered during the development and maintenance of AI-enhanced embedded systems, propose an improved data management approach and empirically validate the proposed approach.Method: To achieve the goal, we conducted this research in close collaboration with Software Center companies using a combination of different empirical research methods: case studies, literature reviews, and action research.Results and conclusions: This research provides five main results. First, it identifies key data management challenges specific to Deep Learning models developed at embedded system companies. Second, it examines the practices such as DataOps and data pipelines that help to address data management challenges. We observed that DataOps is the best data management practice that improves the data quality and reduces the time tdevelop data products. The data pipeline is the critical component of DataOps that manages the data life cycle activities. The study also provides the potential faults at each step of the data pipeline and the corresponding mitigation strategies. Finally, the data pipeline model is realized in a small piece of data pipeline and calculated the percentage of saved data dumps through the implementation.Future work: As future work, we plan to realize the conceptual data pipeline model so that companies can build customized robust data pipelines. We also plan to analyze the impact and value of data pipelines in cross-domain AI systems and data applications. We also plan to develop AI-based fault detection and mitigation system suitable for data pipelines
- …