4,895 research outputs found
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
Cloud-based digital twinning for structural health monitoring using deep learning
Digital Twin technology has recently gathered pace in the engineering communities as it allows for the convergence of the real structure and its digital counterpart throughout their entire life-cycle. With the rapid development of supporting technologies, including machine learning, 5G/6G, cloud computing, and Internet of Things, Digital Twin has been moving progressively from concept to practice. In this paper, a Digital Twin framework based on cloud computing and deep learning for structural health monitoring is proposed to efficiently perform real-time monitoring and proactive maintenance. The framework consists of structural components, device measurements, and digital models formed by combining different sub-models including mathematical, finite element, and machine learning ones. The data interaction among physical structure, digital model, and human interventions are enhanced by using cloud computing infrastructure and a user-friendly web application. The feasibility of the proposed framework is demonstrated via case studies of damage detection of model bridge and real bridge structures using deep learning algorithms, with high accuracy of 92%
Industry 4.0 for SME
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsIndustry 4.0 has been growing within companies and impacting the economy and society, but this has been a more complex challenge for some types of companies. Due to the costs and complexity associated with Industry 4.0 technologies, small and medium enterprises face difficulties in adopting them.
This thesis proposes to create a model that gives guidance and simplifies how to implement Industry 4.0 in SMEs with a low-cost perspective. It is intended that this model can be used as a blueprint to design and implement an Industry 4.0 project within a manufactory SME.
To create the model, a literature review of the different fields regarding Industry 4.0 were conducted to understand the most suited technologies to leverage within the manufacturing industry and the different use cases where these would be applicable. After the model was built, expert interviews were conducted, and based on the received feedback, the model was tweaked, improved, and validated
Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding
Learning long-term dependencies in extended temporal sequences requires
credit assignment to events far back in the past. The most common method for
training recurrent neural networks, back-propagation through time (BPTT),
requires credit information to be propagated backwards through every single
step of the forward computation, potentially over thousands or millions of time
steps. This becomes computationally expensive or even infeasible when used with
long sequences. Importantly, biological brains are unlikely to perform such
detailed reverse replay over very long sequences of internal states (consider
days, months, or years.) However, humans are often reminded of past memories or
mental states which are associated with the current mental state. We consider
the hypothesis that such memory associations between past and present could be
used for credit assignment through arbitrarily long sequences, propagating the
credit assigned to the current state to the associated past state. Based on
this principle, we study a novel algorithm which only back-propagates through a
few of these temporal skip connections, realized by a learned attention
mechanism that associates current states with relevant past states. We
demonstrate in experiments that our method matches or outperforms regular BPTT
and truncated BPTT in tasks involving particularly long-term dependencies, but
without requiring the biologically implausible backward replay through the
whole history of states. Additionally, we demonstrate that the proposed method
transfers to longer sequences significantly better than LSTMs trained with BPTT
and LSTMs trained with full self-attention.Comment: To appear as a Spotlight presentation at NIPS 201
Visual Tracking by Sampling in Part Space
In this paper, we present a novel part-based visual tracking method from the perspective of probability sampling. Specifically, we represent the target by a part space with two online learned probabilities to capture the structure of the target. The proposal distribution memorizes the historical performance of different parts, and it is used for the first round of part selection. The acceptance probability validates the specific tracking stability of each part in a frame, and it determines whether to accept its vote or to reject it. By doing this, we transform the complex online part selection problem into a probability learning one, which is easier to tackle. The observation model of each part is constructed by an improved supervised descent method and is learned in an incremental manner. Experimental results on two benchmarks demonstrate the competitive performance of our tracker against state-of-the-art methods
CILP: Co-simulation based imitation learner for dynamic resource provisioning in cloud computing environments
Intelligent Virtual Machine (VM) provisioning is central to cost and resource efficient computation in cloud computing environments. As bootstrapping VMs is time-consuming, a key challenge for latency-critical tasks is to predict future workload demands to provision VMs proactively. However, existing AI-based solutions tend to not holistically consider all crucial aspects such as provisioning overheads, heterogeneous VM costs and Quality of Service (QoS) of the cloud system. To address this, we propose a novel method, called CILP, that formulates the VM provisioning problem as two sub-problems of prediction and optimization, where the provisioning plan is optimized based on predicted workload demands. CILP leverages a neural network as a surrogate model to predict future workload demands with a co-simulated digital-twin of the infrastructure to compute QoS scores. We extend the neural network to also act as an imitation learner that dynamically decides the optimal VM provisioning plan. A transformer based neural model reduces training and inference overheads while our novel two-phase decision making loop facilitates in making informed provisioning decisions. Crucially, we address limitations of prior work by including resource utilization, deployment costs and provisioning overheads to inform the provisioning decisions in our imitation learning framework. Experiments with three public benchmarks demonstrate that CILP gives up to 22% higher resource utilization, 14% higher QoS scores and 44% lower execution costs compared to the current online and offline optimization based state-of-the-art methods
The AlfaCrux CubeSat mission description and early results
On 1 April 2022, the AlfaCrux CubeSat was launched by the Falcon 9 Transporter-4 mission, the fourth SpaceX dedicated smallsat rideshare program mission, from Space Launch Complex 40 at Cape Canaveral Space Force Station in Florida into a Sun-synchronous orbit at 500 km. AlfaCrux is an amateur radio and educational mission to provide learning and scientific benefits in the context of small satellite missions. It is an opportunity for theoretical and practical learning about the technical management, systems design, communication, orbital mechanics, development, integration, and operation of small satellites. The AlfaCrux payload, a software-defined radio hardware, is responsible for two main services, which are a digital packet repeater and a store-and-forward system. In the ground segment, a cloud-computing-based command and control station has been developed, together with an open access online platform to access and visualize the main information of the AlfaCrux telemetry and user data and experiments. It also becomes an in-orbit database reference to be used for different studies concerned with, for instance, radio propagation, attitude reconstruction, data-driven calibration algorithms for satellite sensors, among others. In this context, this paper describes the AlfaCrux mission, its main subsystems, and the achievements obtained in the early orbit phase. Scientific and engineering assessments conducted with the spacecraft operations to tackle unexpected behaviors in the ground station and also to better understand the space environment are also presented and discussed.Fundação de Apoio à Pesquisa del Distrito Federal (FAPDF), Brasil | Ref. N/
- …