779 research outputs found

    How to Price Shared Optimizations in the Cloud

    Full text link
    Data-management-as-a-service systems are increasingly being used in collaborative settings, where multiple users access common datasets. Cloud providers have the choice to implement various optimizations, such as indexing or materialized views, to accelerate queries over these datasets. Each optimization carries a cost and may benefit multiple users. This creates a major challenge: how to select which optimizations to perform and how to share their cost among users. The problem is especially challenging when users are selfish and will only report their true values for different optimizations if doing so maximizes their utility. In this paper, we present a new approach for selecting and pricing shared optimizations by using Mechanism Design. We first show how to apply the Shapley Value Mechanism to the simple case of selecting and pricing additive optimizations, assuming an offline game where all users access the service for the same time-period. Second, we extend the approach to online scenarios where users come and go. Finally, we consider the case of substitutive optimizations. We show analytically that our mechanisms induce truth- fulness and recover the optimization costs. We also show experimentally that our mechanisms yield higher utility than the state-of-the-art approach based on regret accumulation.Comment: VLDB201

    Creation and web integration of a machine learning tool for estimating the price of second-hand devices

    Get PDF
    Amb els ràpids avenços tecnològics, tant les persones com les empreses renoven els seus dispositius molt sovint. Avui dia, molts d'aquests dispositius són prematurament reciclats. Però, malgrat ser obsolets, encara tenen valor i poden convertir-se en dispositius de segona mà en comptes de ser directament reciclats, el que estalvia més recursos. Existeixen organitzacions que reben centenars o milers de dispositius i han de decidir per cadascun si cal reciclar-lo o guardar-lo per a donar-li un altre ús. La decisió requereix saber el valor del dispositiu i el cost que tindria guardar-lo fins que algú el compri. A més, pel fet que els dispositius són configurables, també cal tenir en compte si és bona idea extreure components d'altres dispositius per a reparacions o millorares a d'altres, el que augmenta la complexitat del problema. Aquest projecte proporciona una eina per a estimar el preu que té un dispositiu en el mercat i facilitar la presa de decisions. El projecte també desenvolupa una aplicació web que integra l'eina i facilita el seu ús.Due to the rapid evolution of hardware, individuals and organizations renew their devices frequently. Nowadays, most replaced devices are prematurely recycled. However, those devices, despite being obsolete, still hold value and can have a second owner, which saves more resources than recycling. Refurbishers receive hundreds or thousands of devices from organizations and need to make the decision of either recycling the device or storing it so it can be used by other organizations. To make a wise choice, the value of the device and the cost of storing it should be taken into account. Moreover, since devices can be configured, the decision could also involve taking components from one device to upgrade or repair another, which makes the problem more complex. This project aims at providing a solution to this problem by building a tool for predicting the price of second-hand devices based on machine learning so refurbishers can estimate the price a buyer would pay for the device. The project also aims at developing a web application that integrates the tool and eases its use

    D3.2 Cost Concept Model and Gateway Specification

    Get PDF
    This document introduces a Framework supporting the implementation of a cost concept model against which current and future cost models for curating digital assets can be benchmarked. The value built into this cost concept model leverages the comprehensive engagement by the 4C project with various user communities and builds upon our understanding of the requirements, drivers, obstacles and objectives that various stakeholder groups have relating to digital curation. Ultimately, this concept model should provide a critical input to the development and refinement of cost models as well as helping to ensure that the curation and preservation solutions and services that will inevitably arise from the commercial sector as ‘supply’ respond to a much better understood ‘demand’ for cost-effective and relevant tools. To meet acknowledged gaps in current provision, a nested model of curation which addresses both costs and benefits is provided. The goal of this task was not to create a single, functionally implementable cost modelling application; but rather to design a model based on common concepts and to develop a generic gateway specification that can be used by future model developers, service and solution providers, and by researchers in follow-up research and development projects.<p></p> The Framework includes:<p></p> • A Cost Concept Model—which defines the core concepts that should be included in curation costs models;<p></p> • An Implementation Guide—for the cost concept model that provides guidance and proposes questions that should be considered when developing new cost models and refining existing cost models;<p></p> • A Gateway Specification Template—which provides standard metadata for each of the core cost concepts and is intended for use by future model developers, model users, and service and solution providers to promote interoperability;<p></p> • A Nested Model for Digital Curation—that visualises the core concepts, demonstrates how they interact and places them into context visually by linking them to A Cost and Benefit Model for Curation.<p></p> This Framework provides guidance for data collection and associated calculations in an operational context but will also provide a critical foundation for more strategic thinking around curation such as the Economic Sustainability Reference Model (ESRM).<p></p> Where appropriate, definitions of terms are provided, recommendations are made, and examples from existing models are used to illustrate the principles of the framework

    The Predicted-Deletion Dynamic Model: Taking Advantage of ML Predictions, for Free

    Full text link
    The main bottleneck in designing efficient dynamic algorithms is the unknown nature of the update sequence. In particular, there are some problems, like 3-vertex connectivity, planar digraph all pairs shortest paths, and others, where the separation in runtime between the best partially dynamic solutions and the best fully dynamic solutions is polynomial, sometimes even exponential. In this paper, we formulate the predicted-deletion dynamic model, motivated by a recent line of empirical work about predicting edge updates in dynamic graphs. In this model, edges are inserted and deleted online, and when an edge is inserted, it is accompanied by a "prediction" of its deletion time. This models real world settings where services may have access to historical data or other information about an input and can subsequently use such information make predictions about user behavior. The model is also of theoretical interest, as it interpolates between the partially dynamic and fully dynamic settings, and provides a natural extension of the algorithms with predictions paradigm to the dynamic setting. We give a novel framework for this model that "lifts" partially dynamic algorithms into the fully dynamic setting with little overhead. We use our framework to obtain improved efficiency bounds over the state-of-the-art dynamic algorithms for a variety of problems. In particular, we design algorithms that have amortized update time that scales with a partially dynamic algorithm, with high probability, when the predictions are of high quality. On the flip side, our algorithms do no worse than existing fully-dynamic algorithms when the predictions are of low quality. Furthermore, our algorithms exhibit a graceful trade-off between the two cases. Thus, we are able to take advantage of ML predictions asymptotically "for free.'

    An Essay on How Data Science Can Strengthen Business

    Get PDF
    Data science combines several extensions, including, e.g., statistics, scientific methods, artificial intelligence (AI) and data analysis to extract value from raw data. Analytical applications and data scientists can then verify and defer the results to discover patterns and trends. In this way, they allow business leaders to gain enlightened knowledge about the market. Companies have kept a wealth of data with them. As modern technology allowed for the creation and storage of ever-increasing amounts of information, data volumes popped. The wealth of data collected and stored by these technologies can bring regenerative benefits to organizations and societies around the world, but only if they can interpret it. That's where data science comes in. So, the applied economics refers to the application of economic theory and analysis. In this article we intend to present several software that are available for the application of economic analysis. Analysis can be performed on any type of data and is a way of looking at raw data and find useful information. There are several technologies available for economic analysis, with more or less characteristics, some of which are not only intended for this single purpose, and cover a wider spectrum of functionalities. Some of the technologies we will use are, e.g., Rstudio, SPSS, Statis and SAS/Stata. These are very common technologies when talking about economic or business analysis. The intention is to demonstrate how each of these software analyse the data and subsequently the interpretations that we can draw from that scrutiny. Organizations are using data science teams to turn data into a competitive advantage by refining products and services and cost-effective solutions. We will use some different algorithms to verify how they are processed by the different technologies, namely we will use metrics such as maximum, minimum, covariance, standard deviation, average and multicollinearity and variance, even the use of types of regression models

    Fine Tuning Transformer Models for Domain Specific Feature Extraction

    Get PDF
    La naturalesa del processament de llengües naturals ha canviat dràsticament en els últims anys. La implementació de Large Language Models pre-entrenat en milers de dades sense etiquetar ha obert la porta a una nova capa de comprensió del processament de text. Això ha desplaçat la investigació a la zona per explotar aquests grans models per obtenir millors resultats per a les tasques més petites. D'aquesta manera, el processament de llengües naturals està adquirint una importància cada vegada major. Afinant els diferents models de llenguatge gran amb dades específiques de context i de tasques, aquests models ràpidament aprenen a seguir patrons i generalitzar-los a nous conceptes. Entenen el llenguatge natural en gran mesura i poden generar relacions en paraules, frases i paràgrafs. La sintonització fina neuronal s'ha convertit en una tasca cada vegada més important per simplificar l'ús de solucions d'aprenentatge automàtic amb pocs recursos. L'augment dels models de transformadors pre-entrenats per al processament del llenguatge natural ha complicat la selecció i l'experimentació d'aquests models, augmentant el temps de recerca i experimentació. Aquest estudi passa per l'estat actual de l'art dels models transformadors i intenta estudiar l'abast i l'aplicabilitat d'aquests models. A partir d'aquest treball inicial, el document produeix un gasoducte complet d'ajust fi del model que permet a l'usuari obtenir fàcilment un model llest per a utilitzar per a una tasca de llenguatge natural. Per provar aquest model, la canonada es prova i s'avalua per a l'extracció automàtica de característiques (és a dir, funcionalitats) des d'aplicacions mòbils utilitzant documents de llenguatge natural disponibles, com ara descripcions.The nature of Natural Language Processing has drastically changed in the past years. The implementation of Large Language Models pre-trained on thousands of unlabelled data has opened the door to a new layer of comprehension of text processing. This has shifted research in the area to exploit these large models to obtain better results for smaller tasks. In this way, fine-tuning Natural Language Processing is becoming increasingly important. By fine-tuning the different large language models with context and task-specific data, these models quickly learn to track patterns and generalize to new concepts. They understand natural language to a great extent and can generate relationships in words, phrases, and paragraphs. Fine Tuning has become an increasingly important task to simplify the use of machine learning solutions with low resources. The increase in pre-trained transformer models for Natural Language Processing has complicated the selection and experimentation of these models, increasing research and experimentation time. This study goes through the current state of the art of transformer models and attempts to study the scope and applicability of these models. From this initial work, the paper produces a compre- hensive pipeline of model fine-tuning that allows the user to easily obtain a ready-to-use model for a natural language task. To test this model, the pipeline is tested and evaluated for the automatic extraction of features (i.e. functionalities) from mobile applications using available natural language documents, such as descriptions

    Caching-based Multicast Message Authentication in Time-critical Industrial Control Systems

    Full text link
    Attacks against industrial control systems (ICSs) often exploit the insufficiency of authentication mechanisms. Verifying whether the received messages are intact and issued by legitimate sources can prevent malicious data/command injection by illegitimate or compromised devices. However, the key challenge is to introduce message authentication for various ICS communication models, including multicast or broadcast, with a messaging rate that can be as high as thousands of messages per second, within very stringent latency constraints. For example, certain commands for protection in smart grids must be delivered within 2 milliseconds, ruling out public-key cryptography. This paper proposes two lightweight message authentication schemes, named CMA and its multicast variant CMMA, that perform precomputation and caching to authenticate future messages. With minimal precomputation and communication overhead, C(M)MA eliminates all cryptographic operations for the source after the message is given, and all expensive cryptographic operations for the destinations after the message is received. C(M)MA considers the urgency profile (or likelihood) of a set of future messages for even faster verification of the most time-critical (or likely) messages. We demonstrate the feasibility of C(M)MA in an ICS setting based on a substation automation system in smart grids.Comment: For viewing INFOCOM proceedings in IEEE Xplore see https://ieeexplore.ieee.org/abstract/document/979676

    Graph neural networks for seizure discrimination based on electroencephalogram analysis

    Get PDF
    Este estudio presenta una investigación sobre la clasificación de Convulsiones Psicógenas No Epilépticas (PNES) y Convulsiones Epilépticas (ES) utilizando datos de EEG y Redes Neuronales de Grafos (GNN). El modelo propuesto muestra un rendimiento destacable, superando los resultados previos del estado del arte y logrando una precisión notable en la clasificación ternaria. Mediante el uso de una arquitectura GNN, el modelo distingue de manera efectiva entre PNES y ES con una precisión del 92.9%. Además, al emplear la validación cruzada "Leave One Group Out", el modelo logra una precisión aún mayor del 97.58%, superando la precisión más alta reportada en el estado del arte de 94.4%. Asimismo, al ampliar la clasificación para incluir a pacientes sanos, el modelo alcanza una precisión del 91.12%, superando la mejor precisión conocida del estado del arte de 85.7%. Estos hallazgos resaltan el potencial del modelo para clasificar y diferenciar de manera precisa estas condiciones médicas utilizando datos de EEG. El trabajo futuro incluye la exploración de biomarcadores para la clasificación binaria utilizando las capacidades de explicabilidad del modelo, contribuyendo al desarrollo de herramientas de diagnóstico objetivas y estrategias de tratamiento personalizadas. Además, este estudio compara el rendimiento, las metodologías y los conjuntos de datos de estudios similares del estado del arte, proporcionando una visión general completa de la investigación en clasificación de convulsiones. En conclusión, este estudio demuestra el éxito del modelo propuesto en la clasificación de PNES y ES, allanando el camino para futuros avances en el campo y beneficiando a pacientes y profesionales de la salud en el diagnóstico y tratamiento.This study presents a research investigation on the classification of Psychogenic Non-Epileptic Seizures (PNES) and Epileptic Seizures (ES) using EEG data and Graph Neural Networks (GNN). The proposed model demonstrates outstanding performance, surpassing previous state-of-the-art results and achieving remarkable accuracy in ternary classification. By utilizing a GNN architecture, the model effectively distinguishes between PNES and ES with an accuracy of 92.9%. Moreover, when employing Leave One Group Out crossvalidation, the model achieves an even higher accuracy of 97.58%, outperforming the highest reported state-of-the-art accuracy of 94.4%. Furthermore, by extending the classification to include healthy patients, the model achieves an accuracy of 91.12%, surpassing the bestknown state-of-the-art accuracy of 85.7%. These findings highlight the potential of the model in accurately classifying and differentiating these medical conditions using EEG data. Future work includes the exploration of biomarkers for binary classification using the model's explainability capabilities, contributing to the development of objective diagnostic tools and personalized treatment strategies. Additionally, this study compares the performance, methodologies, and datasets of similar studies from the state-of-the-art, providing a comprehensive overview of seizure classification research. In conclusion, this study demonstrates the success of the proposed model in classifying PNES and ES, paving the way for further advancements in the field and benefiting patients and healthcare practitioners in diagnosis and treatment
    • …
    corecore