74 research outputs found

    SQL Query Completion for Data Exploration

    Full text link
    Within the big data tsunami, relational databases and SQL are still there and remain mandatory in most of cases for accessing data. On the one hand, SQL is easy-to-use by non specialists and allows to identify pertinent initial data at the very beginning of the data exploration process. On the other hand, it is not always so easy to formulate SQL queries: nowadays, it is more and more frequent to have several databases available for one application domain, some of them with hundreds of tables and/or attributes. Identifying the pertinent conditions to select the desired data, or even identifying relevant attributes is far from trivial. To make it easier to write SQL queries, we propose the notion of SQL query completion: given a query, it suggests additional conditions to be added to its WHERE clause. This completion is semantic, as it relies on the data from the database, unlike current completion tools that are mostly syntactic. Since the process can be repeated over and over again -- until the data analyst reaches her data of interest --, SQL query completion facilitates the exploration of databases. SQL query completion has been implemented in a SQL editor on top of a database management system. For the evaluation, two questions need to be studied: first, does the completion speed up the writing of SQL queries? Second , is the completion easily adopted by users? A thorough experiment has been conducted on a group of 70 computer science students divided in two groups (one with the completion and the other one without) to answer those questions. The results are positive and very promising

    Arrowhead compliant virtual market of energy

    Get PDF
    © 2014 IEEE. Industrial processes use energy to transform raw materials and intermediate goods into final products. Many efforts have been done on the minimization of energy costs in industrial plants. Apart from working on 'how' an industrial process is implemented, it is possible to reduce the energy costs by focusing on 'when' it is performed. Although, some manufacturing plants (e.g. refining or petrochemical plants) can be inflexible with respect to time due to interdependencies in processes that must be respected for performance and safety reasons, there are other industrial segments, such as alumina plants or discrete manufacturing, with more degrees of flexibility. These manufacturing plants can consider a more flexible scheduling of the most energy-intensive processes in response to dynamic prices and overall condition of the electricity market. In this scenario, requests for energy can be encoded by means of a formal structure called flex-offers, then aggregated (joining several flex-offers into a bigger one) and sent to the market, scheduled, disaggregated and transformed into consumption plans, and eventually, into production schedules for given industrial plant. In this paper, we describe the flex-offer concept and how it can be applied to industrial and home automation scenarios. The architecture proposed in this paper aims to be adaptable to multiples scenarios (industrial, home and building automation, etc.), thus providing the foundations for different concept implementations using multiple technologies or supporting various kinds of devices

    An Energy Flexibility Framework on The Internet of Things

    Get PDF
    This paper presents a framework for management of flexible energy loads in the context of the Internet of Things and the Smart Grid. The framework takes place in the European project Arrowhead, and aims at taking advantage of the flexibility (in time and power) of energy production and consumption offered by sets of devices, appliances or buildings, to help at solving the issue of fluctuating energy production of renewable energies. The underlying concepts are explained, the actors involved in the framework, their incentives and interactions are detailed, and a technical overview is provided. An implementation of the framework is presented, as well as the expected results of the pilots.info:eu-repo/semantics/publishedVersio

    ENCOURAGEing Results on ICT for Energy Efficient Buildings

    Get PDF
    21st IEEE International Conference on Emerging Technologies & Factory Automation (ETFA 2016). 6 to 9, Sep, 2016. Berlin, Germany.This paper presents how the ICT infrastructure developed in the European ENCOURAGE project, centered around a message oriented middleware, enabled energy savings in buildings and households. The components of the middleware, as well as the supervisory control strategy, are overviewed, to support the presentation of the results and how they could be achieved. The main results are presented on three of the pilots of the project, a first one consisting of a single household, a second one of a residential neighborhood, and a third one in a university campus.info:eu-repo/semantics/publishedVersio

    Alterations in Gut Microbiome in Cirrhosis as Assessed by Quantitative Metagenomics: Relationship With Acute-on-Chronic Liver Failure and Prognosis

    Get PDF
    Background and Aims: Cirrhosis is associated with changes in gut microbiome composition. Although acute-on-chronic liver failure (ACLF) is the most severe clinical stage of cirrhosis, there is lack of information about gut microbiome alterations in ACLF using quantitative metagenomics. We investigated the gut microbiome in patients with cirrhosis encompassing the whole spectrum of disease (compensated, acutely decompensated without ACLF, and ACLF). A group of healthy subjects was used as control subjects. Methods: Stool samples were collected prospectively in 182 patients with cirrhosis. DNA library construction and sequencing were performed using the Ion Proton Sequencer (ThermoFisher Scientific, Waltham, MA). Microbial genes were grouped into clusters, denoted as metagenomic species. Results: Cirrhosis was associated with a remarkable reduction in gene and metagenomic species richness compared with healthy subjects. This loss of richness correlated with disease stages and was particularly marked in patients with ACLF and persisted after adjustment for antibiotic therapy. ACLF was associated with a significant increase of Enterococcus and Peptostreptococcus sp and a reduction of some autochthonous bacteria. Gut microbiome alterations correlated with model for end-stage liver disease and Child-Pugh scores and organ failure and was associated with some complications, particularly hepatic encephalopathy and infections. Interestingly, gut microbiome predicted 3-month survival with good stable predictors. Functional analysis showed that patients with cirrhosis had enriched pathways related to ethanol production, γ-aminobutyric acid metabolism, and endotoxin biosynthesis, among others. Conclusions: Cirrhosis is characterized by marked alterations in gut microbiome that parallel disease stages with maximal changes in ACLF. Altered gut microbiome was associated with complications of cirrhosis and survival. Gut microbiome may contribute to disease progression and poor prognosis. These results should be confirmed in future studies

    Sélection de données guidée pour les modèles prédictifs

    No full text
    Databases and machine learning (ML) have historically evolved as two separate domains: while databases are used to store and query the data, ML is devoted to predictive models inference, clustering, etc. Despite its apparent simplicity, the “data preparation” step of ML applications turns out to be the most time-consuming step in practice. Interestingly this step encompasses the bridge between databases and ML. In this setting, we raise and address three main problems related to data selection for building predictive models. First, the database usually contains more than the data of interest: how to separate the data that the analyst wants from the one she does not want? We propose to see this problem as imbalanced classification between the tuples of interest and the rest of the database. We develop an undersampling method based on the functional dependencies of the database. Second, we discuss the writing of the query returning the tuples of interest. We propose a SQL query completion solution based on data semantics, that starts from a very general query, and helps an analyst to refine it until she selects her data of interest. This process aims at helping the analyst to design the query that will eventually select the data she requires. Third, assuming the data has successfully been extracted from the database, the next natural question follows: is the selected data suited to answer the considered ML problem? Since getting a predictive model from the features to the class to predict amounts to providing a function, we point out that it makes sense to first assess the existence of that function in the data. This existence can be studied through the prism of functional dependencies, and we show how they can be used to understand a model’s limitation, and to refine the initial data selection if necessary.Les bases de données et l'apprentissage ont historiquement évolués comme deux domaines distincts: alors que les bases de données sont utilisées pour stocker et interroger les données, l'apprentissage se consacre à la détermination de modèle prédictifs, au clustering, etc. Malgré son apparente simplicité, la phase de sélection des données pour l'apprentissage est souvent très chronophage en pratique. Il est intéressante de noter que cet étape fait le pont entre les bases de données et l'apprentissage. Dans ce contexte, nous soulevons et considérons trois problèmes liés à la sélection de données pour les modèles prédictifs. Premièrement, la base de données contient généralement plus que les données d'intérêt: comment séparer les données que l'analyste veut de celles qu'elle ne veut pas? Nous proposons de voir ce problème comme une classification déséquilibrée entre les tuples d'intérêt et le reste de la base de données. Nous développons une méthode de sous-échantillonnage basée sur les dépendances fonctionnelles de la base de données. Deuxièmement, nous discutons de l'écriture de la requête renvoyant les tuples d'intérêt. Nous proposons une solution de complétion de requête SQL basée sur la sémantique des données, qui part d'une requête très générale, et aide un analyste à l'affiner jusqu'à ce qu'elle sélectionne ses données d'intérêt. Ce processus vise à aider l'analyste à concevoir la requête qui finira par sélectionner les données dont elle a besoin. Troisièmement, en supposant que les données ont été extraites avec succès de la base de données, on peut se poser la question suivante: les données sélectionnées sont-elles adaptées pour répondre au problème d'apprentissage considéré ? Puisque construire un modèle prédictif est équivalent à déterminer une fonction, nous soulignons qu'il est logique de d'abord évaluer l'existence de cette fonction dans les données. Cette existence peut être étudiée à travers le prisme des dépendances fonctionnelles, et nous montrons comment elles peuvent être utilisées pour comprendre les limitations d'un modèle et affiner la sélection initiale des données si nécessaire

    Langages de requêtes interactifs pour l'exploration de données

    No full text
    National audienceDans le contexte actuel où l'on assiste à un déluge de données, trouver des manières de naviguer efficacement dans les bases de données s'avère un défi déterminant. En proposant des solutions basées sur des langages de requêtes connus tels que le SQL, et en empruntant des méthodes de l'apprentissage automatique, cette thèse à pour but de s'attaquer à ce défi sous l'angle du rapprochement entre bases de données et apprentissage automatique
    corecore