320 research outputs found

    Méthode de découverte de sources de données tenant compte de la sémantique en environnement de grille de données

    Get PDF
    Les applications grilles de données de nos jours partagent un nombre gigantesque de sources de données en un environnement instable où une source de données peut à tout moment joindre ou quitter le système. Ces sources de données sont hétérogènes, autonomes et distribuées à grande échelle. Dans cet environnement, la découverte efficace des sources de données pertinentes pour l'exécution de requêtes est un défi. Les premiers travaux sur la découverte de sources de données se sont basés sur une recherche par mots clés. Ces solutions ne sont pas satisfaisantes puisqu'elles ne tiennent pas compte des problèmes de l'hétérogénéité sémantique des sources de données. Ainsi, d'autres solutions proposent un schéma global ou une ontologie globale. Cependant, la conception d'un tel schéma ou d'une telle ontologie est une tâche complexe à cause du nombre de sources de données. D'autres solutions optent pour l'usage de correspondances entre les schémas des sources de données ou en s'appuyant sur des ontologies de domaine et en établissant des relations de 'mapping' entre ces dernières. Toutes ces solutions imposent une topologie fixe soit pour les correspondances soit pour les relations de 'mapping'. Cependant, la définition de relations de 'mapping' entre ontologies de domaine est une tâche ardue et imposer une topologie fixe est un inconvénient majeur. Dans cette perspective, nous proposons dans cette thèse une méthode de découverte de sources de données prenant en compte les problèmes liés à l'hétérogénéité sémantique en environnement instable et à grande échelle. Pour cela, nous associons une Organisation Virtuelle (OV) et une ontologie de domaine à chaque domaine et nous nous basons sur les relations de 'mappings' existantes entre ces ontologies. Nous n'imposons aucune hypothèse sur la topologie des relations de 'mapping' mis à part que le graphe qu'elles forment soit connexe. Nous définissons un système d'adressage permettant un accès permanent de n'importe quelle OV vers une autre malgré la dynamicité des pairs. Nous présentons également une méthode de maintenance dite 'paresseuse' afin de limiter le nombre de messages nécessaires à la maintenance du système d'adressage lors de la connexion ou de la déconnexion de pairs. Pour étudier la faisabilité ainsi que la viabilité de nos propositions, nous effectuons une évaluation des performances.Nowadays, data grid applications look to share a huge number of data sources in an unstable environment where a data source may join or leave the system at any time. These data sources are highly heterogeneous because they are independently developed and managed and geographically scattered. In this environment, efficient discovery of relevant data sources for query execution is a complex problem due to the source heterogeneity, large scale environment and system instability. First works on data source discovery are based on a keyword search. These initial solutions are not sufficient because they do not take into account problem of semantic heterogeneity of data sources. Thus, the community has proposed other solutions to consider semantic aspects. A first solution consists in using a global schema or global ontology. However, the conception of such scheme or such ontology is a complex task due to the number of data sources. Other solutions have been proposed providing mappings between data source schemas or based on domain ontologies and establishing mapping relations between them. All these solutions impose a fixed topology for connections as well as mapping relationships. However, the definition of mapping relations between domain ontologies is a difficult task and imposing a fixed topology is a major inconvenience. In this perspective, we propose in this thesis a method for discovering data sources taking into account semantic heterogeneity problems in unstable and large scale environment. For that purpose, we associate a Virtual Organisation (VO) and a domain ontology to each domain and we rely on relationship mappings between existing ontologies. We do not impose any hypothesis on the relationship mapping topology, except that they form connected graph. We define an addressing system for permanent access from any OVi to another OVj despite peers' dynamicity (with i inégalité j). We also present a method of maintenance called 'lazy' to limit the number of messages required to maintain the addressing system during the connection or disconnection of peers. To study the feasibility as well as the viability of our proposals, we make a performance evaluation

    Quality analyses and improvement for fuzzy clustering and web personalization

    Get PDF
    Web mining researchers and practitioners keep on innovating and creating new technologies to help web site managers efficiently improve their offered web-based services and to facilitate information retrieval by web site users. The increasing amount of information and services offered through the Web coupled with the increase in web-based transactions calls for systems that can handle gigantic amount of usage information efficiently while providing good predictions or recommendations and personalization of web sites. In this thesis we first focus on clustering to obtain usage model from weblog data and investigate ways to improve the clustering quality. We also consider applications and focus on generating predictions through collaborative filtering which matches behavior of a current user with that of past like-minded users. To provide dependable performance analysis and improve clustering quality, we study 4 fuzzy clustering algorithms and compare their effectiveness and efficiency in web prediction. Dependability aspects led us further to investigate objectivity of validity indices and choose a more objective index for assessing the relative performance of the clustering techniques. We also use appropriate statistical testing methods in our experiments to distinguish real differences from those that may be due to sampling or other errors. Our results reconfirm some of the claims made previously about these clustering and prediction techniques, while at the same time suggest the need to assess both cluster validation and prediction quality for a sound comparison of the clustering techniques. To assess quality of aggregate usage profiles (UP), we devised a set of criteria which reflect the semantic characterization of UPs and help avoid resorting to subjective human judgment in assessment of UPs and clustering quality. We formulate each of these criteria as a computable measure for individual as well as for groups of UPs. We applied these criteria in the final phase of fuzzy clustering. The soundness and usability of the criteria have been confirmed through a user survey

    Genetic Studies of Agronomic Characters in Winter Wheat (Triticum Aestivum L.)

    Get PDF
    Crop Scienc

    La gestion du fonds de roulement des PME en période de récession économique

    Get PDF

    Possibilities for the analysis of fruit and vegetable consumption based on a transtheoretical dynamic COM-B model

    Get PDF
    The objective of the transtheoretical dynamic COM-B (Capability, Opportunity, Motivation-Behavior) model is to understand why people take risks when it comes to their health and why they do not follow the instructions to protect their health. The model has been developed as the central part of a larger behavioral system called the Behavior Change Wheel (BCW), the goal of which is to assist the designers of a given intervention with factual data during the process leading from the behavioral analysis of the problem to the planning of the intervention. The COM-B model has been successfully applied in many cases. When increasing the consumption of fruits and vegetables, an essential condition for behavior change is that people have the ability, opportunity and motivation to change. The behavior was measured by the annual per capita spending on vegetables, potatoes and fruits, based on HKF (Household Budget Surveys), the latter being published in the STADAT issued by the Hungarian Central Statistical Office. It was assumed that the capability can be approximated by the expenditure on “Higher education“, the opportunity by the expenditure on “Gardens, plants and flowers, and motivation by the expenditure on “Sport, camping goods “, “Indoor sports equipment” and “Sports equipment, camping equipment “. A correlation was demonstrated between the expenditure on fruits, vegetables and potatoes and the expenditure on flowers, gardening and sports, however, there was no correlation in the case of money spent on higher education

    Performance Prediction Upon Toolchain Migration in Model-Based Software

    Get PDF
    Changing the development environment can have severe impacts on the system behavior such as the execution-time performance. Since it can be costly to migrate a software application, engineers would like to predict the performance parameters of the application under the new environment with as little effort as possible. In this work, we concentrate on model-driven development and provide a methodology to estimate the execution-time performance of application models under different toolchains. Our approach has low cost compared to the migration effort of an entire application. As part of the approach, we provide methods for characterizing model-driven applications, an algorithm for generating application-specific microbenchmarks, and results on using different methods for estimating the performance. In the work, we focus on SCADE as the development toolchain and use a Cruise Control and a Water Level application as case studies to confirm the technical feasibility and viability of our technique

    Uncertainty Estimation for Molecules: Desiderata and Methods

    Full text link
    Graph Neural Networks (GNNs) are promising surrogates for quantum mechanical calculations as they establish unprecedented low errors on collections of molecular dynamics (MD) trajectories. Thanks to their fast inference times they promise to accelerate computational chemistry applications. Unfortunately, despite low in-distribution (ID) errors, such GNNs might be horribly wrong for out-of-distribution (OOD) samples. Uncertainty estimation (UE) may aid in such situations by communicating the model's certainty about its prediction. Here, we take a closer look at the problem and identify six key desiderata for UE in molecular force fields, three 'physics-informed' and three 'application-focused' ones. To overview the field, we survey existing methods from the field of UE and analyze how they fit to the set desiderata. By our analysis, we conclude that none of the previous works satisfies all criteria. To fill this gap, we propose Localized Neural Kernel (LNK) a Gaussian Process (GP)-based extension to existing GNNs satisfying the desiderata. In our extensive experimental evaluation, we test four different UE with three different backbones and two datasets. In out-of-equilibrium detection, we find LNK yielding up to 2.5 and 2.1 times lower errors in terms of AUC-ROC score than dropout or evidential regression-based methods while maintaining high predictive performance.Comment: Published as conference paper at ICML 202

    Primary Plasmacytoma of The Testis with no Evidence of Multiple Myeloma: a New Case Report and Literature Review

    Get PDF
    Plasmacytomas of the testis are extremely rare tumours, especially when occurring in the absence of a previous or concurrent diagnosis of multiple myeloma. We report a new case of solitary testicular plasmacytoma, with immunohistochemical studies showing monoclonal cytoplasmic production of IgG lambda light chains, in a 51-year-old man who had no evidence of multiple myeloma 3 years after the orchiectomy.Key Words: Testis, plasmacytoma, multiple myelom
    corecore