39 research outputs found
C-SPARQL Extension for Sampling RDF Graphs Streams
International audienceOur daily use of Internet and related technologies generates continuously large amount of heterogeneous data flows. Several RDF Stream Processing (RSP) systems have been proposed. Existing RSP systems benefit from the advantages of semantic web technologies and traditional data flow management systems. C-SPARQL, CQELS, SPARQL stream , EP-SPARQL, and Sparkwave extend the semantic query language SPARQL and are examples of those systems. Considering that the storage and processing of all these streams become expensive, we propose a solution to reduce the load while keeping data semantics, and optimizing treatments. In this paper, we propose to extend C-SPARQL for continuously generating samples on RDF graphs. We add three sampling operators (UNIFORM, RESERVOIR and CHAIN) to the C-SPARQL query syntax. These operators have been implemented into Esper, the C-SPARQL's data flow management module. The experiments show the performance of our extension in terms of execution time and preserving data semantics
Changes in the Molecular Epidemiology of Pediatric Bacterial Meningitis in Senegal After Pneumococcal Conjugate Vaccine Introduction.
BACKGROUND: Bacterial meningitis is a major cause of mortality among children under 5 years of age. Senegal is part of World Health Organization-coordinated sentinel site surveillance for pediatric bacterial meningitis surveillance. We conducted this analysis to describe the epidemiology and etiology of bacterial meningitis among children less than 5 years in Senegal from 2010 and to 2016. METHODS: Children who met the inclusion criteria for suspected meningitis at the Centre Hospitalier National d'Enfants Albert Royer, Senegal, from 2010 to 2016 were included. Cerebrospinal fluid specimens were collected from suspected cases examined by routine bacteriology and molecular assays. Serotyping, antimicrobial susceptibility testing, and whole-genome sequencing were performed. RESULTS: A total of 1013 children were admitted with suspected meningitis during the surveillance period. Streptococcus pneumoniae, Neisseria meningitidis, and Haemophilus accounted for 66% (76/115), 25% (29/115), and 9% (10/115) of all confirmed cases, respectively. Most of the suspected cases (63%; 639/1013) and laboratory-confirmed (57%; 66/115) cases occurred during the first year of life. Pneumococcal meningitis case fatality rate was 6-fold higher than that of meningococcal meningitis (28% vs 5%). The predominant pneumococcal lineage causing meningitis was sequence type 618 (n = 7), commonly found among serotype 1 isolates. An ST 2174 lineage that included serotypes 19A and 23F was resistant to trimethoprim-sulfamethoxazole. CONCLUSIONS: There has been a decline in pneumococcal meningitis post-pneumococcal conjugate vaccine introduction in Senegal. However, disease caused by pathogens covered by vaccines in widespread use still persists. There is need for continued effective monitoring of vaccine-preventable meningitis
Long-term cellular immunity of vaccines for Zaire Ebola Virus Diseases
Recent Ebola outbreaks underscore the importance of continuous prevention and disease control efforts. Authorized vaccines include Merck’s Ervebo (rVSV-ZEBOV) and Johnson & Johnson’s two-dose combination (Ad26.ZEBOV/MVA-BN-Filo). Here, in a five-year follow-up of the PREVAC randomized trial (NCT02876328), we report the results of the immunology ancillary study of the trial. The primary endpoint is to evaluate long-term memory T-cell responses induced by three vaccine regimens: Ad26–MVA, rVSV, and rVSV–booster. Polyfunctional EBOV-specific CD4+ T-cell responses increase after Ad26 priming and are further boosted by MVA, whereas minimal responses are observed in the rVSV groups, declining after one year. In-vitro expansion for eight days show sustained EBOV-specific T-cell responses for up to 60 months post-prime vaccination with both Ad26-MVA and rVSV, with no decline. Cytokine production analysis identify shared biomarkers between the Ad26-MVA and rVSV groups. In secondary endpoint, we observed an elevation of pro-inflammatory cytokines at Day 7 in the rVSV group. Finally, we establish a correlation between EBOV-specific T-cell responses and anti-EBOV IgG responses. Our findings can guide booster vaccination recommendations and help identify populations likely to benefit from revaccination
Fonctions d'oubli et résumés dans les entrepôts de données
Les entrepôts de données stockent des quantités de données de plus en plus massives et arrivent vite à saturation. La solution qui est appliquée en général est d'assurer un archivage périodique des données les plus anciennes. Cette solution n'est pas satisfaisante car l'archivage et la remise en ligne des données sont des opérations coûteuses au point que l'on peut considérer que des données archivées sont des données perdues du point de vue de leur utilisation dans le cadre d'une analyse des données. Dans cette thèse, nous proposons une solution pour résoudre ce problème : un langage de spécifications de fonctions d'oubli est défini pour déterminer les données qui doivent être présentes dans l'entrepôt de données à chaque instant. Ces spécifications de fonctions d'oubli conduisent à supprimer de façon mécanique les données 'oublier', tout en conservant un résumé de celles-ci par agrégation et par échantillonnage. L'objectif est de mettre à disposition de l'administrateur de l'entrepôt de données des outils permettant de maîtriser la taille de l'entrepôt et d'automatiser au maximum le processus d'archivage des données anciennes en fonction de stratégies prédéfinies. Dans cette thèse, nous nous plaçons dans le cadre du modèle relationnel et nous définissons un langage de spécifications de fonctions d'oubli ainsi que les principes et les algorithmes pour mettre à jour le contenu de l'entrepôt conformément aux spécifications de fonctions d'oubli. Des mécanismes de consultation et d'analyse des résumés constitués sont également proposés.The amount of data stored in data warehouses grows very quickly so that they get saturated. To overcome this problem, the solution is generally to archive older data when new data arrive if there is no space left. This solution is not satisfactory because data mining analyses based on long term historical data become impossible. As a matter of fact data mining analysis cannot be done on archived data without re-loading them in the data warehouse; and the cost of loading back a large dataset of archived data is too high to be operated just for one analysis. So, archived data must be considered as lost data regarding to data mining applications. In this thesis, we propose a solution for solving this problem: a language is defined to specify forgetting functions on older data. The specifications include the definition of some summaries of deleted data to define what data should be present in the data warehouse at each step of time. These summaries are aggregates and samples of deleted data and will be kept in the data warehouse.The goal of these forgetting functions is to control the size of the data warehouse. This control is provided both for the aggregate summaries and the samples. The specification language for forgetting function is defined in the context of relational databases. Once forgetting functions have been specified, the data warehouse is automatically updated in order to follow the specifications. This thesis presents both the language for specifications, the structure of the summaries, the algorithms to update the data warehouse and the possibility of performing interesting analyses of historical data.PARIS-Télécom ParisTech (751132302) / SudocSudocFranceF
Fast SPARQL join processing between distributed streams and stored RDF graphs using bloom filters
International audienceThe growth of real-time data generation and stored data leads us to be constantly in thinking about the three V's big data challenges: volume, velocity and variety. Existing RDF Stream Processing (RSP) systems have solved the variety lock by defining a common model for producing, transmitting and continuously querying data in RDF model. On the volume and velocity side, the performances of RSP systems need to be improved particularly in terms of joins process between stored and streaming RDF graphs. Stored RDF data are very important in streaming context (related ontologies, summarized RDF data, non-evolutive RDF data or evolve very slowly over time, etc.) but existing RSP systems such as C-SPARQL, CQELS, SPARQL stream , EP-SPARQL, Sparkwave, etc. use non-optimized and non-scalable approaches for performing join operations between stored and dynamic RDF data. Indeed, these systems need to read the entire local or remote stored RDF data sets while RDF data streams continuously arrived and need to be processed in near real-time. This latency may negatively affect performances in terms of continuous processing and often causes multiple bottlenecks within the network in a distributed environment. That also makes impractical to refresh data or update the stored contents. This paper proposes an approach for distributed real-time joins between stored and streaming RDF graphs using Bloom filters. The join procedure consists of adding fast processing by greatly reducing intermediate results, in-memory indices storage and precomputing query partitions according to the picked SPARQL query variable(s) between the two natures of RDF data. Experimental and evaluations results confirm the performances gained with our approach which significantly speeds up the query processing compared to the actual RSP's techniques
Efficient Graph-Oriented Summary for Optimized Resource Description Framework Streams Processing Using Extended Centrality Measures
International audienceExisting RDF Stream Processing (RSP) systems allow continuous processing of RDF data issued from different application domains such as weather station measuring phenomena, geolocation,IoT applications, drinking water distribution management and so on. However processing window phase often expires before finishing the entire session and RSP systems immediately delete data streams after each processed window. Such mechanism does not allow an optimized exploitation of the RDF data streams as the most relevant and pertinent information of the data is often not used in a due time and almost impossible to be exploited for further analyzes. It should be better to keep the most informative part of data within streams while minimizing the memory storage space. In this work, we propose an RDF graph summarization system based on an explicit and implicit expressed needs through three (3) main approaches: (1) an approach for user queries (SPARQL) in order to extract their needs and group them into a more global query, (2) an extension of the closeness centrality measure issued from Social Network Analysis (SNA) to determine the most informative parts of the graph and (3) an RDF graph summarization technique combining extracted user query needs and the extended centrality measure. Experiments and evaluations show efficient result in term of memory space storage and the most expected approximate query results on summarized graphs compared to the source ones.Existing RDF Stream Processing (RSP) systems allow continuous processing of RDF data issued from different application domains such as weather station measuring phenomena, geolocation,IoT applications, drinking water distribution management and so on. However processing window phase often expires before finishing the entire session and RSP systems immediately delete data streams after each processed window. Such mechanism does not allow an optimized exploitation of the RDF data streams as the most relevant and pertinent information of the data is often not used in a due time and almost impossible to be exploited for further analyzes. It should be better to keep the most informative part of data within streams while minimizing the memory storage space. In this work, we propose an RDF graph summarization system based on an explicit and implicit expressed needs through three (3) main approaches: (1) an approach for user queries (SPARQL) in order to extract their needs and group them into a more global query, (2) an extension of the closeness centrality measure issued from Social Network Analysis (SNA) to determine the most informative parts of the graph and (3) an RDF graph summarization technique combining extracted user query needs and the extended centrality measure. Experiments and evaluations show efficient result in term of memory space storage and the most expected approximate query results on summarized graphs compared to the source ones
Shifting Patterns of Influenza Circulation during the COVID-19 Pandemic, Senegal
Historically low levels of seasonal influenza circulation were reported during the first years of the COVID-19 pandemic and were mainly attributed to implementation of nonpharmaceutical interventions. In tropical regions, influenza’s seasonality differs largely, and data on this topic are scarce. We analyzed data from Senegal’s sentinel syndromic surveillance network before and after the start of the COVID-19 pandemic to assess changes in influenza circulation. We found that influenza shows year-round circulation in Senegal and has 2 distinct epidemic peaks: during January–March and during the rainy season in August–October. During 2021–2022, the expected January–March influenza peak completely disappeared, corresponding to periods of active SARS-CoV-2 circulation. We noted an unexpected influenza epidemic peak during May–July 2022. The observed reciprocal circulation of SARS-CoV-2 and influenza suggests that factors such as viral interference might be at play and should be further investigated in tropical settings
Analysis of a Dengue Virus Outbreak in Rosso, Senegal 2021
Senegal is hyperendemic for dengue. Since 2017, outbreaks have been noticed annually in many regions around the country, marked by the co-circulation of DENV1-3. On 8 October 2021, a Dengue virus outbreak in the Rosso health post (sentinel site of the syndromic surveillance network) located in the north of the country was notified to the WHO Collaborating Center for arboviruses and hemorrhagic fever viruses at Institut Pasteur de Dakar. A multidisciplinary team was then sent for epidemiological and virologic investigations. This study describes the results from investigations during an outbreak in Senegal using a rapid diagnostic test (RDT) for the combined detection of dengue virus non-structural protein 1 (NS1) and IgM/IgG. For confirmation, samples were also tested by real-time RT-PCR and IgM ELISA at the reference lab in Dakar. qRT-PCR positive samples were subjected to whole genome sequencing using nanopore technology. Virologic analysis scored 102 positives cases (RT-PCR, NS1 antigen detection and/or IgM) out of 173 enrolled patients; interestingly, virus serotyping showed that the outbreak was caused by the DENV-1, a serotype different from DENV-2 involved during the outbreak in Rosso three years earlier, indicating a serotype replacement. Nearly all field-tested NS1 positives samples were confirmed by qRT-PCR with a concordance of 92.3%. Whole genome sequencing and phylogenetic analysis of strains suggested a re-introduction in Rosso of a DENV-1 strain different to the one responsible for the outbreak in the Louga area five years before. Findings call for improved dengue virus surveillance in Senegal, with a wide deployment of DENV antigenic tests, which allow easy on-site diagnosis of suspected cases and early detection of outbreaks. This work highlights the need for continuous monitoring of circulating serotypes which is crucial for a better understanding of viral epidemiology around the country