14 research outputs found

    Identification and mapping of potential recharge in the Middle Seybouse sub-catchment of the Guelma region (North East of Algeria): contribution of remote sensing, multi-criteria analysis, ROC-Curve and GIS

    Get PDF
    Due to the rapid population increase in the Middle Seybouse sub-catchment area in North-East Algeria, the intense agricultural practices, and the industrial development, precious water resources proven to be significantly challenged in their sustainable exploitation both in terms of quantity and quality. The aim of this study is to identify the most suitable areas for groundwater recharge in the Middle Seybouse sub-catchment, over about 770.91 km², using remote sensing data and Geographical Information Systems (GIS). Six factors are recognized to positively affect groundwater recharge: rainfall, land cover, topography, drainage density, lineament density, and lithology. According to their level of involvement in the recharge process, these parameters have been reclassified and then evaluated using the multi-criteria analysis known as “Analytical Hierarchy Process” (AHP). A potential recharge map of the study area was produced showing that 60% of this area, located in the southern and central parts of the catchment, has a high to very high potential. ROC (receiver operating characteristic) curve is used to validate the resulting groundwater potential recharge map using the existing wells in the study area

    What is the right sequencing approach? Solo VS extended family analysis in consanguineous populations.

    Get PDF
    Testing strategies is crucial for genetics clinics and testing laboratories. In this study, we tried to compare the hit rate between solo and trio and trio plus testing and between trio and sibship testing. Finally, we studied the impact of extended family analysis, mainly in complex and unsolved cases. Three cohorts were used for this analysis: one cohort to assess the hit rate between solo, trio and trio plus testing, another cohort to examine the impact of the testing strategy of sibship genome vs trio-based analysis, and a third cohort to test the impact of an extended family analysis of up to eight family members to lower the number of candidate variants. The hit rates in solo, trio and trio plus testing were 39, 40, and 41%, respectively. The total number of candidate variants in the sibship testing strategy was 117 variants compared to 59 variants in the trio-based analysis. We noticed that the average number of coding candidate variants in trio-based analysis was 1192 variants and 26,454 noncoding variants, and this number was lowered by 50-75% after adding additional family members, with up to two coding and 66 noncoding homozygous variants only, in families with eight family members. There was no difference in the hit rate between solo and extended family members. Trio-based analysis was a better approach than sibship testing, even in a consanguineous population. Finally, each additional family member helped to narrow down the number of variants by 50-75%. Our findings could help clinicians, researchers and testing laboratories select the most cost-effective and appropriate sequencing approach for their patients. Furthermore, using extended family analysis is a very useful tool for complex cases with novel genes

    An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data

    No full text
    Frequent itemset mining (FIM) is a common approach for discovering hidden frequent patterns from transactional databases used in prediction, association rules, classification, etc. Apriori is an FIM elementary algorithm with iterative nature used to find the frequent itemsets. Apriori is used to scan the dataset multiple times to generate big frequent itemsets with different cardinalities. Apriori performance descends when data gets bigger due to the multiple dataset scan to extract the frequent itemsets. Eclat is a scalable version of the Apriori algorithm that utilizes a vertical layout. The vertical layout has many advantages; it helps to solve the problem of multiple datasets scanning and has information that helps to find each itemset support. In a vertical layout, itemset support can be achieved by intersecting transaction ids (tidset/tids) and pruning irrelevant itemsets. However, when tids become too big for memory, it affects algorithms efficiency. In this paper, we introduce SHFIM (spark-based hybrid frequent itemset mining), which is a three-phase algorithm that utilizes both horizontal and vertical layout diffset instead of tidset to keep track of the differences between transaction ids rather than the intersections. Moreover, some improvements are developed to decrease the number of candidate itemsets. SHFIM is implemented and tested over the Spark framework, which utilizes the RDD (resilient distributed datasets) concept and in-memory processing that tackles MapReduce framework problem. We compared the SHFIM performance with Spark-based Eclat and dEclat algorithms for the four benchmark datasets. Experimental results proved that SHFIM outperforms Eclat and dEclat Spark-based algorithms in both dense and sparse datasets in terms of execution time

    An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data

    No full text
    Frequent itemset mining (FIM) is a common approach for discovering hidden frequent patterns from transactional databases used in prediction, association rules, classification, etc. Apriori is an FIM elementary algorithm with iterative nature used to find the frequent itemsets. Apriori is used to scan the dataset multiple times to generate big frequent itemsets with different cardinalities. Apriori performance descends when data gets bigger due to the multiple dataset scan to extract the frequent itemsets. Eclat is a scalable version of the Apriori algorithm that utilizes a vertical layout. The vertical layout has many advantages; it helps to solve the problem of multiple datasets scanning and has information that helps to find each itemset support. In a vertical layout, itemset support can be achieved by intersecting transaction ids (tidset/tids) and pruning irrelevant itemsets. However, when tids become too big for memory, it affects algorithms efficiency. In this paper, we introduce SHFIM (spark-based hybrid frequent itemset mining), which is a three-phase algorithm that utilizes both horizontal and vertical layout diffset instead of tidset to keep track of the differences between transaction ids rather than the intersections. Moreover, some improvements are developed to decrease the number of candidate itemsets. SHFIM is implemented and tested over the Spark framework, which utilizes the RDD (resilient distributed datasets) concept and in-memory processing that tackles MapReduce framework problem. We compared the SHFIM performance with Spark-based Eclat and dEclat algorithms for the four benchmark datasets. Experimental results proved that SHFIM outperforms Eclat and dEclat Spark-based algorithms in both dense and sparse datasets in terms of execution time

    Adjunctive Subantimicrobial Dose Doxycycline in the Treatment of Chronic Periodontitis in Type 2 Diabetic Patients: A Unique Combination Therapy

    No full text
    Background/Aim: To evaluate the effectiveness of combination therapy including subantimicrobial dose doxycycline (SDD) and locally delivered doxycycline (LD) as adjuncts to scaling and root planing (SRP) in the treatment of chronic periodontitis in patients with type 2 diabetes mellitus (T2DM). Material and Methods: Forty patients with controlled T2DM (HbA1c ≤7%) and chronic periodontitis were selected. They were randomly divided into two groups, twenty patients each: Test group (TG, n=20) patients was treated with combination therapy of full mouth SRP, LD gel 10% and SDD 20 mg twice daily for 6 months. Control group (CG, n=20) patients was treated with full mouth SRP only. The periodontal parameters were recorded at baseline, 3, 6 and 9 months and included periodontal probing depth (PD), clinical attachment level (CAL), and bleeding on probing (BOP). Gingival crevicular fluid (GCF) samples were collected and a quantitative measurement of matrix metalloproteinase-8 (MMP-8) was carried out by using Enzyme-Linked Immunosorbent Assay (ELIZA) at baseline, 3, 6 and 9 months. Results: Statistically significant reduction in all clinical parameters (PPD, CAL, and BOP) was observed at TG over CG at 3, 6, and 9 months (p<0.05). Moreover, combination therapy provided significant reductions in the amount of GCF MMP-8 for the TG compared to CG at 3, 6, and 9 months evaluation period (p<0.05). Conclusions: Combination therapy including SRP, SDD, and LD, provided significantly greater clinical benefits than SRP alone in the treatment of chronic periodontitis in patients with controlled T2DM

    Starvar: symptom-based tool for automatic ranking of variants using evidence from literature and genomes

    No full text
    Abstract Background Identifying variants associated with diseases is a challenging task in medical genetics research. Current studies that prioritize variants within individual genomes generally rely on known variants, evidence from literature and genomes, and patient symptoms and clinical signs. The functionalities of the existing tools, which rank variants based on given patient symptoms and clinical signs, are restricted to the coverage of ontologies such as the Human Phenotype Ontology (HPO). However, most clinicians do not limit themselves to HPO while describing patient symptoms/signs and their associated variants/genes. There is thus a need for an automated tool that can prioritize variants based on freely expressed patient symptoms and clinical signs. Results STARVar is a Symptom-based Tool for Automatic Ranking of Variants using evidence from literature and genomes. STARVar uses patient symptoms and clinical signs, either linked to HPO or expressed in free text format. It returns a ranked list of variants based on a combined score from two classifiers utilizing evidence from genomics and literature. STARVar improves over related tools on a set of synthetic patients. In addition, we demonstrated its distinct contribution to the domain on another synthetic dataset covering publicly available clinical genotype–phenotype associations by using symptoms and clinical signs expressed in free text format. Conclusions STARVar stands as a unique and efficient tool that has the advantage of ranking variants with flexibly expressed patient symptoms in free-form text. Therefore, STARVar can be easily integrated into bioinformatics workflows designed to analyze disease-associated genomes. Availability STARVar is freely available from https://github.com/bio-ontology-research-group/STARVar
    corecore