26 research outputs found

    VLSP SHARED TASK: SENTIMENT ANALYSIS

    Get PDF
    Sentiment analysis is a natural language processing (NLP) task of identifying orextracting the sentiment content of a text unit. This task has become an active research topic since the early 2000s. During the two last editions of the VLSP workshop series, the shared task on Sentiment Analysis (SA) for Vietnamese has been organized in order to provide an objective evaluation measurement about the performance (quality) of sentiment analysis tools, and encouragethe development of Vietnamese sentiment analysis systems, as well as to provide benchmark datasets for this task. The rst campaign in 2016 only focused on the sentiment polarity classication, with a dataset containing reviews of electronic products. The second campaign in 2018 addressed the problem of Aspect Based Sentiment Analysis (ABSA) for Vietnamese, by providing two datasets containing reviews in restaurant and hotel domains. These data are accessible for research purpose via the VLSP website vlsp.org.vn/resources. This paper describes the built datasets as well as the evaluation results of the systems participating to these campaigns

    VLSP Shared Task: Named Entity Recognition

    Get PDF
    Named entities (NE) are phrases that contain the names of persons, organizations, locations, times and quantities, monetary values, percentages, etc. Named Entity Recognition (NER) is the task of recognizing named entities in documents. NER is an important subtask of Information Extraction, which has attracted researchers all over the world since 1990s. For Vietnamese language, although there exists some research projects and publications on NER task before 2016, no systematic comparison of the performance of NER systems has been done. In 2016, the organizing committee of the VLSP workshop decided to launch the first NER shared task, in order to get an objective evaluation of Vietnamese NER systems and to promote the development of high quality systems. As a result, the first dataset with morpho-syntactic and NE annotations has been released for benchmarking NER systems. At VLSP 2018, the NER shared task has been organized for the second time, providing a bigger dataset containing texts from various domains, but without morpho-syntactic annotation. These resources are available for research purpose via the VLSP website vlsp.org.vn/resources. In this paper, we describe the datasets as well as the evaluation results obtained from these two campaigns

    Environmental contamination with clostridioides (Clostridium) difficile in Vietnam

    Get PDF
    AIMS: To investigate the prevalence, molecular type, and antimicrobial susceptibility of Clostridioides difficile in the environment in Vietnam, where little is known about C. difficile. METHODS AND RESULTS: Samples of pig faeces, soils from pig farms, potatoes, and the hospital environment were cultured for C. difficile. Isolates were identified and typed by polymerase chain reaction (PCR) ribotyping. The overall prevalence of C. difficile contamination was 24.5% (68/278). Clostridioides difficile was detected mainly in soils from pig farms and hospital soils, with 70%-100% prevalence. Clostridioides difficile was isolated from 3.4% of pig faecal samples and 5% of potato surfaces. The four most prevalent ribotypes (RTs) were RTs 001, 009, 038, and QX574. All isolates were susceptible to metronidazole, fidaxomicin, vancomycin, and amoxicillin/clavulanate, while resistance to erythromycin, tetracycline, and moxifloxacin was common in toxigenic strains. Clostridioides difficile RTs 001A+B+CDT- and 038A-B-CDT- were predominantly multidrug resistant. CONCLUSIONS: Environmental sources of C. difficile are important to consider in the epidemiology of C. difficile infection in Vietnam, however, contaminated soils are likely to be the most important source of C. difficile. This poses additional challenges to controlling infections in healthcare settings

    Cryptic Lineages and a Population Damned to Incipient Extinction? Insights into the Genetic Structure of a Mekong River Catfish

    Get PDF
    An understanding of the genetic composition of populations across management boundaries is vital to developing successful strategies for sustaining biodiversity and food resources. This is especially important in ecosystems where habitat fragmentation has altered baseline patterns of gene flow, dividing natural populations into smaller sub-populations and increasing potential loss of genetic variation through genetic drift. River systems can be highly fragmented by dams built for flow regulation and hydropower. We used reduced-representation sequencing to examine genomic patterns in an exploited catfish, Hemibagrus spilopterus, in a hotspot of biodiversity and hydropower development- the Mekong River basin. Our results revealed the presence of two highly-divergent coexisting genetic lineages which may be cryptic species. Within the lineage with the greatest sample sizes, pairwise FST values, principal components analysis, and a STRUCTURE analysis all suggest that long-distance migration is not common across the Lower Mekong Basin, even in areas where flood-pulse hydrology has limited genetic divergence. In tributaries, effective population size estimates were at least an order of magnitude lower than in the Mekong mainstream indicating these populations may be more vulnerable to perturbations such as human-induced fragmentation. Fish isolated upstream of several dams in one tributary exhibited particularly low genetic diversity, high amounts of relatedness, and a level of inbreeding (GIS = 0.51) that has been associated with inbreeding depression in other outcrossing species. Our results highlight the importance of assessing genetic structure and diversity in riverine fisheries populations across proposed dam development sites for the preservation of these critically-important resources

    Human Gamma Oscillations during Slow Wave Sleep

    Get PDF
    Neocortical local field potentials have shown that gamma oscillations occur spontaneously during slow-wave sleep (SWS). At the macroscopic EEG level in the human brain, no evidences were reported so far. In this study, by using simultaneous scalp and intracranial EEG recordings in 20 epileptic subjects, we examined gamma oscillations in cerebral cortex during SWS. We report that gamma oscillations in low (30–50 Hz) and high (60–120 Hz) frequency bands recurrently emerged in all investigated regions and their amplitudes coincided with specific phases of the cortical slow wave. In most of the cases, multiple oscillatory bursts in different frequency bands from 30 to 120 Hz were correlated with positive peaks of scalp slow waves (“IN-phase” pattern), confirming previous animal findings. In addition, we report another gamma pattern that appears preferentially during the negative phase of the slow wave (“ANTI-phase” pattern). This new pattern presented dominant peaks in the high gamma range and was preferentially expressed in the temporal cortex. Finally, we found that the spatial coherence between cortical sites exhibiting gamma activities was local and fell off quickly when computed between distant sites. Overall, these results provide the first human evidences that gamma oscillations can be observed in macroscopic EEG recordings during sleep. They support the concept that these high-frequency activities might be associated with phasic increases of neural activity during slow oscillations. Such patterned activity in the sleeping brain could play a role in off-line processing of cortical networks

    Different Patterns of Evolution in the Centromeric and Telomeric Regions of Group A and B Haplotypes of the Human Killer Cell Ig-Like Receptor Locus

    Get PDF
    The fast evolving human KIR gene family encodes variable lymphocyte receptors specific for polymorphic HLA class I determinants. Nucleotide sequences for 24 representative human KIR haplotypes were determined. With three previously defined haplotypes, this gave a set of 12 group A and 15 group B haplotypes for assessment of KIR variation. The seven gene-content haplotypes are all combinations of four centromeric and two telomeric motifs. 2DL5, 2DS5 and 2DS3 can be present in centromeric and telomeric locations. With one exception, haplotypes having identical gene content differed in their combinations of KIR alleles. Sequence diversity varied between haplotype groups and between centromeric and telomeric halves of the KIR locus. The most variable A haplotype genes are in the telomeric half, whereas the most variable genes characterizing B haplotypes are in the centromeric half. Of the highly polymorphic genes, only the 3DL3 framework gene exhibits a similar diversity when carried by A and B haplotypes. Phylogenetic analysis and divergence time estimates, point to the centromeric gene-content motifs that distinguish A and B haplotypes having emerged ∼6 million years ago, contemporaneously with the separation of human and chimpanzee ancestors. In contrast, the telomeric motifs that distinguish A and B haplotypes emerged more recently, ∼1.7 million years ago, before the emergence of Homo sapiens. Thus the centromeric and telomeric motifs that typify A and B haplotypes have likely been present throughout human evolution. The results suggest the common ancestor of A and B haplotypes combined a B-like centromeric region with an A-like telomeric region

    Mapping 123 million neonatal, infant and child deaths between 2000 and 2017

    Get PDF
    Since 2000, many countries have achieved considerable success in improving child survival, but localized progress remains unclear. To inform efforts towards United Nations Sustainable Development Goal 3.2—to end preventable child deaths by 2030—we need consistently estimated data at the subnational level regarding child mortality rates and trends. Here we quantified, for the period 2000–2017, the subnational variation in mortality rates and number of deaths of neonates, infants and children under 5 years of age within 99 low- and middle-income countries using a geostatistical survival model. We estimated that 32% of children under 5 in these countries lived in districts that had attained rates of 25 or fewer child deaths per 1,000 live births by 2017, and that 58% of child deaths between 2000 and 2017 in these countries could have been averted in the absence of geographical inequality. This study enables the identification of high-mortality clusters, patterns of progress and geographical inequalities to inform appropriate investments and implementations that will help to improve the health of all populations

    Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017

    Get PDF
    Background The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2017 comparative risk assessment (CRA) is a comprehensive approach to risk factor quantification that offers a useful tool for synthesising evidence on risks and risk–outcome associations. With each annual GBD study, we update the GBD CRA to incorporate improved methods, new risks and risk–outcome pairs, and new data on risk exposure levels and risk–outcome associations. Methods We used the CRA framework developed for previous iterations of GBD to estimate levels and trends in exposure, attributable deaths, and attributable disability-adjusted life-years (DALYs), by age group, sex, year, and location for 84 behavioural, environmental and occupational, and metabolic risks or groups of risks from 1990 to 2017. This study included 476 risk–outcome pairs that met the GBD study criteria for convincing or probable evidence of causation. We extracted relative risk and exposure estimates from 46 749 randomised controlled trials, cohort studies, household surveys, census data, satellite data, and other sources. We used statistical models to pool data, adjust for bias, and incorporate covariates. Using the counterfactual scenario of theoretical minimum risk exposure level (TMREL), we estimated the portion of deaths and DALYs that could be attributed to a given risk. We explored the relationship between development and risk exposure by modelling the relationship between the Socio-demographic Index (SDI) and risk-weighted exposure prevalence and estimated expected levels of exposure and risk-attributable burden by SDI. Finally, we explored temporal changes in risk-attributable DALYs by decomposing those changes into six main component drivers of change as follows: (1) population growth; (2) changes in population age structures; (3) changes in exposure to environmental and occupational risks; (4) changes in exposure to behavioural risks; (5) changes in exposure to metabolic risks; and (6) changes due to all other factors, approximated as the risk-deleted death and DALY rates, where the risk-deleted rate is the rate that would be observed had we reduced the exposure levels to the TMREL for all risk factors included in GBD 2017. Findings In 2017, 34·1 million (95% uncertainty interval [UI] 33·3–35·0) deaths and 1·21 billion (1·14–1·28) DALYs were attributable to GBD risk factors. Globally, 61·0% (59·6–62·4) of deaths and 48·3% (46·3–50·2) of DALYs were attributed to the GBD 2017 risk factors. When ranked by risk-attributable DALYs, high systolic blood pressure (SBP) was the leading risk factor, accounting for 10·4 million (9·39–11·5) deaths and 218 million (198–237) DALYs, followed by smoking (7·10 million [6·83–7·37] deaths and 182 million [173–193] DALYs), high fasting plasma glucose (6·53 million [5·23–8·23] deaths and 171 million [144–201] DALYs), high body-mass index (BMI; 4·72 million [2·99–6·70] deaths and 148 million [98·6–202] DALYs), and short gestation for birthweight (1·43 million [1·36–1·51] deaths and 139 million [131–147] DALYs). In total, risk-attributable DALYs declined by 4·9% (3·3–6·5) between 2007 and 2017. In the absence of demographic changes (ie, population growth and ageing), changes in risk exposure and risk-deleted DALYs would have led to a 23·5% decline in DALYs during that period. Conversely, in the absence of changes in risk exposure and risk-deleted DALYs, demographic changes would have led to an 18·6% increase in DALYs during that period. The ratios of observed risk exposure levels to exposure levels expected based on SDI (O/E ratios) increased globally for unsafe drinking water and household air pollution between 1990 and 2017. This result suggests that development is occurring more rapidly than are changes in the underlying risk structure in a population. Conversely, nearly universal declines in O/E ratios for smoking and alcohol use indicate that, for a given SDI, exposure to these risks is declining. In 2017, the leading Level 4 risk factor for age-standardised DALY rates was high SBP in four super-regions: central Europe, eastern Europe, and central Asia; north Africa and Middle East; south Asia; and southeast Asia, east Asia, and Oceania. The leading risk factor in the high-income super-region was smoking, in Latin America and Caribbean was high BMI, and in sub-Saharan Africa was unsafe sex. O/E ratios for unsafe sex in sub-Saharan Africa were notably high, and those for alcohol use in north Africa and the Middle East were notably low. Interpretation By quantifying levels and trends in exposures to risk factors and the resulting disease burden, this assessment offers insight into where past policy and programme efforts might have been successful and highlights current priorities for public health action. Decreases in behavioural, environmental, and occupational risks have largely offset the effects of population growth and ageing, in relation to trends in absolute burden. Conversely, the combination of increasing metabolic risks and population ageing will probably continue to drive the increasing trends in non-communicable diseases at the global level, which presents both a public health challenge and opportunity. We see considerable spatiotemporal heterogeneity in levels of risk exposure and risk-attributable burden. Although levels of development underlie some of this heterogeneity, O/E ratios show risks for which countries are overperforming or underperforming relative to their level of development. As such, these ratios provide a benchmarking tool to help to focus local decision making. Our findings reinforce the importance of both risk exposure monitoring and epidemiological research to assess causal connections between risks and health outcomes, and they highlight the usefulness of the GBD study in synthesising data to draw comprehensive and robust conclusions that help to inform good policy and strategic health planning
    corecore