95 research outputs found

    Context Modeling for Ranking and Tagging Bursty Features in Text Streams

    Get PDF
    Bursty features in text streams are very useful in many text mining applications. Most existing studies detect bursty features based purely on term frequency changes without taking into account the semantic contexts of terms, and as a result the detected bursty features may not always be interesting or easy to interpret. In this paper we propose to model the contexts of bursty features using a language modeling approach. We then propose a novel topic diversity-based metric using the context models to find newsworthy bursty features. We also propose to use the context models to automatically assign meaningful tags to bursty features. Using a large corpus of a stream of news articles, we quantitatively show that the proposed context language models for bursty features can effectively help rank bursty features based on their newsworthiness and to assign meaningful tags to annotate bursty features. ? 2010 ACM.EI

    A General SIMD-based Approach to Accelerating Compression Algorithms

    Get PDF
    Compression algorithms are important for data oriented tasks, especially in the era of Big Data. Modern processors equipped with powerful SIMD instruction sets, provide us an opportunity for achieving better compression performance. Previous research has shown that SIMD-based optimizations can multiply decoding speeds. Following these pioneering studies, we propose a general approach to accelerate compression algorithms. By instantiating the approach, we have developed several novel integer compression algorithms, called Group-Simple, Group-Scheme, Group-AFOR, and Group-PFD, and implemented their corresponding vectorized versions. We evaluate the proposed algorithms on two public TREC datasets, a Wikipedia dataset and a Twitter dataset. With competitive compression ratios and encoding speeds, our SIMD-based algorithms outperform state-of-the-art non-vectorized algorithms with respect to decoding speeds

    Erythropoietin reduces neuronal cell death and hyperalgesia induced by peripheral inflammatory pain in neonatal rats

    Get PDF
    Painful stimuli during neonatal stage may affect brain development and contribute to abnormal behaviors in adulthood. Very few specific therapies are available for this developmental disorder. A better understanding of the mechanisms and consequences of painful stimuli during the neonatal period is essential for the development of effective therapies. In this study, we examined brain reactions in a neonatal rat model of peripheral inflammatory pain. We focused on the inflammatory insult-induced brain responses and delayed changes in behavior and pain sensation. Postnatal day 3 pups received formalin injections into the paws once a day for 3 days. The insult induced dysregulation of several inflammatory factors in the brain and caused selective neuronal cell death in the cortex, hippocampus and hypothalamus. On postnatal day 21, rats that received the inflammatory nociceptive insult exhibited increased local cerebral blood flow in the somatosensory cortex, hyperalgesia, and decreased exploratory behaviors. Based on these observations, we tested recombinant human erythropoietin (rhEPO) as a potential treatment to prevent the inflammatory pain-induced changes. rhEPO treatment (5,000 U/kg/day, i.p.), coupled to formalin injections, ameliorated neuronal cell death and normalized the inflammatory response. Rats that received formalin plus rhEPO exhibited normal levels of cerebral blood flow, pain sensitivity and exploratory behavior. Treatment with rhEPO also restored normal brain and body weights that were reduced in the formalin group. These data suggest that severe inflammatory pain has adverse effects on brain development and rhEPO may be a possible therapy for the prevention and treatment of this developmental disorder

    Genomic prediction of drought tolerance during seedling stage in maize using low-cost molecular markers

    Get PDF
    Drought tolerance in maize is a complex and polygenic trait, especially in the seedling stage. In plant breeding, complex genetic traits can be improved by genomic selection (GS), which has become a practical and effective breeding tool. In the present study, a natural maize population named Northeast China core population (NCCP) consisting of 379 inbred lines were genotyped with diversity arrays technology (DArT) and genotyping-by-sequencing (GBS) platforms. Target traits of seedling emergence rate (ER), seedling plant height (SPH), and grain yield (GY) were evaluated under two natural drought stress environments in northeast China. Adequate genetic variations were observed for all the target traits, but they were divergent across environments. Similarly, the heritability of the target trait also varied across years and environments, the heritabilities in 2019 (0.88, 0.82, 0.85 for ER, SPH, GY) were higher than those in 2020 (0.65, 0.53, 0.33) and cross-2-years (0.32, 0.26, 0.33). In total, three marker datasets, 11,865 SilicoDArT markers obtained from the DArT-seq platform, 7837 SNPs obtained from the DArT-seq platform, and 91,003 SNPs obtained from the GBS platform, were used for GS analysis after quality control. The results of phylogenetic trees showed that broad genetic diversity existed in the NCCP population. Genomic prediction results showed that the average prediction accuracies estimated using the DArT SNP dataset under the two-fold cross-validation scheme were 0.27, 0.19, and 0.33, for ER, SPH, and GY, respectively. The result of SilicoDArT is close to the SNPs from DArT-seq, those were 0.26, 0.22, and 0.33. For the trait with lower heritability, the prediction accuracy can be improved using the dataset filtered by linkage disequilibrium. For the same trait, the prediction accuracies estimated with two DArT marker datasets were consistently higher than that estimated with the GBS SNP dataset under the same genotyping cost. The prediction accuracy was improved by controlling population structure and marker quality, even though the marker density was reduced. The prediction accuracies were improved by more than 30% using the significant-associated SNPs. Due to the complexity of drought tolerance under the natural stress environments, multiple years of data need to be accumulated to improve prediction accuracy by reducing genotype-by-environment interaction. Modeling genotype-by-environment interaction into genomic prediction needs to be further developed for improving drought tolerance in maize. The results obtained from the present study provides valuable pathway for improving drought tolerance in maize using GS

    Sex Differences in Abnormal Intrinsic Functional Connectivity After Acute Mild Traumatic Brain Injury

    Get PDF
    Mild traumatic brain injury (TBI) is considered to induce abnormal intrinsic functional connectivity within resting-state networks (RSNs). The objective of this study was to estimate the role of sex in intrinsic functional connectivity after acute mild TBI. We recruited a cohort of 54 patients (27 males and 27 females with mild TBI within 7 days post-injury) from the emergency department (ED) and 34 age-, education-matched healthy controls (HCs; 17 males and 17 females). On the clinical scales, there were no statistically significant differences between males and females in either control group or mild TBI group. To detect whether there was abnormal sex difference on functional connectivity in RSNs, we performed independent component analysis (ICA) and a dual regression approach to investigate the between-subject voxel-wise comparisons of functional connectivity within seven selected RSNs. Compared to female patients, male patients showed increased intrinsic functional connectivity in motor network, ventral stream network, executive function network, cerebellum network and decreased connectivity in visual network. Further analysis demonstrated a positive correlation between the functional connectivity in executive function network and insomnia severity index (ISI) scores in male patients (r = 0.515, P = 0.006). The abnormality of the functional connectivity of RSNs in acute mild TBI showed the possibility of brain recombination after trauma, mainly concerning male-specific

    The United States COVID-19 Forecast Hub dataset

    Get PDF
    Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
    corecore