125 research outputs found
Determining the Veracity of Rumours on Twitter
While social networks can provide an ideal platform for up-to-date information from individuals across the world, it has also proved to be a place where rumours fester and accidental or deliberate mis- information often emerges. In this article, we aim to support the task of making sense from social media data, and specifically, seek to build an autonomous message-classifier that filters relevant and trustworthy information from Twitter. For our work, we collected about 100 million public tweets, including users’ past tweets, from which we identified 72 rumours (41 true, 31 false). We considered over 80 trustworthiness measures including the authors’ profile and past behaviour, the social network connections (graphs), and the content of tweets themselves. We ran modern machine-learning classifiers over those measures to produce trustworthiness scores at various time windows from the outbreak of the rumour. Such time-windows were key as they allowed useful insight into the progression of the rumours. From our findings, we identified that our model was significantly more accurate than similar studies in the literature. We also identified critical attributes of the data that give rise to the trustworthiness scores assigned. Finally we developed a software demonstration that provides a visual user interface to allow the user to examine the analysis
Personalized medicine in psoriasis: developing a genomic classifier to predict histological response to Alefacept
<p>Abstract</p> <p>Background</p> <p>Alefacept treatment is highly effective in a select group patients with moderate-to-severe psoriasis, and is an ideal candidate to develop systems to predict who will respond to therapy. A clinical trial of 22 patients with moderate to severe psoriasis treated with alefacept was conducted in 2002-2003, as a mechanism of action study. Patients were classified as responders or non-responders to alefacept based on histological criteria. Results of the original mechanism of action study have been published. Peripheral blood was collected at the start of this clinical trial, and a prior analysis demonstrated that gene expression in PBMCs differed between responders and non-responders, however, the analysis performed could not be used to predict response.</p> <p>Methods</p> <p>Microarray data from PBMCs of 16 of these patients was analyzed to generate a treatment response classifier. We used a discriminant analysis method that performs sample classification from gene expression data, via "nearest shrunken centroid method". Centroids are the average gene expression for each gene in each class divided by the within-class standard deviation for that gene.</p> <p>Results</p> <p>A disease response classifier using 23 genes was created to accurately predict response to alefacept (12.3% error rate). While the genes in this classifier should be considered as a group, some of the individual genes are of great interest, for example, cAMP response element modulator (CREM), v-MAF avian musculoaponeurotic fibrosarcoma oncogene family (MAFF), chloride intracellular channel protein 1 (CLIC1, also called NCC27), NLR family, pyrin domain-containing 1 (NLRP1), and CCL5 (chemokine, cc motif, ligand 5, also called regulated upon activation, normally T expressed, and presumably secreted/RANTES).</p> <p>Conclusions</p> <p>Although this study is small, and based on analysis of existing microarray data, we demonstrate that a treatment response classifier for alefacept can be created using gene expression of PBMCs in psoriasis. This preliminary study may provide a useful tool to predict response of psoriatic patients to alefacept.</p
Reexamining the effects of gestational age, fetal growth, and maternal smoking on neonatal mortality
BACKGROUND: Low birth weight (<2,500 g) is a strong predictor of infant mortality. Yet low birth weight, in isolation, is uninformative since it is comprised of two intertwined components: preterm delivery and reduced fetal growth. Through nonparametric logistic regression models, we examine the effects of gestational age, fetal growth, and maternal smoking on neonatal mortality. METHODS: We derived data on over 10 million singleton live births delivered at ≥ 24 weeks from the 1998–2000 U.S. natality data files. Nonparametric multivariable logistic regression based on generalized additive models was used to examine neonatal mortality (deaths within the first 28 days) in relation to fetal growth (gestational age-specific standardized birth weight), gestational age, and number of cigarettes smoked per day. All analyses were further adjusted for the confounding effects due to maternal age and gravidity. RESULTS: The relationship between standardized birth weight and neonatal mortality is nonlinear; mortality is high at low z-score birth weights, drops precipitously with increasing z-score birth weight, and begins to flatten for heavier infants. Gestational age is also strongly associated with mortality, with patterns similar to those of z-score birth weight. Although the direct effect of smoking on neonatal mortality is weak, its effects (on mortality) appear to be largely mediated through reduced fetal growth and, to a lesser extent, through shortened gestation. In fact, the association between smoking and reduced fetal growth gets stronger as pregnancies approach term. CONCLUSIONS: Our study provides important insights regarding the combined effects of fetal growth, gestational age, and smoking on neonatal mortality. The findings suggest that the effect of maternal smoking on neonatal mortality is largely mediated through reduced fetal growth
A genome-wide linkage study of mammographic density, a risk factor for breast cancer
Abstract
Introduction
Mammographic breast density is a highly heritable (h2 > 0.6) and strong risk factor for breast cancer. We conducted a genome-wide linkage study to identify loci influencing mammographic breast density (MD).
Methods
Epidemiological data were assembled on 1,415 families from the Australia, Northern California and Ontario sites of the Breast Cancer Family Registry, and additional families recruited in Australia and Ontario. Families consisted of sister pairs with age-matched mammograms and data on factors known to influence MD. Single nucleotide polymorphism (SNP) genotyping was performed on 3,952 individuals using the Illumina Infinium 6K linkage panel.
Results
Using a variance components method, genome-wide linkage analysis was performed using quantitative traits obtained by adjusting MD measurements for known covariates. Our primary trait was formed by fitting a linear model to the square root of the percentage of the breast area that was dense (PMD), adjusting for age at mammogram, number of live births, menopausal status, weight, height, weight squared, and menopausal hormone therapy. The maximum logarithm of odds (LOD) score from the genome-wide scan was on chromosome 7p14.1-p13 (LOD = 2.69; 63.5 cM) for covariate-adjusted PMD, with a 1-LOD interval spanning 8.6 cM. A similar signal was seen for the covariate adjusted area of the breast that was dense (DA) phenotype. Simulations showed that the complete sample had adequate power to detect LOD scores of 3 or 3.5 for a locus accounting for 20% of phenotypic variance. A modest peak initially seen on chromosome 7q32.3-q34 increased in strength when only the 513 families with at least two sisters below 50 years of age were included in the analysis (LOD 3.2; 140.7 cM, 1-LOD interval spanning 9.6 cM). In a subgroup analysis, we also found a LOD score of 3.3 for DA phenotype on chromosome 12.11.22-q13.11 (60.8 cM, 1-LOD interval spanning 9.3 cM), overlapping a region identified in a previous study.
Conclusions
The suggestive peaks and the larger linkage signal seen in the subset of pedigrees with younger participants highlight regions of interest for further study to identify genes that determine MD, with the goal of understanding mammographic density and its involvement in susceptibility to breast cancer
A review of estimation of distribution algorithms in bioinformatics
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain
Low linkage disequilibrium in wild Anopheles gambiae s.l. populations
<p>Abstract</p> <p>Background</p> <p>In the malaria vector <it>Anopheles gambiae</it>, understanding diversity in natural populations and genetic components of important phenotypes such as resistance to malaria infection is crucial for developing new malaria transmission blocking strategies. The design and interpretation of many studies here depends critically on Linkage disequilibrium (LD). For example in association studies, LD determines the density of Single Nucleotide Polymorphisms (SNPs) to be genotyped to represent the majority of the genomic information. Here, we aim to determine LD in wild <it>An. gambiae s.l</it>. populations in 4 genes potentially involved in mosquito immune responses against pathogens (<it>Gambicin</it>, <it>NOS</it>, <it>REL2 </it>and <it>FBN9</it>) using previously published and newly generated sequences.</p> <p>Results</p> <p>The level of LD between SNP pairs in cloned sequences of each gene was determined for 7 species (or incipient species) of the <it>An. gambiae </it>complex. In all tested genes and species, LD between SNPs was low: even at short distances (< 200 bp), most SNP pairs gave an r<sup>2 </sup>< 0.3. Mean r<sup>2 </sup>ranged from 0.073 to 0.766. In most genes and species LD decayed very rapidly with increasing inter-marker distance.</p> <p>Conclusions</p> <p>These results are of great interest for the development of large scale polymorphism studies, as LD generally falls below any useful limit. It indicates that very fine scale SNP detection will be required to give an overall view of genome-wide polymorphism. Perhaps a more feasible approach to genome wide association studies is to use targeted approaches using candidate gene selection to detect association to phenotypes of interest.</p
Gender, Obesity and Repeated Elevation of C-Reactive Protein: Data from the CARDIA Cohort
C-reactive Protein (CRP) measurements above 10 mg/L have been conventionally treated as acute inflammation and excluded from epidemiologic studies of chronic inflammation. However, recent evidence suggest that such CRP elevations can be seen even with chronic inflammation. The authors assessed 3,300 participants in The Coronary Artery Risk Development in Young Adults study, who had two or more CRP measurements between 1992/3 and 2005/6 to a) investigate characteristics associated with repeated CRP elevation above 10 mg/L; b) identify subgroups at high risk of repeated elevation; and c) investigate the effect of different CRP thresholds on the probability of an elevation being one-time rather than repeated. 225 participants (6.8%) had one-time and 103 (3.1%) had repeated CRP elevation above 10 mg/L. Repeated elevation was associated with obesity, female gender, low income, and sex hormone use. The probability of an elevation above 10 mg/L being one-time rather than repeated was lowest (51%) in women with body mass index above 31 kg/m2, compared to 82% in others. These findings suggest that CRP elevations above 10 mg/L in obese women are likely to be from chronic rather than acute inflammation, and that CRP thresholds above 10 mg/L may be warranted to distinguish acute from chronic inflammation in obese women
Global Pyrogeography: the Current and Future Distribution of Wildfire
Climate change is expected to alter the geographic distribution of wildfire, a complex abiotic process that responds to a variety of spatial and environmental gradients. How future climate change may alter global wildfire activity, however, is still largely unknown. As a first step to quantifying potential change in global wildfire, we present a multivariate quantification of environmental drivers for the observed, current distribution of vegetation fires using statistical models of the relationship between fire activity and resources to burn, climate conditions, human influence, and lightning flash rates at a coarse spatiotemporal resolution (100 km, over one decade). We then demonstrate how these statistical models can be used to project future changes in global fire patterns, highlighting regional hotspots of change in fire probabilities under future climate conditions as simulated by a global climate model. Based on current conditions, our results illustrate how the availability of resources to burn and climate conditions conducive to combustion jointly determine why some parts of the world are fire-prone and others are fire-free. In contrast to any expectation that global warming should necessarily result in more fire, we find that regional increases in fire probabilities may be counter-balanced by decreases at other locations, due to the interplay of temperature and precipitation variables. Despite this net balance, our models predict substantial invasion and retreat of fire across large portions of the globe. These changes could have important effects on terrestrial ecosystems since alteration in fire activity may occur quite rapidly, generating ever more complex environmental challenges for species dispersing and adjusting to new climate conditions. Our findings highlight the potential for widespread impacts of climate change on wildfire, suggesting severely altered fire regimes and the need for more explicit inclusion of fire in research on global vegetation-climate change dynamics and conservation planning
Expression and regulation of type 2A protein phosphatases and alpha4 signalling in cardiac health and hypertrophy
Abstract Cardiac physiology and hypertrophy are regulated
by the phosphorylation status of many proteins, which
is partly controlled by a poorly defined type 2A protein
phosphatase-alpha4 intracellular signalling axis. Quantitative
PCR analysis revealed that mRNA levels of the type
2A catalytic subunits were differentially expressed in H9c2
cardiomyocytes (PP2ACb[PP2ACa[PP4C[PP6C),
NRVM (PP2ACb[PP2ACa = PP4C = PP6C), and
adult rat ventricular myocytes (PP2ACa[
PP2ACb[PP6C[PP4C). Western analysis confirmed
that all type 2A catalytic subunits were expressed in H9c2
cardiomyocytes; however, PP4C protein was absent in
adult myocytes and only detectable following 26S proteasome
inhibition. Short-term knockdown of alpha4 protein
expression attenuated expression of all type 2A catalytic
subunits. Pressure overload-induced left ventricular (LV)
hypertrophy was associated with an increase in both
PP2AC and alpha4 protein expression. Although PP6C
expression was unchanged, expression of PP6C regulatory
subunits (1) Sit4-associated protein 1 (SAP1) and (2)
ankyrin repeat domain (ANKRD) 28 and 44 proteins was
elevated, whereas SAP2 expression was reduced in
hypertrophied LV tissue. Co-immunoprecipitation studies
demonstrated that the interaction between alpha4 and
PP2AC or PP6C subunits was either unchanged or reduced
in hypertrophied LV tissue, respectively. Phosphorylation
status of phospholemman (Ser63 and Ser68) was significantly
increased by knockdown of PP2ACa, PP2ACb, or
PP4C protein expression. DNA damage assessed by histone
H2A.X phosphorylation (cH2A.X) in hypertrophied tissue
remained unchanged. However, exposure of cardiomyocytes
to H2O2 increased levels of cH2A.X which was
unaffected by knockdown of PP6C expression, but was
abolished by the short-term knockdown of alpha4 expression.
This study illustrates the significance and altered
activity of the type 2A protein phosphatase-alpha4 complex
in healthy and hypertrophied myocardium
- …