175 research outputs found
Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus
The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning
DNA damage among wood workers assessed with the comet assay
Exposure to wood dust, a human carcinogen, is common in wood-related industries, and millions of workers are occupationally exposed to wood dust worldwide. The comet assay is a rapid, simple, and sensitive method for determining DNA damage. The objective of this study was to investigate the DNA damage associated with occupational exposure to wood dust using the comet assay (peripheral blood samples) among nonsmoking wood workers (n = 31, furniture and construction workers) and controls (n = 19). DNA damage was greater in the group exposed to composite wood products compared to the group exposed to natural woods and controls (P < 0.001). No difference in DNA damage was observed between workers exposed to natural woods and controls (P = 0.13). Duration of exposure and current dust concentrations had no effect on DNA damage. In future studies, workers' exposures should include cumulative dust concentrations and exposures originating from the binders used in composite wood products
Profiling allele-specific gene expression in brains from individuals with autism spectrum disorder reveals preferential minor allele usage.
One fundamental but understudied mechanism of gene regulation in disease is allele-specific expression (ASE), the preferential expression of one allele. We leveraged RNA-sequencing data from human brain to assess ASE in autism spectrum disorder (ASD). When ASE is observed in ASD, the allele with lower population frequency (minor allele) is preferentially more highly expressed than the major allele, opposite to the canonical pattern. Importantly, genes showing ASE in ASD are enriched in those downregulated in ASD postmortem brains and in genes harboring de novo mutations in ASD. Two regions, 14q32 and 15q11, containing all known orphan C/D box small nucleolar RNAs (snoRNAs), are particularly enriched in shifts to higher minor allele expression. We demonstrate that this allele shifting enhances snoRNA-targeted splicing changes in ASD-related target genes in idiopathic ASD and 15q11-q13 duplication syndrome. Together, these results implicate allelic imbalance and dysregulation of orphan C/D box snoRNAs in ASD pathogenesis
Modeling Disease Severity in Multiple Sclerosis Using Electronic Health Records
Objective:
To optimally leverage the scalability and unique features of the electronic health records (EHR) for research that would ultimately improve patient care, we need to accurately identify patients and extract clinically meaningful measures. Using multiple sclerosis (MS) as a proof of principle, we showcased how to leverage routinely collected EHR data to identify patients with a complex neurological disorder and derive an important surrogate measure of disease severity heretofore only available in research settings.
Methods:
In a cross-sectional observational study, 5,495 MS patients were identified from the EHR systems of two major referral hospitals using an algorithm that includes codified and narrative information extracted using natural language processing. In the subset of patients who receive neurological care at a MS Center where disease measures have been collected, we used routinely collected EHR data to extract two aggregate indicators of MS severity of clinical relevance multiple sclerosis severity score (MSSS) and brain parenchymal fraction (BPF, a measure of whole brain volume).
Results:
The EHR algorithm that identifies MS patients has an area under the curve of 0.958, 83% sensitivity, 92% positive predictive value, and 89% negative predictive value when a 95% specificity threshold is used. The correlation between EHR-derived and true MSSS has a mean R[superscript 2] = 0.38±0.05, and that between EHR-derived and true BPF has a mean R[superscript 2] = 0.22±0.08. To illustrate its clinical relevance, derived MSSS captures the expected difference in disease severity between relapsing-remitting and progressive MS patients after adjusting for sex, age of symptom onset and disease duration (p = 1.56×10[superscript −12]).
Conclusion:
Incorporation of sophisticated codified and narrative EHR data accurately identifies MS patients and provides estimation of a well-accepted indicator of MS severity that is widely used in research settings but not part of the routine medical records. Similar approaches could be applied to other complex neurological disorders.National Institute of General Medical Sciences (U.S.) (NIH U54-LM008748
Colchicine and therapy of cardiovascular disease: Not merely a theory
The cytoskeleton is a sophisticated cellular system consisted of actin filaments, intermediate filaments and microtubules (MT) accompanied by a large number of associated structural and motor proteins. Microtubules are dynamically assembling and disassembling structures. They are pivotal for many cell functions, e.g. intracellular traffic of membrane-bound organelles in endocytosis and protein secretion, also a variety of inflammatory and signal transduction pathways. Tubulin is the major building protein of MT. Depended on doses, agents that bind to tubulin inhibit its assembly, that is, MT formation, or disas-semble the preformed MT. Such tubulin-binding agents are named MT-disassembling agents or antitululins, colchicine being a classical member of these agents. Herein, we describe in brief the scientific saga of colchicine as related to the therapy of car-diovascular diseases such as acute coronary syndromes, myocardial infarction, atrial fibrillation, pericarditis, and hypertrophic cardiomyopathy
Selected heterozygosity at cis-regulatory sequences increases the expression homogeneity of a cell population in humans
Background: Examples of heterozygote advantage in humans are scarce and limited to protein-coding sequences. Here, we attempt a genome-wide functional inference of advantageous heterozygosity at cis-regulatory regions. Results: The single-nucleotide polymorphisms bearing the signatures of balancing selection are enriched in active cis-regulatory regions of immune cells and epithelial cells, the latter of which provide barrier function and innate immunity. Examples associated with ancient trans-specific balancing selection are also discovered. Allelic imbalance in chromatin accessibility and divergence in transcription factor motif sequences indicate that these balanced polymorphisms cause distinct regulatory variation. However, a majority of these variants show no association with the expression level of the target gene. Instead, single-cell experimental data for gene expression and chromatin accessibility demonstrate that heterozygous sequences can lower cell-to-cell variability in proportion to selection strengths. This negative correlation is more pronounced for highly expressed genes and consistently observed when using different data and methods. Based on mathematical modeling, we hypothesize that extrinsic noise from fluctuations in transcription factor activity may be amplified in homozygotes, whereas it is buffered in heterozygotes. While high expression levels are coupled with intrinsic noise reduction, regulatory heterozygosity can contribute to the suppression of extrinsic noise. Conclusions: This mechanism may confer a selective advantage by increasing cell population homogeneity and thereby enhancing the collective action of the cells, especially of those involved in the defense systems in humansope
Association of Diabetic Ketoacidosis and HbA1c at Onset with Year-Three HbA1c in Children and Adolescents with Type 1 Diabetes: Data from the International SWEET Registry
Objective: To establish whether diabetic ketoacidosis (DKA) or HbA1c at onset is associated with year-three HbA1c in children with type 1 diabetes (T1D).
Methods: Children with T1D from the SWEET registry, diagnosed <18 years, with documented clinical presentation, HbA1c at onset and follow-up were included. Participants were categorized according to T1D onset: (a) DKA (DKA with coma, DKA without coma, no DKA); (b) HbA1c at onset (low [<10%], medium [10 to <12%], high [≥12%]). To adjust for demographics, linear regression was applied with interaction terms for DKA and HbA1c at onset groups (adjusted means with 95% CI). Association between year-three HbA1c and both HbA1c and presentation at onset was analyzed (Vuong test).
Results: Among 1420 children (54% males; median age at onset 9.1 years [Q1;Q3: 5.8;12.2]), 6% of children experienced DKA with coma, 37% DKA without coma, and 57% no DKA. Year-three HbA1c was lower in the low compared to high HbA1c at onset group, both in the DKA without coma (7.1% [6.8;7.4] vs 7.6% [7.5;7.8], P = .03) and in the no DKA group (7.4% [7.2;7.5] vs 7.8% [7.6;7.9], P = .01), without differences between low and medium HbA1c at onset groups. Year-three HbA1c did not differ among HbA1c at onset groups in the DKA with coma group. HbA1c at onset as an explanatory variable was more closely associated with year-three HbA1c compared to presentation at onset groups (P = .02).
Conclusions: Year-three HbA1c is more closely related to HbA1c than to DKA at onset; earlier hyperglycemia detection might be crucial to improving year-three HbA1c.info:eu-repo/semantics/publishedVersio
The strength of co-authorship in gene name disambiguation
<p>Abstract</p> <p>Background</p> <p>A biomedical entity mention in articles and other free texts is often ambiguous. For example, 13% of the gene names (aliases) might refer to more than one gene. The task of Gene Symbol Disambiguation (GSD) – a special case of Word Sense Disambiguation (WSD) – is to assign a unique gene identifier for all identified gene name aliases in biology-related articles. Supervised and unsupervised machine learning WSD techniques have been applied in the biomedical field with promising results. We examine here the utilisation potential of the fact – one of the special features of biological articles – that the authors of the documents are known through graph-based semi-supervised methods for the GSD task.</p> <p>Results</p> <p>Our key hypothesis is that a biologist refers to each particular gene by a fixed gene alias and this holds for the co-authors as well. To make use of the co-authorship information we decided to build the inverse co-author graph on MedLine abstracts. The nodes of the inverse co-author graph are articles and there is an edge between two nodes if and only if the two articles have a mutual author. We introduce here two methods using distances (based on the graph) of abstracts for the GSD task. We found that a disambiguation decision can be made in 85% of cases with an extremely high (99.5%) precision rate just by using information obtained from the inverse co-author graph. We incorporated the co-authorship information into two GSD systems in order to attain full coverage and in experiments our procedure achieved precision of 94.3%, 98.85%, 96.05% and 99.63% on the human, mouse, fly and yeast GSD evaluation sets, respectively.</p> <p>Conclusion</p> <p>Based on the promising results obtained so far we suggest that the co-authorship information and the circumstances of the articles' release (like the title of the journal, the year of publication) can be a crucial building block of any sophisticated similarity measure among biological articles and hence the methods introduced here should be useful for other biomedical natural language processing tasks (like organism or target disease detection) as well.</p
Optimizing both catalyst preparation and catalytic behaviour for the oxidative dehydrogenation of ethane of Ni-Sn-O catalysts
[EN] Bulk Ni-Sn-O catalysts have been synthesized, tested in the oxidative dehydrogenation of ethane and characterized by several physicochemical techniques. The catalysts have been prepared by evaporation of the corresponding salts using several additives in the synthesis gel, i.e. ammonium hydroxide, nitric acid, glyoxylic acid or oxalic acid, in the synthesis gel. The catalysts were finally calcined at 500 degrees C in air. Important changes in the catalytic behaviour have been observed depending on the additive. In fact, an important improvement in the catalytic performance is observed especially when some additives, such as glyoxylic or oxalic acid, are used. Thus the productivity to ethylene multiplies by 6 compared to the reference Ni-Sn-O catalyst if appropriate templates are used, and this is the result of an improvement in both the catalytic activity and the selectivity to ethylene. This improved performance has been explained in terms of the decrease of the crystallite size (and the increase in the surface area of catalyst) as well as the modification of the lattice parameter of nickel oxide.The authors would like to acknowledge the DGICYT in Spain (CTQ2015-68951-C3-1-R and CTQ2012-37925-C03-2) for financial support. We also thank the University of Valencia and SCSIE-UV for assistanceSolsona Espriu, BE.; López Nieto, JM.; Agouram, S.; Soriano Rodríguez, MD.; Dejoz, A.; Vázquez, MI.; Concepción Heydorn, P. (2016). Optimizing both catalyst preparation and catalytic behaviour for the oxidative dehydrogenation of ethane of Ni-Sn-O catalysts. Topics in Catalysis. 59(17-18):1564-1572. https://doi.org/10.1007/s11244-016-0674-zS156415725917-18Heracleous E, Lee AF, Wilson K, Lemonidou AA (2005) J Catal 231:159–171Heracleous E, Lemonidou AA (2006) J Catal 237:162–174Savova B, Loridant S, Filkova D, Millet JMM (2010) Appl Catal A 390:148–157Heracleous E, Lemonidou AA (2010) J Catal 270:67–75Solsona B, Nieto JML, Concepcion P, Dejoz A, Ivars F, Vazquez MI (2011) J Catal 280:28–39Skoufa Z, Heracleous E, Lemonidou AA (2012) Catal Today 192:169–176Zhu H, Ould-Chikh S, Anjum DH, Sun M, Biausque G, Basset JM, Caps V (2012) J Catal 285:292–303Skoufa Z, Heracleous E, Lemonidou AA (2012) Chem Eng Sci 84:48–56Zhu H, Rosenfeld DC, Anjum DH, Caps V, Basset JM (2015) ChemSusChem 8:1254–1263Heracleous E, Lemonidou AA (2015) J Catal 322:118–129Solsona B, Concepcion P, Demicol B, Hernandez S, Delgado JJ, Calvino JJ, Nieto JML (2012) J Catal 295:104–114Nieto JML, Solsona B, Grasselli RK, Concepción P (2014) Top Catal 57:1248–1255Popescu I, Skoufa Z, Heracleous E, Lemonidou AA, Marcu IC (2015) PCCP 17:8138–8147Zhang X, Gong Y, Yu G, Xie Y (2002) J Mol Catal A 180:293–298Popescu I, Skoufa Z, Heracleous E, Lemonidou A, Marcu I-C (2015) Phys Chem Chem Phys 17:8138–8147Nakamura KI, Miyake T, Konishi T, Suzuki T (2006) J Mol Catal A 260:144–151Solsona B, Dejoz AM, Vazquez MI, Ivars F, Nieto JML (2009) Top Catal 52:751–757Bortolozzi JP, Gutierrez LB, Ulla MA (2013) Appl Catal A 452:179–188Takeguchi T, Furukawa S, Inoue M (2001) J Catal 202:14–24Richardson JT, Turk B, Twigg MV (1996) Appl Catal 148:97–112Biju V, Khadar MA (2002) J Nanopart Res 4:247–253Van Veenendaal MA, Sawatzky GA (1993) Phys Rev Lett 70:2459–2462Vedrine JC, Hollinger G, Duc TM (1978) J Phys Chem 82:1515–1520Salagre P, Fierro JLG, Medina F, Sueiras JE (1996) J Mol Catal A 106:125–13
- …
