813 research outputs found
Predicting Proteome-Early Drug Induced Cardiac Toxicity Relationships (Pro-EDICToRs) with Node Overlapping Parameters (NOPs) of a new class of Blood Mass-Spectra graphs
The 11th International Electronic Conference on Synthetic Organic Chemistry session Computational ChemistryBlood Serum Proteome-Mass Spectra (SP-MS) may allow detecting Proteome-Early Drug Induced Cardiac Toxicity Relationships (called here Pro-EDICToRs). However, due to the thousands of proteins in the SP identifying general Pro-EDICToRs patterns instead of a single protein marker may represents a more realistic alternative. In this sense, first we introduced a novel Cartesian 2D spectrum graph for SP-MS. Next, we introduced the graph node-overlapping parameters (nopk) to numerically characterize SP-MS using them as inputs to seek a Quantitative Proteome-Toxicity Relationship (QPTR) classifier for Pro-EDICToRs with accuracy higher than 80%. Principal Component Analysis (PCA) on the nopk values present in the QPTR model explains with one factor (F1) the 82.7% of variance. Next, these nopk values were used to construct by the first time a Pro-EDICToRs Complex Network having nodes (samples) linked by edges (similarity between two samples). We compared the topology of two sub-networks (cardiac toxicity and control samples); finding extreme relative differences for the re-linking (P) and Zagreb (M2) indices (9.5 and 54.2 % respectively) out of 11 parameters. We also compared subnetworks with well known ideal random networks including Barabasi-Albert, Kleinberg Small World, Erdos-Renyi, and Epsstein Power Law models. Finally, we proposed Partial Order (PO) schemes of the 115 samples based on LDA-probabilities, F1-scores and/or network node degrees. PCA-CN and LDA-PCA based POs with Tanimotoâs coefficients equal or higher than 0.75 are promising for the study of Pro-EDICToRs. These results shows that simple QPTRs models based on MS graph numerical parameters are an interesting tool for proteome researchThe authors thank projects funded by the Xunta de Galicia (PXIB20304PR and BTF20302PR) and the Ministerio de Sanidad y Consumo (PI061457). GonzĂĄlez-DĂaz H. acknowledges tenure track research position funded by the Program Isidro Parga Pondal, Xunta de Galici
Statistical Methods to Enhance Clinical Prediction with High-Dimensional Data and Ordinal Response
Der technologische Fortschritt ermöglicht es heute, die moleculare
Konfiguration einzelner Zellen oder ganzer Gewebeproben zu
untersuchen. Solche in groĂen Mengen produzierten
hochdimensionalen Omics-Daten aus der Molekularbiologie lassen sich
zu immer niedrigeren Kosten erzeugen und werden so immer
hÀufiger auch in klinischen Fragestellungen eingesetzt.
Personalisierte Diagnose oder auch die Vorhersage eines
Behandlungserfolges auf der Basis solcher Hochdurchsatzdaten stellen
eine moderne Anwendung von Techniken aus dem maschinellen Lernen dar.
In der Praxis werden klinische Parameter, wie etwa der
Gesundheitszustand oder die Nebenwirkungen einer Therapie, hÀufig auf
einer ordinalen Skala erhoben (beispielsweise gut, normal,
schlecht).
Es ist verbreitet, Klassifikationsproblme mit ordinal skaliertem
Endpunkt wie generelle Mehrklassenproblme zu behandeln und somit die
Information, die in der Ordnung zwischen den Klassen enthalten ist, zu
ignorieren. Allerdings kann das VernachlÀssigen dieser Information zu
einer verminderten KlassifikationsgĂŒte fĂŒhren oder sogar eine
ungĂŒnstige ungeordnete Klassifikation erzeugen.
Klassische AnsÀtze, einen ordinal skalierten Endpunkt direkt zu
modellieren, wie beispielsweise mit einem kumulativen Linkmodell,
lassen sich typischerweise nicht auf hochdimensionale Daten anwenden.
Wir prÀsentieren in dieser Arbeit hierarchical twoing (hi2) als
einen Algorithmus fĂŒr die Klassifikation hochdimensionler Daten in
ordinal Skalierte Kategorien. hi2 nutzt die MĂ€chtigkeit der
sehr gut verstandenen binÀren Klassifikation, um auch in ordinale
Kategorien zu klassifizieren. Eine Opensource-Implementierung von
hi2 ist online verfĂŒgbar.
In einer Vergleichsstudie zur Klassifikation von echten wie von
simulierten Daten mit ordinalem Endpunkt produzieren etablierte
Methoden, die speziell fĂŒr geordnete Kategorien entworfen wurden,
nicht generell bessere Ergebnisse als state-of-the-art
nicht-ordinale Klassifikatoren. Die FĂ€higkeit eines Algorithmus, mit
hochdimensionalen Daten umzugehen, dominiert die
Klassifikationsleisting. Wir zeigen, dass unser Algorithmus hi2
konsistent gute Ergebnisse erzielt und in vielen FĂ€llen besser
abschneidet als die anderen Methoden
Extraction of pharmacokinetic evidence of drug-drug interactions from the literature
Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, literature mining has not been used to extract specific types of experimental evidence, which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDI, essential for identifying causal mechanisms of putative interactions and as input for further pharmacological and pharmacoepidemiology investigations. We used manually curated corpora of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining on two tasks: first, identifying PubMed abstracts containing pharmacokinetic evidence of DDIs; second, extracting sentences containing such evidence from abstracts. We implemented a text mining pipeline and evaluated it using several linear classifiers and a variety of feature transforms. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, various publicly available named entity recognizers, and pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F10.93, MCC0.74, iAUC0.99) and sentences (F10.76, MCC0.65, iAUC0.83). We found that word bigram features were important for achieving optimal classifier performance and that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification. We also found that some drug-related named entity recognition tools and dictionaries led to slight but significant improvements, especially in classification of evidence sentences. Based on our thorough analysis of classifiers and feature transforms and the high classification performance achieved, we demonstrate that literature mining can aid DDI discovery by supporting automatic extraction of specific types of experimental evidence.National Institutes of Health, National Library of Medicine Program, grant 01LM011945-01 "BLR: Evidence-based Drug-Interaction Discovery: In-Vivo, In-Vitro and Clinical," a grant from the Indiana University Collaborative Research Program 2013, "Drug-Drug Interaction Prediction from Large-scale Mining of Literature and Patient Records," as well as a grant from the joint program between the Fundação Luso-Americana para o Desenvolvimento (Portugal) and National Science Foundation (USA), 2012-2014, "Network Mining For Gene Regulation And Biochemical Signaling.
Evaluation of face recognition algorithms under noise
One of the major applications of computer vision and image processing is face recognition,
where a computerized algorithm automatically identifies a personâs face from
a large image dataset or even from a live video. This thesis addresses facial recognition,
a topic that has been widely studied due to its importance in many applications
in both civilian and military domains. The application of face recognition systems
has expanded from security purposes to social networking sites, managing fraud, and
improving user experience. Numerous algorithms have been designed to perform face
recognition with good accuracy. This problem is challenging due to the dynamic nature
of the human face and the different poses that it can take. Regardless of the
algorithm, facial recognition accuracy can be heavily affected by the presence of noise.
This thesis presents a comparison of traditional and deep learning face recognition
algorithms under the presence of noise. For this purpose, Gaussian and salt-andpepper
noises are applied to the face images drawn from the ORL Dataset. The
image recognition is performed using each of the following eight algorithms: principal
component analysis (PCA), two-dimensional PCA (2D-PCA), linear discriminant
analysis (LDA), independent component analysis (ICA), discrete cosine transform
(DCT), support vector machine (SVM), convolution neural network (CNN) and Alex
Net. The ORL dataset was used in the experiments to calculate the evaluation accuracy
for each of the investigated algorithms. Each algorithm is evaluated with two
experiments; in the first experiment only one image per person is used for training,
whereas in the second experiment, five images per person are used for training. The investigated traditional algorithms are implemented with MATLAB and the deep
learning algorithms approaches are implemented with Python. The results show that
the best performance was obtained using the DCT algorithm with 92% dominant
eigenvalues and 95.25 % accuracy, whereas for deep learning, the best performance
was using a CNN with accuracy of 97.95%, which makes it the best choice under noisy
conditions
ATR-FTIR Spectroscopy-Linked Chemometrics:A Novel Approach to the Analysis and Control of the Invasive Species Japanese Knotweed
Japanese knotweed (Reynoutria japonica), an invasive plant species, causes negative environmental and socio-economic impacts. A female clone in the United Kingdom, its extensive rhizome system enables rapid vegetative spread. Plasticity permits this species to occupy a broad geographic range and survive harsh abiotic conditions. It is notoriously difficult to control with traditional management strategies, which include repetitive herbicide application and costly carbon-intensive rhizome excavation. This problem is complicated by crossbreeding with the closely related species, Giant knotweed (Reynoutria sachalinensis), to give the more vigorous hybrid, Bohemian knotweed (Fallopia x Bohemica) which produces viable seed. These species, hybrids, and backcrosses form a morphologically similar complex known as Japanese knotweed âsensu latoâ and are often misidentified. The research herein explores the opportunities offered by advances in the application of attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy-linked chemometrics within plant sciences, for the identification and control of knotweed, to enhance our understanding of knotweed biology, and the potential of this technique. ATR-FTIR spectral profiles of Japanese knotweed leaf material and xylem sap samples, which include important biological absorptions due to lipids, proteins, carbohydrates, and nucleic acids, were used to: identify plants from different growing regions highlighting the plasticity of this clonal species; differentiate between related species and hybrids; and predict key physiological characteristics such as hormone concentrations and root water potential. Technical advances were made for the application of ATR-FTIR spectroscopy to plant science, including definition of the environmental factors that exert the most significant influence on spectral profiles, evaluation of sample preparation techniques, and identification of key wavenumbers for prediction of hormone concentrations and abiotic stress. The presented results cement the position of concatenated mid-infrared spectroscopy and machine learning as a powerful approach for the study of plant biology, extending its reach beyond the field of crop science to demonstrate a potential for the discrimination between and control of invasive plant species
Trends in application of NIR and hyperspectral imaging for food authentication
Food fraud can cause damage to consumer health and affect their confidence, destroy brands and generate large economic losses in the industry. Food authenticity allows to identify if food composition, geographical origin, genetic variety and farming system corresponds to what has been declared on the label. Although there are currently standardized methods to identify certain adulterants, the complexity of the food, the complexity of the supply chain and the appearance of new adulterants require the continuous development of analytical techniques to detect food fraud. NIR and Hyperspectral imaging (HSI) in tandem with chemometrics are non-destructive, non-invasive and accurate techniques for food authentication. This review focuses on NIR and HIS approaches to food authentication, including adulteration by substitution, geographical origin and farming system. In this context, the advances in NIR and HSI approaches reported since 2014 are discussed regarding their potential use in food authentication. Both techniques have shown to have efficiency, precision and selectivity to detect adulterants and identify geographic origin, genetic variety and farming system. Portability and remote access are shown as the next step for the industrialization of NIR and HSI devices
- âŠ