462 research outputs found

    Smart Approach for the Design of Highly Selective Aptamer-Based Biosensors

    Get PDF
    Aptamers are chemically synthesized single-stranded DNA or RNA oligonucleotides widely used nowadays in sensors and nanoscale devices as highly sensitive biorecognition elements. With proper design, aptamers are able to bind to a specific target molecule with high selectivity. To date, the systematic evolution of ligands by exponential enrichment (SELEX) process is employed to isolate aptamers. Nevertheless, this method requires complex and time-consuming procedures. In silico methods comprising machine learning models have been recently proposed to reduce the time and cost of aptamer design. In this work, we present a new in silico approach allowing the generation of highly sensitive and selective RNA aptamers towards a specific target, here represented by ammonium dissolved in water. By using machine learning and bioinformatics tools, a rational design of aptamers is demonstrated. This "smart" SELEX method is experimentally proved by choosing the best five aptamer candidates obtained from the design process and applying them as functional elements in an electrochemical sensor to detect, as the target molecule, ammonium at different concentrations. We observed that the use of five different aptamers leads to a significant difference in the sensor's response. This can be explained by considering the aptamers' conformational change due to their interaction with the target molecule. We studied these conformational changes using a molecular dynamics simulation and suggested a possible explanation of the experimental observations. Finally, electrochemical measurements exposing the same sensors to different molecules were used to confirm the high selectivity of the designed aptamers. The proposed in silico SELEX approach can potentially reduce the cost and the time needed to identify the aptamers and potentially be applied to any target molecule

    Investigating modularity and transparency within bioinspired connectionist architectures using genetic and epigenetic models

    Get PDF
    Machine learning algorithms allow computers to deal with incomplete data in tasks such as speech recognition and object detection. Some machine learning algorithms take inspiration from biological systems due to useful properties such as robustness, allowing algorithms to be flexible and domain agnostic. This comes at a cost, resulting in difficulty when one attempts to understand the reasoning behind decisions. This is problematic when such models are applied in realworld situations where accountability, legality, and maintenance are of concern. Artificial gene regulatory networks (AGRNs) are a type of connectionist architecture inspired by gene regulatory mechanisms. AGRNs are of interest within this thesis due to their ability to solve tasks in chaotic dynamical systems despite their relatively small size.The overarching aim of this work was to investigate the properties of connectionist architectures to improve the transparency of their execution. Initially, the evolutionary process and internal structure of AGRNs were investigated. Following this, the creation of an external control layer used to improve the transparency of execution of an external connectionist architecture was attempted.When investigating the evolutionary process of AGRNs, pathways were found that when followed, produced more performant networks in a shorter time frame. Evidence that AGRNs are capable of performing well despite internal interference was found when investigating their modularity, where it was also discovered that they do not develop strict modularity consistently. A control layer inspired by epigenetics that selectively deactivates nodes in trained artificial neural networks (ANNs) was developed; the analysis of its behaviour provided an insight into the internal workings of the ANN

    Deep learning techniques for biomedical data processing

    Get PDF
    The interest in Deep Learning (DL) has seen an exponential growth in the last ten years, producing a significant increase in both theoretical and applicative studies. On the one hand, the versatility and the ability to tackle complex tasks have led to the rapid and widespread diffusion of DL technologies. On the other hand, the dizzying increase in the availability of biomedical data has made classical analyses, carried out by human experts, progressively more unlikely. Contextually, the need for efficient and reliable automatic tools to support clinicians, at least in the most demanding tasks, has become increasingly pressing. In this survey, we will introduce a broad overview of DL models and their applications to biomedical data processing, specifically to medical image analysis, sequence processing (RNA and proteins) and graph modeling of molecular data interactions. First, the fundamental key concepts of DL architectures will be introduced, with particular reference to neural networks for structured data, convolutional neural networks, generative adversarial models, and siamese architectures. Subsequently, their applicability for the analysis of different types of biomedical data will be shown, in areas ranging from diagnostics to the understanding of the characteristics underlying the process of transcription and translation of our genetic code, up to the discovery of new drugs. Finally, the prospects and future expectations of DL applications to biomedical data will be discussed

    Theoretical and computational modeling of rna-ligand interactions

    Get PDF
    Ribonucleic acid (RNA) is a polymeric nucleic acid that plays a variety of critical roles in gene expression and regulation at the level of transcription and translation. Recently, there has been an enormous interest in the development of therapeutic strategies that target RNA molecules. Instead of modifying the product of gene expression, i.e., proteins, RNAtargeted therapeutics aims to modulate the relevant key RNA elements in the disease-related cellular pathways. Such approaches have two significant advantages. First, diseases with related proteins that are difficult or unable to be drugged become druggable by targeting the corresponding messenger RNAs (mRNAs) that encode the amino acid sequences. Second, besides coding mRNAs, the vast majority of the human genome sequences are transcribed to noncoding RNAs (ncRNAs), which serve as enzymatic, structural, and regulatory elements in cellular pathways of most human diseases. Targeting noncoding RNAs would open up remarkable new opportunities for disease treatment. The first step in modeling the RNA-drug interaction is to understand the 3D structure of the given RNA target. With current theoretical models, accurate prediction of 3D structures for large RNAs from sequence remains computationally infeasible. One of the major challenges comes from the flexibility in the RNA molecule, especially in loop/junction regions, and the resulting rugged energy landscape. However, structure probing techniques, such as the “selective 20-hydroxyl acylation analyzed by primer extension” (SHAPE) experiment, enable the quantitative detection of the relative flexibility and hence structure information of RNA structural elements. Therefore, one may incorporate the SHAPE data into RNA 3D structure prediction. In the first project, we investigate the feasibility of using a machine-learning-based approach to predict the SHAPE reactivity from the 3D RNA structure and compare the machine-learning result to that of a physics-based model. In the second project, in order to provide a user-friendly tool for RNA biologists, we developed a fully automated web interface, “SHAPE predictoR” (SHAPER) for predicting SHAPE profile from any given 3D RNA structure. In a cellular environment, various factors, such as metal ions and small molecules, interact with an RNA molecule to modulate RNA cellular activity. RNA is a highly charged polymer with each backbone phosphate group carrying one unit of negative (electronic) charge. In order to fold into a compact functional tertiary structure, it requires metal ions to reduce Coulombic repulsive electrostatic forces by neutralizing the backbone charges. In particular, Mg2+ ion is essential for the folding and stability of RNA tertiary structures. In the third project, we introduce a machine-learning-based model, the “Magnesium convolutional neural network” (MgNet) model, to predict Mg2+ binding site for a given 3D RNA structure, and show the use of the model in investigating the important coordinating RNA atoms and identifying novel Mg2+ binding motifs. Besides Mg2+ ions, small molecules, such as drug molecules, can also bind to an RNA to modulate its activities. Motivated by the tremendous potential of RNA-targeted drug discovery, in the fourth project, we develop a novel approach to predicting RNA-small molecule binding. Specifically, we develop a statistical potential-based scoring/ranking method (SPRank) to identify the native binding mode of the small molecule from a pool of decoys and estimate the binding affinity for the given RNA-small molecule complex. The results tested on a widely used data set suggest that SPRank can achieve (moderately) better performance than the current state-of-art models

    Vibrational spectroscopy : are we close to finding a solution for early pancreatic cancer diagnosis?

    Get PDF
    Pancreatic cancer (PC) is an aggressive and lethal neoplasm, ranking seventh in the world for cancer deaths, with an overall 5-year survival rate of below 10%. The knowledge about PC pathogenesis is rapidly expanding. New aspects of tumor biology, including its molecular and morphological heterogeneity, have been reported to explain the complicated "cross-talk" that occurs between the cancer cells and the tumor stroma or the nature of pancreatic ductal adenocarcinoma-associated neural remodeling. Nevertheless, currently, there are no specific and sensitive diagnosis options for PC. Vibrational spectroscopy (VS) shows a promising role in the development of early diagnosis technology. In this review, we summarize recent reports about improvements in spectroscopic methodologies, briefly explain and highlight the drawbacks of each of them, and discuss available solutions. The important aspects of spectroscopic data evaluation with multivariate analysis and a convolutional neural network methodology are depicted. We conclude by presenting a study design for systemic verification of the VS-based methods in the diagnosis of PC

    Transformative Machine Learning

    Get PDF
    The key to success in machine learning (ML) is the use of effective data representations. Traditionally, data representations were hand-crafted. Recently it has been demonstrated that, given sufficient data, deep neural networks can learn effective implicit representations from simple input representations. However, for most scientific problems, the use of deep learning is not appropriate as the amount of available data is limited, and/or the output models must be explainable. Nevertheless, many scientific problems do have significant amounts of data available on related tasks, which makes them amenable to multi-task learning, i.e. learning many related problems simultaneously. Here we propose a novel and general representation learning approach for multi-task learning that works successfully with small amounts of data. The fundamental new idea is to transform an input intrinsic data representation (i.e., handcrafted features), to an extrinsic representation based on what a pre-trained set of models predict about the examples. This transformation has the dual advantages of producing significantly more accurate predictions, and providing explainable models. To demonstrate the utility of this transformative learning approach, we have applied it to three real-world scientific problems: drug-design (quantitative structure activity relationship learning), predicting human gene expression (across different tissue types and drug treatments), and meta-learning for machine learning (predicting which machine learning methods work best for a given problem). In all three problems, transformative machine learning significantly outperforms the best intrinsic representation

    An Ensemble Learning Model for COVID-19 Detection from Blood Test Samples

    Get PDF
    Current research endeavors in the application of artificial intelligence (AI) methods in the diagnosis of the COVID-19 disease has proven indispensable with very promising results. Despite these promising results, there are still limitations in real-time detection of COVID-19 using reverse transcription polymerase chain reaction (RT-PCR) test data, such as limited datasets, imbalance classes, a high misclassification rate of models, and the need for specialized research in identifying the best features and thus improving prediction rates. This study aims to investigate and apply the ensemble learning approach to develop prediction models for effective detection of COVID-19 using routine laboratory blood test results. Hence, an ensemble machine learning-based COVID-19 detection system is presented, aiming to aid clinicians to diagnose this virus effectively. The experiment was conducted using custom convolutional neural network (CNN) models as a first-stage classifier and 15 supervised machine learning algorithms as a second-stage classifier: K-Nearest Neighbors, Support Vector Machine (Linear and RBF), Naive Bayes, Decision Tree, Random Forest, MultiLayer Perceptron, AdaBoost, ExtraTrees, Logistic Regression, Linear and Quadratic Discriminant Analysis (LDA/QDA), Passive, Ridge, and Stochastic Gradient Descent Classifier. Our findings show that an ensemble learning model based on DNN and ExtraTrees achieved a mean accuracy of 99.28% and area under curve (AUC) of 99.4%, while AdaBoost gave a mean accuracy of 99.28% and AUC of 98.8% on the San Raffaele Hospital dataset, respectively. The comparison of the proposed COVID-19 detection approach with other state-of-the-art approaches using the same dataset shows that the proposed method outperforms several other COVID-19 diagnostics methods.publishedVersio
    corecore