2,230 research outputs found

    A computational model for anti-cancer drug sensitivity prediction

    Get PDF
    Various methods have been developed to build models for predicting drug response in cancer treatment based on patient data through machine learning algorithms. Drug prediction models can offer better patient data classification, optimising sensitivity identification in cancer therapy for suitable drugs. In this paper, a computational model based on Deep Neural Networks has been designed for prediction of anti-cancer drug response based on genetic expression data using publicly available drug profiling datasets from Cancer Cell Line Encyclopedia (CCLE). The model consists of several parts, including continuous drug response prediction, discretization and a drug sensitivity result output. Regularization and compression of neuron connections is also implemented to make the model compact and efficient, outperforming other widely used algorithms, such as elastic net (EN), random forest (RF), support vector regression (SVR) and simple artificial neural network (ANN) in sensitivity analysis and predictive accuracy

    생물학적 사전 지식을 ν™œμš©ν•œ κ³ μ°¨μ›μ˜ 닀쀑 였믹슀 관계λ₯Ό μ°ΎλŠ” 컴퓨터 곡학적 μ ‘κ·Ό 방법

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사) -- μ„œμšΈλŒ€ν•™κ΅λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀, 2021.8. κΉ€μ„ .세포가 μ–΄λ–»κ²Œ κΈ°λŠ₯ν•˜κ³  μ™ΈλΆ€ μžκ·Ήμ— λ°˜μ‘ν•˜λŠ”μ§€ μ΄ν•΄ν•˜λŠ” 것은 생물학, μ˜ν•™μ—μ„œ κ°€μž₯ μ€‘μš”ν•œ 관심사 쀑 ν•˜λ‚˜μ΄λ‹€. 기술의 λ°œμ „μœΌλ‘œ κ³Όν•™μžλ“€μ€ 단일 생물학적 μ‹€ν—˜μœΌλ‘œ μ„Έν¬μ˜ λ³€ν™”μš”μΈλ“€μ„ μ‰½κ²Œ μΈ‘μ •ν•  수 있게 λ˜μ—ˆλ‹€. μ£Όλͺ©ν• λ§Œν•œ μ˜ˆμ‹œλ‘œ κ²Œλ†ˆ μ‹œν€€μ‹±, μœ μ „μž λ°œν˜„λŸ‰ μΈ‘μ •, μœ μ „μž λ°œν˜„μ„ μ‘°μ ˆν•˜λŠ” ν›„μ„± μœ μ „μ²΄ μΈ‘μ • 같은 닀쀑 였믹슀 데이터가 μžˆλ‹€. μ„Έν¬μ˜ μƒνƒœλ₯Ό 더 μžμ„Ένžˆ μ΄ν•΄ν•˜κΈ° μœ„ν•΄μ„œ 닀쀑 였믹슀 μ‘°μ ˆμžμ™€ μœ μ „μž μ‚¬μ΄μ˜ 쑰절 관계λ₯Ό μ•Œμ•„λ‚΄λŠ” 것이 μ€‘μš”ν•˜λ‹€. ν•˜μ§€λ§Œ 닀쀑 였믹슀 쑰절 κ΄€κ³„λŠ” 맀우 λ³΅μž‘ν•˜κ³  λͺ¨λ“  세포 μƒνƒœ 특이적인 관계λ₯Ό μ‹€ν—˜μ μœΌλ‘œ κ²€μ¦ν•˜λŠ” 것은 λΆˆκ°€λŠ₯ν•˜λ‹€. λ”°λΌμ„œ, μ„œλ‘œ λ‹€λ₯Έ μœ ν˜•μ˜ 고차원 였믹슀 λ°μ΄ν„°λ‘œλΆ€ν„° 관계λ₯Ό μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•œ 효율적인 컴퓨터 곡학적 접근방법이 μš”κ΅¬λœλ‹€. μ΄λŸ¬ν•œ 고차원 데이터λ₯Ό μ²˜λ¦¬ν•˜λŠ” ν•œ 가지 방법은 λ‹€μ–‘ν•œ λ°μ΄ν„°λ² μ΄μŠ€μ—μ„œ μ„ λ³„λœ μœ μ „μžμ˜ κΈ°λŠ₯κ³Ό 였믹슀 κ°„μ˜ 관계와 같은 μ™ΈλΆ€ 생물학적 지식을 ν†΅ν•©ν•˜μ—¬ ν™œμš©ν•˜λŠ” 것이닀. λ³Έ λ°•μ‚¬ν•™μœ„ 논문은 생물학적 사전 지식을 ν™œμš©ν•˜μ—¬ 닀쀑 였믹슀 λ°μ΄ν„°λ‘œλΆ€ν„° μœ μ „μžμ˜ λ°œν˜„μ„ μ‘°μ ˆν•˜λŠ” 관계λ₯Ό μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•œ μ„Έ 가지 컴퓨터 곡학적인 접근법을 μ œμ•ˆν•˜μ˜€λ‹€. 첫 λ²ˆμ§ΈλŠ” 마이크둜 μ•Œμ—”μ—μ΄μ™€ μœ μ „μžμ˜ μΌλŒ€λ‹€ 관계λ₯Ό μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•œ 기법이닀. 마이크둜 μ•Œμ—”μ—μ΄ ν‘œμ  예츑 λ¬Έμ œλŠ” κ°€λŠ₯ν•œ ν‘œμ  μœ μ „μžμ˜ κ°œμˆ˜κ°€ λ„ˆλ¬΄ 많으며 거짓 μ–‘μ„±κ³Ό κ±°μ§“μŒμ„±μ˜ λΉ„μœ¨μ„ μ‘°μ ˆν•΄μ•Ό ν•˜λŠ” λ¬Έμ œκ°€ μžˆλ‹€. μ΄λŸ¬ν•œ 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄ 마이크둜 μ•Œμ—”μ—μ΄-μœ μ „μžμ™€ λ°μ΄ν„°μ˜ λ§₯락 μ‚¬μ΄μ˜ 연관성을 λ¬Έν—Œ 지식을 ν™œμš©ν•˜μ—¬ κ²°μ •ν•˜κ³  마이크둜 μ•Œμ—”μ—μ΄-μœ μ „μž 관계λ₯Ό μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•œ ContextMMIAλ₯Ό κ°œλ°œν•˜μ˜€λ‹€. ContextMMIAλŠ” 톡계적 μœ μ˜μ„±κ³Ό λ¬Έν—Œ 관련성을 기반으둜 마이크둜 μ•Œμ—”μ—μ΄-μœ μ „μž κ΄€κ³„μ˜ 점수λ₯Ό κ³„μ‚°ν•˜μ—¬ κ΄€κ³„μ˜ μš°μ„ μˆœμœ„λ₯Ό κ²°μ •ν•œλ‹€. μ˜ˆν›„κ°€ λ‹€λ₯Έ μœ λ°©μ•” 데이터에 λŒ€ν•œ μ‹€ν—˜μ—μ„œ ContextMMIAλŠ” μ˜ˆν›„κ°€ λ‚˜μœ μœ λ°©μ•”μ—μ„œ ν™œμ„±ν™”λœ 마이크둜 μ•Œμ—”μ—μ΄-μœ μ „μž 관계λ₯Ό μ˜ˆμΈ‘ν•˜μ˜€κ³  κΈ°μ‘΄ μ‹€ν—˜μ μœΌλ‘œ κ²€μ¦λœ 관계가 높은 μš°μ„ μˆœμœ„λ‘œ μ˜ˆμΈ‘λ˜μ—ˆμœΌλ©° ν•΄λ‹Ή μœ μ „μžλ“€μ΄ μœ λ°©μ•” κ΄€λ ¨ κ²½λ‘œμ— κ΄€μ—¬ν•˜λŠ” κ²ƒμœΌλ‘œ μ•Œλ €μ‘Œλ‹€. 두 λ²ˆμ§ΈλŠ” μ•½λ¬Ό λ°˜μ‘μ„ μΌμœΌν‚€λŠ” μœ μ „μžμ˜ λ‹€λŒ€μΌ 쑰절 관계λ₯Ό μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•œ 기법이닀. μ•½λ¬Ό λ°˜μ‘ μ˜ˆμΈ‘μ„ μœ„ν•΄μ„œ μ•½λ¬Ό λ°˜μ‘ 맀개 μœ μ „μžλ₯Ό κ²°μ •ν•΄μ•Ό ν•˜λ©° 이λ₯Ό μœ„ν•΄ 20,000개 μœ μ „μžμ˜ 닀쀑 였믹슀 데이터λ₯Ό 톡합 λΆ„μ„ν•˜λŠ” 방법이 ν•„μš”ν•˜λ‹€. 이 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄ 저차원 μž„λ² λ”© 방법, μ•½λ¬Ό-μœ μ „μž 연관성에 λŒ€ν•œ λ¬Έν—Œ 지식 및 μœ μ „μž-μœ μ „μž μƒν˜Έ μž‘μš© 지식을 ν™œμš©ν•˜μ—¬ μ•½λ¬Ό λ°˜μ‘μ„ μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•œ DRIM을 κ°œλ°œν•˜μ˜€λ‹€. DRIM은 μ˜€ν† μΈμ½”λ”, ν…μ„œ λΆ„ν•΄, μ•½λ¬Ό-μœ μ „μž 연관성을 μ΄μš©ν•˜μ—¬ 닀쀑 였믹슀 λ°μ΄ν„°μ—μ„œ λ‹€λŒ€μΌ 관계λ₯Ό κ²°μ •ν•œλ‹€. κ²°μ •λœ 맀개 μœ μ „μžμ˜ 쑰절 관계λ₯Ό μœ μ „μž-μœ μ „μž μƒν˜Έ μž‘μš© 지식과 μ•½λ¬Ό λ°˜μ‘ μ‹œκ³„μ—΄ μœ μ „μž λ°œν˜„ λ°μ΄ν„°μ˜ μƒν˜Έ 상관관계λ₯Ό μ΄μš©ν•˜μ—¬ κ²°μ •ν•œλ‹€. μœ λ°©μ•” 세포주 데이터에 λŒ€ν•œ μ‹€ν—˜μ—μ„œ DRIM은 λΌνŒŒν‹°λ‹™μ΄ ν‘œμ μœΌλ‘œ ν•˜λŠ” PI3K-Akt νŒ¨μŠ€μ›¨μ΄μ— κ΄€μ—¬ν•˜λŠ” μœ μ „μžλ“€μ˜ μ•½λ¬Ό λ°˜μ‘ 쑰절 관계λ₯Ό μ˜ˆμΈ‘ν•˜μ˜€κ³  λΌνŒŒν‹°λ‹™ λ°˜μ‘μ„±κ³Ό κ΄€λ ¨λœ 맀개 μœ μ „μžλ₯Ό μ˜ˆμΈ‘ν•˜μ˜€λ‹€. 그리고 예츑된 쑰절 관계가 세포주 특이적인 νŒ¨ν„΄μ„ λ³΄μ΄λŠ” 것을 ν™•μΈν•˜μ˜€λ‹€. μ„Έ λ²ˆμ§ΈλŠ” μ„Έν¬μ˜ μƒνƒœλ₯Ό μ„€λͺ…ν•˜λŠ” μ‘°μ ˆμžμ™€ μœ μ „μžμ˜ λ‹€λŒ€λ‹€ 쑰절 관계λ₯Ό μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•œ 기법이닀. λ‹€λŒ€λ‹€ 관계 μ˜ˆμΈ‘μ„ μœ„ν•΄ κ΄€μ°°λœ μœ μ „μž λ°œν˜„ κ°’κ³Ό μœ μ „μž 쑰절 λ„€νŠΈμ›Œν¬λ‘œλΆ€ν„° μΆ”μ •λœ μœ μ „μž λ°œν˜„ κ°’ μ‚¬μ΄μ˜ 차이λ₯Ό μΈ‘μ •ν•˜λŠ” λͺ©μ  ν•¨μˆ˜λ₯Ό λ§Œλ“€μ—ˆλ‹€. λͺ©μ  ν•¨μˆ˜λ₯Ό μ΅œμ†Œν™”ν•˜κΈ° μœ„ν•˜μ—¬ μ‘°μ ˆμΈμžμ™€ μœ μ „μžμ˜ μˆ˜μ— 따라 κΈ°ν•˜κΈ‰μˆ˜μ μœΌλ‘œ μ¦κ°€ν•˜λŠ” 검색 곡간을 탐색해야 ν•œλ‹€. 이 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄ 쑰절자-μœ μ „μž μƒν˜Έ μž‘μš© 지식을 ν™œμš©ν•˜μ—¬ 두 가지 연산을 λ°˜λ³΅ν•˜μ—¬ 쑰절 관계λ₯Ό μ°ΎλŠ” μ΅œμ ν™” 기법을 κ°œλ°œν•˜μ˜€λ‹€. 첫 번째 λ‹¨κ³„λŠ” λ„€νŠΈμ›Œν¬μ— 간선을 μΆ”κ°€ν•˜κΈ° μœ„ν•΄ κ°•ν™” ν•™μŠ΅ 기반 νœ΄λ¦¬μŠ€ν‹±μ„ 톡해 쑰절자λ₯Ό μ„ νƒν•˜λŠ” λ‹€λŒ€μΌ μœ μ „μž 쀑심 관계λ₯Ό νƒμƒ‰ν•˜λŠ” 단계이닀. 두 번째 λ‹¨κ³„λŠ” λ„€νŠΈμ›Œν¬μ—μ„œ 간선을 μ œκ±°ν•˜κΈ° μœ„ν•΄ μœ μ „μžλ₯Ό ν™•λ₯ μ μœΌλ‘œ μ„ νƒν•˜λŠ” μΌλŒ€λ‹€ 쑰절자 쀑심 관계λ₯Ό νƒμƒ‰ν•˜λŠ” 단계이닀. μœ λ°©μ•” 세포주 데이터에 λŒ€ν•œ μ‹€ν—˜μ—μ„œ μ œμ•ˆλœ 방법은 μ΄μ „μ˜ μ΅œμ ν™” 방법보닀 더 μ •ν™•ν•œ μœ μ „μž λ°œν˜„λŸ‰ 좔정을 ν•˜μ˜€κ³  쑰절자 및 μœ μ „μž λ°œν˜„ λ°μ΄ν„°λ‘œ μœ λ°©μ•” μ•„ν˜• 특이적 λ„€νŠΈμ›Œν¬λ₯Ό κ΅¬μ„±ν•˜μ˜€λ‹€. λ˜ν•œ, μœ λ°©μ•” μ•„ν˜• κ΄€λ ¨ μ‹€ν—˜ κ²€μ¦λœ 쑰절 관계λ₯Ό μ˜ˆμΈ‘ν•˜μ˜€λ‹€. μš”μ•½ν•˜λ©΄, λ³Έ λ°•μ‚¬ν•™μœ„ 논문은 닀쀑 였믹슀 μ‘°μ ˆμžμ™€ μœ μ „μžμ˜ μ‚¬μ΄μ˜ μΌλŒ€λ‹€, λ‹€λŒ€μΌ, λ‹€λŒ€λ‹€ 관계λ₯Ό μ˜ˆμΈ‘ν•˜κΈ° μœ„ν•˜μ—¬ 생물학적 지식을 ν™œμš©ν•œ 컴퓨터 곡학적 접근법을 μ œμ•ˆν•˜μ˜€λ‹€. μ œμ•ˆλœ 방법은 μ¦κ°€ν•˜κ³  μžˆλŠ” λΆ„μž 생물학 데이터λ₯Ό λΆ„μ„ν•˜μ—¬ μœ μ „μž 쑰절 μƒν˜Έ μž‘μš©μ„ μ΄ν•΄ν•¨μœΌλ‘œμ¨ 세포 κΈ°λŠ₯에 λŒ€ν•œ 심측적인 이해λ₯Ό 도와쀄 수 μžˆμ„ κ²ƒμœΌλ‘œ κΈ°λŒ€λœλ‹€.Understanding how cells function or respond to external stimuli is one of the most important questions in biology and medicine. Thanks to the advances in instrumental technologies, scientists can routinely measure events within cells in single biological experiments. Notable examples are multi-omics data: sequencing of genomes, quantifications of gene expression, and identification of epigenetic events that regulate expression of genes. In order to better understand cellular mechanisms, it is essential to identify regulatory relationships between multi-omics regulators and genes. However, regulatory relationships are very complex and it is infeasible to validate all condition-specific relationships experimentally. Thus, there is an urgent need for an efficient computational method to extract relationships from different types of high-dimensional omics data. One way to address these high-dimensional data is to incorporate external biological knowledge such as relationships between omics and functions of genes curated in various databases. In my doctoral study, I developed three computational approaches to identify the regulatory relationships from multi-omics data utilizing biological prior knowledge. The first study proposes a method to predict one-to-m relationships between miRNA and genes. The computational challenge of miRNA target prediction is that there are many miRNA target candidates, and the ratio of false positives to false negatives needs to be adjusted. This challenge is addressed by utilizing literature knowledge for determining the association between miRNA-gene and a given context. In this study, I developed ContextMMIA to predict miRNA-gene relationships from miRNA and gene expression data. ContextMMIA computes scores of miRNA-gene relationships based on statistical significance and literature relevance and prioritizes the relationships based on the scores. In experiments on breast cancer data with different prognosis, ContextMMIA predicted differentially activated miRNA-gene relationships in invasive breast cancer. The experimentally verified miRNA-gene relationships were predicted with high priority and those genes are known to be involved in breast cancer-related pathways. The second study proposes a method to predict n-to-one relationships between regulators and gene on drug response. The computational challenge of drug response prediction is how to integrate multi-omics data of 20,000 genes for determining drug response mediator genes. This challenge is addressed by utilizing low-dimensional embedding methods, literature knowledge of drug-gene associations, and gene-gene interaction knowledge. For this problem, I developed DRIM to predict drug response relationships from the multi-omics data and drug-induced time-series gene expression data. DRIM uses autoencoder, tensor decomposition, and drug-gene association to determine n-to-one relationships from multi-omics data. Then, regulatory relationships of mediator genes are determined by gene-gene interaction knowledge and cross-correlation of drug-induced time-series gene expression data. In experiments on breast cancer cell line data, DRIM extracted mediator genes relevant to drug response and regulatory relationships of genes involved in the PI3K-Akt pathway targeted by lapatinib. In addition, DRIM revealed distinguished patterns of relationships in breast cancer cell lines with different lapatinib resistance. The third study proposes a method to predict n-to-m relationships between regulators and genes. In order to predict n-to-m relationships, this study formulated an objective function that measures the deviation between observed gene expression values and estimated gene expression values derived from gene regulatory networks. The computational challenge of minimizing the objective function is to navigate the search space of relationships exponentially increasing according to the number of regulators and genes. This challenge is addressed by the iterative local optimization with regulator-gene interaction knowledge. In this study, I developed a two-step iterative RL-based method to predict n-to-m relationships from regulator and gene expression data. The first step is to explore the n-to-one gene-oriented step that selects regulators by reinforcement learning based heuristic to add edges to the network. The second step is to explore the one-to-m regulator-oriented step that stochastically selects genes to remove edges from the network. In experiments on breast cancer cell line data, the proposed method constructed breast cancer subtype-specific networks from the regulator and gene expression profiles with a more accurate gene expression estimation than previous combinatorial optimization methods. Moreover, regulatory relationships involved in the networks were associated with breast cancer subtypes. In summary, in this thesis, I proposed computational methods for predicting one-to-m, n-to-one, and n-to-m relationships between multi-omics regulators and genes utilizing external domain knowledge. The proposed methods are expected to deepen our knowledge of cellular mechanisms by understanding gene regulatory interactions by analyzing the ever-increasing molecular biology data such as The Cancer Genome Atlas, Cancer Cell Line Encyclopedia.Chapter 1 Introduction 1 1.1 Biological background 1 1.1.1 Multi-omics analysis 1 1.1.2 Multi-omics relationships indicating cell state 2 1.1.3 Biological prior knowledge 4 1.2 Research problems for the multi-omics relationship 6 1.3 Computational challenges and approaches in the exploring multiomics relationship 6 1.4 Outline of the thesis 12 Chapter 2 Literature-based condition-specific miRNA-mRNA target prediction 13 2.1 Computational Problem & Evaluation criterion 14 2.2 Related works 15 2.3 Motivation 17 2.4 Methods 20 2.4.1 Identifying genes and miRNAs based on the user-provided context 22 2.4.2 Omics Score 23 2.4.3 Context Score 24 2.4.4 Confidence Score 26 2.5 Results 26 2.5.1 Pathway analysis 27 2.5.2 Reproducibility of validated targets in humans 31 2.5.3 Sensitivity tests when different keywords are used 33 2.6 Summary 34 Chapter 3 DRIM: A web-based system for investigating drug response at the molecular level by condition-specific multi-omics data integration 36 3.1 Computational Problem & Evaluation criterion 37 3.2 Related works 38 3.3 Motivation 42 3.4 Methods 44 3.4.1 Step 1: Input 45 3.4.2 Step 2: Identifying perturbed sub-pathway with time-series 45 3.4.3 Step 3: Embedding multi-omics for selecting potential mediator genes 47 3.4.4 Step 4: Construct TF-regulatory time-bounded network and identify regulatory path 52 3.4.5 Step 5: Analysis result on the web 52 3.5 Case study: Comparative analysis of breast cancer cell lines that have different sensitivity with lapatinib 54 3.5.1 Multi-omics analysis result before drug treatment 56 3.5.2 Time-series gene expression analysis after drug treatment 57 3.6 Summary 61 Chapter 4 Combinatorial modeling and optimization using iterative RL search for inferring sample-specific regulatory network 63 4.1 Computational Problem & Evaluation criterion 64 4.2 Related works 64 4.3 Motivation 66 4.4 Methods 68 4.4.1 Formulating an objective function 68 4.4.2 Overview of an iterative search method 70 4.4.3 G-step for exploring n-to-one gene-oriented relationship 73 4.4.4 R-step for exploring one-to-m regulator-oriented relationship 79 4.5 Results 80 4.5.1 Cancer cell line data 80 4.5.2 Hyperparameters 81 4.5.3 Quantitative evaluation 82 4.5.4 Qualitative evaluation 83 4.6 Summary 86 Chapter 5 Conclusions 88 ꡭ문초둝 111λ°•

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Tumor-on-a-chip model for advancement of anti-cancer nano drug delivery system

    Get PDF
    Despite explosive growth in the development of nano-drug delivery systems (NDDS) targeting tumors in the last few decades, clinical translation rates are low owing to the lack of efficient models for evaluating and predicting responses. Microfluidics-based tumor-on-a-chip (TOC) systems provide a promising approach to address these challenges. The integrated engineered platforms can recapitulate complex in vivo tumor features at a microscale level, such as the tumor microenvironment, three-dimensional tissue structure, and dynamic culture conditions, thus improving the correlation between results derived from preclinical and clinical trials in evaluating anticancer nanomedicines. The specific focus of this review is to describe recent advances in TOCs for the evaluation of nanomedicine, categorized into six sections based on the drug delivery process: circulation behavior after infusion, endothelial and matrix barriers, tumor uptake, therapeutic efficacy, safety, and resistance. We also discuss current issues and future directions for an end-use perspective of TOCs

    Multi-Organs-on-Chips for Testing Small-Molecule Drugs: Challenges and Perspectives.

    Get PDF
    Organ-on-a-chip technology has been used in testing small-molecule drugs for screening potential therapeutics and regulatory protocols. The technology is expected to boost the development of novel therapies and accelerate the discovery of drug combinations in the coming years. This has led to the development of multi-organ-on-a-chip (MOC) for recapitulating various organs involved in the drug-body interactions. In this review, we discuss the current MOCs used in screening small-molecule drugs and then focus on the dynamic process of drug absorption, distribution, metabolism, and excretion. We also address appropriate materials used for MOCs at low cost and scale-up capacity suitable for high-performance analysis of drugs and commercial high-throughput screening platforms

    Correlation between cell line chemosensitivity and protein expression pattern as new approach for the design of targeted anticancer small molecules

    Get PDF
    BACKGROUND AND RATIONALE: Over the past few decades, several databases with a significant amount of biological data related to cancer cells and anticancer agents (e.g.: National Cancer Institute database, NCI; Cancer Cell Line Encyclopedia, CCLE; Genomic and Drug Sensitivity in Cancer portal, GDSC) have been developed. The huge amount of heterogeneous biological data extractable from these databanks (among all, drug response and protein expression) provides a real foundation for predictive cancer chemogenomics, which aims to investigate the relationships between genomic traits and the response of cancer cells to drug treatment with the aim to identify novel therapeutic molecules and targets. In very recent times many computational and statistical approaches have been proposed to integrate and correlate these heterogeneous biological data sequences (protein expression – drug response), with the aim to assign the putative mechanism of action of anticancer small molecules with unknown biological target/s. The main limitation of all these computational methods is the need for experimental drug response data (after screening data). From this point of view, the possibility to predict in silico the antiproliferative activity of new/untested small molecules against specific cell lines, could enable correlations to be found between the predicted drug response and protein expression of the desired target from the very earliest stages of research. Such an innovative approach could allow to select the compounds with molecular mechanisms that are more likely to be connected with the target of interest preliminary to the in vitro assays, which would be a critical aid in the design of new targeted anticancer agents. RESULTS: In the present study, we aimed to develop a new innovative computational protocol based on the correlation of drug activity and protein expression data to support the discovery of new targeted anticancer agents. Compared with the approaches reported in the literature, the main novelty of the proposed protocol was represented by the use of predicted antiproliferative activity data, instead of experimental ones. To this aim, in the first phase of the research the new in silico Antiproliferative Activity Predictor (AAP) tool able to predict the anticancer activity (expressed as GI50) of new/untested small molecules against the NCI-60 panel was developed. The ligand-based tool, which took the advantages of the consolidated expertise of the research group in the manipulation of molecular descriptors, was adequately validated and the reliability of the prediction was further confirmed by the analysis of an in-house database and subsequent evaluation of a set of molecules selected by the NCI for the one-dose/five-doses antiproliferative assays. In the second part of the study, a new computational method to correlate drug activity data and protein expression pattern data was proposed and evaluated by analysing several case studies of targeted drugs tested by NCI, confirming the reliability of the proposed method for the biological data analysis. In the last part of the project the proposed correlation approach was applied to design new small molecules as selective inhibitors of Cdc25 phosphatase, a well-known protein involved in carcinogenic processes. By means of this innovative approach, integrated with other classical ligand/structures-based techniques, it was possible to screen a large database of molecular structures, and to select the ones with optimal relationship with the focused target. In vitro antiproliferative and enzymatic inhibition assays of the selected compounds led to the identification of new structurally heterogeneous inhibitors of Cdc25 proteins and confirmed the results of the in silico analysis. CONCLUSIONS: Collectively, the obtained results showed that the correlation between protein expression pattern and chemosensitivity is an innovative, alternative, and effective method to identify new modulators for the selected targets. In contrast to traditional in silico methods, the proposed protocol allows for the selection of molecular structures with heterogeneous scaffolds, which are not strictly related to the binding sites and with chemical-physical features that may be more suitable for all the pathways involved in the overall mechanism. The biological assays further corroborate the robustness and the reliability of this new approach and encourage its application in the anticancer targeted drug discovery field
    • …
    corecore