941 research outputs found

    Modeling of proteins

    Get PDF
    Compendi d'articlesIn paper I the four proposed assumptions in the context of categorical variable mapping in protein classification problems: (1) translation, (2) permutation, (3) constant, and (4) eigenvalues were tested. The results suggest that these four assumptions are valid. In paper II the proposed approach is able to generate an accuracy, sensitivity and specify of classification forecasts of 97.69%, 95.02% and 98.26%, respectively, illustrating that a combination of DNA methylation with nonlinear methods such as artificial neural networks might be useful in the task of identifying patients with a carcinoma. In paper III it was shown that gene expression data can be successfully analyzed with machine learning techniques in order to differentiate healthy patients and patients with interstitial lung disease systemic sclerosis (ILD-SSc). In paper IV, following a machine learning approach, it was possible to identify a list of genes that appear to be related to inflammatory bowel diseasePrograma de Doctorat en Química Teòrica i Modelització Computaciona

    Aproximaciones bioinformáticas para identificación de perfiles epigenéticos en procesos neuropatológicos

    Get PDF
    Degenerative neurological diseases, such as Alzheimer, Multiple Sclerosis or Huntington Disease, are illnesses that are not well-known while at the same time having a significant impact on the quality of life of the patients and their survival. The focus of this dissertation is finding biomarkers for the identification of these diseases, ideally in a rapid a reliable manner. The analysis was carried out using DNA CpG methylation data. In recent years there has been very significant technological improvements. It is currently possible to obtain the methylation levels for hundreds of thousands of CpG in a patient in a fast and reliable manner. It is however challenging to analyze these amounts of new data. A reasonable approach to tackle this issue is using machine learning techniques that have proven useful in many other fields. In this dissertation I developed a nonlinear approach to identifying combinations of CpGs DNA methylation data, as biomarkers for Alzheimer (AD) disease. It will be shown that this approach increases the accuracy of the detection on patients with AD when compared to directly using all the data available. I also analyzed the case of Huntington Disease (HD).Using nonlinear techniques I was able to reduce the number of CpGs considered from hundreds of thousands to 237 using a non-linear approach. It will be shown that using only these 237 CpGs and non-linear techniques such as artificial neural networks makes it possible to accurately differentiate between control and HD patients. Additionally, in this dissertation I present a technique, based on the concept of Shannon Entropy, to select CpGs as inputs for non-linear classification algorithms. It will be shown that this approach generates accurate classifications that are a statistically significant improvement over using all the data available or randomly selecting the same number of CpGs. The results seems to clearly illustrate that the analysis of the DNA methylation data, for the identification of patients suffering from the degenerative neurological diseases above mentioned, needs to be carefully carry out. Having the possibility of analyzing hundreds of thousands of CpGs level does not necessarily translate into better results as some of these levels might be unrelated and only adding noise to the analysis. It will be shown that the proposed algorithms generate accurate results while at the same time decreasing the number of CpGs used. For instance, in the case of Alzheimer the results obtained with the proposed algorithm generate a sensitivity of 0.9007 and a specificity of 0.9485. One of the underlying expectations is that in the future there will be curative treatments for these illnesses, which do not currently exists. It is also assumed that early detection, similarly to many other diseases, might be important when such treatments appear. Using the current technology it is relatively simple to analyze DNA methylation data and hence it can become an interesting biomarker in the context of these illnesses.Las enfermedades neurológicas degenerativas, como el Alzheimer, la Esclerosis Múltiple o la Enfermedad de Huntington son enfermedades que aún no son del todo conocidas y, al mismo tiempo, tienen un gran impacto en la calidad de vida del paciente y en su supervivencia. El enfoque de esta tesis es encontrar biomarcadores para la identificación de estas enfermedades, idealmente de una manera rápida y precisa. El análisis se llevó a cabo utilizando datos de metilación de ADN CpG. En los últimos años se han producido mejoras tecnológicas muy significativas. Actualmente es posible obtener los niveles de metilación para cientos de miles de CpG en un paciente de una manera rápida y confiable. Sin embargo, es difícil analizar estas cantidades de nuevos datos. Un enfoque razonable para abordar este problema es el uso de técnicas de aprendizaje automático que han demostrado ser útiles en muchos otros campos. En esta tesis doctoral desarrolle un enfoque no lineal para identificar combinaciones de datos de metilación del ADN (CpGs), como biomarcadores para la enfermedad de Alzheimer (EA). Se demostrará que este algoritmo aumenta la precisión de la detección en pacientes con EA en comparación con el uso directo de todos los datos disponibles. También analice el caso de la enfermedad de Huntington (EH). Usando técnicas no lineales pude reducir el número de CpG considerados de cientos de miles a 237 utilizando también un enfoque no lineal. Se demostrará que el uso de solo estos 237 CpG y técnicas no lineales como las redes neuronales artificiales permite diferenciar con precisión entre pacientes de control y EH. Adicionalmente, en esta tesis presento una técnica, basada en el concepto de Entropía de Shannon, para seleccionar CpGs como entradas para algoritmos de clasificación no lineal. Se demostrará que este enfoque genera clasificaciones precisas con una mejora estadísticamente significativa sobre el uso de todos los datos disponibles o la selección aleatoria del mismo número de CpG. Los resultados parecen ilustrar claramente que el análisis de los datos de metilación del ADN, para la identificación de pacientes que sufren de la enfermedad neurológica degenerativa antes mencionada, debe llevarse a cabo cuidadosamente. Tener la posibilidad de analizar cientos de miles de niveles de CpG no necesariamente se traduce en mejores resultados, ya que algunos de estos niveles pueden no estar relacionados y solo agregar ruido al análisis. Se demostrará que los algoritmos propuestos generan resultados precisos y, al mismo tiempo, disminuyen el número de CpG utilizados. Por ejemplo, en el caso del Alzheimer los resultados obtenidos con el algoritmo propuesto generan una sensibilidad de 0,9007 y una especificidad de 0,9485. Una de las expectativas subyacentes es que en el futuro habrá tratamientos curativos para estas enfermedades, que actualmente no existen. También se supone que la detección temprana, de manera similar a muchas otras enfermedades, podría ser importante cuando aparecen tales tratamientos. Utilizando la tecnología actual, es relativamente simple analizar los datos de metilación del ADN y, por lo tanto, puede convertirse en un biomarcador interesante en el contexto de estas enfermedades

    Nuevo centro Las Cruces: renovación urbana, capacitación y emprendimiento en la ciudad de Bogotá

    Get PDF
    Trabajo de gradoEl proyecto nuevo Centro Las Cruces es una propuesta urbana y arquitectónica que presenta un cambio en la imagen de un sector tradicional del centro de Bogotá. Teniendo de base conceptos como el hábitat y la sostenibilidad el modelo de ciudad incorpora nuevos elementos arquitectónicos que delimitan un borde pretendiendo de esta manera cambiar la percepción de inseguridad, inequidad y pobreza no sólo de sus habitantes sino también del resto de la ciudad.PregradoArquitect

    Value Investing and Size Effect in the South Korean Stock Market

    Get PDF
    There are indications that value investing strategies have been able to outperform the overall market in several countries across the globe. In this article, the specific case of South Korea is analyzed. It would appear that from a rigorous statistical point of view there are no strong evidence supporting the outperformance of value stocks versus growth stocks in South Korea, particularly when measured on a yearly basis. These results were consistent using both MSCI value and growth indexes as well as constructing portfolios using the P/E, P/B, cash flow per share and average 5-year sales growth. The statistical tests performed failed to reject for the majority of the years that the monthly returns come from distributions with different medians. The test yielding rather consistent results on a yearly basis but for large periods of time (decades) the results were more mixed, pointing in some cases to value investing outperforming over that very long time frame. It should be noted that the final value of the portfolios was rather different when using criteria, such as low P/E, typically associated with value stocks. The tests also failed to reject the hypothesis of different means for the monthly returns of small, medium and large companies

    Identification of Systemic Sclerosis through Machine Learning Algorithms and Gene Expression

    Get PDF
    Systemic sclerosis (SSc) is an autoimmune, chronic disease that remains not well understood. It is believed that the cause of the illness is a combination of genetic and environmental factors. The evolution of the illness also greatly varies from patient to patient. A common complication of the illness, with an associated higher mortality, is interstitial lung disease (ILD). We present in this paper an algorithm (using machine learning techniques) that it is able to identify, with a 92.2% accuracy, patients suffering from ILD-SSc using gene expression data obtained from peripheral blood. The data were obtained from public sources (GEO accession GSE181228) and contains genetic data for 134 patients at an initial stage as well as at a follow up date (12 months later) for 98 of these patients. Additionally, there are 45 control (healthy) cases. The algorithm also identified 172 genes that might be involved in the illness. These 172 genes appeared in all the 20 most accurate classification models among a total of half a million models estimated. Their frequency might suggest that they are related to the illness to some degree. The proposed algorithm, besides differentiating between control and patients, was also able to distinguish among different variants of the illness (diffuse variants). This can have a significance from a treatment point of view. The different type of variants have a different associated prognosis

    Impuestos a los asalariados.

    Get PDF
    El impuesto directo al trabajo, de 1972 a 1980, manifestó en general una tendencia a crecer como porcentaje del ingreso total. Desde 1980 se estableció la forma impositiva indirecta, el IVA, con una tasa del 10%, la cual se incrementó al 15% para 1983. El porcentaje tributario entre 1982 y 1984se comportó con respecto a los años anteriores de la siguiente manera: los trabajadores que perciben entre 1.5 y 3 salarios mínimos encuentran significativos incrementos (0.3, 1.10 y 2.20%), y los ingresos equivalentes a seis y más salarios mínimos muestran importantes reducciones

    An Entropy Approach to Multiple Sclerosis Identification

    Get PDF
    Multiple sclerosis (MS) is a relatively common neurodegenerative illness that frequently causes a large level of disability in patients. While its cause is not fully understood, it is likely due to a combination of genetic and environmental factors. Diagnosis of multiple sclerosis through a simple clinical examination might be challenging as the evolution of the illness varies significantly from patient to patient, with some patients experiencing long periods of remission. In this regard, having a quick and inexpensive tool to help identify the illness, such as DNA CpG (cytosine-phosphate-guanine) methylation, might be useful. In this paper, a technique is presented, based on the concept of Shannon Entropy, to select CpGs as inputs for non-linear classification algorithms. It will be shown that this approach generates accurate classifications that are a statistically significant improvement over using all the data available or randomly selecting the same number of CpGs. The analysis controlled for factors such as age, gender and smoking status of the patient. This approach managed to reduce the number of CpGs used while at the same time significantly increasing the accuracy

    Alzheimer Identification through DNA Methylation and Artificial Intelligence Techniques

    Get PDF
    A nonlinear approach to identifying combinations of CpGs DNA methylation data, as biomarkers for Alzheimer (AD) disease, is presented in this paper. It will be shown that the presented algorithm can substantially reduce the amount of CpGs used while generating forecasts that are more accurate than using all the CpGs available. It is assumed that the process, in principle, can be non-linear; hence, a non-linear approach might be more appropriate. The proposed algorithm selects which CpGs to use as input data in a classification problem that tries to distinguish between patients suffering from AD and healthy control individuals. This type of classification problem is suitable for techniques, such as support vector machines. The algorithm was used both at a single dataset level, as well as using multiple datasets. Developing robust algorithms for multi-datasets is challenging, due to the impact that small differences in laboratory procedures have in the obtained data. The approach that was followed in the paper can be expanded to multiple datasets, allowing for a gradual more granular understanding of the underlying process. A 92% successful classification rate was obtained, using the proposed method, which is a higher value than the result obtained using all the CpGs available. This is likely due to the reduction in the dimensionality of the data obtained by the algorithm that, in turn, helps to reduce the risk of reaching a local minima

    Neural Network Aided Detection of Huntington Disease

    Get PDF
    Huntington Disease (HD) is a degenerative neurological disease that causes a significant impact on the quality of life of the patient and eventually death. In this paper we present an approach to create a biomarker using as an input DNA CpG methylation data to identify HD patients. DNA CpG methylation is a well-known epigenetic marker for disease state. Technological advances have made it possible to quickly analyze hundreds of thousands of CpGs. This large amount of information might introduce noise as potentially not all DNA CpG methylation levels will be related to the presence of the illness. In this paper, we were able to reduce the number of CpGs considered from hundreds of thousands to 237 using a non-linear approach. It will be shown that using only these 237 CpGs and non-linear techniques such as artificial neural networks makes it possible to accurately differentiate between control and HD patients. An underlying assumption in this paper is that there are no indications suggesting that the process is linear and therefore non-linear techniques, such as artificial neural networks, are a valid tool to analyze this complex disease. The proposed approach is able to accurately distinguish between control and HD patients using DNA CpG methylation data as an input and non-linear forecasting techniques. It should be noted that the dataset analyzed is relatively small. However, the results seem relatively consistent and the analysis can be repeated with larger data-sets as they become available
    corecore