105 research outputs found
Obtendo información útil para a mellora dunha materia a partir dos resultados dos exames de resposta múltiple
[Resumo] Os procesos de avaliación, deben aplicarse ós docentes e mesmo á materia en si, non só ós
alumnos.
Con esta finalidade formúlase unha análise dos resultados acadados polo alumnado durante a
proba de avaliación empregada na materia de Marcos de Desenvolvemento (Grao en Enxeñaría
Informática – Facultade de Informática). O exame é de resposta múltiple (4 opcións por
pregunta, só unha válida e restando puntos as respostas erróneas). Os exames analízanse en
dúas ramas: por unha banda, estúdanse as taxas de acerto/fallo/en branco de cada unha das
preguntas; por outra, a porcentaxe de opcións (a,b,c,d, branco) en cada pregunta.
Este sinxelo estudo, automatizado mediante o emprego dunha folla de cálculo, permite, non
obstante, obter interesantes conclusións:
• Detecta ambigüidades ou erros na formulación das preguntas que, polo xeral, se
derivan nunha elevada porcentaxe de respostas en branco.
• Detecta lagoas de coñecemento nalgunha das áreas da materia, que orixinan
preguntas cunha elevada taxa de erros. Cada pregunta está asociada a un bloque
teórico, polo que se podes establecer en qué aspectos os alumnos presentan máis ou
menos coñecementos. Ambos aspectos poden ser empregados para detectar erros na formulación da materia e/ou
exame e facer posible así a definición de plans de mellora de cara ós vindeiros cursos
académicos
Random Forest Classification Based on Star Graph Topological Indices for Antioxidant Proteins
[Abstract] Aging and life quality is an important research topic nowadays in areas such as life sciences, chemistry, pharmacology, etc. People live longer, and, thus, they want to spend that extra time with a better quality of life. At this regard, there exists a tiny subset of molecules in nature, named antioxidant proteins that may influence the aging process. However, testing every single protein in order to identify its properties is quite expensive and inefficient. For this reason, this work proposes a model, in which the primary structure of the protein is represented using complex network graphs that can be used to reduce the number of proteins to be tested for antioxidant biological activity. The graph obtained as a representation will help us describe the complex system by using topological indices. More specifically, in this work, Randić’s Star Networks have been used as well as the associated indices, calculated with the S2SNet tool. In order to simulate the existing proportion of antioxidant proteins in nature, a dataset containing 1999 proteins, of which 324 are antioxidant proteins, was created. Using this data as input, Star Graph Topological Indices were calculated with the S2SNet tool. These indices were then used as input to several classification techniques. Among the techniques utilised, the Random Forest has shown the best performance, achieving a score of 94% correctly classified instances. Although the target class (antioxidant proteins) represents a tiny subset inside the dataset, the proposed model is able to achieve a percentage of 81.8% correctly classified instances for this class, with a precision of 81.3%.Galicia. Consellería de Economía e Industria; 10SIN105004PRGalicia. Consellería de Economía e Industria; O9SIN010105PRMinisterio de Economía y Competitividad; TIN-2009-0770
Bio-AIMS collection of chemoinformatics web tools based on molecular graph information and artificial intelligence models
[Abstract] The molecular information encoding into molecular descriptors is the first step into in silico Chemoinformatics methods in Drug Design. The Machine Learning methods are a complex solution to find prediction models for specific biological properties of molecules. These models connect the molecular structure information such as atom connectivity (molecular graphs) or physical-chemical properties of an atom/group of atoms to the molecular activity (Quantitative Structure - Activity Relationship, QSAR). Due to the complexity of the proteins, the prediction of their activity is a complicated task and the interpretation of the models is more difficult. The current review presents a series of 11 prediction models for proteins, implemented as free Web tools on an Artificial Intelligence Model Server in Biosciences, Bio-AIMS (http://bio-aims.udc.es/TargetPred.php). Six tools predict protein activity, two models evaluate drug - protein target interactions and the other three calculate protein - protein interactions. The input information is based on the protein 3D structure for nine models, 1D peptide amino acid sequence for three tools and drug SMILES formulas for two servers. The molecular graph descriptor-based Machine Learning models could be useful tools for in silico screening of new peptides/proteins as future drug targets for specific treatments.Red Gallega de Investigación y Desarrollo de Medicamentos; R2014/025Instituto de Salud Carlos III; PI13/0028
The Rücker–Markov invariants of complex bio-systems: applications in parasitology and neuroinformatics
[Abstract] Rücker's walk count (WC) indices are well-known topological indices (TIs) used in Chemoinformatics to quantify the molecular structure of drugs represented by a graph in Quantitative structure–activity/property relationship (QSAR/QSPR) studies. In this work, we introduce for the first time the higher-order (kth order) analogues (WCk) of these indices using Markov chains. In addition, we report new QSPR models for large complex networks of different Bio-Systems useful in Parasitology and Neuroinformatics. The new type of QSPR models can be used for model checking to calculate numerical scores S(Lij) for links Lij (checking or re-evaluation of network connectivity) in large networks of all these fields. The method may be summarized as follows: (i) first, the WCk(j) values are calculated for all jth nodes in a complex network already created; (ii) A linear discriminant analysis (LDA) is used to seek a linear equation that discriminates connected or linked (Lij = 1) pairs of nodes experimentally confirmed from non-linked ones (Lij = 0); (iii) The new model is validated with external series of pairs of nodes; (iv) The equation obtained is used to re-evaluate the connectivity quality of the network, connecting/disconnecting nodes based on the quality scores calculated with the new connectivity function. The linear QSPR models obtained yielded the following results in terms of overall test accuracy for re-construction of complex networks of different Bio-Systems: parasite–host networks (93.14%), NW Spain fasciolosis spreading networks (71.42/70.18%) and CoCoMac Brain Cortex co-activation network (86.40%). Thus, this work can contribute to the computational re-evaluation or model checking of connectivity (collation) in complex systems of any science field.Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; Ibero-NBIC, 209RT-0366Ministerio de Ciencia e Innovación; TIN2009-0770
Applied Computational Techniques on Schizophrenia Using Genetic Mutations
[Abstract] Schizophrenia is a complex disease, with both genetic and environmental influence. Machine learning techniques can be used to associate different genetic variations at different genes with a (schizophrenic or non-schizophrenic) phenotype. Several machine learning techniques were applied to schizophrenia data to obtain the results presented in this study. Considering these data, Quantitative Genotype – Disease Relationships (QDGRs) can be used for disease prediction. One of the best machine learning-based models obtained after this exhaustive comparative study was implemented online; this model is an artificial neural network (ANN). Thus, the tool offers the possibility to introduce Single Nucleotide Polymorphism (SNP) sequences in order to classify a patient with schizophrenia. Besides this comparative study, a method for variable selection, based on ANNs and evolutionary computation (EC), is also presented. This method uses half the number of variables as the original ANN and the variables obtained are among those found in other publications. In the future, QDGR models based on nucleic acid information could be expanded to other diseases.Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; 209RT-0366Xunta de Galicia; 10SIN105004PRInstituto de Salud Carlos III; RD07/0067/0005Xunta de Galicia; Ref. 2009/5
Mejora continua de la calidad de la docencia a partir del análisis de los resultados de evaluación
[Resumen] El objetivo de cualquier docente debería ser la mejora continua en sus materias. En este trabajo se muestra una aproximación para adecuar las enseñanzas a aquellos aspectos más necesarios dentro de una materia. Para ello es necesario tomar nota de las debilidades mostradas por el alumnado. Por lo tanto, se plantea un análisis exhaustivo del rendimiento, más allá de una simple evaluación numérica, con el objetivo de dirigir los esfuerzos docentes a las áreas en las que se detecta una mayor necesidad. Así, para valorar los conocimientos teóricos se mostrará un análisis estadístico a partir de los resultados de la prueba teórica realizada (de tipo respuesta múltiple) analizando no sólo la cantidad de fallos sino analizando dónde y en qué porcentaje se producen éstos. En relación a la práctica, se desarrolla una rúbrica que permite una corrección exhaustiva de los trabajos, dejando además abierta la posibilidad a apuntar las observaciones necesarias en todos los puntos de interés. Se contextualiza la propuesta realizada en una materia concreta (Marcos de Desarrollo), puesto que es la materia que se empleó para su puesta en marcha. Sin embargo, el método propuesto es totalmente genérico y puede ser trasladado sin apenas cambio a cualquier otra materia.[Abstract] The objective of any teaching should be the continuous improvement of the subjects. This paper shows an approach to adapt the teachings to those aspects most necessary within a subject. For this, it is necessary to take note of the weaknesses shown by the students. Therefore, an exhaustive analysis of the performance is proposed, beyond a simple numerical evaluation, with the aim of directing the teaching efforts to the areas in which a greater need is detected. Thus, to assess theoretical knowledge, statistical analysis will be shown based on the results of the theoretical test carried out (multiple response type) analyzing not only the number of failures but analyzing where and in what percentage these occur. In relation to the practice, a rubric is developed that allows an exhaustive correction of the works, leaving also open the possibility to record the necessary observations in all the points of interest. The proposal made in a specific subject (Development Frameworks) is contextualized, since it is the material used for its implementation. However, the proposed method is totally generic and can be transferred with little change to any other subject
Markov Mean Properties for Cell Death-Related Protein Classification
[Abstract] The cell death (CD) is a dynamic biological function involved in physiological and pathological processes. Due to the complexity of CD, there is a demand for fast theoretical methods that can help to find new CD molecular targets. The current work presents the first classification model to predict CD-related proteins based on Markov Mean Properties. These protein descriptors have been calculated with the MInD-Prot tool using the topological information of the amino acid contact networks of the 2423 protein chains, five atom physicochemical properties and the protein 3D regions. The Machine Learning algorithms from Weka were used to find the best classification model for CD-related protein chains using all 20 attributes. The most accurate algorithm to solve this problem was K*. After several feature subset methods, the best model found is based on only 11 variables and is characterized by the Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.992 and the true positive rate (TP Rate) of 88.2% (validation set). 7409 protein chains labeled with “unknown function” in the PDB Databank were analyzed with the best model in order to predict the CD-related biological activity. Thus, several proteins have been predicted to have CD-related function in Homo sapiens: 3DRX–involved in virus-host interaction biological process, protein homooligomerization; 4DWF–involved in cell differentiation, chromatin modification, DNA damage response, protein stabilization; 1IUR–involved in ATP binding, chaperone binding; 1J7D–involved in DNA double-strand break processing, histone ubiquitination, nucleotide-binding oligomerization; 1UTU–linked with DNA repair, regulation of transcription; 3EEC–participating to the cellular membrane organization, egress of virus within host cell, class mediator resulting in cell cycle arrest, negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle and apoptotic process. Other proteins from bacteria predicted as CD-related are 2G3V - a CAG pathogenicity island protein 13 from Helicobacter pylori, 4G5A - a hypothetical protein in Bacteroides thetaiotaomicron, 1YLK–involved in the nitrogen metabolism of Mycobacterium tuberculosis, and 1XSV - with possible DNA/RNA binding domains. The results demonstrated the possibility to predict CD-related proteins using molecular information encoded into the protein 3D structure. Thus, the current work demonstrated the possibility to predict new molecular targets involved in cell-death processes.Xunta de Galicia; 10SIN105004PRInstituto de Salud Carlos III; PI13/0028
S2SNet: a tool for transforming characters and numeric sequences into star network topological indices in chemoinformatics, bioinformatics, biomedical, and social-legal sciences
[Abstract] The study of complex systems such as proteins/DNA/RNA or dynamics of tax law systems can be carried out with the complex network theory. This allows the numerical quantification of the significant information contained by the sequences of amino acids, nucleotides or types of tax laws. In this paper we describe S2SNet, a new Python tool with a graphical user interface that can transform any sequence of characters or numbers into series of invariant star network topological indices. The application is based on Python reusable processing procedures that perform different functions such as reading sequence data, transforming numerical series into character sequences, changing letter codification of strings and drawing the star networks of each sequence using Graphviz package as graphical back-end. S2SNet was previously used to obtain classification models for natural/random proteins, breast/colon/prostate cancer-related proteins, DNA sequences of mycobacterial promoters and for early detection of diseases and drug-induced toxicities using the blood serum proteome mass spectrum. In order to show the extended practical potential of S2SNet, this work presents several examples of application for proteins, DNA/RNA, blood proteome mass spectra and time evolution of the financial law recurrence. The obtained topological indices can be used to characterize systems by creating classification models, clustering or pattern search with statistical, Neural Network or Machine Learning methods. The free availability of S2SNet, the flexibility of analyzing diverse systems and the Python portability make it an ideal tool in fields such as Bioinformatics, Proteomics, Genomics, and Biomedicine or Social, Economic and Political Sciences
Supporting the personal autonomy of children with autism spectrum disorder: a software system design and development
Technology attracts the interest of children with ASD. Health professionals and developers are interested to design programs facilitating daily lives of individuals with ASD. However, some special requirements must be considered on software design process for people with ASD. To describe principles and decision making of an interactive software design process for children with ASD.info:eu-repo/semantics/publishedVersio
Kernel-Based Feature Selection Techniques for Transport Proteins Based on Star Graph Topological Indices
[Abstract] The transport of the molecules inside cells is a very important topic, especially in Drug Metabolism. The experimental testing of the new proteins for the transporter molecular function is expensive and inefficient due to the large amount of new peptides. Therefore, there is a need for cheap and fast theoretical models to predict the transporter proteins. In the current work, the primary structure of a protein is represented as a molecular Star graph, characterized by a series of topological indices. The dataset was made up of 2,503 protein chains, out of which 413 have transporter molecular function and 2,090 have no transporter function. These indices were used as input to several classification techniques to find the best Quantitative Structure Activity Relationship (QSAR) model that can evaluate the transporter function of a new protein chain. Among several feature selection techniques, the Support Vector Machine Recursive Feature Elimination allows us to obtain a classification model based on 20 attributes with a true positive rate of 83% and a false positive rate of 16.7%.Xunta de Galicia; 1OSIN105004P
- …
