30 research outputs found

    BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

    Get PDF
    The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed

    The effect of microwave irradiation on the folding of e2 -peptides: a computational approach

    Get PDF
    Tehnologija je iz dneva v dan bolj integrirana v naše življenje. Hiter razvoj nam je prinesel veliko naprav, ki nam olajšujejo vsakodnevno življene. Veliko število teh komunikacijskih, navigacijskih in skpektroskopskuh naprav pri delovanju izseva mikrovalove. Zaradi visoke izpostavljenosti mikrovalovom je pomembno, da preučimo morebitne negativne vplive te tehnologije. V ta namen smo se odločili, da raziščemo vpliv mikrovalov na gardnike neših teles - peptide. Ubrali smo računalniški pristop in s pomočjo molekulske dinamike preučevali vpliv mikrovalov na zvijanje betapeptida. Z uporabo razklopljnenih termostatov smo ločeno nastavljali rotacijsko, translacijsko in vibracijsko temperaturo sistema betapeptida v raztopini. Primerjali smo zvijanje peptida pod pogoji klasičnega in mikrovalovnega segrevanja in prišli do zaključka, da rotacijsko gibanje polarnega metanola prekine vodikove vezi med topilom in peptidom, kar vodi do tvorbe novih vodikovih vezi med atomi vijačnice. Posledica tega je kompaktnjše zvitje betapeptida. Prav taki strnjeni agregati lahko vodijo do nastanka amiloidnih nitk, ki so povezane z nevrodegenerativnimi boloznimi, kot je Alzheimerjeva bolezen.Technology is an ever more integrated part of our lives. The fast evolution brought us many gadgets that ease our lives. Microwaves play a vital role in communication, navigation and spectroscopy devices. Because of the omnipresent microwave radiation it is important to determine the possibility of any detrimental health effects. For the propose we decided to investigate the effect of microwave iradiation on the building blocks of our body, peptides. We chose a computational approach and with the help of molecular dynamics analyzed the folding behaviour of a beta peptide in methanol under microwave radiation. The decoupled thermostats allowed us to control the translational, vibrational and rotational temperatures of the system with the betapeptide in solution separately. We compared the folding behaviour under conventional heating and microwave heating and came to the conclusion that the increased rotational motion of the methanol molecules, that is caused by the microwaves ,breaks of the H-bonds between the solvent and solute and leads to increased formation of intramolecular H-bonds. This results in a more compact folding of the beta peptide. Such peptides can aggregate and form amyloid fibrils which have been linked to neurodegenerative didorders like Alzheimer disease

    PROTEIN SOLUBILITY CLASSIFICATION IN BIOMEDICAL CONCEPTS SPACE

    Get PDF
    Proteini so pomemben del vsakega organizma in imajo številne pomembne funkcije, katere so v veliki meri odvisne od strukture proteina. Zadnja je mnogokrat predmet raziskav, kjer strokovnjaki izolirajo posamezen protein in proučijo njegove strukturne lastnosti. Na proces izolacije proteina v veliki meri vpliva njegova topnost, saj je protein z nizko stopnjo topnosti zelo težko izolirati. Prav tako so netopni proteini razlog za nekatere pomembne bolezni. Zaradi teh razlogov želijo strokovnjaki velikokrat vnaprej vedeti, kateri proteini imajo več možnosti za visoko stopnjo topnosti. Posledično so se razvile številne metode, ki uporabljajo tehnike nadzorovanega strojnega učenja za klasifikacijo topnosti proteinov. Te metode klasificirajo proteine v topne in ne-topne ter se uporabljajo za napovedovanje topnosti za nove primerke. V disertaciji predlagamo novo metodo za klasifikacijo topnosti proteinov, ki s pomočjo tehnik tekstovnega rudarjenja izlušči medicinsko znanje iz strokovne literature in ga predstavi v obliki atributov. Te atribute poimenujemo atributi biomedicinskih konceptov in predstavljajo novost na področju klasifikacije topnosti proteinov. Do sedaj uporabljene metode so namreč omejene z uporabo atributov, ki so večinoma izpeljani le iz sekvence proteina. V okviru disertacije tako podamo številne znanstvene prispevke. Predlagana je metoda za ekstrakcijo atributov biomedicinskih konceptov iz strokovne literature na podlagi imena oziroma identifikacijske številke proteina. Nadalje ponudimo originalno primerjavo metod, ki uporabljajo nove atribute, z metodami, ki ponujajo že uveljavljene atribute izpeljane iz sekvence proteina. Kot se pokaže v disertaciji, novi atributi doprinesejo k uspešnosti klasifikacije topnosti proteinov. Podan je tudi algoritem za implementacijo najuspešega klasifikatorja z atributi biomedicinskih konceptov. Zadnji prispevek vključuje novo medicinsko znanje, ki ponudi indice o tem, katere skupine besed in besednih zvez iz strokovne literature so najbolj povezane s topnostjo proteinov. Disertacija je sestavljena iz skupno osem poglavij, katera podrobno predstavijo teoretično ozadje področij, kot so nadzorovano strojno učenje, tekstovno rudarjenje ter struktura in topnost proteinov. Obsežen del disertacije je namenjen opisu proteinskih podatkovnih baz, ki ponujajo informacije o topnosti proteinov ter opisu razvite metode in njene primerjave z do sedaj uporabljanimi metodami. Izvedena je empirična primerjava dvajsetih baz sekvenčnih atributov, ki jim postopoma dodajamo nove atribute in spremljamo doprinose k uspešnosti treh pogosto uporabljanih klasifikacijskih metod.Proteins are an essential part of every organism and each protein has its own function, which depends on the protein’s structure. The latter is an important research topic and researchers often isolate proteins from complex mixtures to study their structures. The isolation process is in many ways influenced by the protein’s solubility since insoluble proteins are usually harder to isolate than soluble ones. In addition, low protein solubility has been linked to different diseases. For these reasons, researchers often wish to indentify which proteins are more likely to be soluble. As a result, several protein solubility classification algorithms have been proposed. Roughly speaking, these algorithms take a set of soluble and insoluble proteins as an input, learn their differences and product a classifier that can be used to predict solubility for new proteins. In this thesis we propose a new method for protein solubility classification, which uses text mining techniques to define protein attributes. This new method extracts biomedical knowledge from scientific literature and presents this knowledge in the form of so called biomedical concept attributes. These attributes present a novel approach of describing proteins in the classification process, since today’s state-of-the-art classification methods use mostly attributes derived from the protein’s sequence. To evaluate the new method, this thesis describes the classification scheme for an empirical study which measures the impact of the new attributes on the protein solubility classification. In the study, the twenty most common sequence derived attribute datasets are analysed, to which we gradually add five types of biomedical concept attributes. We measure the performance of the classifiers obtained by these attribute datasets. As a result, this thesis introduces several original scientific contributions. First of all, an analysis of protein databases that contain information about protein solubility is performed. Secondly, the method for extracting biomedical concept attributes is presented. Next, an original comparison of methods that use biomedical concept attributes with those that use only sequence-derived attributes is performed. The thesis demonstrates that the new attributes increase the performance of some classifiers. Finally, it identifies types of words and word associations from the medical literature that are associated with protein solubility

    Extracting New Temporal Features to Improve the Interpretability of Undiagnosed Type 2 Diabetes Mellitus Prediction Models

    No full text
    Type 2 diabetes mellitus (T2DM) often results in high morbidity and mortality. In addition, T2DM presents a substantial financial burden for individuals and their families, health systems, and societies. According to studies and reports, globally, the incidence and prevalence of T2DM are increasing rapidly. Several models have been built to predict T2DM onset in the future or detect undiagnosed T2DM in patients. Additional to the performance of such models, their interpretability is crucial for health experts, especially in personalized clinical prediction models. Data collected over 42 months from health check-up examinations and prescribed drugs data repositories of four primary healthcare providers were used in this study. We propose a framework consisting of LogicRegression based feature extraction and Least Absolute Shrinkage and Selection operator based prediction modeling for undiagnosed T2DM prediction. Performance of the models was measured using Area under the ROC curve (AUC) with corresponding confidence intervals. Results show that using LogicRegression based feature extraction resulted in simpler models, which are easier for healthcare experts to interpret, especially in cases with many binary features. Models developed using the proposed framework resulted in an AUC of 0.818 (95% Confidence Interval (CI): 0.812−0.823) that was comparable to more complex models (i.e., models with a larger number of features), where all features were included in prediction model development with the AUC of 0.816 (95% CI: 0.810−0.822). However, the difference in the number of used features was significant. This study proposes a framework for building interpretable models in healthcare that can contribute to higher trust in prediction models from healthcare experts

    Influence of security mechanisms on web services interoperability

    No full text
    Pri integraciji informacijskih sistemov z uporabo spletnih storitev je treba ustrezno nasloviti podroÄŤje varnosti. Prispevek prouÄŤuje povezljivost spletnih storitev v okviru standarda WSS (Web Services Security). Gre za specifikacije, ki so namenjene varovanju spletnih storitev. V prispevku smo analizirali podporo za razvoj varnih spletnih storitev na platfonni Java in ogrodju .NET, prouÄŤili njihovo povezljivost ter implementirali varno storitev, izdelano v okolju Microsoft .NET in njenega odjemalca v okolju Java. Identificirali in analizirali smo probleme, ki so nastali pri njunem povezovanju

    Comprehensive Decision Tree Models in Bioinformatics

    Get PDF
    Purpose: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree
    corecore