20 research outputs found
Predicting missing annotations in Gene Ontology with Knowledge Graph Embeddings and True Path Rule
Gene Ontology (GO) and its Annotations (GOA) provide a controlled and evolving vocabulary for gene products and gene functions widely used in molecular biology. GO & GOA are updated and maintained both automatically from biological publications and manually by curators. These knowledge bases however are often incomplete for two reasons: 1) Research in biological domain itself is still ongoing; 2) The amount of experimental evidence might not be yet sufficient to validate annotations. In this paper, we address the gap in evidence between gene products and their annotations by making link predictions using Knowledge Graph Embedding (KGE) methods. Through the application of the True Path Rule (TPR) in the training stage of KGE, we were able to improve the performance of traditional KGE methods. We report two experimental scenarios with GO and GO Chicken Annotation datasets to show the contribution of embedding TPR to prediction accuracy
Denk-uç yerpaylaşan sistemlerde bir kontratlama tasarımı ve gerçekleştirimi.
Today, with widespread use of Internet in many areas, the common procedures frequently encountered in business life such as contracting and negotiation need to be automated. The distributed structure of the Internet and the difficulty of resources dispersed on one center makes such a system to have a distributed architecture . In this study, for first time, automatization of a contracting form through business processes was proposed and was carried out. A peer to peer process contracting overlay what we call Peer-Con is developed. The system is an extension of Java Agent Development Framework (JADE) and uses IEEE Foundation for Intelligent Physical Agents (FIPA) Agent Communication Language (ACL) standard. Cost aware flexible representation of process capabilities; description of an operator to decide on whether given capabilities turnout to an agreement or not and self organization of peer connectivity for better contracting performance are distinguishing features of the system. The system can easily be adapted to different domains while the core functionality remains the same. Practical use of Peer-Con is shown by two applications from different domains; Driving Route Calculation on Web Maps and Digital Signal Processing Module (DSPM) product planning domain.M.S. - Master of Scienc
Biyomedikal bilgi çizgeleri için makine öğrenmesi tabanlı anlamsal bağ tahmini
Son zamanlarda, gerçek dünyadaki varlıklar ve bunların birbiriyle ilişkileri hakkında milyonlarca gerçekler (ifadeler) içeren büyük bilgi tabanlarının oluşturulması oldukça ilgi görmektedir. Günümüzde biyomedikal alandaki bir çok verinin Anlamsal Web teknolojileri ile erişebilir olması sayesinde bu bilgi tabanları Bağlı Veri formunda olan bilge çizgeleri olarak sunulmaktadır. Bilgi çizgeleri, veriyi tanımlamada güçlü bir model sunmakta ve aynı zamanda altında yatan çizge yapısı sayesinde çizge madenciliği algoritmalarının uygulanmasını mümkün kılmaktadır. Bu tezde büyük bilgi çizgelerinde eksik bağlantıları keşfetmek ve yeni bağlantıları tahmin etmek için çeşitli yaklaşımlar sunulmuştur. Bu tez kapsamında biyolojik ve biyomedikal bilgi ağlarında varlıklar arasında yeni bağları keşfetmek için bilgi çizgeleri kullanarak makine öğrenmesi temelli melez bir yaklaşım önerilmiştir. Yeni ilişkilerin tahmini için çizgenin yapısal ve anlamsal özelliklerine dayanan iki öznitelik grubu, yerel ve global öznitelikler, kullanılmıştır. Yerel öznitelikler, ağ yakınlıklarına ve global öznitelikler ise anlamsal çizgenin vektör temsiline dayanır. Bu iki öznitelik grubu ile eğitilen makine öğrenmesi modelleri, ayrı ayrı ve bütünleştirilerek bağ tahmini için kullanılmıştır. Ayrıca bağ tahmini yöntemlerinin değerlendirilmesinde göz ardı edilen durumlar için test senaryoları geliştirilmiş ve bu test senaryoları için önerilen yöntemin başarısı denenmiştir. Önerilen yaklaşımların yararlılığı biyomedikal alanda ilaç keşfi ve halk sağlığı için önemli olan iki problem, yeni ilaç-ilaç etkileşimi tahmini ve yeni ilaç endikasyonu tahmini, için başarılı bir şekilde uygulanarak gösterilmiştir. Yöntemin mevcut yaklaşımlara üstünlüğü böylece kanıtlanmıştır.The construction of large knowledge bases which contain large volumes of data about real world objects and their relationships has been an object of great interest in recent times. Nowadays, many databases in the biomedical field are accessible through Semantic Web technologies, so these databases have been presented as knowledge graphs in the form of Linked Data. The knowledge graphs are powerful models for defining data which also enable the application of graph mining algorithms thanks to the underlying graph structure. A number of approaches for the discovery of missing links and prediction of new links in large knowledge graphs is presented within this thesis. A hybrid approach based on machine learning which uses knowledge graphs to discover new links between entities in biological and biomedical information network is proposed. Two groups of features, local and global features, based on structural and semantic properties of the knowledge graphs, are used for the link prediction. Local features are based on network proximity and global features are based on vector representation of the semantic graph. The machine learning models trained with these two feature groups were evaluated separately and jointly. In addition, test scenarios were developed for cases that were often ignored in evaluating link prediction methods, and the proposed methods applied to these scenarios were tested. The utility of the proposed approaches was demonstrated by successfully applying them on two important problems for the drug discovery process in the biomedical field; predicting new drug-drug interactions and predicting novel drug indications. The method's superiority over existing approaches were thereby demonstrated
Use of open linked data in bioinformatics space: A case study
In the life sciences, semantic web can support many aspects of bio- and health informatics, with exciting applications appearing in areas ranging from plant genetics to drug discovery. Using semantic technologies with open linked data, provides two kinds of advantages: ability to search multiple datasets through a single framework and ability to search relationships and paths of relationships that go across different datasets. The Bio2RDF project creates a network of coherently linked data across the biological databases. As part of the Bio2RDF project, an integrated bioinformatics warehouse on the semantic web is built. In this paper, a use case with a query for multiple distant data sources which are semantically available through Bio2RDF is defined. The validation of the results by traditional search techniques and discussion for future directions is presented. © 2013 IEEE