Search CORE

35 research outputs found

Changes in the choroidal thickness in reproductive-aged women with iron-deficiency anemia

Author: A Reiner
A Sanchez-Cano
Aydin Ciftci
Cemile Dayangan Sayan
DL Nickla
E McLean
EE Karaca
Erhan Yumusak
F Ingegnoli
G Pekel
H Tanabe
J Cao
JL Beard
JS Brown
JW Jung
JW Kiel
Kemal Ornek
LM Parver
MC Carraro
ML Aisen
N Milman
Nevin Hande Dikel
NJ Kassebaum
PK Ranil
R Margolis
RF Spaide
S Sizmaz
Selim Yalcin
SK Vance
T Agawa
WB Wei
WK Ngo
X Ding
Y Balarajan
Y Ikuno
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Knowledge graph matching with inter-service information transfer

Author: Yumusak S.
Publication venue: CEUR
Publication date: 01/01/2020
Field of study

Knowledge graph matching is an approach to create entitymappings of structured data with linked data sources. As an automatedapproach, this paper explains a sparql query based matching engine developed during the Columns-Property Annotation (CPA) Challenge under ISWC 2020 SemTab challenge (Semantic Web Challenge on TabularData to Knowledge Graph Matching). The proposed approach utilizesa text correction via different knowledge base services/libraries as wellas numeric interval definitions to identify negligible numeric differences.The approach (submitted as TeamTR) achived, in the CPA task, F1-scores of 0.916. 0.873 and 0.837 in Rounds 1, 2 and 3, respectively

Southampton (e-Prints Soton)

Classification of linked data sources using semantic scoring

Author: Dogdu E.
Kodaz H.
Yumusak S.
Publication venue
Publication date
Field of study

Linked data sets are created using semantic Web technologies and they are usually big and the number of such datasets is growing. The query execution is therefore costly, and knowing the content of data in such datasets should help in targeted querying. Our aim in this paper is to classify linked data sets by their knowledge content. Earlier projects such as LOD Cloud, LODStats, and SPARQLES analyze linked data sources in terms of content, availability and infrastructure. In these projects, linked data sets are classified and tagged principally using VoID vocabulary and analyzed according to their content, availability and infrastructure. Although all linked data sources listed in these projects appear to be classified or tagged, there are a limited number of studies on automated tagging and classification of newly arriving linked data sets. Here, we focus on automated classification of linked data sets using semantic scoring methods. We have collected the SPARQL endpoints of 1,328 unique linked datasets from Datahub, LOD Cloud, LODStats, SPARQLES, and SpEnD projects. We have then queried textual descriptions of resources in these data sets using their rdfs:comment and rdfs:label property values. We analyzed these texts in a similar manner with document analysis techniques by assuming every SPARQL endpoint as a separate document. In this regard, we have used WordNet semantic relations library combined with an adapted term frequency-inverted document frequency (tfidf) analysis on the words and their semantic neighbours. In WordNet database, we have extracted information about comment/label objects in linked data sources by using hypernym, hyponym, homonym, meronym, region, topic and usage semantic relations. We obtained some significant results on hypernym and topic semantic relations; we can find words that identify data sets and this can be used in automatic classification and tagging of linked data sources. By using these words, we experimented different classifiers with different scoring methods, which results in better classification accuracy results

Southampton (e-Prints Soton)

A methodology on converting 10-K filings into a machine learning dataset and its applications

Author: Kodaz H.
Sami Kacar M.
Yumusak S.
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/04/2023
Field of study

Companies listed on the stock exchange are required to share their annual reports with the U.S. Securities and Exchange Commission (SEC) within the first three months following the fiscal year. These reports, namely 10-K Filings, are presented to public interest by the SEC through an Electronic Data Gathering, Analysis, and Retrieval database. 10-K Filings use standard file formats (xbrl, html, pdf) to publish the financial reports of the companies. Although the file formats propose a standard structure, the content and the meta-data of the financial reports (e.g. tag names) is not strictly bound to a pre-defined schema. This study proposes a data collection and data preprocessing method to semantify the financial reports and use the collected data for further analysis (i.e. machine learning). The analysis of eight different datasets, which were created during the study, are presented using the proposed data transformation methods. As a use case, based on the datasets, five different machine learning algorithms were utilized to predict the existence of the corresponding company in the S&P 500 index. According to the strong machine learning results, the dataset generation methodology is successful and the datasets are ready for further use.</p

Southampton (e-Prints Soton)

Low-diameter topic-based pub/sub overlay network construction with minimum–maximum node degree

Author: Hassanpour R.
Layazali S.
Oztoprak K.
Yumusak S.
Publication venue
Publication date
Field of study

In the construction of effective and scalable overlay networks, publish/subscribe (pub/sub) network designers prefer to keep the diameter and maximum node degree of the network low. However, existing algorithms are not capable of simultaneously decreasing the maximum node degree and the network diameter. To address this issue in an overlay network with various topics, we present herein a heuristic algorithm, called the constant-diameter minimum–maximum degree (CD-MAX), which decreases the maximum node degree and maintains the diameter of the overlay network at two as the highest. The proposed algorithm based on the greedy merge algorithm selects the node with the minimum number of neighbors. The output of the CD-MAX algorithm is enhanced by applying a refinement stage through the CD-MAX-Ref algorithm, which further improves the maximum node degrees. The numerical results of the algorithm simulation indicate that the CD-MAX and CD-MAX-Ref algorithms improve the maximum node-degree by up to 64% and run up to four times faster than similar algorithms

Southampton (e-Prints Soton)

Detecting dangerous maritime refugee migration paths through cell phone activities

Author: Altun H.O.
Coban M.
Yilmaz Y.
Yumusak S.
Publication venue: IEE
Publication date: 26/01/2023
Field of study

In the 21st century, the world has experienced devastating wars that have caused people to migrate, creating problems in host countries. Among these migration routes, maritime migration routes are more desirable compared to other routes because coasts cannot be controlled as strictly as the alternative passages. However, maritime migration poses life-threatening risks due to unsafe boats, transportation between undesignated areas, lack of life-saving equipment, and dangerous weather conditions chosen for covert operations. Refugees and migrants may die or go missing at sea during these migrations. Most refugees are unaware of high risks they are facing as they hopelessly set out in search of better living conditions. In this study, we propose that such suicide-like maritime migration activities can be detected to some extent through cell phone activities and there may be a way to track early signs of migrants coming together from other regions and migrating through sea routes. By collecting media reports of failed attempts by immigrants and linking them to the D4R cell phone data, we were able to gain some indications of the possibility of early warning systems through analysis of cell phone calls

Southampton (e-Prints Soton)

SpEnD: Linked data SPARQL endpoints discovery using search engines

Author: Dogdu E.
Kamilaris A.
Kodaz H.
Vandenbussche P.-Y.
Yumusak S.
Publication venue
Publication date: 01/01/2017
Field of study

Linked data endpoints are online query gateways to semantically annotated linked data sources. In order to query these data sources, SPARQL query language is used as a standard. Although a linked data endpoint (i.e. SPARQL endpoint) is a basic Web service, it provides a platform for federated online querying and data linking methods. For linked data consumers, SPARQL endpoint availability and discovery are crucial for live querying and semantic information retrieval. Current studies show that availability of linked datasets is very low, while the locations of linked data endpoints change frequently. There are linked data respsitories that collect and list the available linked data endpoints or resources. It is observed that around half of the endpoints listed in existing repositories are not accessible (temporarily or permanently offline). These endpoint URLs are shared through repository websites, such as Datahub.io, however, they are weakly maintained and revised only by their publishers. In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a “search keyword” discovery process for finding relevant keywords for the linked data domain and specifically SPARQL endpoints. Then, the collected search keywords are utilized to find linked data sources via popular search engines (Google, Bing, Yahoo, Yandex). By using this method, most of the currently listed SPARQL endpoints in existing endpoint repositories, as well as a significant number of new SPARQL endpoints, have been discovered. We analyze our findings in comparison to Datahub collection in detail

Southampton (e-Prints Soton)

A Discovery and Analysis Engine for Semantic Web

Author: Aras R.E.
Dogdu E.
Kamilaris A.
Kodaz H.
Uysal E.
Yumusak S.
Publication venue: International World Wide Web Conferences Steering Committee
Publication date: 01/01/2018
Field of study

The Semantic Web promotes common data formats and exchange protocols on the web towards better interoperability among systems and machines. Although Semantic Web technologies are being used to semantically annotate data and resources for easier reuse, the ad hoc discovery of these data sources remains an open issue. Popular Semantic Web endpoint repositories such as SPARQLES, Linking Open Data Project (LOD Cloud), and LODStats do not include recently published datasets and are not updated frequently by the publishers. Hence, there is a need for a web-based dynamic search engine that discovers these endpoints and datasets at frequent intervals. To address this need, a novel web meta-crawling method is proposed for discovering Linked Data sources on the Web. We implemented the method in a prototype system named SPARQL Endpoints Discovery (SpEnD). In this paper, we describe the design and implementation of SpEnD, together with an analysis and evaluation of its operation, in comparison to the aforementioned static endpoint repositories in terms of time performance, availability, and size. Findings indicate that SpEnD outperforms existing Linked Data resource discovery methods

Southampton (e-Prints Soton)

An Unusual Cause of Pseudopapillary Oedema: Hyperphosphatemic Hyperostosis Syndrome

Author: Behbehani R
Binder S
Erhan Yumusak
Fatih Mehmet Mutlu
Faysal Gok
Ting MAJ
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Additive Effect of Mesenchymal Stem Cells and Defibrotide in An Arterial Rat Thrombosis Model

Author: Beken S
Dilli D
Karabulut R
Kiliç E
Yumusak N
Zenciroglu A.
Çetinkaya D.U
Publication venue: 'Sociedad Argentina de Pediatria'
Publication date: 01/01/2017
Field of study

Scopu

Hacettepe University Institutional Repository

Gazi University Dspace