151 research outputs found
Ollivier-Ricci Curvature for Hypergraphs: A Unified Framework
Bridging geometry and topology, curvature is a powerful and expressiveinvariant. While the utility of curvature has been theoretically andempirically confirmed in the context of manifolds and graphs, itsgeneralization to the emerging domain of hypergraphs has remained largelyunexplored. On graphs, Ollivier-Ricci curvature measures differences betweenrandom walks via Wasserstein distances, thus grounding a geometric concept inideas from probability and optimal transport. We develop ORCHID, a flexibleframework generalizing Ollivier-Ricci curvature to hypergraphs, and prove thatthe resulting curvatures have favorable theoretical properties. Throughextensive experiments on synthetic and real-world hypergraphs from differentdomains, we demonstrate that ORCHID curvatures are both scalable and useful toperform a variety of hypergraph tasks in practice.<br
Developing a Prediction Model for Author Collaboration in Bioinformatics Research Using Graph Mining Techniques and Big Data Applications
Nowadays, scientific collaboration has dramatically increased due to web-based technologies, advanced communication systems, and information and scientific databases. The present study aims to provide a predictive model for author collaborations in bioinformatics research output using graph mining techniques and big data applications. The study is applied-developmental research adopting a mixed-method approach, i.e., a mix of quantitative and qualitative measures. The research population consisted of all bioinformatics research documents indexed in PubMed (n=699160). The correlations of bioinformatics articles were examined in terms of weight and strength based on article sections including title, abstract, keywords, journal title, and author affiliation using graph mining techniques and big data applications. Eventually, the prediction model of author collaboration in bioinformatics research was developed using the abovementioned tools and expert-assigned weights. The calculations and data analysis were carried out using Expert Choice, Excel, Spark, and Scala, and Python programming languages in a big data server. Accordingly, the research was conducted in three phases: 1) identifying and weighting the factors contributing to authors’ similarity measurement; 2) implementing co-authorship prediction model; and 3) integrating the first and second phases (i.e., integrating the weights obtained in the previous phases). The results showed that journal title, citation, article title, author affiliation, keywords, and abstract scored 0.374, 0.374, 0.091, 0.075, 0.055, and 0.031. Moreover, the journal title achieved the highest score in the model for the co-author recommender system. As the data in bibliometric information networks is static, it was proved remarkably effective to use content-based features for similarity measures. So that the recommender system can offer the most suitable collaboration suggestions. It is expected that the model works efficiently in other databases and provides suitable recommendations for author collaborations in other subject areas. By integrating expert opinion and systemic weights, the model can help alleviate the current information overload and facilitate collaborator lookup by authors.https://dorl.net/dor/20.1001.1.20088302.2021.19.2.1.
Implementation of a topic map data model for a Web-based information resource
Cataloged from PDF version of article.The Web has become a vast information resource in recent years. Millions of
people use the Web on a regular basis and the number is increasing rapidly. The
Web is the largest center in the world presenting almost all of the social,
economical, educational, etc. activities and anyone from all over the word can
visit this huge place even though he does not have to stand up from his sit. Due to
its hugeness, finding desired data on the Web in a timely and cost effective way is
a problem of wide interest. In the last several years, many search engines have
been created to help Web users find desired information. However, most of these
search engines employ topic-independent search methods that rely heavily on
keyword-based approaches where the users are presented with a lot of
unnecessary search results.
In this thesis, we present a data model using topic maps standards for Webbased
information resources. In this model, topics, topic associations and topic
occurrences (called as topic metalinks and topic sources in this study) are the
fundamental concepts. In fact, the presented model is a metadata model that
describes the content of the Web-based information resource and creates virtual
knowledge maps over the modeled information resource. Thus, semantic indexing
of the Web-based information resource is performed for allowing efficient search
and querying the data on the resource.
iv
Additionally, we employ full text indexing in the presented model by using a
widely accepted method that is inverted file index. Due to the rapid increase of
data, the dynamic update of the inverted file index during the addition of new
documents is inevitable. We have implemented an efficient dynamic update
scheme in the presented model for the employed inverted file index method.
The presented topic map data model provides combining the powers of both
keyword-based search and topic-centric search methods. We also provide a
prototype search engine verifying that our presented model contributes very much
to the problem of efficient and effective search and querying of the Web-based
information resources.Kutlutürk, MustafaM.S
Sustainability in software engineering: a systematic literature review
Background: Supporting sustainability in software engineering is becoming an active area of research. We want to contribute the first Systematic Literature Review(SLR) in this field to aid researchers who are motivated to contribute to that topic by providing a body of knowledge as starting point, because we know from own experience, this search can be tedious and time consuming. Aim: We aim to provide an overview of different aspects of sustainability in software engineering research with regard to research activity, investigated topics, identified limitations, proposed approaches, used methods, available studies, and considered domains. Method: The applied method is a SLR in five reliable and commonly-used databases according to the (quasi-standard) protocol by Kitchenham et al. [1]. We assessed the 100 first results of each database ordered by relevance with respect to the search query. Results: Of 500 classified publications, we regard 96 as relevant for our research questions. We sketch a taxonomy of their topics and domains, and provide lists of used methods and proposed approaches. Most of the excluded publications were ruled out because of an unfitting usage of terms within the search query. Conclusions: Currently, there is little research coverage on the different aspects of sustainability in software engineering while other disciplines are already more active. Future work includes extending the study by reviewing a higher number of publications, including dedicated journal and workshop searches, and snowballing.Peer ReviewedPostprint (author's final draft
Metadata-based and personalized web querying
Cataloged from PDF version of article.The advent of the Web has raised new searching and querying problems. Keyword
matching based querying techniques that have been widely used by search
engines, return thousands of Web documents for a single query, and most of these
documents are generally unrelated to the users’ information needs. Towards the
goal of improving the information search needs of Web users, a recent promising
approach is to index the Web by using metadata and annotations.
In this thesis, we model and query Web-based information resources using
metadata for improved Web searching capabilities. Employing metadata for
querying the Web increases the precision of the query outputs by returning semantically
more meaningful results. Our Web data model, named “Web information
space model”, consists of Web-based information resources (HTML/XML documents
on the Web), expert advice repositories (domain-expert-specified metadata
for information resources), and personalized information about users (captured
as user profiles that indicate users’ preferences about experts as well as users’
knowledge about topics). Expert advice is specified using topics and relationships
among topics (i.e., metalinks), along the lines of recently proposed topic maps
standard. Topics and metalinks constitute metadata that describe the contents of
the underlying Web information resources. Experts assign scores to topics, metalinks,
and information resources to represent the “importance” of them. User
profiles store users’ preferences and navigational history information about the
information resources that the user visits. User preferences, knowledge level on
topics, and history information are used for personalizing the Web search, and
improving the precision of the results returned to the user.
We store expert advices and user profiles in an object relational database
iv
v
management system, and extend the SQL for efficient querying of Web-based information
resources through the Web information space model. SQL extensions
include the clauses for propagating input importance scores to output tuples, the
clause that specifies query stopping condition, and new operators (i.e., text similarity
based selection, text similarity based join, and topic closure). Importance
score propagation and query stopping condition allow ranking of query outputs,
and limiting the output size. Text similarity based operators and topic closure
operator support sophisticated querying facilities. We develop a new algebra
called Sideway Value generating Algebra (SVA) to process these SQL extensions.
We also propose evaluation algorithms for the text similarity based SVA directional
join operator, and report experimental results on the performance of the
operator. We demonstrate experimentally the effectiveness of metadata-based
personalized Web search through SQL extensions over the Web information space
model against keyword matching based Web search techniques.Özel, Selma AyşePh.D
- …