Search CORE

28,010 research outputs found

S-Match: an algorithm and an implementation of semantic matching

Author: Giunchiglia Fausto
Mapping and Translation
Shvaiko Pavel
Yatskevich Mikalai
Publication venue: Dagstuhl Seminar Proceedings. 04391 - Semantic Interoperability and Integration
Publication date: 01/01/2005
Field of study

We think of Match as an operator which takes two graph-like structures and produces a mapping between those nodes of the two graphs that correspond semantically to each other. Semantic matching is a novel approach where semantic correspondences are discovered by computing and returning as a result, the semantic information implicitly or explicitly codified in the labels of nodes and arcs. In this paper we present an algorithm implementing semantic matching, and we discuss its implementation within the S-Match system. We also test S-Match against three state of the art matching systems. The results, though preliminary, look promising, in particular for what concerns precision and recall

Dagstuhl Research Online Publication Server

Cross-concordances: terminology mapping and its effectiveness for information retrieval

Author: Mayr Philipp
Petras Vivien
Publication venue
Publication date: 01/01/2008
Field of study

The German Federal Ministry for Education and Research funded a major terminology mapping initiative, which found its conclusion in 2007. The task of this terminology mapping initiative was to organize, create and manage 'cross-concordances' between controlled vocabularies (thesauri, classification systems, subject heading lists) centred around the social sciences but quickly extending to other subject areas. 64 crosswalks with more than 500,000 relations were established. In the final phase of the project, a major evaluation effort to test and measure the effectiveness of the vocabulary mappings in an information system environment was conducted. The paper reports on the cross-concordance work and evaluation results.Comment: 19 pages, 4 figures, 11 tables, IFLA conference 200

arXiv.org e-Print Archive

E-LIS

Knowledge-based Data Processing for Multilingual Natural Language Analysis

Author: Bhooshan Gupta Brij
García Martínez Eyre Yamila
Kotecha Ketan
Kumar Jain Deepak
Kumar Akshi
Publication venue: ACM Transactions on Asian and Low-Resource Language Information Processing
Publication date: 15/01/2024
Field of study

Natural Language Processing (NLP) aids the empowerment of intelligent machines by enhancing human language understanding for linguistic-based human-computer communication. Recent developments in processing power, as well as the availability of large volumes of linguistic data, have enhanced the demand for data-driven methods for automatic semantic analysis. This paper proposes multilingual data processing using feature extraction with classification using deep learning architectures. Here, the input text data has been collected based on various languages and processed to remove missing values and null values. The processed data has been extracted using Histogram Equalization based Global Local Entropy (HEGLE) and classified using Kernel-based Radial basis Function (Ker_Rad_BF). These architectures could be utilized to process natural language. We present solutions to the multilingual sentiment analysis issue in this research article by implementing algorithms, and we compare precision factors to discover the optimum option for multilingual sentiment analysis. For the HASOC dataset, the proposed HEGLE_ Ker_Rad_BF achieved an accuracy of 98%, a precision of 97%, a recall of 90.5%, an f-1 score of 85%, RMSE of 55.6% and a loss curve analysis attained 44%. For the TRAC dataset, the accuracy of 98%, the precision attained is 97%, the Recall is 91%, the F-1 score is 87%, and the RMSE of the proposed neural network is 55%

Re-UNIR

Hybrid Search: Effectively Combining Keywords and Semantic Searches

Author: Bhagdev R.
Chapman S.
Ciravegna F.
Lanfranchi V.
Petrelli D.
Publication venue
Publication date: 01/06/2008
Field of study

This paper describes hybrid search, a search method supporting both document and knowledge retrieval via the flexible combination of ontologybased search and keyword-based matching. Hybrid search smoothly copes with lack of semantic coverage of document content, which is one of the main limitations of current semantic search methods. In this paper we define hybrid search formally, discuss its compatibility with the current semantic trends and present a reference implementation: K-Search. We then show how the method outperforms both keyword-based search and pure semantic search in terms of precision and recall in a set of experiments performed on a collection of about 18.000 technical documents. Experiments carried out with professional users show that users understand the paradigm and consider it very powerful and reliable. K-Search has been ported to two applications released at Rolls-Royce plc for searching technical documentation about jet engines

White Rose Research Online

SAFE: Self-Attentive Function Embeddings for Binary Similarity

Author: Baldoni Roberto
Di Luna Giuseppe Antonio
Massarelli Luca
Petroni Fabio
Querzoni Leonardo
Publication venue
Publication date: 01/01/2019
Field of study

The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact. Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations. However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem. In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures. We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions. Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search).Comment: Published in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA) 201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Weak signal identification with semantic web mining

Author: Thorleuchter Dirk
Van den Poel Dirk
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

We investigate an automated identification of weak signals according to Ansoff to improve strategic planning and technological forecasting. Literature shows that weak signals can be found in the organization's environment and that they appear in different contexts. We use internet information to represent organization's environment and we select these websites that are related to a given hypothesis. In contrast to related research, a methodology is provided that uses latent semantic indexing (LSI) for the identification of weak signals. This improves existing knowledge based approaches because LSI considers the aspects of meaning and thus, it is able to identify similar textual patterns in different contexts. A new weak signal maximization approach is introduced that replaces the commonly used prediction modeling approach in LSI. It enables to calculate the largest number of relevant weak signals represented by singular value decomposition (SVD) dimensions. A case study identifies and analyses weak signals to predict trends in the field of on-site medical oxygen production. This supports the planning of research and development (R&D) for a medical oxygen supplier. As a result, it is shown that the proposed methodology enables organizations to identify weak signals from the internet for a given hypothesis. This helps strategic planners to react ahead of time

Ghent University Academic Bibliography

Fraunhofer-ePrints

Forecasting the Spreading of Technologies in Research Communities

Author: Mannocci Andrea
Motta Enrico
Osborne Francesco
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Technologies such as algorithms, applications and formats are an important part of the knowledge produced and reused in the research process. Typically, a technology is expected to originate in the context of a research area and then spread and contribute to several other fields. For example, Semantic Web technologies have been successfully adopted by a variety of fields, e.g., Information Retrieval, Human Computer Interaction, Biology, and many others. Unfortunately, the spreading of technologies across research areas may be a slow and inefficient process, since it is easy for researchers to be unaware of potentially relevant solutions produced by other research communities. In this paper, we hypothesise that it is possible to learn typical technology propagation patterns from historical data and to exploit this knowledge i) to anticipate where a technology may be adopted next and ii) to alert relevant stakeholders about emerging and relevant technologies in other fields. To do so, we propose the Technology-Topic Framework, a novel approach which uses a semantically enhanced technology-topic model to forecast the propagation of technologies to research areas. A formal evaluation of the approach on a set of technologies in the Semantic Web and Artificial Intelligence areas has produced excellent results, confirming the validity of our solution

Crossref

Open Research Online (The Open University)