Search CORE

32 research outputs found

Evaluation and Improvement of Semantically-Enhanced Tagging System

Author: Alsharif Majdah Hussain
Publication venue: Software Technology Research Laboratory
Publication date: 01/12/2013
Field of study

The Social Web or ‘Web 2.0’ is focused on the interaction and collaboration between web sites users. It is credited for the existence of tagging systems, amongst other things such as blogs and Wikis. Tagging systems like YouTube and Flickr offer their users the simplicity and freedom in creating and sharing their own contents and thus folksonomy is a very active research area where many improvements are presented to overcome existing disadvantages such as the lack of semantic meaning, ambiguity, and inconsistency. TE is a tagging system proposing solutions to the problems of multilingualism, lack of semantic meaning and shorthand writing (which is very common in the social web) through the aid of semantic and social resources. The current research is presenting an addition to the TE system in the form of an embedded stemming component to provide a solution to the different lexical form problems. Prior to this, the TE system had to be explored thoroughly and then its efficiency had to be determined in order to decide on the practicality of embedding any additional components as enhancements to the performance. Deciding on this involved analysing the algorithm efficiency using an analytical approach to determine its time and space complexity. The TE had a time growth rate of O (N²) which is polynomial, thus the algorithm is considered efficient. Nonetheless, recommended modifications like patch SQL execution can improve this. Regarding space complexity, the number of tags per photo represents the problem size which, if it grows, will increase linearly the required memory space. Based on the findings above, the TE system is re-implemented on Flickr instead of YouTube, because of a recent YouTube restriction, which is of greater benefit in multi languages tagging system since the language barrier is meaningless in this case. The re-implementation is achieved using ‘flickrj’ (Java Interface for Flickr APIs). Next, the stemming component is added to perform tags normalisation prior to the ontologies querying. The component is embedded using the Java encoding of the porter 2 stemmer which support many languages including Italian. The impact of the stemming component on the performance of the TE system in terms of the size of the index table and the number of retrieved results is investigated using an experiment that showed a reduction of 48% in the size of the index table. This also means that search queries have less system tags to compare them against the search keywords and this can speed up the search. Furthermore, the experiment runs similar search trails on two versions of the TE systems one without the stemming component and the other with the stemming component and found out that the latter produced more results on the conditions of working with valid words and valid stems. The embedding of the stemming component in the new TE system has lessened the effect of the storage overhead needed for the generated system tags by their reduction for the size of the index table which make the system suited for many applications such as text classification, summarization, email filtering, machine translation…etc

De Montfort University Open Research Archive

A Survey on Semantic Processing Techniques

Author: Cambria Erik
Chen Guanyi
He Kai
Mao Rui
Ni Jinjie
Yang Zonglin
Zhang Xulang
Publication venue
Publication date: 22/10/2023
Field of study

Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

arXiv.org e-Print Archive

Exploring Text Mining and Analytics for Applications in Public Security: An in-depth dive into a systematic literature review

Author: Carvalho Victor Diogho Heuer de
Costa Ana Paula Cabral Seixas
Publication venue: SciELO Preprints
Publication date: 19/01/2023
Field of study

Text mining and related analytics emerge as a technological approach to support human activities in extracting useful knowledge through texts in several formats. From a managerial point of view, it can help organizations in planning and decision-making processes, providing information that was not previously evident through textual materials produced internally or even externally. In this context, within the public/governmental scope, public security agencies are great beneficiaries of the tools associated with text mining, in several aspects, from applications in the criminal area to the collection of people's opinions and sentiments about the actions taken to promote their welfare. This article reports details of a systematic literature review focused on identifying the main areas of text mining application in public security, the most recurrent technological tools, and future research directions. The searches covered four major article bases (Scopus, Web of Science, IEEE Xplore, and ACM Digital Library), selecting 194 materials published between 2014 and the first half of 2021, among journals, conferences, and book chapters. There were several findings concerning the targets of the literature review, as presented in the results of this article

SciELO Preprints

Graph machine learning approaches to classifying the building and ground relationship Architectural 3D topological model to retrieve similar architectural precendents

Author: Alymani Abdulrahman
Publication venue
Publication date
Field of study

Architects struggle to choose the best form of how the building meets the ground and may benefit from a suggestion based on precedents. A precedent suggestion may help architects decide how the building should meet the ground. Machine learning (ML), as a part of artificial intelligence (AI), can play a role in the following scenario to determine the most appropriate relationship from a set of examples provided by trained architects. A key feature of the system involves its classification of three-dimensional (3D) prototypes of architectural precedent models using a topological graph instead of two-dimensional (2D) images to classify the models. This classified model then predicts and retrieves similar architecture precedents to enable the designer to develop or reconsider their design. The research methodology uses mixed methods research. A qualitative interview validates the taxonomy collected in the literature review and image sorting survey to study the similarity of human classification of the building and ground relationship (BGR). Moreover, the researcher leverages the use of two primary technologies in the development of the BGR tool. First, a software library enhances the representation of 3D models by using non-manifold topology (Topologic). The second phase involves an end-to-end deep graph convolutional neural network (DGCNN). This study employs a two-stage experimental workflow. The first step sees a sizable synthetic database of building relationships and ground topologies created by generative simulation for a 3D prototype of architectural precedents. These topologies then undergo conversion into semantically rich topological dual graphs. Second, the prototype architectural graphs are imported to the DGCNN model for graph classification. This experiment's results show that this approach can recognise architectural forms using more semantically relevant and structured data and that using a unique data set prevents direct comparison. Our experiments have shown that the proposed workflow achieves highly accurate results that align with DGCNN’s performance on benchmark graphs. Additionally, the study demonstrates the effectiveness of using different machine learning approaches, such as Deep Graph Library (DGL) and Unsupervised Graph Level Representation Learning (UGLRL). This research demonstrates the potential of AI to help designers identify the topology of architectural solutions and place them within the most relevant architectural canons

Online Research @ Cardiff

KEER2022

Author
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2022
Field of study

Avanttítol: KEER2022. DiversitiesDescripció del recurs: 25 juliol 202

UPCommons. Portal del coneixement obert de la UPC

Theories of Informetrics and Scholarly Communication

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 10/02/2021
Field of study

Scientometrics have become an essential element in the practice and evaluation of science and research, including both the evaluation of individuals and national assessment exercises. Yet, researchers and practitioners in this field have lacked clear theories to guide their work. As early as 1981, then doctoral student Blaise Cronin published "The need for a theory of citing" —a call to arms for the fledgling scientometric community to produce foundational theories upon which the work of the field could be based. More than three decades later, the time has come to reach out the field again and ask how they have responded to this call. This book compiles the foundational theories that guide informetrics and scholarly communication research. It is a much needed compilation by leading scholars in the field that gathers together the theories that guide our understanding of authorship, citing, and impact

Directory of Open Access Books (DOAB)

Theories of Informetrics and Scholarly Communication

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2016
Field of study

Scientometrics have become an essential element in the practice and evaluation of science and research, including both the evaluation of individuals and national assessment exercises. Yet, researchers and practitioners in this field have lacked clear theories to guide their work. As early as 1981, then doctoral student Blaise Cronin published The need for a theory of citing - a call to arms for the fledgling scientometric community to produce foundational theories upon which the work of the field could be based. More than three decades later, the time has come to reach out the field again and ask how they have responded to this call. This book compiles the foundational theories that guide informetrics and scholarly communication research. It is a much needed compilation by leading scholars in the field that gathers together the theories that guide our understanding of authorship, citing, and impact

SSOAR - Social Science Open Access Repository

Theories of Informetrics and Scholarly Communication

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

OAPEN Library

Proceedings of the 10th International Conference on CMC and Social Media Corpora for the Humanities (CMC-Corpora 2023), 14–15 September 2023, University of Mannheim, Germany

Author
Publication venue: Institut für Deutsche Sprache (IDS)
Publication date: 01/01/2023
Field of study

MAnnheim DOCument Server