23 research outputs found
Increasing Reproducibility in Science by Interlinking Semantic Artifact Descriptions in a Knowledge Graph
One of the pillars of the scientific method is reproducibility â the ability to replicate the results of a prior study if the same procedures are followed. A lack of reproducibility can lead to wasted resources, false conclusions, and a loss of public trust in science. Ensuring reproducibility is challenging due to the heterogeneity of the methods used in different fields of science. In this article, we present an approach for increasing the reproducibility of research results, by semantically describing and interlinking relevant artifacts such as data, software scripts or simulations in a knowledge graph. In order to ensure the flexibility to adapt the approach to different fields of science, we devise a template model, which allows defining typical descriptions required to increase reproducibility of a certain type of study. We provide a scoring model for gradually assessing the reproducibility of a certain study based on the templates and provide a knowledge graph infrastructure for curating reproducibility descriptions along with semantic research contribution descriptions. We demonstrate the feasibility of our approach with an example in data science
Social mining for sustainable cities: thematic study of gender-based violence coverage in news articles and domestic violence in relation to COVID-19
We argue that social computing and its diverse applications can contribute to the attainment of sustainable development goals (SDGs)âspecifically to the SDGs concerning gender equality and empowerment of all women and girls, and to make cities and human settlements inclusive. To achieve the above goals for the sustainable growth of societies, it is crucial to study gender-based violence (GBV) in a smart city context, which is a common component of violence across socio-economic groups globally. This paper analyzes the nature of news articles reported in English newspapers of Pakistan, India, and the UKâaccumulating 12,693 gender-based violence-related news articles. For the qualitative textual analysis, we employ Latent Dirichlet allocation for topic modeling and propose a Doc2Vec based word-embeddings model to classify gender-based violence-related content, called GBV2Vec. Further, by leveraging GBV2Vec, we also build an online tool that analyzes the sensitivity of Gender-based violence-related content from the textual data. We run a case study on GBV concerning COVID-19 by feeding the data collected through Google News API. Finally, we show different news reporting trends and the nature of the gender-based violence committed during the testing times of COVID-19. The approach and the toolkit that this paper proposes will be of great value to decision-makers and human rights activists, given the prompt and coordinated performance against gender-based violence in smart city contextâand can contribute to the achievement of SDGs for sustainable growth of human societies
Detailed analysis of Ethereum network on transaction behavior, community structure and link prediction
Ethereum, the second-largest cryptocurrency after Bitcoin, has attracted wide attention in the last few years and accumulated significant transaction records. However, the underlying Ethereum network structure is still relatively unexplored. Also, very few attempts have been made to perform link predictability on the Ethereum transactions network. This paper presents a Detailed Analysis of the Ethereum Network on Transaction Behavior, Community Structure, and Link Prediction (DANET) framework to investigate various valuable aspects of the Ethereum network. Specifically, we explore the change in wealth distribution and accumulation on Ethereum Featured Transactional Network (EFTN) and further study its community structure. We further hunt for a suitable link predictability model on EFTN by employing state-of-the-art Variational Graph Auto-Encoders. The link prediction experimental results demonstrate the superiority of outstanding prediction accuracy on Ethereum networks. Moreover, the statistic usages of the Ethereum network are visualized and summarized through the experiments allowing us to formulate conjectures on the current use of this technology and future development. Subjects Data Mining and Machine Learning, Data Science, Emerging Technologie
Automated Discovery of Product Feature Inferences Within Large-Scale Implicit Social Media Data
Recently, social media has emerged as an alternative, viable source to extract large-scale, heterogeneous product features in a time and cost-efficient manner. One of the challenges of utilizing social media data to inform product design decisions is the existence of implicit data such as sarcasm, which accounts for 22.75% of social media data, and can potentially create bias in the predictive models that learn from such data sources. For example, if a customer says "I just love waiting all day while this song downloads," an automated product feature extraction model may incorrectly associate a positive sentiment of "love" to the cell phone's ability to download. While traditional text mining techniques are designed to handle well-formed text where product features are explicitly inferred from the combination of words, these tools would fail to process these social messages that include implicit product feature information. In this paper, we propose a method that enables designers to utilize implicit social media data by translating each implicit message into its equivalent explicit form, using the word concurrence network. A case study of Twitter messages that discuss smartphone features is used to validate the proposed method. The results from the experiment not only show that the proposed method improves the interpretability of implicit messages, but also sheds light on potential applications in the design domains where this work could be extended
An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages
The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data
Development of a service evolution map for service design through application of text mining to service documents
As digital convergence has proliferated and products have become smarter, various service concepts have emerged based on the capabilities of products. It has become a main concern to illuminate historical changes and status of service concepts according to the utilisation of product elements to provide valuable information for service development. However, a lacuna still remains in the literature regarding a systematic and quantitative approach on this problem. This study proposes a service evolution map as a tool for analysing the evolutionary paths of service concepts based on the utilisation of product elements. The proposed service evolution map consists of two layers with the time dimension: a product element layer for the utilisation of product elements and a service concept layer for the evolutionary paths of service concepts. Based on the service documents describing what the services are, text mining, co-word analysis, and modified formal concept analysis are employed to develop the product element and service concept layers, respectively. A case study of mobile application services is presented to illustrate the proposed approach. This study is expected to be a basis of future research on the interaction between products and services and service concept design based on the creative utilisation of product elements.clos