16 research outputs found

    Exploring Pattern Mining Algorithms for Hashtag Retrieval Problem

    Get PDF
    Hashtag is an iconic feature to retrieve the hot topics of discussion on Twitter or other social networks. This paper incorporates the pattern mining approaches to improve the accuracy of retrieving the relevant information and speeding up the search performance. A novel algorithm called PM-HR (Pattern Mining for Hashtag Retrieval) is designed to first transform the set of tweets into a transactional database by considering two different strategies (trivial and temporal). After that, the set of the relevant patterns is discovered, and then used as a knowledge-based system for finding the relevant tweets based on users\u27 queries under the similarity search process. Extensive results are carried out on large and different tweet collections, and the proposed PM-HR outperforms the baseline hashtag retrieval approaches in terms of runtime, and it is very competitive in terms of accuracy

    A data mining approach to ontology learning for automatic content-related question-answering in MOOCs.

    Get PDF
    The advent of Massive Open Online Courses (MOOCs) allows massive volume of registrants to enrol in these MOOCs. This research aims to offer MOOCs registrants with automatic content related feedback to fulfil their cognitive needs. A framework is proposed which consists of three modules which are the subject ontology learning module, the short text classification module, and the question answering module. Unlike previous research, to identify relevant concepts for ontology learning a regular expression parser approach is used. Also, the relevant concepts are extracted from unstructured documents. To build the concept hierarchy, a frequent pattern mining approach is used which is guided by a heuristic function to ensure that sibling concepts are at the same level in the hierarchy. As this process does not require specific lexical or syntactic information, it can be applied to any subject. To validate the approach, the resulting ontology is used in a question-answering system which analyses students' content-related questions and generates answers for them. Textbook end of chapter questions/answers are used to validate the question-answering system. The resulting ontology is compared vs. the use of Text2Onto for the question-answering system, and it achieved favourable results. Finally, different indexing approaches based on a subject's ontology are investigated when classifying short text in MOOCs forum discussion data; the investigated indexing approaches are: unigram-based, concept-based and hierarchical concept indexing. The experimental results show that the ontology-based feature indexing approaches outperform the unigram-based indexing approach. Experiments are done in binary classification and multiple labels classification settings . The results are consistent and show that hierarchical concept indexing outperforms both concept-based and unigram-based indexing. The BAGGING and random forests classifiers achieved the best result among the tested classifiers

    An Intelligent Anomaly Detection Scheme for Micro-services Architectures with Temporal and Spatial Data Analysis

    Get PDF
    This is the author accepted manuscript. The final version is available from IEEE via the DOI in this recordService-oriented 5G mobile systems are commonly believed to reshape the landscape of the Internet with ubiquitous services and infrastructures. The micro-services architecture has attracted significant interests from both academia and industry, offering the capabilities of agile development and scale capacity. The emerging mobile edge computing is able to firmly maintain efficient resource utility of 5G systems, which can be empowered by micro-services. However, such capabilities impose significant challenges on micro-services system management. Although substantial data are produced for system maintenance, the interleaved temporal-spatial information has not been fully exploited. Additionally, the flooding data impose heavy pressures on automatic analysis tools. Automated digestion of data is in an urgent need for system maintenance. In this paper, we propose a new learning-based anomaly detection framework for service-provision systems with micro-services architectures using service execution logs (temporally) and query traces (spatially). It includes two major parts: logging and tracing representation, and two-stage identification via a sequential model and temporal-spatial analysis. The experimental results show that the temporal-spatial features can accurately capture the nature of operational data. The proposed framework performs well on anomaly detection, and helps gain in-depth insights of large-scale systems

    Predicting controlled vocabulary based on text and citations: Case studies in medical subject headings in MEDLINE and patents

    Get PDF
    This dissertation makes three contributions in the area of controlled vocabulary prediction of Medical Subject Headings. The first contribution is a new partial matching measure based on distributional semantics. The second contribution is a probabilistic model based on text similarity and citations. The third contribution is a case study of cross-domain vocabulary prediction in US Patents. Medical subject headings (MeSH) are an important life sciences controlled vocabulary. They are an ideal ground to study controlled vocabulary prediction due to their complexity, hierarchical nature, and practical significance. The dissertation begins with an updated analysis of human indexing consistency in MEDLINE. This study demonstrates the need for partial matching measures to account for indexing variability. Here, I develop four measures combining the MeSH hierarchy and contextual similarity. These measures provide several new tools for evaluating and diagnosing controlled vocabulary models. Next, a generalized predictive model is introduced. This model uses citations and abstract similarity as inputs to a hybrid KNN classifier. Citations and abstracts are found to be complimentary in that they reliably produce unique and relevant candidate terms. Finally, the predictive model is applied to a corpus of approximately 65,000 biomedical US patents. This case study explores differences in the vocabulary of MEDLINE and patents, as well as the prospect for MeSH prediction to open new scholarly opportunities in economics and health policy research

    Sustainability Conversations for Impact: Transdisciplinarity on Four Scales

    Get PDF
    Sustainability is a dynamic, multi-scale endeavor. Coherence can be lost between scales – from project teams, to organizations, to networks, and, most importantly, down to conversations. Sustainability researchers have embraced transdisciplinarity, as it is grounded in science, shared language, broad participation, and respect for difference. Yet, transdisciplinarity at these four scales is not well-defined. In this dissertation I extend transdisciplinarity out from the project to networks and organizations, and down into conversation, adding novel lenses and quantitative approaches. In Chapter 2, I propose transdisciplinarity incorporate academic disciplines which help cross scales: Organizational Learning, Knowledge Management, Applied Cooperation, and Data Science. In Chapter 3 I then use a mixed-method approach to study a transdisciplinary organization, the Maine Aquaculture Hub, as it develops strategy. Using social network analysis and conversation analytics, I evaluate how the Hub’s network-convening, strategic thinking and conversation practices turn organization-scale transdisciplinarity into strategic advantage. In Chapters 4 and 5, conversation is the nexus of transdisciplinarity. I study seven public aquaculture lease scoping meetings (informal town halls) and classify conversation activity by “discussion discipline,” i.e., rhetorical and social intent. I compute the relationship between discussion discipline proportions and three sustainability outcomes of intent-to-act, options-generation, and relationship-building. I consider exogenous factors, such as signaling, gender balance, timing and location. I show that where inquiry is high, so is innovation. Where acknowledgement is high, so is intent-to-act. Where respect is high, so is relationship-building. Indirectness and sarcasm dampen outcomes. I propose seven interventions to improve sustainability conversation capacity, such as nudging, networks, and using empirical models. Chapter 5 explores those empirical models: I use natural language-processing (NLP) to detect the discussion disciplines by training a model using the previously coded transcripts. Then I use that model to classify 591 open-source conversation transcripts, and regress the sustainability outcomes, per-transcript, on discussion discipline proportions. I show that all three conversation outcomes can be predicted by the discussion disciplines, and most statistically-significant being intent-to-act, which responds directly to acknowledgement and respect. Conversation AI is the next frontier of transdisciplinarity for sustainability solutions

    Algebraic Structures of Neutrosophic Triplets, Neutrosophic Duplets, or Neutrosophic Multisets

    Get PDF
    Neutrosophy (1995) is a new branch of philosophy that studies triads of the form (, , ), where is an entity {i.e. element, concept, idea, theory, logical proposition, etc.}, is the opposite of , while is the neutral (or indeterminate) between them, i.e., neither nor .Based on neutrosophy, the neutrosophic triplets were founded, which have a similar form (x, neut(x), anti(x)), that satisfy several axioms, for each element x in a given set.This collective book presents original research papers by many neutrosophic researchers from around the world, that report on the state-of-the-art and recent advancements of neutrosophic triplets, neutrosophic duplets, neutrosophic multisets and their algebraic structures – that have been defined recently in 2016 but have gained interest from world researchers. Connections between classical algebraic structures and neutrosophic triplet / duplet / multiset structures are also studied. And numerous neutrosophic applications in various fields, such as: multi-criteria decision making, image segmentation, medical diagnosis, fault diagnosis, clustering data, neutrosophic probability, human resource management, strategic planning, forecasting model, multi-granulation, supplier selection problems, typhoon disaster evaluation, skin lesson detection, mining algorithm for big data analysis, etc
    corecore