3,411 research outputs found

    Detecting Popularity of Ideas and Individuals in Online Community

    Get PDF
    Research in the last decade has prioritized the effects of online texts and online behaviors on user information prediction. However, the previous research overlooks the overall meaning of online texts and more detailed features about users’ online behaviors. The purpose of the research is to detect the adopted ideas, the popularity of ideas, and the popularity of individuals by identifying the overall meaning of online texts and the centrality features based on user’s online interactions within an online community. To gain insights into the research questions, the online discussions on MyStarbucksIdea website is examined in this research. MyStarbucksIdea had launched since 2008 that encouraged people to submit new ideas for improving Starbuck’s products and services. Starbucks had adopted hundreds of ideas from this crowdsourcing platform. Based on the example of the MyStarbucksIdea community, a new document representation approach, Doc2Vec, synthesized with the users’ centrality features was unitized in this research. Additionally, it also is essential to study the surface-level features of online texts, the sentiment features of online texts, and the features of users’ online behaviors to determine the idea adoption as well as the popularity of ideas and individuals in the online community. Furthermore, supervised machine learning approaches, including Logistic Regression, Support Vector Machine, and Random Forest, with the adjustments for the imbalanced classes, served as the classifiers for the experiments. The results of the experiments showed that the classifications of the idea adoption, the popularity of ideas, and the popularity of individuals were all considered successful. The overall meaning of idea texts and user’s centrality features were most accurate in detecting the adopted ideas and the popularity of ideas. The overall meaning of idea texts and the features of users’ online behaviors were most accurate in detecting the popularity of individuals. These results are in accord with the results of the previous studies, which used behavioral and textual features to predict user information and enhance the previous studies\u27 results by providing the new document embedding approach and the centrality features. The models used in this research can become a much-needed tool for the popularity predictions of future research

    Leadership in Action: How Top Hackers Behave A Big-Data Approach with Text-Mining and Sentiment Analysis

    Get PDF
    This paper examines hacker behavior in dark forums and identifies its significant predictors in the light of leadership theory for communities of practice. We combine techniques from online forum features as well as text-mining and sentiment-analysis of messages. We create a multinomial logistic regression model to achieve role-based hacker classification and validate our model with actual hacker forum data. We identify total number of messages, number of threads, hacker keyword frequency, and sentiments as the most significant predictors of expert hacker behavior. We also demonstrate that while disseminating technical knowledge, the hacker community follows Pareto principle. As a recommendation for future research, we build a unique keyword lexicon of the most significant terms derived by tf-idf measure. Such investigation of hacker behavior is particularly relevant for organizations in proactive prevention of cyber-attacks. Foresight on online hacker behavior can help businesses save losses from breaches and additional costs of attack-preventive measures

    Knowledge Discovery and Management within Service Centers

    Get PDF
    These days, most enterprise service centers deploy Knowledge Discovery and Management (KDM) systems to address the challenge of timely delivery of a resourceful service request resolution while efficiently utilizing the huge amount of data. These KDM systems facilitate prompt response to the critical service requests and if possible then try to prevent the service requests getting triggered in the first place. Nevertheless, in most cases, information required for a request resolution is dispersed and suppressed under the mountain of irrelevant information over the Internet in unstructured and heterogeneous formats. These heterogeneous data sources and formats complicate the access to reusable knowledge and increase the response time required to reach a resolution. Moreover, the state-of-the art methods neither support effective integration of domain knowledge with the KDM systems nor promote the assimilation of reusable knowledge or Intellectual Capital (IC). With the goal of providing an improved service request resolution within the shortest possible time, this research proposes an IC Management System. The proposed tool efficiently utilizes domain knowledge in the form of semantic web technology to extract the most valuable information from those raw unstructured data and uses that knowledge to formulate service resolution model as a combination of efficient data search, classification, clustering, and recommendation methods. Our proposed solution also handles the technology categorization of a service request which is very crucial in the request resolution process. The system has been extensively evaluated with several experiments and has been used in a real enterprise customer service center

    The rationales behind free and proprietary software selection in organisations

    Get PDF
    The aim of this paper is to critically examine the important assumptions behind the software-selection function in organisations. Software is incorporated in many situations within enterprises due to its unique ability to efficiently and effectively augment business functions and processes. Proprietary software with its inherent advantages and disadvantages remains dominant over "Free and Open-Source Software" (FOSS) in a large number of cases. However, the arrival of cloud-computing almost certainly mandates a heterogeneous software environment. Open standards, upon which most FOSS is based promotes the free exchange of information, a founding requirement of the systems embedded in organisations. Despie evidence to the contrary, the fact that FOSS is also available at low financial cost, combined with the benefits implicit in facilitating inter-process communication supports the view that it would be attractive to organisations.This paper approaches the paradoxical situation by examining the relevant literature in a broad number of disciplines. An important aspect examined is the roles that management, and in particular the executive, play in the software-selection functio. It is on the basis of these findings that the rationales of use for both proprietry and FOSS are discussed in a multi-disciplinary context. Understanding the rationales behind the software-selection function may provide academics and practitioners with insight into what many would consider an ICT-centric problem. However, by abstratcting to the managment context, as opposed to the technical context, the organisational issues surrounding both proprietary software and FOSS adoption are counter-intuitively brought to the forefront

    Theory-driven Bilateral Dynamic Preference Learning for Person and Job Match: A Process-oriented Multi-step Multi-objective Method

    Get PDF
    Person-job matching is a typical dynamic process with bilateral interactions between job seekers and jobs, along with sample imbalance issues. These characteristics pose significant challenges when designing an intelligent person-job match method. In this paper, we propose a novel process-oriented view of the person-job matching problem and formulate it as a multi-step multi-objective bilateral match learning problem. Our method combines profile features and historical sequential behaviors to learn the bilateral attributes and dynamic preferences, with multimodal data integrated through various attention mechanisms, such as the orthogonal multi-head and gated mechanisms. The method includes a sequence update module to learn the bilateral preferences and their updates sensitive to feedback. Furthermore, the multi-step constraint effectively solves the problem of imbalanced samples through partial relationships and information transmission between multi-objectives. Abundant experiments show that our method outperforms state-of-the-art methods in providing successful matches and improving recruitment efficiency

    How Fair Is IS Research?

    Full text link
    While both information systems and machine learning are not neutral, the identification of discrimination is more difficult if a system learns from data and discrimination can be introduced at several stages. Therefore, this article investigates if IS Research has taken up with this topic. A literature analysis is conducted and its discussion shows that technology, organization, and human aspects have to be considered, making it a topic not only for data scientist or computer scientist, but for information systems researchers as well

    Automatic Identification of Assumptions from the Hibernate Developer Mailing List

    Get PDF
    During the software development life cycle, assumptions are an important type of software development knowledge that can be extracted from textual artifacts. Analyzing assumptions can help to, for example, comprehend software design and further facilitate software maintenance. Manual identification of assumptions by stakeholders is rather time-consuming, especially when analyzing a large dataset of textual artifacts. To address this problem, one promising way is to use automatic techniques for assumption identification. In this study, we conducted an experiment to evaluate the performance of existing machine learning classification algorithms for automatic assumption identification, through a dataset extracted from the Hibernate developer mailing list. The dataset is composed of 400 'Assumption' sentences and 400 'Non-Assumption' sentences. Seven classifiers using different machine learning algorithms were selected and evaluated. The experiment results show that the SVM algorithm achieved the best performance (with a precision of 0.829, a recall of 0.812, and an F1-score of 0.819). Additionally, according to the ROC curves and related AUC values, the SVM-based classifier comparatively performed better than other classifiers for the binary classification of assumptions.</p

    Identifying Security and Privacy Violation Rules in Trigger-Action IoT Platforms with NLP Models

    Get PDF
    Trigger-Action platforms are systems that enable users to easily define, in terms of conditional rules, custom behaviors concerning Internet-of-Things (IoT) devices and web services. Unfortunately, although these tools stimulate the cre- ativity of users in building automation, they may also introduce serious risks for the users. Indeed, trigger-action rules can lead to the possibility of users harming themselves, for example by unintentionally disclosing non-public information, or unwillingly exposing their smart environment to cyber-threats. In this pa- per, we propose to use Natural Language Processing (NLP) techniques to detect automation rules, defined within Trigger- Action IoT platforms, that potentially violate the security or privacy of the users. The proposed NLP-based models capture the semantic and contextual information of the trigger-action rules by applying classification techniques to different combinations of rule’s features. We evaluate the proposed solution with the mainstream trigger-action platform, namely IFTTT, by training the NLP models with a dataset of 76,741 rules labeled by using an ensemble of three semi-supervised learning techniques. The experimental results demonstrate that the model based on BERT (Bidirectional Encoder Representations from Transformers) ob- tains the highest performances when trained on all features, achieving average Precision and Recall values between 88% and 93%. We also compare the achieved performances with those of a baseline system implementing information flow analysis
    • 

    corecore