459,677 research outputs found

    Pemodelan Pola Hubungan Kemampuan Lulusan Universitas Lancang Kuning Dengan Kebutuhan Dunia USAha Dan Industri

    Full text link
    The rapid growth of the data warehouse has created conditions for rich data but poor information. Data mining is the mining or the discovery of new information by looking for certain patterns or rules of a number of large amounts of data that is expected to produce an interesting pattern or important information from the condition. By utilizing the the graduate tracer data that is associated with users of graduates, they are business and industry, are expected to produce information about the pattern of their relationship through data mining techniques, association rule. Category ability of graduates in measuring the level parameter is less necessary, reasonably necessary, needed, and is needed in the world of business and industry. The algorithm used is a priori algorithm, the information displayed in the form of support and confidence values of each category type abilities of graduates

    Marketing relations and communication infrastructure development in the banking sector based on big data mining

    Get PDF
    Purpose: The article aims to study the methodological tools for applying the technologies of intellectual analysis of big data in the modern digital space, the further implementation of which can become the basis for the marketing relations concept implementation in the banking sector of the Russian Federation‘ economy. Structure/Methodology/Approach: For the marketing relations development in the banking sector in the digital economy, it seems necessary: firstly, to identify the opportunities and advantages of the big data mining in banking marketing; secondly, to identify the sources and methods of processing big data; thirdly, to study the examples of the big data mining successful use by Russian banks and to formulate the recommendations on the big data technologies implementation in the digital marketing banking strategy. Findings: The authors‘ analysis showed that big data technologies processing of open online and offline sources of information significantly increases the data amount available for intelligent analysis, as a result of which the interaction between the bank and the target client reaches a new level of partnership. Practical Implications: Conclusions and generalizations of the study can be applied in the practice of managing financial institutions. The results of the study can be used by bank management to form a digital marketing strategy for long-term communication. Originality/Value: The main contribution of this study is that the authors have identified the main directions of using big data in relationship marketing to generate additional profit, as well as the possibility of intellectual analysis of the client base, aimed at expanding the market share and retaining customers in the banking sector of the economy.peer-reviewe

    Structural Deep Embedding for Hyper-Networks

    Full text link
    Network embedding has recently attracted lots of attentions in data mining. Existing network embedding methods mainly focus on networks with pairwise relationships. In real world, however, the relationships among data points could go beyond pairwise, i.e., three or more objects are involved in each relationship represented by a hyperedge, thus forming hyper-networks. These hyper-networks pose great challenges to existing network embedding methods when the hyperedges are indecomposable, that is to say, any subset of nodes in a hyperedge cannot form another hyperedge. These indecomposable hyperedges are especially common in heterogeneous networks. In this paper, we propose a novel Deep Hyper-Network Embedding (DHNE) model to embed hyper-networks with indecomposable hyperedges. More specifically, we theoretically prove that any linear similarity metric in embedding space commonly used in existing methods cannot maintain the indecomposibility property in hyper-networks, and thus propose a new deep model to realize a non-linear tuplewise similarity function while preserving both local and global proximities in the formed embedding space. We conduct extensive experiments on four different types of hyper-networks, including a GPS network, an online social network, a drug network and a semantic network. The empirical results demonstrate that our method can significantly and consistently outperform the state-of-the-art algorithms.Comment: Accepted by AAAI 1

    Data mining Twitter for cancer, diabetes, and asthma insights

    Get PDF
    Twitter may be a data resource to support healthcare research. Literature is still limited related to the potential of Twitter data as it relates to healthcare. The purpose of this study was to contrast the processes by which a large collection of unstructured disease-related tweets could be converted into structured data to be further analyzed. This was done with the objective of gaining insights into the content and behavioral patterns associated with disease-specific communications on Twitter. Twelve months of Twitter data related to cancer, diabetes, and asthma were collected to form a baseline dataset containing over 34 million tweets. As Twitter data in its raw form would have been difficult to manage, three separate data reduction methods were contrasted to identify a method to generate analysis files, maximizing classification precision and data retention. Each of the disease files were then run through a CHAID (chi-square automatic interaction detector) analysis to demonstrate how user behavior insights vary by disease. Chi-square Automatic Interaction Detector (CHAID) was a technique created by Gordon V. Kass in 1980. CHAID is a tool used to discover the relationship between variables. This study followed the standard CRISP-DM data mining approach and demonstrates how the practice of mining Twitter data fits into this six-stage iterative framework. The study produced insights that provide a new lens into the potential Twitter data has as a valuable healthcare data source as well as the nuances involved in working with the data

    Applied information retrieval and multidisciplinary research: new mechanistic hypotheses in Complex Regional Pain Syndrome

    Get PDF
    Background: Collaborative efforts of physicians and basic scientists are often necessary in the investigation of complex disorders. Difficulties can arise, however, when large amounts of information need to reviewed. Advanced information retrieval can be beneficial in combining and reviewing data obtained from the various scientific fields. In this paper, a team of investigators with varying backgrounds has applied advanced information retrieval methods, in the form of text mining and entity relationship tools, to review the current literature, with the intention to generate new insights into the molecular mechanisms underlying a complex disorder. As an example of such a disorder the Complex Regional Pain Syndrome (CRPS) was chosen. CRPS is a painful and debilitating syndrome with a complex etiology that is still unraveled for a considerable part, resulting in suboptimal diagnosis and treatment. Results: A text mining based approach combined with a simple network analysis identified Nuclear Factor kappa B (NFκB) as a possible central mediator in both the initiation and progression of CRPS. Conclusion: The result shows the added value of a multidisciplinary approach combined with information retrieval in hypothesis discovery in biomedical research. The new hypothesis, which was derived in silico, provides a framework for further mechanistic studies into the underlying molecular mechanisms of CRPS and requires evaluation in clinical and epidemiological studies

    Scalable supergraph search in large graph databases

    Full text link
    © 2016 IEEE. Supergraph search is a fundamental problem in graph databases that is widely applied in many application scenarios. Given a graph database and a query-graph, supergraph search retrieves all data-graphs contained in the query-graph from the graph database. Most existing solutions for supergraph search follow the pruning-and-verification framework, which prunes false answers based on features in the pruning phase and performs subgraph isomorphism testings on the remaining graphs in the verification phase. However, they are not scalable to handle large-sized data-graphs and query-graphs due to three drawbacks. First, they rely on a frequent subgraph mining algorithm to select features which is expensive and cannot generate large features. Second, they require a costly verification phase. Third, they process features in a fixed order without considering their relationship to the query-graph. In this paper, we address the three drawbacks and propose new indexing and query processing algorithms. In indexing, we select features directly from the data-graphs without expensive frequent subgraph mining. The features form a feature-tree that contains all-sized features and both the cost sharing and pruning power of the features are considered. In query processing, we propose a verification-free algorithm, where the order to process features is query-dependent by considering both the cost sharing and the pruning power. We explore two optimization strategies to further improve the algorithm efficiency. The first strategy applies a lightweight graph compression technique and the second strategy optimizes the inclusion of answers. Finally, we conduct extensive performance studies on two real large datasets to demonstrate the high scalability of our algorithms

    REALISTIC MODELING OF HANDOVER EVENTS IN A MULTI-CARRIER 5G NETWORK: A PRELIMINARY STEP TOWARDS COP-KPI RELATIONSHIP REALIZATION

    Get PDF
    The ever-increasing demand for mobile data traffic along with new use cases are set to make the current cellular network technology obsolete and give rise to a newer and better one in the form of 5G. This arising technology is coming with a promise of massive capacity, ultra-high reliability and close to zero latency, however, coming alongside is additional complexity. 5G is expected to carry along with it more than 5000 confi guration and optimization parameters (COPs). These COPs are the backbone of a network as most of the Key Performance Indicators (KPIs) relies on the proper settings of these COPs. To set these parameters optimally, it is imperative that the relationship between COPs and KPIs be understood. However, to date, this relationship between COPs and KPIs is known to some extend but is not fully realized. But mining the COP-KPI relationship is not a dead end. Machine Learning (ML) can be leveraged to learn KPI behavior with changes in COPs. Yet, ML's full potential is bounded by the lack of representative data in the wireless community to effectively train these models. Gathering these data is, in itself, a challenge. Real data from live network is abundant, yet not representative. Although simulator is a promising source of data, its performance lies on how realistic and detailed the modeling and implementation of its functions are. In this thesis paper, we have presented a realistic and comprehensive modeling of one of the most important functions of a wireless network: the handover function. In line with 3GPP standards, we have modeled and implemented more than 20 handover related COPs. The model is incorporated in a python-based simulator to generate data. Validation and evaluation are done to prove the model accuracy and its effectiveness in capturing real handover procedure. Use cases are also presented to show its capability to simulate different COP settings and show the effects on KPIs. This thesis paper is presented as an initial step in generating representative dataset to train machine learning to model COP-KPI relationship
    • …
    corecore