Search CORE

459,677 research outputs found

Pemodelan Pola Hubungan Kemampuan Lulusan Universitas Lancang Kuning Dengan Kebutuhan Dunia USAha Dan Industri

Author: Wiza F. (Fana)
Publication venue: Lancang Kuning University
Publication date: 01/01/2016
Field of study

The rapid growth of the data warehouse has created conditions for rich data but poor information. Data mining is the mining or the discovery of new information by looking for certain patterns or rules of a number of large amounts of data that is expected to produce an interesting pattern or important information from the condition. By utilizing the the graduate tracer data that is associated with users of graduates, they are business and industry, are expected to produce information about the pattern of their relationship through data mining techniques, association rule. Category ability of graduates in measuring the level parameter is less necessary, reasonably necessary, needed, and is needed in the world of business and industry. The algorithm used is a priori algorithm, the information displayed in the form of support and confidence values of each category type abilities of graduates

Neliti

Marketing relations and communication infrastructure development in the banking sector based on big data mining

Author: Baraulya E. V.
Ivanchenko Olyesia Valerievna
Mirgorodskaya Olga Nikolaevna
Putilina T. I.
Publication venue: Eleftherios Thalassinos
Publication date: 01/01/2019
Field of study

Purpose: The article aims to study the methodological tools for applying the technologies of intellectual analysis of big data in the modern digital space, the further implementation of which can become the basis for the marketing relations concept implementation in the banking sector of the Russian Federation‘ economy. Structure/Methodology/Approach: For the marketing relations development in the banking sector in the digital economy, it seems necessary: firstly, to identify the opportunities and advantages of the big data mining in banking marketing; secondly, to identify the sources and methods of processing big data; thirdly, to study the examples of the big data mining successful use by Russian banks and to formulate the recommendations on the big data technologies implementation in the digital marketing banking strategy. Findings: The authors‘ analysis showed that big data technologies processing of open online and offline sources of information significantly increases the data amount available for intelligent analysis, as a result of which the interaction between the bank and the target client reaches a new level of partnership. Practical Implications: Conclusions and generalizations of the study can be applied in the practice of managing financial institutions. The results of the study can be used by bank management to form a digital marketing strategy for long-term communication. Originality/Value: The main contribution of this study is that the authors have identified the main directions of using big data in relationship marketing to generate additional profit, as well as the possibility of intellectual analysis of the client base, aimed at expanding the market share and retaining customers in the banking sector of the economy.peer-reviewe

OAR@UM

Structural Deep Embedding for Hyper-Networks

Author: Cui Peng
Tu Ke
Wang Fei
Wang Xiao
Zhu Wenwu
Publication venue
Publication date: 31/01/2018
Field of study

Network embedding has recently attracted lots of attentions in data mining. Existing network embedding methods mainly focus on networks with pairwise relationships. In real world, however, the relationships among data points could go beyond pairwise, i.e., three or more objects are involved in each relationship represented by a hyperedge, thus forming hyper-networks. These hyper-networks pose great challenges to existing network embedding methods when the hyperedges are indecomposable, that is to say, any subset of nodes in a hyperedge cannot form another hyperedge. These indecomposable hyperedges are especially common in heterogeneous networks. In this paper, we propose a novel Deep Hyper-Network Embedding (DHNE) model to embed hyper-networks with indecomposable hyperedges. More specifically, we theoretically prove that any linear similarity metric in embedding space commonly used in existing methods cannot maintain the indecomposibility property in hyper-networks, and thus propose a new deep model to realize a non-linear tuplewise similarity function while preserving both local and global proximities in the formed embedding space. We conduct extensive experiments on four different types of hyper-networks, including a GPS network, an online social network, a drug network and a semantic network. The empirical results demonstrate that our method can significantly and consistently outperform the state-of-the-art algorithms.Comment: Accepted by AAAI 1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Data mining Twitter for cancer, diabetes, and asthma insights

Author: Chulis Kimberly
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Twitter may be a data resource to support healthcare research. Literature is still limited related to the potential of Twitter data as it relates to healthcare. The purpose of this study was to contrast the processes by which a large collection of unstructured disease-related tweets could be converted into structured data to be further analyzed. This was done with the objective of gaining insights into the content and behavioral patterns associated with disease-specific communications on Twitter. Twelve months of Twitter data related to cancer, diabetes, and asthma were collected to form a baseline dataset containing over 34 million tweets. As Twitter data in its raw form would have been difficult to manage, three separate data reduction methods were contrasted to identify a method to generate analysis files, maximizing classification precision and data retention. Each of the disease files were then run through a CHAID (chi-square automatic interaction detector) analysis to demonstrate how user behavior insights vary by disease. Chi-square Automatic Interaction Detector (CHAID) was a technique created by Gordon V. Kass in 1980. CHAID is a tool used to discover the relationship between variables. This study followed the standard CRISP-DM data mining approach and demonstrates how the practice of mining Twitter data fits into this six-stage iterative framework. The study produced insights that provide a new lens into the potential Twitter data has as a valuable healthcare data source as well as the nuances involved in working with the data

Purdue E-Pubs

Applied information retrieval and multidisciplinary research: new mechanistic hypotheses in Complex Regional Pain Syndrome

Author: Boyer Scott
Cases Montserrat
de Bruijn Anke GJ
de Mos Marissa
Hettne Kristina M
Mestres Jordi
van der Lei Johan
van Mulligen Erik M
Weeber Marc
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background: Collaborative efforts of physicians and basic scientists are often necessary in the investigation of complex disorders. Difficulties can arise, however, when large amounts of information need to reviewed. Advanced information retrieval can be beneficial in combining and reviewing data obtained from the various scientific fields. In this paper, a team of investigators with varying backgrounds has applied advanced information retrieval methods, in the form of text mining and entity relationship tools, to review the current literature, with the intention to generate new insights into the molecular mechanisms underlying a complex disorder. As an example of such a disorder the Complex Regional Pain Syndrome (CRPS) was chosen. CRPS is a painful and debilitating syndrome with a complex etiology that is still unraveled for a considerable part, resulting in suboptimal diagnosis and treatment. Results: A text mining based approach combined with a simple network analysis identified Nuclear Factor kappa B (NFκB) as a possible central mediator in both the initiation and progression of CRPS. Conclusion: The result shows the added value of a multidisciplinary approach combined with information retrieval in hypothesis discovery in biomedical research. The new hypothesis, which was derived in silico, provides a framework for further mechanistic studies into the underlying molecular mechanisms of CRPS and requires evaluation in clinical and epidemiological studies

CiteSeerX

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

PubMed Central

EUR Research Repository

Leiden University Scholary Publications

Erasmus University Digital Repository

Scalable supergraph search in large graph databases

Author: Chang L
Lin X
Lyu B
Qin L
Yu JX
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/06/2016
Field of study

© 2016 IEEE. Supergraph search is a fundamental problem in graph databases that is widely applied in many application scenarios. Given a graph database and a query-graph, supergraph search retrieves all data-graphs contained in the query-graph from the graph database. Most existing solutions for supergraph search follow the pruning-and-verification framework, which prunes false answers based on features in the pruning phase and performs subgraph isomorphism testings on the remaining graphs in the verification phase. However, they are not scalable to handle large-sized data-graphs and query-graphs due to three drawbacks. First, they rely on a frequent subgraph mining algorithm to select features which is expensive and cannot generate large features. Second, they require a costly verification phase. Third, they process features in a fixed order without considering their relationship to the query-graph. In this paper, we address the three drawbacks and propose new indexing and query processing algorithms. In indexing, we select features directly from the data-graphs without expensive frequent subgraph mining. The features form a feature-tree that contains all-sized features and both the cost sharing and pruning power of the features are considered. In query processing, we propose a verification-free algorithm, where the order to process features is query-dependent by considering both the cost sharing and the pruning power. We explore two optimization strategies to further improve the algorithm efficiency. The first strategy applies a lightweight graph compression technique and the second strategy optimizes the inclusion of answers. Finally, we conduct extensive performance studies on two real large datasets to demonstrate the high scalability of our algorithms

OPUS - University of Technology Sydney

Recommended from our members

Distributionally Robust Performance Analysis with Applications to Mine Valuation and Risk

Author: Dolan Christopher James
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2017
Field of study

We consider several problems motivated by issues faced in the mining industry. In recent years, it has become clear that mines have substantial tail risk in the form of environmental disasters, and this tail risk is not incorporated into common pricing and risk models. However, data sets of the extremal climate behavior that drive this risk are very small, and generally inadequate for properly estimating the tail behavior. We propose a data-driven methodology that comes up with reasonable worst-case scenarios, given the data size constraints, and we incorporate this into a real options based model for the valuation of mines. We propose several different iterations of the model, to allow the end-user to choose the degree to which they wish to specify the financial consequences of the disaster scenario. Next, in order to perform a risk analysis on a portfolio of mines, we propose a method of estimating the correlation structure of high-dimensional max-stable processes. Using the techniques of (Liu Et al, 2017) to map the relationship between normal correlations and max-stable correlations, we can then use techniques inspired by (Bickel et al, 2008, Liu et al, 2014, Rothman et al, 2009) to estimate the underlying correlation matrix, while preserving a sparse, positive-definite structure. The correlation matrices are then used in the calculation of model-robust risk metrics (VaR, CVAR) using the the Sample-Out-of-Sample methodology (Blanchet and Kang, 2017). We conclude with several new techniques that were developed in the field of robust performance analysis, that while not directly applied to mining, were motivated by our studies into distributionally robust optimization in order to address these problems

Columbia University Academic Commons

REALISTIC MODELING OF HANDOVER EVENTS IN A MULTI-CARRIER 5G NETWORK: A PRELIMINARY STEP TOWARDS COP-KPI RELATIONSHIP REALIZATION

Author: Manalastas Marvin
Publication venue
Publication date: 01/04/2020
Field of study

The ever-increasing demand for mobile data traffic along with new use cases are set to make the current cellular network technology obsolete and give rise to a newer and better one in the form of 5G. This arising technology is coming with a promise of massive capacity, ultra-high reliability and close to zero latency, however, coming alongside is additional complexity. 5G is expected to carry along with it more than 5000 confi guration and optimization parameters (COPs). These COPs are the backbone of a network as most of the Key Performance Indicators (KPIs) relies on the proper settings of these COPs. To set these parameters optimally, it is imperative that the relationship between COPs and KPIs be understood. However, to date, this relationship between COPs and KPIs is known to some extend but is not fully realized. But mining the COP-KPI relationship is not a dead end. Machine Learning (ML) can be leveraged to learn KPI behavior with changes in COPs. Yet, ML's full potential is bounded by the lack of representative data in the wireless community to effectively train these models. Gathering these data is, in itself, a challenge. Real data from live network is abundant, yet not representative. Although simulator is a promising source of data, its performance lies on how realistic and detailed the modeling and implementation of its functions are. In this thesis paper, we have presented a realistic and comprehensive modeling of one of the most important functions of a wireless network: the handover function. In line with 3GPP standards, we have modeled and implemented more than 20 handover related COPs. The model is incorporated in a python-based simulator to generate data. Validation and evaluation are done to prove the model accuracy and its effectiveness in capturing real handover procedure. Use cases are also presented to show its capability to simulate different COP settings and show the effects on KPIs. This thesis paper is presented as an initial step in generating representative dataset to train machine learning to model COP-KPI relationship

SHAREOK repository