84 research outputs found

    Experiences in Mining Educational Data to Analyze Teacher's Performance: A Case Study with High Educational Teachers

    Get PDF
    Educational Data Mining (EDM) is a new paradigm aiming to mine and extract knowledge necessary to optimize the effectiveness of teaching process. With normal educational system work it’s often unlikely to accomplish fine system optimizing due to large amount of data being collected and tangled throughout the system. EDM resolves this problem by its capability to mine and explore these raw data and as a consequence of extracting knowledge. This paper describes several experiments on real educational data wherein the effectiveness of Data Mining is explained in migration the educational data into knowledge. The experiments goal at first to identify important factors of teacher behaviors influencing student satisfaction. In addition to presenting experiences gained through the experiments, the paper aims to provide practical guidance of Data Mining solutions in a real application

    User-Based Web Recommendation System: A Case Study of the National Museum of History

    Get PDF
    With the explosion and the rapidly growing market of the Internet, it is imperative that managers re-think to using technology, especially internet, to deliver services faster, cheaper, and with better quality than their competitors do. The web site provides a communication way that reveals real-time assess data and fruitful information of customers. Therefore, the call for customer with personalized web pages has become loud. To achieve personalized web pages, this study proposes recommendation algorithm of user behavior oriented by using the web log files from National Museum of History

    Reliable Peer-to-Peer Access for Italian Citizens to Digital Government Services on the Internet

    Get PDF
    In the delivery of e-government services to citizens it should be clear that the viewpoint cannot simply be the standard one of client-supplier commonly used to provide services on the Internet. In a modern society it has rather to be the peer-to-peer approach which is typical of democracies, where institutions are equal to citizens in front of the law. But this is not yet a widely accepted standpoint in digital government efforts going on in many advanced countries in the world. Italian government, in its ever increasing effort to provide citizens with easier access to online government services, has instead adopted and is pursuing this symmetric approach, which is going to represent a fundamental tool in the ongoing march towards e-democracy. In this paper we describe the organizations involved in the process and the Information Technology (IT) infrastructure enabling the effective management of the whole process while ensuring the mandatory security functions in a democratic manner. Organizational complexity lies in the distribution of responsibilities for the management of people’s personal data among the more than 8000 Italian Municipalities and the need of keeping a centralized control on all processes dealing with identity of people. Technical complexity stems from the need of efficiently supporting this distribution of responsibilities while ensuring, at the same time, interoperability of IT-based systems independent of technical choices of the organizations involved, and fulfillment of privacy constraints. The IT architecture defined for this purpose features a clear separation between security services, provided at an infrastructure level, and application services, exposed on the Internet as Web Services

    Efficient Discovery of Association Rules and Frequent Itemsets through Sampling with Tight Performance Guarantees

    Full text link
    The tasks of extracting (top-KK) Frequent Itemsets (FI's) and Association Rules (AR's) are fundamental primitives in data mining and database applications. Exact algorithms for these problems exist and are widely used, but their running time is hindered by the need of scanning the entire dataset, possibly multiple times. High quality approximations of FI's and AR's are sufficient for most practical uses, and a number of recent works explored the application of sampling for fast discovery of approximate solutions to the problems. However, these works do not provide satisfactory performance guarantees on the quality of the approximation, due to the difficulty of bounding the probability of under- or over-sampling any one of an unknown number of frequent itemsets. In this work we circumvent this issue by applying the statistical concept of \emph{Vapnik-Chervonenkis (VC) dimension} to develop a novel technique for providing tight bounds on the sample size that guarantees approximation within user-specified parameters. Our technique applies both to absolute and to relative approximations of (top-KK) FI's and AR's. The resulting sample size is linearly dependent on the VC-dimension of a range space associated with the dataset to be mined. The main theoretical contribution of this work is a proof that the VC-dimension of this range space is upper bounded by an easy-to-compute characteristic quantity of the dataset which we call \emph{d-index}, and is the maximum integer dd such that the dataset contains at least dd transactions of length at least dd such that no one of them is a superset of or equal to another. We show that this bound is strict for a large class of datasets.Comment: 19 pages, 7 figures. A shorter version of this paper appeared in the proceedings of ECML PKDD 201

    Towards an Improved Hoarding Procedure in a Mobile Environment

    Get PDF
    Frequent disconnection has been a critical issue in wireless network communication therefore causing excessive delay in data delivery. In this paper, we formulated a management mechanism based on computational optimization to achieve efficient and fast computation in order to reduce inherent delay during the hoarding process. The simulated result obtained is evaluated based on hoard size and delivery time. Keywords: Hoarding Procedure, Mobile computing Environment and Computational Optimization

    Three light-weight execution engines in Java for web data-intensive data source contents : (extended abstract)

    Get PDF
    Title from cover. "March, 1998."Includes bibliographical references (p. 8-9).Ricardo Ambrose ... [et al.]

    Data Mining Based on Association Rule Privacy Preserving

    Get PDF
    The security of the large database that contains certain crucial information, it will become a serious issue when sharing data to the network against unauthorized access. Privacy preserving data mining is a new research trend in privacy data for data mining and statistical database. Association analysis is a powerful tool for discovering relationships which are hidden in large database. Association rules hiding algorithms get strong and efficient performance for protecting confidential and crucial data. Data modification and rule hiding is one of the most important approaches for secure data. The objective of the proposed Association rulehiding algorithm for privacy preserving data mining is to hide certain information so that they cannot be discovered through association rule mining algorithm. The main approached of association rule hiding algorithms to hide some generated association rules, by increase or decrease the support or the confidence of the rules. The association rule items whether in Left Hand Side (LHS) or Right Hand Side (RHS) of the generated rule, that cannot be deduced through association rule mining algorithms. The concept of Increase Support of Left Hand Side (ISL) algorithm is decrease the confidence of rule by increase the support value of LHS. It doesnÊt work for both side of rule; it works only for modification of LHS. In Decrease Support of Right Hand Side (DSR) algorithm, confidence of the rule decrease by decrease the support value of RHS. It works for the modification of RHS. We proposed a new algorithm solves the problem of them. That can increase and decrease the support of the LHS and RHS item of the rule correspondingly so that more rule hide less number of modification. The efficiency of the proposed algorithm is compared with ISL algorithms and DSR algorithms using real databases, on the basis of number of rules hide, CPU time and the number of modifies entries and got better results
    corecore