12 research outputs found
Fornecer um modelo de descritor para melhorar o efeito do FeO em pelotas brutas
Data mining is extraction the knowledge from a wide range of data. In this study, data mining is used to analyze a production system in the Golgohar Sirjan Company. The collected data is saved in the Excel file and then data cleaning and data preparation operations were done in order to use in the IBM SPSS Modeler software. FeO is one of the controlling parametersduring the production of the pellet. In this study, the classification method of C & R Tree is used to predict the effect of FeO in the raw pellet against 10 input variables included SiO2, Fetot, CaO, S, Al2O3, Mgo, P, Fe2O3, C.C.S, Temp (dry temperature). The variables that create the most sensitivity on the FeO in the raw pellet are evaluated and compared according to the accuracy of the models and also, practical suggestions are presented for the directors of industry to improve the quality of pellet.La minerÃa de datos es la extracción del conocimiento de una amplia gama de datos. En este estudio, la minerÃa de datos se utiliza para analizar un sistema de producción en la empresa Golgohar Sirjan. Los datos recopilados se guardan en el archivo Excel y luego se realizaron las operaciones de limpieza y preparación de datos para utilizarlos en el software IBM SPSS Modeler. El FeO es uno de los parámetros de control durante la producción del pellet. En este estudio, el método de clasificación de C & R Tree se usa para predecir el efecto de FeO en el pellet sin procesar contra 10 variables de entrada incluidas SiO2, Fetot, CaO, S, Al2O3, Mgo, P, Fe2O3, CCS, Temp (seco). temperatura). Las variables que crean la mayor sensibilidad en el FeO en el pellet sin procesar se evalúan y comparan de acuerdo con la precisión de los modelos y, además, se presentan sugerencias prácticas para los directores de la industria para mejorar la calidad del pellet.A mineração de dados é extrair o conhecimento de uma ampla gama de dados. Neste estudo, a mineração de dados é usada para analisar um sistema de produção na Golgohar Sirjan Company. Os dados coletados são salvos no arquivo do Excel e, em seguida, as operações de limpeza de dados e preparação de dados foram feitas para serem usadas no software IBM SPSS Modeler. O FeO é um dos parâmetros de controle durante a produção do pellet. Neste estudo, o método de classificação de C & R Tree é usado para prever o efeito do FeO no pellet cru contra 10 variáveis de entrada, incluindo SiO2, Fetot, CaO, S, Al2O3, Mgo, P, Fe2O3, CCS, Temp temperatura). As variáveis que criam mais sensibilidade no FeO no pelete bruto são avaliadas e comparadas de acordo com a precisão dos modelos e também, sugestões práticas são apresentadas para os diretores da indústria para melhorar a qualidade do pellet
Data Mining: How Popular Is It?
Data Mining is a process used in the industry, to facilitate decision making. As the name implies, large volumes of data is mined or sifted, to find useful information for decision making. With the advent of E-business, Data Mining has become more important to practitioners. The purpose of this paper is to find out the importance of Data Mining by looking at the different application areas that have used data mining for decision making
DATA MINING CLUSTERING IN HEALTHCARE
The accumulating amounts of data are making traditional analysis methods impractical. Novel tools employed in Data Mining (DM) provide a useful alternative framework that addresses this problem. This research suggests a technique to identify certain patient populations. Our model examines the patient population and clusters certain groups. Those subpopulations are then classified in terms of their appropriate medical treatment. As a result, we show the value of applying a DM model to more easily identify patients
A Data Mining Approach To identify Diabetes
Mounting amounts of data made traditional data analysis methods impractical. Data mining (DM) tools provide a useful for alternative framework that addresses this problem. This study follows a DM technique to identify diabetic patients. We develop a model that clusters diabetes patients of a large healthcare company into different subpopulation. Consequently, we show the value of applying a DM model to identify diabetic patients
Identifying Diabetic Patients: A Data Mining Approach
Mounting amounts of data made traditional data analysis methods impractical. Data mining (DM) tools provide a useful for alternative framework that addresses this problem. This study follows a DM technique to identify diabetic patients. We develop a model that clusters diabetes patients of a large healthcare company into different subpopulation. Consequently, we show the value of applying a DM model to identify diabetic patients
Assessment of landslide susceptibility using statistical- and artificial intelligence-based FR-RF integrated model and multiresolution DEMs
© 2019 by the authors. Landslide is one of the most important geomorphological hazards that cause significant ecological and economic losses and results in billions of dollars in financial losses and thousands of casualties per year. The occurrence of landslide in northern Iran (Alborz Mountain Belt) is often due to the geological and climatic conditions and tectonic and human activities. To reduce or control the damage caused by landslides, landslide susceptibility mapping (LSM) and landslide risk assessment are necessary. In this study, the efficiency and integration of frequency ratio (FR) and random forest (RF) in statistical- and artificial intelligence-based models and different digital elevation models (DEMs) with various spatial resolutions were assessed in the field of LSM. The experiment was performed in Sangtarashan watershed, Mazandran Province, Iran. The study area, which extends to 1072.28 km2, is severely affected by landslides, which cause severe economic and ecological losses. An inventory of 129 landslides that occurred in the study area was prepared using various resources, such as historical landslide records, the interpretation of aerial photos and Google Earth images, and extensive field surveys. The inventory was split into training and test sets, which include 70 and 30% of the landslide locations, respectively. Subsequently, 15 topographic, hydrologic, geologic, and environmental landslide conditioning factors were selected as predictor variables of landslide occurrence on the basis of literature review, field works and multicollinearity analysis. Phased array type L-band synthetic aperture radar (PALSAR), ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer), and SRTM (Shuttle Radar Topography Mission) DEMs were used to extract topographic and hydrologic attributes. The RF model showed that land use/land cover (16.95), normalised difference vegetation index (16.44), distance to road (15.32) and elevation (13.6) were the most important controlling variables. Assessment of model performance by calculating the area under the receiving operating characteristic curve parameter showed that FR-RF integrated model (0.917) achieved higher predictive accuracy than the individual FR (0.865) and RF (0.840) models. Comparison of PALSAR, ASTER, and SRTM DEMs with 12.5, 30 and 90 m spatial resolution, respectively, with the FR-RF integrated model showed that the prediction accuracy of FR-RF-PALSAR (0.917) was higher than FR-RF-ASTER (0.865) and FR-RF-SRTM (0.863). The results of this study could be used by local planners and decision makers for planning development projects and landslide hazard mitigation measures
On Identifying Critical Nuggets Of Information During Classification Task
In large databases, there may exist critical nuggets - small collections of records or instances that contain domain-specific important information. This information can be used for future decision making such as labeling of critical, unlabeled data records and improving classification results by reducing false positive and false negative errors. In recent years, data mining efforts have focussed on pattern and outlier detection methods. However, not much effort has been dedicated to finding critical nuggets within a data set. This work introduces the idea of critical nuggets, proposes an innovative domain-independent method to measure criticality, suggests a heuristic to reduce the search space for finding critical nuggets, and isolates and validates critical nuggets from some real world data sets. It seems that only a few subsets may qualify to be critical nuggets, underlying the importance of finding them. The proposed methodology can detect them. This work also identifies certain properties of critical nuggets and provides experimental validation of the properties. Critical nuggets were then applied to 2 important classification task related performance metrics - classification accuracy and misclassification costs. Experimental results helped validate that critical nuggets can assist in improving classification accuracies in real world data sets when compared with other standalone classification algorithms. The improvements in accuracy using the critical nuggets were statistically significant. Extensive studies were also undertaken on real world data sets that utilized critical nuggets to help minimize misclassification costs. In this case as well the critical nuggets based approach yielded statistically significant, lower misclassification costs than than standalone classification methods
Semantic discovery and reuse of business process patterns
Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse
Knowledge Management Determinants Of Continuance Behavior: Evaluating The Air Force Knowledge Now Knowledge Management System
Knowledge management (KM) encompasses the set of capabilities, processes, tools, and techniques for the most effective use of knowledge by an organization. The goal of KM is to improve the organization\u27s ability to create, transfer, retain, and apply knowledge. Knowledge management is a goal that many organizations seek to achieve. Organizations apply their strategies, plans, and implementation to achieve KM. Organizations use technology to implement their KM strategy. For some, this approach has worked well; however, for others, the results have fallen short. KM shortcomings revolve around employees\u27 infrequent use of the technology. This research seeks to understand what influences a user\u27s behavior to use a KM system and why a user becomes a routine user. This research provides a model of KM continuance behavior and post-acceptance usage behavior. Post-acceptance usage behavior is how an individual decides to use a system after its initial acceptance. The KM continuance model incorporates technology, community, individual, and organizational elements that influence a user\u27s intentions and actual use of a KM system. The specific context of this research is a KM system known as the Air Force Knowledge Now (AFKN) system. AFKN emphasizes KM through expertise-sharing activities in Communities of Practice (CoPs). The AFKN KM system facilitates and enhances the relationships in the community. The data for this study were obtained by using an online questionnaire. The results are analyzed using Partial Least Squares structural equation modeling with a two-step data analysis approach. The first step assessed the properties of the measurement model. The second step assessed the path model. Path coefficients and t-values are generated to evaluate the 14 proposed hypotheses. The results of the investigation show that community and technology KM both positively influence a user\u27s evaluation of the KM environment. The results produced a coefficient of determination of 60% for KM continued-use intention and 31% for KM continued-use behavior. The outcome of this research is a model that allows organizations to tailor their KM systems efforts to the organizational environment in order to maximize their resources. This investigation serves as a foundation for further research and development in areas of KM, KM systems, and post-acceptance usage
Recommended from our members
Critical Success Factors in Data Mining Projects.
The increasing awareness of data mining technology, along with the attendant increase in the capturing, warehousing, and utilization of historical data to support evidence-based decision making, is leading many organizations to recognize that the effective use of data is the key element in the next generation of client-server enterprise information technology. The concept of data mining is gaining acceptance in business as a means of seeking higher profits and lower costs. To deploy data mining projects successfully, organizations need to know the key factors for successful data mining. Implementing emerging information systems (IS) can be risky if the critical success factors (CSFs) have been researched insufficiently or documented inadequately. While numerous studies have listed the advantages and described the data mining process, there is little research on the success factors of data mining. This dissertation identifies CSFs in data mining projects. Chapter 1 introduces the history of the data mining process and states the problems, purposes, and significances of this dissertation. Chapter 2 reviews the literature, discusses general concepts of data mining and data mining project contexts, and reviews general concepts of CSF methodologies. It also describes the identification process for the various CSFs used to develop the research framework. Chapter 3 describes the research framework and methodology, detailing how the CSFs were identified and validated from more than 1,300 articles published on data mining and related topics. The validated CSFs, organized into a research framework using 7 factors, generate the research questions and hypotheses. Chapter 4 presents analysis and results, along with the chain of evidence for each research question, the quantitative instrument and survey results. In addition, it discusses how the data were collected and analyzed to answer the research questions. Chapter 5 concludes with a summary of the findings, describing assumptions and limitations and suggesting future research