Search CORE

12 research outputs found

Improving Database Quality through Eliminating Duplicate Records

Author: Cather Martha E.
Sung Andrew H.
Wei Mingzhen
Publication venue: Scholars\u27 Mine
Publication date: 01/11/2006
Field of study

Redundant or duplicate data are the most troublesome problem in database management and applications. Approximate field matching is the key solution to resolve the problem by identifying semantically equivalent string values in syntactically different representations. This paper considers token-based solutions and proposes a general field matching framework to generalize the field matching problem in different domains. By introducing a concept of String Matching Points (SMP) in string comparison, string matching accuracy and efficiency are improved, compared with other commonly-applied field matching algorithms. The paper discusses the development of field matching algorithms from the developed general framework. The framework and corresponding algorithm are tested on a public data set of the NASA publication abstract database. The approach can be applied to address the similar problems in other databases

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

DATA PUBLICATION IN THE OPEN ACCESS INITIATIVE

Author: Bertelmann Roland
Brase Jan
Diepenbroek Michael
Grobe Hannes
Höck Heinke
Klump Jens
Lautenschlager Michael
Schindler Uwe
Sens Irina
Wächter Joachim
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 15/06/2006
Field of study

The ‘Berlin Declaration’ was published in 2003 as a guideline to policy makers to promote the Internet as a functional instrument for a global scientific knowledge base. Because knowledge is derived from data, the principles of the ‘Berlin Declaration’ should apply to data as well. Today, access to scientific data is hampered by structural deficits in the publication process. Data publication needs to offer authors an incentive to publish data through long-term repositories. Data publication also requires an adequate licence model that protects the intellectual property rights of the author while allowing further use of the data by the scientific community

DigitalCommons@University of Nebraska

Development of a Prototype for Critical Disease Predictions using Data Mining

Author: Mohammad Taha Khan, Professor Dr. Shamimul Qamar, Dr. Ripu Ranjan Sinha
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2016
Field of study

The goal of this paper is to present breast cancer prototype model along with the prediction of heart diseases by employing data mining techniques. The data used in the study had been retrieved from Public-Use Data, which is available online. The data comprised of 699 and 909 records for breast cancer and heart disease respectively. For data prediction and mining, C4.5 and C5.0, which are decision tree algorithms, were used on the data, used in the study. The results of both data sets using both algorithms were also compared. The paper also outlines the significance of evidence based medicine, which is the novel and innovative approach in healthcare decision making process [5]. It is essential that the clinical decisions are supported and based on scientific evidence, which ensures that they are sound and effective decisions. This paper also will depict the importance of data mining in modern healthcare

International Journal on Recent and Innovation Trends in Computing and Communication

Discussing the Role of Classification Algorithms in Clinical Predictions with help of Case Studies

Author: Mohammad Taha Khan, Dr. Shamimul Qamar, F. Massin
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2015
Field of study

This paper discuss about the important role of classification algorithms in clinical predictions , two case studies one for breast cancer and other for heart disease prediction with help of classification data mining techniques is presented in this paper. Online freely accessible data is used for the said case studies. Used data is publicly available data on internet consisting of 909 records for heart disease and 699 for breast cancer. C4.5 and the C5.0 Two well-known decision tree algorithms used to get the rules for predictions, and these rules used for improving the quality of an open source Pathology Management System based on Care2x.Performances of these algorithms are also compared. This Paper will further discuss about the importance of open source software in healthcare as well as how a pathology management system can adopt Evidence Based Medicine (EBM). EBM is a new and important approach which can greatly improve decision making in health care. EBM's task is to prevent, diagnose and medicate diseases using medical evidence [5].Clinical decisions must be based on scientific evidence that demonstrates effectiveness. This paper is basically extension of our previous work ‘A Prototype of Cancer/Heart Disease Prediction Model Using Data Mining’

International Journal on Recent and Innovation Trends in Computing and Communication

Case-Based-Reasoning System for Feature Selection and Diagnosing Disease; Case Study: Asthma

Author: Darabi Somayeh Akhavan
heydarnejad Hassan
Teimourpour Babak
Zolnoori Maryam
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 30/05/2014
Field of study

Asthma is a chronic informatory disease of the respiratory canals in which it has not become obvious what is the reason for the reports argumentation on the ground of asthma prevalence. In the present research, the purpose would be to design a case-based-reasoning (CBR) model in order to assist a physician to diagnose the type of disease and also the needed therapy. At first for designing this system, the disease variables were discriminated and were at the patients' disposal as a questionnaire, and after gathering the relevant data (CBR) algorithm was rendered on the data which led to the asthma diagnosis. The system was tested on 325 asthmatic and non asthmatic adult cases and was accessed with eighty percent accuracy. The consequences were promising. With regard to the fact that the factors of the disease are different in various countries, This study was performed in order to determine risk factors for asthma in Iranian society and the results of research showed that the most important variables of asthma disease in Iran are symptoms heperresponsivity, frequency of cough, cough. Key words: data mining, case based reasoning, asthma, diagnosis

International Institute for Science, Technology and Education (IISTE): E-Journals

Die Vergabe von DOI-Namen für Sozialund Wirtschaftsdaten Serviceleistungen der Registrierungsagentur da|ra

Author: Brigitte Hausstein
Publication venue
Publication date
Field of study

Das GESIS Leibniz-Institut für Sozialwissenschaften und das ZBW LeibnizInformationszentrum für Wirtschaftswissenschaften betreiben in Kooperation mit DataCite, der internationalen Initiative zur Verbesserung des Zugangs zu Forschungsdaten, einen DOI-Registrierungsservice für Sozial- und Wirtschaftsdaten. Mit dieser Infrastruktur wird eine wichtige Voraussetzung für eine dauerhafte Identifizierung, Sicherung, Lokalisierung und schließlich eine verlässliche Zitierbarkeit von Forschungsdaten aus den Sozial- und Wirtschaftswissenschaften geschaffen. Um die technischen und organisatorischen Lösungen für die Vergabe von DOINamen zu testen, führte GESIS 2010 ein Pilotprojekt für die Registrierung von sozialwissenschaftlichen Daten durch. Inzwischen sind 5200 Studien registriert und mehr als 7000 Metadatensätze in das Informationssystem aufgenommen worden. Dieser Beitrag beschreibt die technische und organisatorische Implementierung der Registrierungsagentur da|ra und zeigt, wie in der Etablierungsphase des Projektes das bereits existierende DOI-Registrierungssystem ab 2012 auch für wirtschaftswissenschaftliche Forschungsdaten genutzt werden kann.Forschungsdaten, Datenzitation, Persistent Identifier, DOI-Namen

Research Papers in Economics

Numeric Data:Citation Techniques and Integration with Text

Author: Brasen Jan
Farquhar Adam
Gastl Angela
Gruttemeier Herbert
Heijne Maria
Heller Alfred
Hitson Brian
Johnson Lorrie
McMahon Brian
Piguet Arlette
Rombouts Jeroen
Sandfær Mogens
Sens Irina
Publication venue
Publication date: 01/01/2009
Field of study

Online Research Database In Technology

Die Vergabe von DOI-Namen für Sozial- und Wirtschaftsdaten: Serviceleistungen der Registrierungsagentur da|ra

Author: Hausstein Brigitte
Publication venue: 'Botanic Garden & Botanical Museum Berlin-Dahlem BGBM'
Publication date: 20/01/2014
Field of study

"Das GESIS Leibniz-Institut für Sozialwissenschaften und das ZBW Leibniz- Informationszentrum für Wirtschaftswissenschaften betreiben in Kooperation mit DataCite, der internationalen Initiative zur Verbesserung des Zugangs zu Forschungsdaten, einen DOI-Registrierungsservice für Sozial- und Wirtschaftsdaten. Mit dieser Infrastruktur wird eine wichtige Voraussetzung für eine dauerhafte Identifizierung, Sicherung, Lokalisierung und schließlich eine verlässliche Zitierbarkeit von Forschungsdaten aus den Sozial- und Wirtschaftswissenschaften geschaffen. Um die technischen und organisatorischen Lösungen für die Vergabe von DOINamen zu testen, führte GESIS 2010 ein Pilotprojekt für die Registrierung von sozialwissenschaftlichen Daten durch. Inzwischen sind 5200 Studien registriert und mehr als 7000 Metadatensätze in das Informationssystem aufgenommen worden. Dieser Beitrag beschreibt die technische und organisatorische Implementierung der Registrierungsagentur da|ra und zeigt, wie in der Etablierungsphase des Projektes das bereits existierende DOI-Registrierungssystem ab 2012 auch für wirtschaftswissenschaftliche Forschungsdaten genutzt werden kann." [Autorenreferat

SSOAR - Social Science Open Access Repository

Improving Quality of Lead Data of Customer Relationship Management in M-Commerce

Author: Dr Vijaya Kumar
Kavya M Gowda
Publication venue
Publication date: 24/04/2020
Field of study

Abstract: In the present growing trend of mobile computing, technologies, applications mainly focus on mobile ecommerce (m-commerce) and the mobile Web. As the mobile commerce market grows, Customer relationship management (CRM) is one of the major applications that incorporates the present marketing standard of relationship management (RM) and supports in acquisition, understanding requirements, and maintaining long-term relationships with customers. Expanding the CRM applications to mobile devices is increasingly becoming a major goal for organizations to optimize their workforce. To obtain a successful and desired outcome, quality of acquired data has a major impact on the organization's productivity. Hence the quality of data constitute the basis for major decisions of organization's operational and strategic levels which intern leads to positive growth of organizations. In order to tackle the above issue, we have proposed a novel quality model for customer lead data. Some of the activities are defined to analyse the performance of system with respect to time, data quality, reliability. This framework needs a feasible implementation module in the near future for improving quality of CRM lead data in m-commerce

CiteSeerX

An investigation into the problems of ineffective control of invasive plants in selected areas of South Africa : a case study of Campuloclinium macrocephalum (pompom weed)

Author: Mashiloane William Tlokotse
Publication venue
Publication date: 01/09/2011
Field of study

Interference of natural environment by invasive plants is a global concern. In South Africa and in particular Gauteng Province, interference of natural land by invasive plants that originated from other countries has been an endemic problem. These invasive plants pose a threat to biodiversity as a result of its wild and wide dispersion rate where it spreads into neighbouring Provinces such as Mpumalanga, Limpopo, North West and the Free State. Pompom weed is aggressive to control and can spread by means of both wind and water. This research project investigates problems associated with ineffective control of invasive plants in general and pompom weed in particular. State organs, Non Governmental Organisations (NGOs) and farming communities were identified as relevant respondents in this study. Three hundred (300) validated questionnaires were distributed to these stakeholders and 286 were adequately completed and received. These were analysed and the data interpreted. Results obtained showed that lack of coordination and teamwork from all stakeholders are responsible for ineffective control of invasive plants in the country. The use of biological control was recommended for the control and eradication of the invasive plants.Environmental SciencesM.A. (Environmental Management

Unisa Institutional Repository