7 research outputs found

    Benchmarking business analytics techniques in Big Data

    Get PDF
    Technological developments and the growing dependence of organizations and society in the world of the internet led to the growth and variety of data. This growth and variety have become a challenge to the traditional techniques of Business Analytics. In this project, we conducted a benchmarking process that aimed to assess the performance of some Data Mining tools, like RapidMiner, in Big Data environment. Firstly, was analyzed a study where a group of Data Mining tools are evaluated and determined what is the best Data Mining tool, according to the evaluation criteria. After that, the best two tools considered in the study are analyzed regarding their ability to analyze data in a Big Data environment. Finally, studies were carried out on the evaluations of the RapidMiner and KNIME tools for their performance in the Big Data environment.This work has been supported by national funds through FCT -Fundacao para a Ciencia e Tecnologia within the Project Scope: UID/CEC/00319/2019 and Deus ex Machina (DEM): Symbiotic technology for societal efficiency gains -NORTE-01-0145-FEDER-000026

    A framework to evaluate big data fabric tools

    Get PDF
    A huge growth in data and information needs has led organizations to search for the most appropriate data integration tools for different types of business. The management of a large dataset requires the exploitation of appropriate resources, new methods, as well as the possession of powerful technologies. That led the surge of numerous ideas, technologies, and tools offered by different suppliers. For this reason, it is important to understand the key factors that determine the need to invest in a big data project and then categorize these technologies to simplify the choice that best fits the context of their problem. The objective of this study is to create a model that will serve as a basis for evaluating the different alternatives and solutions capable of overcoming the major challenges of data integration. Finally, a brief analysis of three major data fabric solutions available on the market is also carried out, including Talend Data Fabric, IBM Infosphere, and Informatica Platform

    Analisis Performa Algoritma Decision Tree, Naive Bayes, K-Nearest Neighbor untuk Klasifikasi Zona Daerah Risiko Covid-19 di Indonesia

    Get PDF
    Pandemi Covid-19 terjadi di Indonesia. Pemerintah berupaya melakukan penanganan Covid-19, salah satunya dengan pembuatan peta risiko Covid-19. Peta risiko Covid-19 membagi zona berdasarkan Kabupaten/Kota. Zona risiko Covid-19 menjadi patokan pemerintah dalam mengambil kebijakan setiap daerah. Pemerintah menggunakan pembobotan dari 15 indikator untuk menentukan zona. Beberapa kali perubahan zona risiko Covid-19 pada website mengalami keterlambatan. Klasifikasi dapat menjadi alternatif penentuan zona risiko Covid-19, sehingga perubahan zona dapat dilakukan secara cepat dan efisien. Klasifikasi memiliki berbagai algoritma, setiap algoritma memiliki keunggulan dan kelemahan. Algoritma klasifikasi yang memiliki akurasi yang baik dengan waktu relatif cepat yaitu Decision Tree, Na茂ve Bayes dan K-Nearest Neighbor. Tujuan penelitian ini menghitung performa setiap algoritma, mendapatkan algoritma terbaik dan mendapatkan pola klasifikasi dari algoritma terbaik. Metode penelitian menggunakan 10-fold cross validation untuk pembagian data dan confusion matrix untuk menilai performa. Software yang digunakan yaitu Rapidminer dan WEKA. Hasil dari pengolahan data menunjukan semua algoritma mempunyai nilai performa yang baik yaitu diatas 70%. Semua algoritma tidak memerlukan waktu yang lama dalam pembuatan model. Nilai performa terbaik didapatkan dengan menggunakan algoritma decision tree dengan software WEKA dengan nilai performa 88% dan waktu 0,32 detik. Pola klasifikasi dari algoritma terbaik menghasilkan 77 aturan 聽yang membagi 3 zona klasifikasi yaitu rendah, sedang, dan tinggi. Atribut yang berpengaruh dalam klasifikasi zona risiko Covid-19 yaitu aktif, CR, CFR, laju insidensi, positif, dan meninggal.聽AbstractThe Covid-19 pandemic occurred in Indonesia. The government is trying to handle Covid-19, one of which is by making a Covid-19 risk map. The Covid-19 risk map divides zones based on Regency/City. The Covid-19 risk zone is the government's benchmark policy for each region. The government uses a weighting of 15 indicators to determine the zone. Several times the Covid-19 risk zone change on the website has been delayed. Classification can be an alternative to determining the Covid-19 risk zone,聽 that zone changes can be quickly and efficiently. Many algorithms can be used for classification. Several classification algorithms have good accuracy with relatively fast time are Decision Tree, K-Nearest Neighbor, and Na茂ve Bayes. The purpose of this study is to calculate the performance of each algorithm, get the best algorithm, and get the classification pattern from the best algorithm. The research method uses 10-fold cross validation for data sharing and confusion matrix to assess performance. The software used is Rapidminer. The results show that all algorithms have good performance values, which are above 70%. All algorithms do not require a long time in modeling. The best performance value using a Decision Tree algorithm. The classification pattern of the best algorithm produces 20 rules that divide 3 classification zones, namely low, medium, and high. Attributes that influence the classification of the Covid-19 risk zone are active, CR, CFR, incidence rate, positive, and death.聽

    Selecci贸n de algoritmos de preprocesamiento de datos del Hospital Delicia Concepci贸n Masvernat (Concordia, provincia de Entre R铆os) que permita el desarrollo de un componente de software para predicci贸n de enfermedades cardiol贸gicas

    Get PDF
    El sector sanitario, sin lugar a dudas es uno de los 谩mbitos en el que se administran grandes vol煤menes de datos; principalmente en el 谩rea cl铆nica.Esto conduce a identificar una importante necesidad de encontrar maneras deadministrar, integrar, analizar e interpretar ese gran conjunto de datos; procurando identificar patrones de comportamiento que sean de utilidad en latoma de decisiones m茅dicas. El proyecto de investigaci贸n1 en el que se enmarca este art铆culo plantea como principal objetivo desarrollar un componente de software capaz de generar, con aprendizaje automatizado, un modelo con capacidades predictivas sobre enfermedades cardiol贸gicas; que permita un mejor soporte a decisiones de diagn贸stico cl铆nico y un avance significativo en la medicina preventiva. Este art铆culo presenta una revisi贸n exhaustiva de las herramientas de preprocesamiento de datos para analizar datos sanitarios masivos, en t茅rminos de la imputaci贸n de valores perdidos, detecci贸n de valores at铆picos, reducci贸n, escalado, transformaci贸n y partici贸n de datos. Adem谩s, se proponen herramientas de ciencia de datos en el campo sanitario. Se ha presentado un an谩lisis en profundidad para describir los pros y los contras de las herramientas existentes para abordar los desaf铆os pr谩cticos. Los resultados obtenidos son 煤tiles para el desarrollo de investigaciones basadas en predicci贸n de enfermedades en el campo sanitario.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ

    Classification of handwriting kinematics in automated diagnosis and monitoring of Parkinson's disease

    Get PDF
    Parkinson's disease is one of the most prevalent neurodegenerative conditions. Currently, there is no standard clinical tool available to diagnose PD. One of the research priorities is to come up with biomarkers which will improve the diagnostic process and can be used for the clinical test. At present, the only way to assess this disease is by visually observing the symptoms of the patient which is performed only by expert neurologists. As of now, there is no treatment to prevent the progression of PD. However, there is an elemental drug `Levodopa' (L-dopa) available to control the disease by increasing dopamine cells in the brain. It is important to detect PD and start treatment in the early stages as it helps to control the symptoms and significantly delays the development of motor complications. In this study fine motor symptoms handwriting has been studied. As a first objective I have conducted the experiments on the significant number of patients and age-matched control (112 Participants:56 PD and 56 controls), and thus completed the task of data collection. The system developed extracts the dynamic features of the handwriting/drawing, reports the possible strength of dynamic features providing a basis for automated analysis. The advantage of this approach is that patients are not required to follow complex commands, and the analysis can be fully automized. I anticipate that following appropriate clinical tests already planned, the system will be able to detect early disease symptoms remotely outside hospitals or clinics. It could also be used for self-evaluation by patients with neuromuscular and motor neuron disorders. This device can be used without compromising on the comfort level of Patients who may still prefer writing with an ink pen on plain paper. This study proposes a new feature `Composite Index of Speed and Pen-pressure' (CISP) to distinguish between different stages of Parkinson's disease. The experiment also demonstrated a method which can be used with guided spiral drawing to improve classification results to predict Parkinson's disease. Further, I recommend using a panel of writing tasks which might prove to be an effective biomarker for cell loss in the substantia nigra and the associated dopamine deficiency. Thus, models developed can be used in designing an automated application for predicting and monitoring Parkinson's diseas

    Un plan de toma de decisiones basado en datos en funci贸n de las 谩reas de DigCompOrg en una escuela primaria en Grecia

    Get PDF
    Las tecnolog铆as digitales son un elemento clave de gran importancia para las organizaciones educativas y pueden contribuir a marcar el camino hacia una educaci贸n de calidad. La integraci贸n de las tecnolog铆as digitales exige un proceso de innovaci贸n educativa basado en tres pilares b谩sicos: pedag贸gico, tecnol贸gico y organizacional. En la propuesta de la Comisi贸n Europea para promover la digitalizaci贸n de la educaci贸n, se destaca que la competencia digital es una de las ocho claves que los gobiernos europeos deben trabajar en el 谩mbito de la formaci贸n competencias. Otro elemento b谩sico e importante de las competencias clave son los modelos de evaluaci贸n de las competencias digitales. a nuevos y mejores m茅todos de evaluaci贸n. Los principales pilares de esta tesis son la evaluaci贸n de competencias digitales basada en DigCompOrg y el conjunto de los datos obtenidos para ser utilizados en un modelo de toma de decisones basada en datos (Data Driven Decision Making - DDDM) para la mejora escolar. El modelo DigCompOrg orienta los procesos de an谩lisis y toma de decisiones sobre la digitalizaci贸n organizacional y DDDM explota la importancia de los procesos de toma de decisiones apoyados en datos reales de la organizaci贸n, en este caso orientados a que la escuela pueda abordar procesos de mejora educativa. Los datos en el contexto de las escuelas se entienden como el conjunto de informaci贸n que se recopila y organiza para representar alg煤n aspecto de las escuelas que se est谩 estudiando y como indica el marco te贸rico de nuestra investigaci贸n, la mejor manera de recuperar informaci贸n sobre datos que conciernen a las competencias digitales es el modelo DigCompOrg. Por las razones anteriores, el objetivo principal de esta investigaci贸n es evaluar la competencia digital de una escuela primaria en Grecia en funci贸n de las 谩reas de DigCompOrg y proponer un plan de toma de decisiones basado en datos (DDDM) para la mejora escolar a partir del an谩lisis de evidencias del estado real. Hemos estudiado, pues, el caso espec铆fico de un centro para su mejora, pero tambi茅n hemos sido capaces de integrar dos modelos te贸ricos (DigCompOrg y DDDM) en una propuesta pr谩ctica. Los objetivos de investigaci贸n se concretan en: a) Analizar el grado de desarrollo de la competencia digital de un colegio en Grecia seg煤n las 谩reas contempladas en el modelo DigCompOrg teniendo en cuenta la opini贸n de profesores y alumnos. b) Analizar c贸mo se afectan entre s铆 las variables del modelo DigCompOrg a partir del cuestionario de docentes, con el fin de tener una autoevaluaci贸n y mejora escolar. c) Dise帽ar un plan de toma de decisiones basado en un modelo DDDM y los resultados previos obtenidos sobre la competencia digital de la organizaci贸n. El an谩lisis de los resultados de la investigaci贸n destac贸 una correlaci贸n negativa moderada entre la influencia psicol贸gica positiva por el uso de las TIC y la influencia negativa en la educaci贸n de los estudiantes por el uso de las TIC. Se ha comprobado una correlaci贸n positiva moderada entre la influencia psicol贸gica positiva por el uso de las Las TIC y la cooperaci贸n con otros estudiantes mediante el uso de las TIC, lo que indica que el uso de las TIC mejor贸 el desarrollo psicol贸gico de los estudiantes seg煤n su percepci贸n y al mismo tiempo aument贸 la cooperaci贸n entre ellos. Finalmente, esta tesis lleg贸 a la propuesta de un plan de acci贸n DDDM relacionado con las 谩reas DigCompOrg de la dimensi贸n docente y extrajo datos para la mejora escolar.Digital technologies are a key element of great importance for educational organisations and can help lead the way towards quality education. The integration of digital technologies requires a process of educational innovation based on three basic pillars: pedagogical, technological and organisational. In the European Commission's proposal to promote the digitisation of education, digital competence is highlighted as one of the eight key competences that European governments should work on in the field of skills training. Another basic and important element of the key competences is the assessment models of digital competences. to new and better assessment methods. The main pillars of this thesis are the DigCompOrg-based assessment of digital competences and the collection of the obtained data to be used in a Data Driven Decision Making (DDDM) model for school improvement. The DigCompOrg model guides the analysis and decision-making processes on organisational digitalisation and DDDM exploits the importance of decision-making processes supported by real organisational data, in this case oriented towards school improvement processes. Data in the context of schools is understood as the set of information that is collected and organised to represent some aspect of the schools under study and as the theoretical framework of our research indicates, the best way to retrieve information on data concerning digital competences is the DigCompOrg model. For the above reasons, the main objective of this research is to assess the digital competence of a primary school in Greece in terms of the DigCompOrg areas and to propose a data-driven decision making (DDDM) plan for school improvement based on the analysis of real state evidence. We have therefore studied the specific case of a school for improvement, but we have also been able to integrate two theoretical models (DigCompOrg and DDDM) into a practical proposal. The research objectives are as follows: a) To analyse the degree of development of the digital competence of a school in Greece according to the areas covered by the DigCompOrg model taking into account the opinion of teachers and students. b) To analyse how the variables of the DigCompOrg model affect each other on the basis of the teachers' questionnaire, in order to have a self-evaluation and school improvement. c) To design a decision-making plan based on a DDDM model and the previous results obtained on the organisation's digital competence. The analysis of the research results highlighted a moderate negative correlation between the positive psychological influence of ICT use and the negative influence of ICT use on students' education. A moderate positive correlation was found between the positive psychological influence of ICT use and cooperation with other students through ICT use, which indicates that ICT use improved students' psychological development according to their perception and at the same time increased cooperation among them. Finally, this thesis came to the proposal of a DDDM action plan related to the DigCompOrg areas of the teaching dimension and extracted data for school improvement
    corecore