718,773 research outputs found

    Defining Big Data

    Get PDF
    ABSTRACT As Big Data becomes better understood, there is a need for a comprehensive definition of Big Data to support work in fields such as data quality for Big Data. Existing definitions of Big Data define Big Data by comparison with existing, usually relational, definitions, or define Big Data in terms of data characteristics or use an approach which combines data characteristics with the Big Data environment. In this paper we examine existing definitions of Big Data and discuss the strengths and limitations of the different approaches, with particular reference to issues related to data quality in Big Data. We identify the issues presented by incomplete or inconsistent definitions. We propose an alternative definition and relate this definition to our work on quality in Big Dat

    An Iterative Methodology for Defining Big Data Analytics Architectures

    Get PDF
    Thanks to the advances achieved in the last decade, the lack of adequate technologies to deal with Big Data characteristics such as Data Volume is no longer an issue. Instead, recent studies highlight that one of the main Big Data issues is the lack of expertise to select adequate technologies and build the correct Big Data architecture for the problem at hand. In order to tackle this problem, we present our methodology for the generation of Big Data pipelines based on several requirements derived from Big Data features that are critical for the selection of the most appropriate tools and techniques. Thus, thanks to our approach we reduce the required know-how to select and build Big Data architectures by providing a step-by-step methodology that leads Big Data architects into creating their Big Data Pipelines for the case at hand. Our methodology has been tested in two use cases.This work has been funded by the ECLIPSE project (RTI2018-094283-B-C32) from the Spanish Ministry of Science, Innovation and Universities

    Directional Decision Lists

    Full text link
    In this paper we introduce a novel family of decision lists consisting of highly interpretable models which can be learned efficiently in a greedy manner. The defining property is that all rules are oriented in the same direction. Particular examples of this family are decision lists with monotonically decreasing (or increasing) probabilities. On simulated data we empirically confirm that the proposed model family is easier to train than general decision lists. We exemplify the practical usability of our approach by identifying problem symptoms in a manufacturing process.Comment: IEEE Big Data for Advanced Manufacturin

    Ethics in tax practice: A study of the effect of practitioner firm size

    Get PDF
    While much of the empirical accounting literature suggests that, if differences do exist, Big Four employees are more ethical than non-Big Four employees, this trend has not been evident in the recent media coverage of Big Four tax practitioners acting for multinationals accused of aggressive tax avoidance behaviour. However, there has been little exploration in the literature to date specifically of the relationship between firm size and ethics in tax practice. We aim here to address this gap, initially exploring tax practitioners’ perceptions of the impact of firm size on ethics in tax practice using interview data in order to identify the salient issues involved. We then proceed to assess quantitatively whether employer firm size has an impact on the ethical reasoning of tax practitioners, using a tax context-specific adaptation of a well-known and validated psychometric instrument, the Defining Issues Test

    Defining Interaction Design Patterns to Extract Knowledge from Big Data

    Full text link
    [EN] The Big Data domain offers valuable opportunities to gain valuable knowledge. The User Interface (UI), the place where the user interacts to extract knowledge from data, must be adapted to address the domain complexities. Designing UIs for Big Data becomes a challenge that involves identifying and designing the user-data interaction implicated in the knowledge extraction. To design such an interaction, one widely used approach is design patterns. Design Patterns describe solutions to common interaction design problems. This paper proposes a set of patterns to design UIs aimed at extracting knowledge from the Big Data systems data conceptual schemas. As a practical example, we apply the patterns to design UI s for the Diagnosis of Genetic Diseases domain since it is a clear case of extracting knowledge from a complex set of genetic data. Our patterns provide valuable design guidelines for Big Data UIs.The authors thank the members of the PROS Center's Genome group for fruitful discussions. In addition, it is also important to highlight that Secretaria Nacional de Educacion, Ciencia y Tecnologia (SENESCYT) and Escuela Politecnica Nacional from Ecuador have supported this work. This project also has the support of Generalitat Valenciana through project IDEO (PROMETEOII/2014/039) and Spanish Ministry of Science and Innovation through project DataME (ref: TIN2016-80811-P).Iñiguez Jarrín, CE.; Panach Navarrete, JI.; Pastor López, O. (2018). Defining Interaction Design Patterns to Extract Knowledge from Big Data. Springer. 490-504. https://doi.org/10.1007/978-3-319-91563-0_30S490504Power, D.J.: ‘Big Data’ decision making use cases. In: Delibašić, B., Hernández, J.E., Papathanasiou, J., Dargam, F., Zaraté, P., Ribeiro, R., Liu, S., Linden, I. (eds.) ICDSST 2015. LNBIP, vol. 216, pp. 1–9. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18533-0_1Genetic Alliance: Capítulo 2, Diagnóstico de una enfermedad genética (2009). https://www.ncbi.nlm.nih.gov/books/NBK132200/Pabinger, S., et al.: A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform. 15(2), 256–278 (2014). https://doi.org/10.1093/bib/bbs086Borchers, J.O.: Pattern approach to interaction design. In: Proceedings of the Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques, DIS 2000, pp. 369–378 (2000)Tidwell, J.: Designing Interfaces, vol. XXXIII, no. 2. O’Reilly Media, Sebastopol (2012)Van Duyne, D.K., Landay, J.A., Hong, J.I.: The Design of Sites: Patterns, Principles, and Processes for Crafting a Customer-Centered Web Experience. Addison-Wesley, Boston (2003)Schmettow, M.: User interaction design patterns for information retrieval. In: EuroPLoP 2006, pp. 489–512 (2006)IBM big data use cases – What is a big data use case and how to get started - Exploration. http://www-01.ibm.com/software/data/bigdata/use-cases.htmlDatamer e-book: Top Five High-Impact Use Cases for Big Data Analytics (2016). https://www.datameer.com/pdf/eBook-Top-Five-High-Impact-UseCases-for-Big-Data-Analytics.pdfBig Data Uses Cases | Pentaho. http://www.pentaho.com/big-data-use-casesHenderson-Sellers, B., Ralyté, J.: Situational method engineering: state-of-the-art review. J. Univers. Comput. Sci. 16(3), 424–478 (2010)Iñiguez-Jarrin, C., García, A., Reyes, J.F., Pastor, O.: GenDomus: interactive and collaboration mechanisms for diagnosing genetic diseases. In: ENASE 2017 - Proceedings of the 12th International Conference on Evaluation of Novel Approaches to Software Engineering, Porto, Portugal, 28–29 April 2017, pp. 91–102 (2017). https://doi.org/10.5220/0006324000910102Román, J.F.R., López, Ó.P.: Use of GeIS for early diagnosis of alcohol sensitivity. In: Proceedings of the BIOSTEC 2016, pp. 284–289 (2016). https://doi.org/10.5220/0005822902840289Laskowski, N.: Ten big data case studies in a nutshell. http://searchcio.techtarget.com/opinion/Ten-big-data-case-studies-in-a-nutshellMolina, P.J., Meliá, S., Pastor, O.: JUST-UI: a user interface specification model. In: Kolski, C., Vanderdonckt, J. (eds.) Computer-Aided Design of User Interfaces III, pp. 63–74. Springer, Dordrecht (2002). https://doi.org/10.1007/978-94-010-0421-3_

    BIG MAC: A bolometer array for mid-infrared astronomy, Center Director's Discretionary Fund

    Get PDF
    The infrared array referred to as Big Mac (for Marshall Array Camera), was designed for ground based astronomical observations in the wavelength range 5 to 35 microns. It contains 20 discrete gallium-doped germanium bolometer detectors at a temperature of 1.4K. Each bolometer is irradiated by a square field mirror constituting a single pixel of the array. The mirrors are arranged contiguously in four columns and five rows, thus defining the array configuration. Big Mac utilized cold reimaging optics and an up looking dewar. The total Big Mac system also contains a telescope interface tube for mounting the dewar and a computer for data acquisition and processing. Initial astronomical observations at a major infrared observatory indicate that Big Mac performance is excellent, having achieved the design specifications and making this instrument an outstanding tool for astrophysics

    Comprendiendo el potencial y los desafíos del Big Data en las escuelas y la educación

    Full text link
    In recent years, the world has experienced a huge revolution centered around the gathering and application of big data in various fields. This has affected many aspects of our daily life, including government, manufacturing, commerce, health, communication, entertainment, and many more. So far, education has benefited only a little from the big data revolution. In this article, we review the potential of big data in the context of education systems. Such data may include log files drawn from online learning environments, messages on online discussion forums, answers to open-ended questions, grades on various tasks, demographic and administrative information, speech, handwritten notes, illustrations, gestures and movements, neurophysiologic signals, eye movements, and many more. Analyzing this data, it is possible to calculate a wide range of measurements of the learning process and to support various educational stakeholders with informed decision-making. We offer a framework for better understanding of how big data can be used in education. The framework comprises several elements that need to be addressed in this context: defining the data; formulating data-collecting and storage apparatuses; data analysis and the application of analysis products. We further review some key opportunities and some important challenges of using big data in educationEn los últimos años, el mundo ha experimentado una gran revolución centrada en la recopilación y aplicación de big data en varios campos. Esto ha afectado muchos aspectos de nuestra vida diaria, incluidos el gobierno, la manufactura, el comercio, la salud, la comunicación, el entretenimiento y muchos más. Hasta ahora, la educación se ha beneficiado muy poco de la revolución del big data. En este artículo revisamos el potencial de los macrodatos en el contexto de los sistemas educativos. Dichos datos pueden incluir archivos de registro extraídos de entornos de aprendizaje en línea, mensajes en foros de discusión en línea, respuestas a preguntas abiertas, calificaciones en diversas tareas, información demográfica y administrativa, discurso, notas escritas a mano, ilustraciones, gestos y movimientos, señales neurofisiológicas, movimientos oculares y muchos más. Analizando estos datos es posible calcular una amplia gama de mediciones del proceso de aprendizaje y apoyar a diversos interesados educativos con una toma de decisiones informada. Ofrecemos un marco para una mejor comprensión de cómo se puede utilizar el big data en la educación. El marco comprende varios elementos que deben abordarse en este contexto: definición de los datos; formulación de aparatos de recolección y almacenamiento de datos; análisis de datos y aplicación de productos de análisis. Además, revisamos algunas oportunidades clave y algunos desafíos importantes del uso de big data en la educació

    Big Data -- A 21st Century Science Maginot Line? No-Boundary Thinking: Shifting from the Big Data Paradigm

    Get PDF
    Whether your interests lie in scientific arenas, the corporate world, or in government, you have certainly heard the praises of big data: Big data will give you new insights, allow you to become more efficient, and/or will solve your problems. While big data has had some outstanding successes, many are now beginning to see that it is not the Silver Bullet that it has been touted to be. Here our main concern is the overall impact of big data; the current manifestation of big data is constructing a Maginot Line in science in the 21st century. Big data is not lots of data as a phenomena anymore; the big data paradigm is putting the spirit of the Maginot Line into lots of data. Big data overall is disconnecting researchers and science challenges. We propose No-Boundary Thinking (NBT), applying no-boundary thinking in problem defining to address science challenges
    • …
    corecore