402 research outputs found

    Parallel programming paradigms and frameworks in big data era

    Get PDF
    With Cloud Computing emerging as a promising new approach for ad-hoc parallel data processing, major companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for customers to access these services and to deploy their programs. We have entered the Era of Big Data. The explosion and profusion of available data in a wide range of application domains rise up new challenges and opportunities in a plethora of disciplines-ranging from science and engineering to biology and business. One major challenge is how to take advantage of the unprecedented scale of data-typically of heterogeneous nature-in order to acquire further insights and knowledge for improving the quality of the offered services. To exploit this new resource, we need to scale up and scale out both our infrastructures and standard techniques. Our society is already data-rich, but the question remains whether or not we have the conceptual tools to handle it. In this paper we discuss and analyze opportunities and challenges for efficient parallel data processing. Big Data is the next frontier for innovation, competition, and productivity, and many solutions continue to appear, partly supported by the considerable enthusiasm around the MapReduce paradigm for large-scale data analysis. We review various parallel and distributed programming paradigms, analyzing how they fit into the Big Data era, and present modern emerging paradigms and frameworks. To better support practitioners interesting in this domain, we end with an analysis of on-going research challenges towards the truly fourth generation data-intensive science.Peer ReviewedPostprint (author's final draft

    From Microsound to Vaporwave:internet-mediated musics, online methods, and genre

    Get PDF
    How is the internet transforming musical practices? In this article, through a study of five prominent popular and crossover music genres spanning the period from the late 1990s to the present, we examine how the internet has augmented the creative, aesthetic, communicative and social dimensions of music. Analysing the internet-based practices associated with these genres poses methodological and theoretical challenges. It requires new research tools attentive to the online practices involved in their creation and reception. To this end we adapt the Issue Crawler software, an established digital method that analyses networks of hyperlinking on the world-wide web. In addition, it requires a theoretical framework that can respond to music’s profuse mediations in the digital environment. We propose that a version of genre theory offers such a framework. The paper concludes by reflecting on the implications of our analysis for theorising music and place and for historical periodization after the internet

    A large-scale empirical study on mobile performance: energy, run-time and memory

    Get PDF
    Software performance concerns have been attracting research interest at an increasing rate, especially regarding energy performance in non-wired computing devices. In the context of mobile devices, several research works have been devoted to assessing the performance of software and its underlying code. One important contribution of such research efforts is sets of programming guidelines aiming at identifying efficient and inefficient programming practices, and consequently to steer software developers to write performance-friendly code. Despite recent efforts in this direction, it is still almost unfeasible to obtain universal and up-to-date knowledge regarding software and respective source code performance. Namely regarding energy performance, where there has been growing interest in optimizing software energy consumption due to the power restrictions of such devices. There are still many difficulties reported by the community in measuring performance, namely in large-scale validation and replication. The Android ecosystem is a particular example, where the great fragmentation of the platform, the constant evolution of the hardware, the software platform, the development libraries themselves, and the fact that most of the platform tools are integrated into the IDE’s GUI, makes it extremely difficult to perform performance studies based on large sets of data/applications. In this paper, we analyze the execution of a diversified corpus of applications of significant magnitude. We analyze the source-code performance of 1322 versions of 215 different Android applications, dynamically executed with over than 27900 tested scenarios, using state-of-the-art black-box testing frameworks with different combinations of GUI inputs. Our empirical analysis allowed to observe that semantic program changes such as adding functionality and repairing bugfixes are the changes more associated with relevant impact on energy performance. Furthermore, we also demonstrate that several coding practices previously identified as energy-EC - European Commission(19135); National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia, within project UIDP/50014/2020, by COST Action 19135: “CERICIRAS - Connecting Education and Research Communities for an Innovative Resource Aware Society”, and by Erasmus+ project No. 2020-1-PT01-KA203-078646: “SusTrainable - Promoting Sustainability as a Fundamental Driver in Software Development Training and Education”. The first author is also financed by FCT grant SFRH/BD/146624/201

    Eliminating Code Duplication in Cascading Style Sheets

    Get PDF
    Cascading Style Sheets (i.e., CSS) is the standard styling language, widely used for defining the presentation semantics of user interfaces for web, mobile and desktop applications. Despite its popularity, CSS has not received much attention from academia. Indeed, developing and maintaining CSS code is rather challenging, due to the inherent language design shortcomings, the interplay of CSS with other programming languages (e.g., HTML and JavaScript), the lack of empirically- evaluated coding best-practices, and immature tool support. As a result, the quality of CSS code bases is poor in many cases. In this thesis, we focus on one of the major issues found in CSS code bases, i.e., the duplicated code. In a large, representative dataset of CSS code, we found an average of 68% duplication in style declarations. To alleviate this, we devise techniques for refactoring CSS code (i.e., grouping style declarations into new style rules), or migrating CSS code to take advantage of the code abstraction features provided by CSS preprocessor languages (i.e., superset languages for CSS that augment it by adding extra features that facilitate code maintenance). Specifically for the migration transformations, we attempt to align the resulting code with manually-developed code, by relying on the knowledge gained by conducting an empirical study on the use of CSS preprocessors, which revealed the common coding practices of the developers who use CSS preprocessor languages. To guarantee the behavior preservation of the proposed transformations, we come up with a list of preconditions that should be met, and also describe a lightweight testing technique. By applying a large number of transformations on several web sites and web applications, it is shown that the transformations are indeed presentation-preserving, and can effectively reduce the amount of duplicated code in CSS

    Adaptive Big Data Pipeline

    Get PDF
    Over the past three decades, data has exponentially evolved from being a simple software by-product to one of the most important companies’ assets used to understand their customers and foresee trends. Deep learning has demonstrated that big volumes of clean data generally provide more flexibility and accuracy when modeling a phenomenon. However, handling ever-increasing data volumes entail new challenges: the lack of expertise to select the appropriate big data tools for the processing pipelines, as well as the speed at which engineers can take such pipelines into production reliably, leveraging the cloud. We introduce a system called Adaptive Big Data Pipelines: a platform to automate data pipelines creation. It provides an interface to capture the data sources, transformations, destinations and execution schedule. The system builds up the cloud infrastructure, schedules and fine-tunes the transformations, and creates the data lineage graph. This system has been tested on data sets of 50 gigabytes, processing them in just a few minutes without user intervention.ITESO, A. C

    High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)

    Full text link
    Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence (HEP-FCE) initiated a roadmap planning activity with two key overlapping drivers -- 1) software effectiveness, and 2) infrastructure and expertise advancement. The HEP-FCE formed three working groups, 1) Applications Software, 2) Software Libraries and Tools, and 3) Systems (including systems software), to provide an overview of the current status of HEP computing and to present findings and opportunities for the desired HEP computational roadmap. The final versions of the reports are combined in this document, and are presented along with introductory material.Comment: 72 page

    BlogForever D3.2: Interoperability Prospects

    Get PDF
    This report evaluates the interoperability prospects of the BlogForever platform. Therefore, existing interoperability models are reviewed, a Delphi study to identify crucial aspects for the interoperability of web archives and digital libraries is conducted, technical interoperability standards and protocols are reviewed regarding their relevance for BlogForever, a simple approach to consider interoperability in specific usage scenarios is proposed, and a tangible approach to develop a succession plan that would allow a reliable transfer of content from the current digital archive to other digital repositories is presented

    Ground Processing Affordability for Space Vehicles

    Get PDF
    Launch vehicles and most of their payloads spend the majority of their time on the ground. The cost of ground operations is very high. So, why so often is so little attention given to ground processing during development? The current global space industry and economic environment are driving more need for efficiencies to save time and money. Affordability and sustainability are more important now than ever. We can not continue to treat space vehicles as mere science projects. More RLV's (Reusable Launch Vehicles) are being developed for the gains of reusability which are not available for ELV's (Expendable Launch Vehicles). More human-rated vehicles are being developed, with the retirement of the Space Shuttles, and for a new global space race, yet these cost more than the many unmanned vehicles of today. We can learn many lessons on affordability from RLV's. DFO (Design for Operations) considers ground operations during design, development, and manufacturing-before the first flight. This is often minimized for space vehicles, but is very important. Vehicles are designed for launch and mission operations. You will not be able to do it again if it is too slow or costly to get there. Many times, technology changes faster than space products such that what is launched includes outdated features, thus reducing competitiveness. Ground operations must be considered for the full product Lifecycle, from concept to retirement. Once manufactured, launch vehicles along with their payloads and launch systems require a long path of processing before launch. Initial assembly and testing always discover problems to address. A solid integration program is essential to minimize these impacts, as was seen in the Constellation Ares I-X test rocket. For RLV's, landing/recovery and post-flight turnaround activities are performed. Multi-use vehicles require reconfiguration. MRO (Maintenance, Repair, and Overhaul) must be well-planned--- even for the unplanned problems. Defect limits and standard repairs need to be in-place as well as easily added. Many routine inspections and maintenance can be like an aircraft overhaul. Modifications and technology upgrades should be expected. Another factor affecting ground operations efficiency is trending. It is essential for RLV's, and also useful for ELV's which fly the same or similar models again. Good data analysis of technical and processing performance will determine fixes and improvements needed for safety, design, and future processing. Collecting such data on new or low-frequency vehicles is a challenge. Lessons can be learned from the Space Shuttle, or even the Concorde aircraft. For all of the above topics, efficient business systems must be established for comprehensive program management and good throughput. Drawings, specifications, and manuals for an entire launch vehicle are often in different formats from multiple vendors, plus they have proprietary constraints. Nonetheless, the integration team must ensure that all data needed is compatible and visible to each appropriate team member. Ground processing systems for scheduling, tracking, problem resolution, etc. must be well laid-out. The balance between COTS (commercial off the shelf) and custom software is difficult. Multiple customers, vendors, launch sites, and landing sites add to the complexity of efficient IT (Information Technology) tools

    A Cybersecurity review of Healthcare Industry

    Get PDF
    Antecedentes La ciberseguridad no es un concepto nuevo de nuestros días. Desde los años 60 la ciberseguridad ha sido un ámbito de discusión e investigación. Aunque los mecanismos de defensa en materia de seguridad han evolucionado, las capacidades del atacante también se han incrementado de igual o mayor manera. Prueba de este hecho es la precaria situación en materia de ciberseguridad de muchas empresas, que ha llevado a un incremento de ataques de ransomware y el establecimiento de grandes organizaciones criminales dedicadas al cibercrimen. Esta situación, evidencia la necesidad de avances e inversión en ciberseguridad en multitud de sectores, siendo especialmente relevante en la protección de infraestructuras críticas. Se conoce como infraestructuras críticas aquellas infraestructuras estratégicas cuyo funcionamiento es indispensable y no permite soluciones alternativas, por lo que su perturbación o destrucción tendría un grave impacto sobre los servicios esenciales. Dentro de esta categorización se encuentran los servicios e infraestructuras sanitarias. Estas infraestructuras ofrecen un servicio, cuya interrupción conlleva graves consecuencias, como la pérdida de vidas humanas. Un ciberataque puede afectar a estos servicios sanitarios, llevando a su paralización total o parcial, como se ha visto en recientes incidentes, llevando incluso a la pérdida de vidas humanas. Además, este tipo de servicios contienen multitud de información personal de carácter altamente sensible. Los datos médicos son un tipo de datos con alto valor en mercados ilegales, y por tanto objetivos de ataques centrados en su robo. Por otra parte, se debe mencionar, que al igual que otros sectores, actualmente los servicios sanitarios se encuentran en un proceso de digitalización. Esta evolución, ha obviado la ciberseguridad en la mayoría de sus desarrollos, contribuyendo al crecimiento y gravedad de los ataques previamente mencionados. - Metodología e investigación El trabajo presentado en esta tesis sigue claramente un método experimental y deductivo. Está investigación se ha centrado en evaluar el estado de la ciberseguridad en infraestructuras sanitarias y proponer mejoras y mecanismos de detección de ciberataques. Las tres publicaciones científicas incluidas en esta tesis buscan dar soluciones y evaluar problemas actuales en el ámbito de las infraestructuras y sistemas sanitarios. La primera publicación, 'Mobile malware detection using machine learning techniques', se centró en desarrollar nuevas técnicas de detección de amenazas basadas en el uso de tecnologías de inteligencia artificial y ‘machine learning’. Esta investigación fue capaz de desarrollar un método de detección de aplicaciones potencialmente no deseadas y maliciosas en entornos móviles de tipo Android. Además, tanto en el diseño y creación se tuvo en cuenta las necesidades específicas de los entornos sanitarios. Buscando ofrecer una implantación sencilla y viable de acorde las necesidades de estos centros, obteniéndose resultados satisfactorios. La segunda publicación, 'Interconnection Between Darknets', buscaba identificar y detectar robos y venta de datos médicos en darknets. El desarrollo de esta investigación conllevó el descubrimiento y prueba de la interconexión entre distintas darknets. La búsqueda y el análisis de información en este tipo de redes permitió demostrar como distintas redes comparten información y referencias entre ellas. El análisis de una darknet implica la necesidad de analizar otras, para obtener una información más completa de la primera. Finalmente, la última publicación, 'Security and privacy issues of data-over-sound technologies used in IoT healthcare devices' buscó investigar y evaluar la seguridad de dispositivos médicos IoT ('Internet of Things'). Para desarrollar esta investigación se adquirió un dispositivo médico, un electrocardiógrafo portable, actualmente en uso por diversos hospitales. Las pruebas realizadas sobre este dispositivo fueron capaces de descubrir múltiples fallos de ciberseguridad. Estos descubrimientos evidenciaron la carencia de certificaciones y revisiones obligatorias en materia ciberseguridad en productos sanitarios, comercializados actualmente. Desgraciadamente la falta de presupuesto dedicado a investigación no permitió la adquisición de varios dispositivos médicos, para su posterior evaluación en ciberseguridad. - Conclusiones La realización de los trabajos e investigaciones previamente mencionadas permitió obtener las siguientes conclusiones. Partiendo de la necesidad en mecanismos de ciberseguridad de las infraestructuras sanitarias, se debe tener en cuenta su particularidad diseño y funcionamiento. Las pruebas y mecanismos de ciberseguridad diseñados han de ser aplicables en entornos reales. Desgraciadamente actualmente en las infraestructuras sanitarias hay sistemas tecnológicos imposibles de actualizar o modificar. Multitud de máquinas de tratamiento y diagnostico cuentan con software y sistemas operativos propietarios a los cuales los administradores y empleados no tienen acceso. Teniendo en cuenta esta situación, se deben desarrollar medidas que permitan su aplicación en este ecosistema y que en la medida de los posible puedan reducir y paliar el riesgo ofrecido por estos sistemas. Esta conclusión viene ligada a la falta de seguridad en dispositivos médicos. La mayoría de los dispositivos médicos no han seguido un proceso de diseño seguro y no han sido sometidos a pruebas de seguridad por parte de los fabricantes, al suponer esto un coste directo en el desarrollo del producto. La única solución en este aspecto es la aplicación de una legislación que fuerce a los fabricantes a cumplir estándares de seguridad. Y aunque actualmente se ha avanzado en este aspecto regulatorio, se tardaran años o décadas en sustituir los dispositivos inseguros. La imposibilidad de actualizar, o fallos relacionados con el hardware de los productos, hacen imposible la solución de todos los fallos de seguridad que se descubran. Abocando al reemplazo del dispositivo, cuando exista una alternativa satisfactoria en materia de ciberseguridad. Por esta razón es necesario diseñar nuevos mecanismos de ciberseguridad que puedan ser aplicados actualmente y puedan mitigar estos riesgos en este periodo de transición. Finalmente, en materia de robo de datos. Aunque las investigaciones preliminares realizadas en esta tesis no consiguieron realizar ningún descubrimiento significativo en el robo y venta de datos. Actualmente las darknets, en concreto la red Tor, se han convertido un punto clave en el modelo de Ransomware as a Business (RaaB), al ofrecer sitios webs de extorsión y contacto con estos grupos
    corecore