76,391 research outputs found
Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform
Advances in detectors and computational technologies provide new
opportunities for applied research and the fundamental sciences. Concurrently,
dramatic increases in the three Vs (Volume, Velocity, and Variety) of
experimental data and the scale of computational tasks produced the demand for
new real-time processing systems at experimental facilities. Recently, this
demand was addressed by the Spark-MPI approach connecting the Spark
data-intensive platform with the MPI high-performance framework. In contrast
with existing data management and analytics systems, Spark introduced a new
middleware based on resilient distributed datasets (RDDs), which decoupled
various data sources from high-level processing algorithms. The RDD middleware
significantly advanced the scope of data-intensive applications, spreading from
SQL queries to machine learning to graph processing. Spark-MPI further extended
the Spark ecosystem with the MPI applications using the Process Management
Interface. The paper explores this integrated platform within the context of
online ptychographic and tomographic reconstruction pipelines.Comment: New York Scientific Data Summit, August 6-9, 201
Using visual analytics to develop situation awareness in astrophysics
We present a novel collaborative visual analytics application for cognitively overloaded users in the astrophysics domain. The system was developed for scientists who need to analyze heterogeneous, complex data under time pressure, and make predictions and time-critical decisions rapidly and correctly under a constant influx of changing data. The Sunfall Data Taking system utilizes several novel visualization and analysis techniques to enable a team of geographically distributed domain specialists to effectively and remotely maneuver a custom-built instrument under challenging operational conditions. Sunfall Data Taking has been in production use for 2 years by a major international astrophysics collaboration (the largest data volume supernova search currently in operation), and has substantially improved the operational efficiency of its users. We describe the system design process by an interdisciplinary team, the system architecture and the results of an informal usability evaluation of the production system by domain experts in the context of Endsley's three levels of situation awareness
Research and Development Workstation Environment: the new class of Current Research Information Systems
Against the backdrop of the development of modern technologies in the field
of scientific research the new class of Current Research Information Systems
(CRIS) and related intelligent information technologies has arisen. It was
called - Research and Development Workstation Environment (RDWE) - the
comprehensive problem-oriented information systems for scientific research and
development lifecycle support. The given paper describes design and development
fundamentals of the RDWE class systems. The RDWE class system's generalized
information model is represented in the article as a three-tuple composite web
service that include: a set of atomic web services, each of them can be
designed and developed as a microservice or a desktop application, that allows
them to be used as an independent software separately; a set of functions, the
functional filling-up of the Research and Development Workstation Environment;
a subset of atomic web services that are required to implement function of
composite web service. In accordance with the fundamental information model of
the RDWE class the system for supporting research in the field of ontology
engineering - the automated building of applied ontology in an arbitrary domain
area, scientific and technical creativity - the automated preparation of
application documents for patenting inventions in Ukraine was developed. It was
called - Personal Research Information System. A distinctive feature of such
systems is the possibility of their problematic orientation to various types of
scientific activities by combining on a variety of functional services and
adding new ones within the cloud integrated environment. The main results of
our work are focused on enhancing the effectiveness of the scientist's research
and development lifecycle in the arbitrary domain area.Comment: In English, 13 pages, 1 figure, 1 table, added references in Russian.
Published. Prepared for special issue (UkrPROG 2018 conference) of the
scientific journal "Problems of programming" (Founder: National Academy of
Sciences of Ukraine, Institute of Software Systems of NAS Ukraine
Transdisciplinarity seen through Information, Communication, Computation, (Inter-)Action and Cognition
Similar to oil that acted as a basic raw material and key driving force of
industrial society, information acts as a raw material and principal mover of
knowledge society in the knowledge production, propagation and application. New
developments in information processing and information communication
technologies allow increasingly complex and accurate descriptions,
representations and models, which are often multi-parameter, multi-perspective,
multi-level and multidimensional. This leads to the necessity of collaborative
work between different domains with corresponding specialist competences,
sciences and research traditions. We present several major transdisciplinary
unification projects for information and knowledge, which proceed on the
descriptive, logical and the level of generative mechanisms. Parallel process
of boundary crossing and transdisciplinary activity is going on in the applied
domains. Technological artifacts are becoming increasingly complex and their
design is strongly user-centered, which brings in not only the function and
various technological qualities but also other aspects including esthetic, user
experience, ethics and sustainability with social and environmental dimensions.
When integrating knowledge from a variety of fields, with contributions from
different groups of stakeholders, numerous challenges are met in establishing
common view and common course of action. In this context, information is our
environment, and informational ecology determines both epistemology and spaces
for action. We present some insights into the current state of the art of
transdisciplinary theory and practice of information studies and informatics.
We depict different facets of transdisciplinarity as we see it from our
different research fields that include information studies, computability,
human-computer interaction, multi-operating-systems environments and
philosophy.Comment: Chapter in a forthcoming book: Information Studies and the Quest for
Transdisciplinarity - Forthcoming book in World Scientific. Mark Burgin and
Wolfgang Hofkirchner, Editor
From Big Data to Big Displays: High-Performance Visualization at Blue Brain
Blue Brain has pushed high-performance visualization (HPV) to complement its
HPC strategy since its inception in 2007. In 2011, this strategy has been
accelerated to develop innovative visualization solutions through increased
funding and strategic partnerships with other research institutions.
We present the key elements of this HPV ecosystem, which integrates C++
visualization applications with novel collaborative display systems. We
motivate how our strategy of transforming visualization engines into services
enables a variety of use cases, not only for the integration with high-fidelity
displays, but also to build service oriented architectures, to link into web
applications and to provide remote services to Python applications.Comment: ISC 2017 Visualization at Scale worksho
Applied business analytics approach to IT projects – Methodological framework
The design and implementation of a big data project differs from a typical business intelligence project that might be presented concurrently within the same organization. A big data initiative typically triggers a large scale IT project that is expected to deliver the desired outcomes. The industry has identified two major methodologies for running a data centric project, in particular SEMMA (Sample, Explore, Modify, Model and Assess) and CRISP-DM (Cross Industry Standard Process for Data Mining). More general, the professional organizations PMI (Project Management Institute) and IIBA (International Institute of Business Analysis) have defined their methods for project management and business analysis based on the best current industry practices. However, big data projects place new challenges that are not considered by the existing methodologies. The building of end-to-end big data analytical solution for optimization of the supply chain, pricing and promotion, product launch, shop potential and customer value is facing both business and technical challenges. The most common business challenges are unclear and/or poorly defined business cases; irrelevant data; poor data quality; overlooked data granularity; improper contextualization of data; unprepared or bad prepared data; non-meaningful results; lack of skill set. Some of the technical challenges are related to lag of resources and technology limitations; availability of data sources; storage difficulties; security issues; performance problems; little flexibility; and ineffective DevOps. This paper discusses an applied business analytics approach to IT projects and addresses the above-described aspects. The authors present their work on research and development of new methodological framework and analytical instruments applicable in both business endeavors, and educational initiatives, targeting big data. The proposed framework is based on proprietary methodology and advanced analytics tools. It is focused on the development and the implementation of practical solutions for project managers, business analysts, IT practitioners and Business/Data Analytics students. Under discussion are also the necessary skills and knowledge for the successful big data business analyst, and some of the main organizational and operational aspects of the big data projects, including the continuous model deployment
- …