5,064 research outputs found

    DimLift: Interactive Hierarchical Data Exploration through Dimensional Bundling

    Get PDF
    The identification of interesting patterns and relationships is essential to exploratory data analysis. This becomes increasingly difficult in high dimensional datasets. While dimensionality reduction techniques can be utilized to reduce the analysis space, these may unintentionally bury key dimensions within a larger grouping and obfuscate meaningful patterns. With this work we introduce DimLift , a novel visual analysis method for creating and interacting with dimensional bundles . Generated through an iterative dimensionality reduction or user-driven approach, dimensional bundles are expressive groups of dimensions that contribute similarly to the variance of a dataset. Interactive exploration and reconstruction methods via a layered parallel coordinates plot allow users to lift interesting and subtle relationships to the surface, even in complex scenarios of missing and mixed data types. We exemplify the power of this technique in an expert case study on clinical cohort data alongside two additional case examples from nutrition and ecology.acceptedVersio

    What May Visualization Processes Optimize?

    Full text link
    In this paper, we present an abstract model of visualization and inference processes and describe an information-theoretic measure for optimizing such processes. In order to obtain such an abstraction, we first examined six classes of workflows in data analysis and visualization, and identified four levels of typical visualization components, namely disseminative, observational, analytical and model-developmental visualization. We noticed a common phenomenon at different levels of visualization, that is, the transformation of data spaces (referred to as alphabets) usually corresponds to the reduction of maximal entropy along a workflow. Based on this observation, we establish an information-theoretic measure of cost-benefit ratio that may be used as a cost function for optimizing a data visualization process. To demonstrate the validity of this measure, we examined a number of successful visualization processes in the literature, and showed that the information-theoretic measure can mathematically explain the advantages of such processes over possible alternatives.Comment: 10 page

    An intelligent information forwarder for healthcare big data systems with distributed wearable sensors

    Get PDF
    © 2016 IEEE. An increasing number of the elderly population wish to live an independent lifestyle, rather than rely on intrusive care programmes. A big data solution is presented using wearable sensors capable of carrying out continuous monitoring of the elderly, alerting the relevant caregivers when necessary and forwarding pertinent information to a big data system for analysis. A challenge for such a solution is the development of context-awareness through the multidimensional, dynamic and nonlinear sensor readings that have a weak correlation with observable human behaviours and health conditions. To address this challenge, a wearable sensor system with an intelligent data forwarder is discussed in this paper. The forwarder adopts a Hidden Markov Model for human behaviour recognition. Locality sensitive hashing is proposed as an efficient mechanism to learn sensor patterns. A prototype solution is implemented to monitor health conditions of dispersed users. It is shown that the intelligent forwarders can provide the remote sensors with context-awareness. They transmit only important information to the big data server for analytics when certain behaviours happen and avoid overwhelming communication and data storage. The system functions unobtrusively, whilst giving the users peace of mind in the knowledge that their safety is being monitored and analysed

    Methodologies for the assessment of industrial and energy assets, based on data analysis and BI

    Get PDF
    In July 2020, post pandemic onset, Europe launched the Next Generation EU (NGEU) program. The amount of resources deployed to revitalize Europe has reached 750 billion. The NGEU initiative directs significant resources to Italy. These funds can enable our country to boost investment and increase employment. The missions of Italian Recovery and Resilience Plan (PNRR) include digitization, innovation and sustainable mobility (rail network investments, etc.). In this context, this doctorate thesis discusses the importance of infrastructure for society with a special focus on energy, railway and motorway infrastructure. The central theme of sustainability, defined by the World Commission on Environment and Development (WCDE) as ''development that meets the needs of the present generation without compromising the ability of future generations to meet their needs’’, is also highlighted. Through their activities and relationships, organizations contribute positively or negatively to the goal of sustainable development. Sustainability becomes an integrated part of corporate culture. First research in this thesis describes how Artificial Intelligence techniques can play a supporting role for both maintenance operators in tunnel monitoring and those responsible for safety in operation. Relevant information can be extracted from large volumes of data from sensor equipment in an efficient, fast, dynamic and adaptive manner and made immediately usable by those operating machinery and services to support rapid decisions. Performing sensor-based analysis in motorway tunnels represents a major technological breakthrough that would simplify tunnel management activities and thus the detection of possible deterioration, while keeping risk within tolerance limits. The idea involves the creation of an algorithm for detecting faults, acquiring real-time data from tunnel subsystem sensors and using it to help identify the tunnel's state of service. Artificial intelligence models were trained over a sixmonth period with a granularity of one-hour time series measured on a road tunnel forming part of the Italian motorway systems. The verification was carried out with 3 reference to a series of failures recorded by the sensors. The second research argument is relates to the transfer capacities of high-voltage overhead lines (HVOHL), which are often limited by the critical temperature of the power line, which depends on the magnitude of the current transferred and the environmental conditions, i.e. ambient temperature, wind, etc. In order to use existing power lines more effectively (with a view to progressive decarbonization) and more safely with respect to critical power line temperatures, this work proposes a Dynamic Thermal Rating (DTR) approach using IoT sensors installed on a number of HV OHL located in different geographical locations in Italy. The objective is to estimate the temperature and ampacity of the OHL conductor, using a data-driven thermomechanical model with a bayesian probabilistic approach, in order to improve the confidence interval of the results. This work shows that it might be possible to estimate a spatio-temporal temperature distribution for each OHL and an increase in the threshold values of the effective current to optimize the OHL ampacity. The proposed model was validated using the Monte Carlo method. Finally, in this thesis is presented study on KPIs as indispensable allies of top management in the asset control phase. They are often overwhelmed by the availability of a huge amount of Key Performance Indicators (KPIs). Most managers struggle In understanding and identifying the few vital management metrics and instead collect and report a vast amount of everything that is easy to measure. As a result, they end up drowning in data, thirsty for information. This condition does not allow good systems management. The aim of this research is help the Asset Management System (AMS) of a railway infrastructure manager using business intelligence (BI) to equip itself with a KPI management system in line with the AM presented by the normative ISO 55000 - 55001 - 55002 and UIC (International Union of Railways) guideline, for the specific case of a railway infrastructure. This work starts from the study of these regulations, continues with the exploration, definition and use of KPIs. Subsequently KPIs of a generic infrastructure are identified and analyzed, 4 especially for the specific case of a railway infrastructure manager. These KPIs are fitted in the internal elements of the AM frameworks (ISO-UIC) for systematization. Moreover, an analysis of the KPIs now used in the company is made, compared with the KPIs that an infrastructure manager should have. Starting from here a gap analysis is done for the optimization of AMS

    Toward a new Framework of Strategic Alignment of Big Data projects: literature review

    Get PDF
    The notion of strategic alignment is a permanent issue for enterprises, that consists of redesigning a new architecture in order to reach a perfect harmony between the business architecture and the information technology architecture. The strategic alignment is therefore an old issue, that was first cited by Henderson and Venkatraman, in the 70’s, many contributions came along since then by the scientific community. With the emergence of big Data, many scientists focused on how to reach the strategic alignment, based on the new technologies provided by big Data, thus in our contribution, we started to define three main categories that can be used. Those categories can be summarized as reaching strategic alignment either through the big data ecosystem, or through big data analytics capability and finally through the big data transformation. We focused more on the last one, and we managed to propose five bricks that a company can use in order to reach the perfect harmony by doing some efforts on their strategy and business model, on their culture and organization, on their strategy of marketing based on the client experience, and also on their technology choices and their IT infrastructure and finally through the treatment of the data gathered and the establishment of measures.

    Wearables for independent living in older adults: Gait and falls

    Get PDF
    Solutions are needed to satisfy care demands of older adults to live independently. Wearable technology (wearables) is one approach that offers a viable means for ubiquitous, sustainable and scalable monitoring of the health of older adults in habitual free-living environments. Gait has been presented as a relevant (bio)marker in ageing and pathological studies, with objective assessment achievable by inertial-based wearables. Commercial wearables have struggled to provide accurate analytics and have been limited by non-clinically oriented gait outcomes. Moreover, some research-grade wearables also fail to provide transparent functionality due to limitations in proprietary software. Innovation within this field is often sporadic, with large heterogeneity of wearable types and algorithms for gait outcomes leading to a lack of pragmatic use. This review provides a summary of the recent literature on gait assessment through the use of wearables, focusing on the need for an algorithm fusion approach to measurement, culminating in the ability to better detect and classify falls. A brief presentation of wearables in one pathological group is presented, identifying appropriate work for researchers in other cohorts to utilise. Suggestions for how this domain needs to progress are also summarised

    24th International Conference on Information Modelling and Knowledge Bases

    Get PDF
    In the last three decades information modelling and knowledge bases have become essentially important subjects not only in academic communities related to information systems and computer science but also in the business area where information technology is applied. The series of European – Japanese Conference on Information Modelling and Knowledge Bases (EJC) originally started as a co-operation initiative between Japan and Finland in 1982. The practical operations were then organised by professor Ohsuga in Japan and professors Hannu Kangassalo and Hannu Jaakkola in Finland (Nordic countries). Geographical scope has expanded to cover Europe and also other countries. Workshop characteristic - discussion, enough time for presentations and limited number of participants (50) / papers (30) - is typical for the conference. Suggested topics include, but are not limited to: 1. Conceptual modelling: Modelling and specification languages; Domain-specific conceptual modelling; Concepts, concept theories and ontologies; Conceptual modelling of large and heterogeneous systems; Conceptual modelling of spatial, temporal and biological data; Methods for developing, validating and communicating conceptual models. 2. Knowledge and information modelling and discovery: Knowledge discovery, knowledge representation and knowledge management; Advanced data mining and analysis methods; Conceptions of knowledge and information; Modelling information requirements; Intelligent information systems; Information recognition and information modelling. 3. Linguistic modelling: Models of HCI; Information delivery to users; Intelligent informal querying; Linguistic foundation of information and knowledge; Fuzzy linguistic models; Philosophical and linguistic foundations of conceptual models. 4. Cross-cultural communication and social computing: Cross-cultural support systems; Integration, evolution and migration of systems; Collaborative societies; Multicultural web-based software systems; Intercultural collaboration and support systems; Social computing, behavioral modeling and prediction. 5. Environmental modelling and engineering: Environmental information systems (architecture); Spatial, temporal and observational information systems; Large-scale environmental systems; Collaborative knowledge base systems; Agent concepts and conceptualisation; Hazard prediction, prevention and steering systems. 6. Multimedia data modelling and systems: Modelling multimedia information and knowledge; Contentbased multimedia data management; Content-based multimedia retrieval; Privacy and context enhancing technologies; Semantics and pragmatics of multimedia data; Metadata for multimedia information systems. Overall we received 56 submissions. After careful evaluation, 16 papers have been selected as long paper, 17 papers as short papers, 5 papers as position papers, and 3 papers for presentation of perspective challenges. We thank all colleagues for their support of this issue of the EJC conference, especially the program committee, the organising committee, and the programme coordination team. The long and the short papers presented in the conference are revised after the conference and published in the Series of “Frontiers in Artificial Intelligence” by IOS Press (Amsterdam). The books “Information Modelling and Knowledge Bases” are edited by the Editing Committee of the conference. We believe that the conference will be productive and fruitful in the advance of research and application of information modelling and knowledge bases. Bernhard Thalheim Hannu Jaakkola Yasushi Kiyok

    A Data-driven Methodology Towards Mobility- and Traffic-related Big Spatiotemporal Data Frameworks

    Get PDF
    Human population is increasing at unprecedented rates, particularly in urban areas. This increase, along with the rise of a more economically empowered middle class, brings new and complex challenges to the mobility of people within urban areas. To tackle such challenges, transportation and mobility authorities and operators are trying to adopt innovative Big Data-driven Mobility- and Traffic-related solutions. Such solutions will help decision-making processes that aim to ease the load on an already overloaded transport infrastructure. The information collected from day-to-day mobility and traffic can help to mitigate some of such mobility challenges in urban areas. Road infrastructure and traffic management operators (RITMOs) face several limitations to effectively extract value from the exponentially growing volumes of mobility- and traffic-related Big Spatiotemporal Data (MobiTrafficBD) that are being acquired and gathered. Research about the topics of Big Data, Spatiotemporal Data and specially MobiTrafficBD is scattered, and existing literature does not offer a concrete, common methodological approach to setup, configure, deploy and use a complete Big Data-based framework to manage the lifecycle of mobility-related spatiotemporal data, mainly focused on geo-referenced time series (GRTS) and spatiotemporal events (ST Events), extract value from it and support decision-making processes of RITMOs. This doctoral thesis proposes a data-driven, prescriptive methodological approach towards the design, development and deployment of MobiTrafficBD Frameworks focused on GRTS and ST Events. Besides a thorough literature review on Spatiotemporal Data, Big Data and the merging of these two fields through MobiTraffiBD, the methodological approach comprises a set of general characteristics, technical requirements, logical components, data flows and technological infrastructure models, as well as guidelines and best practices that aim to guide researchers, practitioners and stakeholders, such as RITMOs, throughout the design, development and deployment phases of any MobiTrafficBD Framework. This work is intended to be a supporting methodological guide, based on widely used Reference Architectures and guidelines for Big Data, but enriched with inherent characteristics and concerns brought about by Big Spatiotemporal Data, such as in the case of GRTS and ST Events. The proposed methodology was evaluated and demonstrated in various real-world use cases that deployed MobiTrafficBD-based Data Management, Processing, Analytics and Visualisation methods, tools and technologies, under the umbrella of several research projects funded by the European Commission and the Portuguese Government.A população humana cresce a um ritmo sem precedentes, particularmente nas áreas urbanas. Este aumento, aliado ao robustecimento de uma classe média com maior poder económico, introduzem novos e complexos desafios na mobilidade de pessoas em áreas urbanas. Para abordar estes desafios, autoridades e operadores de transportes e mobilidade estão a adotar soluções inovadoras no domínio dos sistemas de Dados em Larga Escala nos domínios da Mobilidade e Tráfego. Estas soluções irão apoiar os processos de decisão com o intuito de libertar uma infraestrutura de estradas e transportes já sobrecarregada. A informação colecionada da mobilidade diária e da utilização da infraestrutura de estradas pode ajudar na mitigação de alguns dos desafios da mobilidade urbana. Os operadores de gestão de trânsito e de infraestruturas de estradas (em inglês, road infrastructure and traffic management operators — RITMOs) estão limitados no que toca a extrair valor de um sempre crescente volume de Dados Espaciotemporais em Larga Escala no domínio da Mobilidade e Tráfego (em inglês, Mobility- and Traffic-related Big Spatiotemporal Data —MobiTrafficBD) que estão a ser colecionados e recolhidos. Os trabalhos de investigação sobre os tópicos de Big Data, Dados Espaciotemporais e, especialmente, de MobiTrafficBD, estão dispersos, e a literatura existente não oferece uma metodologia comum e concreta para preparar, configurar, implementar e usar uma plataforma (framework) baseada em tecnologias Big Data para gerir o ciclo de vida de dados espaciotemporais em larga escala, com ênfase nas série temporais georreferenciadas (em inglês, geo-referenced time series — GRTS) e eventos espacio- temporais (em inglês, spatiotemporal events — ST Events), extrair valor destes dados e apoiar os RITMOs nos seus processos de decisão. Esta dissertação doutoral propõe uma metodologia prescritiva orientada a dados, para o design, desenvolvimento e implementação de plataformas de MobiTrafficBD, focadas em GRTS e ST Events. Além de uma revisão de literatura completa nas áreas de Dados Espaciotemporais, Big Data e na junção destas áreas através do conceito de MobiTrafficBD, a metodologia proposta contem um conjunto de características gerais, requisitos técnicos, componentes lógicos, fluxos de dados e modelos de infraestrutura tecnológica, bem como diretrizes e boas práticas para investigadores, profissionais e outras partes interessadas, como RITMOs, com o objetivo de guiá-los pelas fases de design, desenvolvimento e implementação de qualquer pla- taforma MobiTrafficBD. Este trabalho deve ser visto como um guia metodológico de suporte, baseado em Arqui- teturas de Referência e diretrizes amplamente utilizadas, mas enriquecido com as característi- cas e assuntos implícitos relacionados com Dados Espaciotemporais em Larga Escala, como no caso de GRTS e ST Events. A metodologia proposta foi avaliada e demonstrada em vários cenários reais no âmbito de projetos de investigação financiados pela Comissão Europeia e pelo Governo português, nos quais foram implementados métodos, ferramentas e tecnologias nas áreas de Gestão de Dados, Processamento de Dados e Ciência e Visualização de Dados em plataformas MobiTrafficB

    Approximated and User Steerable tSNE for Progressive Visual Analytics

    Full text link
    Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D embeddings that can be visualized and analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited technique for the visualization of several high-dimensional data. tSNE can create meaningful intermediate results but suffers from a slow initialization that constrains its application in Progressive Visual Analytics. We introduce a controllable tSNE approximation (A-tSNE), which trades off speed and accuracy, to enable interactive data exploration. We offer real-time visualization techniques, including a density-based solution and a Magic Lens to inspect the degree of approximation. With this feedback, the user can decide on local refinements and steer the approximation level during the analysis. We demonstrate our technique with several datasets, in a real-world research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis
    corecore