8 research outputs found

    Alexandria: Extensible Framework for Rapid Exploration of Social Media

    Full text link
    The Alexandria system under development at IBM Research provides an extensible framework and platform for supporting a variety of big-data analytics and visualizations. The system is currently focused on enabling rapid exploration of text-based social media data. The system provides tools to help with constructing "domain models" (i.e., families of keywords and extractors to enable focus on tweets and other social media documents relevant to a project), to rapidly extract and segment the relevant social media and its authors, to apply further analytics (such as finding trends and anomalous terms), and visualizing the results. The system architecture is centered around a variety of REST-based service APIs to enable flexible orchestration of the system capabilities; these are especially useful to support knowledge-worker driven iterative exploration of social phenomena. The architecture also enables rapid integration of Alexandria capabilities with other social media analytics system, as has been demonstrated through an integration with IBM Research's SystemG. This paper describes a prototypical usage scenario for Alexandria, along with the architecture and key underlying analytics.Comment: 8 page

    Visualization approach to effective decision making on hydrological data

    Get PDF
    Temporal data is by nature arranged according to the sequence of time where the order of the data is very significant.Thus in order to visualize a temporal data, the order of the data has to be preserve that will show certain trends or temporal patterns. Most visualization technique however uses technical visual representation such as bar chart and line graph. This approach is suitable and can be easily comprehended only by technical users. In order to reduce the learning curve in understanding the prototype develop and facilitate decision making, metaphor based visualization approach was used for representing temporal hydrological data. To evaluate the correct of decision making similarity test was conducted by using data mining approach, specifically incorporating case-based reasoning. The test case or new data was compared with the case extracted from previous operation data and the case closely was examined by exploring the detailed data. Results were evaluated through usability testing and similarity testing. The prototype was demonstrated to a group of users specifically three DID staff involved with the dam operation directly and indirectly. The feedbacks received from the users are positive where the interface objects used took a short time for them to learn and understand due to the familiarity of the representation. One look at the map, it will give them the overall picture of the situation patterns of the dam water level and rainfall around the catchments area according to the time frame chosen. The metaphorical representation based visualization is used as a basis to represent temporal and multi-variate data using icon based technique and colour code to enhance interface usability and usefulness. This type of representation can be easily understood by a non-expert from the domain. The visualization actually assists users in the process of decision-making by representing the patterns in form close to the mental model of a user by using metaphor.This help speed up data exploration thus decision-making process. In critical situation speed and accuracy is vital in the decision making process

    Decision Support Elements and Enabling Techniques to Achieve a Cyber Defence Situational Awareness Capability

    Full text link
    [ES] La presente tesis doctoral realiza un análisis en detalle de los elementos de decisión necesarios para mejorar la comprensión de la situación en ciberdefensa con especial énfasis en la percepción y comprensión del analista de un centro de operaciones de ciberseguridad (SOC). Se proponen dos arquitecturas diferentes basadas en el análisis forense de flujos de datos (NF3). La primera arquitectura emplea técnicas de Ensemble Machine Learning mientras que la segunda es una variante de Machine Learning de mayor complejidad algorítmica (lambda-NF3) que ofrece un marco de defensa de mayor robustez frente a ataques adversarios. Ambas propuestas buscan automatizar de forma efectiva la detección de malware y su posterior gestión de incidentes mostrando unos resultados satisfactorios en aproximar lo que se ha denominado un SOC de próxima generación y de computación cognitiva (NGC2SOC). La supervisión y monitorización de eventos para la protección de las redes informáticas de una organización debe ir acompañada de técnicas de visualización. En este caso, la tesis aborda la generación de representaciones tridimensionales basadas en métricas orientadas a la misión y procedimientos que usan un sistema experto basado en lógica difusa. Precisamente, el estado del arte muestra serias deficiencias a la hora de implementar soluciones de ciberdefensa que reflejen la relevancia de la misión, los recursos y cometidos de una organización para una decisión mejor informada. El trabajo de investigación proporciona finalmente dos áreas claves para mejorar la toma de decisiones en ciberdefensa: un marco sólido y completo de verificación y validación para evaluar parámetros de soluciones y la elaboración de un conjunto de datos sintéticos que referencian unívocamente las fases de un ciberataque con los estándares Cyber Kill Chain y MITRE ATT & CK.[CA] La present tesi doctoral realitza una anàlisi detalladament dels elements de decisió necessaris per a millorar la comprensió de la situació en ciberdefensa amb especial èmfasi en la percepció i comprensió de l'analista d'un centre d'operacions de ciberseguretat (SOC). Es proposen dues arquitectures diferents basades en l'anàlisi forense de fluxos de dades (NF3). La primera arquitectura empra tècniques de Ensemble Machine Learning mentre que la segona és una variant de Machine Learning de major complexitat algorítmica (lambda-NF3) que ofereix un marc de defensa de major robustesa enfront d'atacs adversaris. Totes dues propostes busquen automatitzar de manera efectiva la detecció de malware i la seua posterior gestió d'incidents mostrant uns resultats satisfactoris a aproximar el que s'ha denominat un SOC de pròxima generació i de computació cognitiva (NGC2SOC). La supervisió i monitoratge d'esdeveniments per a la protecció de les xarxes informàtiques d'una organització ha d'anar acompanyada de tècniques de visualització. En aquest cas, la tesi aborda la generació de representacions tridimensionals basades en mètriques orientades a la missió i procediments que usen un sistema expert basat en lògica difusa. Precisament, l'estat de l'art mostra serioses deficiències a l'hora d'implementar solucions de ciberdefensa que reflectisquen la rellevància de la missió, els recursos i comeses d'una organització per a una decisió més ben informada. El treball de recerca proporciona finalment dues àrees claus per a millorar la presa de decisions en ciberdefensa: un marc sòlid i complet de verificació i validació per a avaluar paràmetres de solucions i l'elaboració d'un conjunt de dades sintètiques que referencien unívocament les fases d'un ciberatac amb els estàndards Cyber Kill Chain i MITRE ATT & CK.[EN] This doctoral thesis performs a detailed analysis of the decision elements necessary to improve the cyber defence situation awareness with a special emphasis on the perception and understanding of the analyst of a cybersecurity operations center (SOC). Two different architectures based on the network flow forensics of data streams (NF3) are proposed. The first architecture uses Ensemble Machine Learning techniques while the second is a variant of Machine Learning with greater algorithmic complexity (lambda-NF3) that offers a more robust defense framework against adversarial attacks. Both proposals seek to effectively automate the detection of malware and its subsequent incident management, showing satisfactory results in approximating what has been called a next generation cognitive computing SOC (NGC2SOC). The supervision and monitoring of events for the protection of an organisation's computer networks must be accompanied by visualisation techniques. In this case, the thesis addresses the representation of three-dimensional pictures based on mission oriented metrics and procedures that use an expert system based on fuzzy logic. Precisely, the state-of-the-art evidences serious deficiencies when it comes to implementing cyber defence solutions that consider the relevance of the mission, resources and tasks of an organisation for a better-informed decision. The research work finally provides two key areas to improve decision-making in cyber defence: a solid and complete verification and validation framework to evaluate solution parameters and the development of a synthetic dataset that univocally references the phases of a cyber-attack with the Cyber Kill Chain and MITRE ATT & CK standards.Llopis Sánchez, S. (2023). Decision Support Elements and Enabling Techniques to Achieve a Cyber Defence Situational Awareness Capability [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/19424

    Investigating and reflecting on the integration of automatic data analysis and visualization in knowledge discovery

    No full text
    The aim of this work is to survey and reflect on the various ways visualization and data mining can be integrated to achieve effective knowledge discovery by involving the best of human and machine capabilities. Following a bottom-up bibliographic research approach, the article categorizes the observed techniques in classes, highlighting current trends, gaps, and potential future directions for research. In particular it looks at strengths and weaknesses of information visualization (infovis) and data mining, and for which purposes researchers in infovis use data mining techniques and reversely how researchers in data mining employ infovis techniques. The article then proposes, on the basis of the extracted patterns, a series of potential extensions not found in literature. Finally, we use this information to analyze the discovery process by comparing the analysis steps from the perspective of information visualization and data mining. The comparison brings to light new perspectives on how mining and visualization can best employ human and machine strengths. This activity leads to a series of reflections and research questions that can help to further advance the science of visual analytics

    Visualisation of data to optimise strategic decision making

    Get PDF
    1.1 Purpose of the study: The purpose of this research was to explain the principles that should be adopted when developing data visualisations for effective strategic decision making. 1.1.1 Main problem statement: Big data is produced at exponential rates and organisational executives may not possess the appropriate skill or knowledge to consume it for rigorous and timely strategic decision-making (Li, Tiwari, Alcock, & Bermell-Garcia, 2016; Marshall & De la Harpe, 2009; McNeely & Hahm, 2014). 1.1.2 Sub-problems: Organisational executives, including Chief Executive Officers (CEOs), Chief Financial Officers (CFOs) and Chief Operating Officers (COOs) possess unique and differing characteristics including education, IT skill, goals and experiences impacting on his/her strategic decision-making ability (Campbell, Chang, & Hosseinian-Far, 2015; Clayton, 2013; Krotov, 2015; Montibeller & Winterfeldt, 2015; Toker, Conati, Steichen, & Carenini, 2013; Xu, 2014). Furthermore, data visualisations are often not "fit-forpurpose", meaning they do not consistently or adequately guide executive strategic decision-making for organisational success (Nevo, Nevo, Kumar, Braasch, & Mathews, 2015). Finally, data visualisation development currently faces challenges, including resolving the interaction between data and human intuition, as well as the incorporation of big data to derive competitive advantage (Goes, 2014; Moorthy et al., 2015; Teras & Raghunathan, 2015). 1.1.3 Research Questions: Based on the challenges identified in section 1.1.1 and 1.1.2, the researcher has identified 3 research questions. RQ1: What do individual organisational executives value and use in data and data visualisation for strategic decision-making purposes? RQ2: How does data visualisation impact on an executive's ability to use and digest relevant information, including on his/her decision-making speed and confidence? RQ3: What elements should data analysts consider when developing data visualisations? 1.2 Rationale: The study will provide guidance to data analysts on how to develop and rethink their data visualisation methods, based on responses from organisational executives tasked with strategic decision-making. By performing this study, data analysts and executives will both benefit, as data analysts will gain knowledge and understanding of what executives value and use in data visualisations, while executives will have a platform to raise their requirements, improving the effectiveness of data visualisations for strategic decision-making. 1.3 Research Method: Qualitative research was the research method used in this research study. Qualitative research could be described as using words rather than precise measurements or calculations when performing data collection and analysis and uses methods of observation, human experiences and inquiry to explain the results of a study (Bryman, 2015; Myers, 2013). Its importance in social science research has increased, as there is a need to further understand the connection of the research study to people's emotions, culture and experiences (Creswell, 2013; Lub, 2015). This supports the ontological view of the researcher, which is an interpretivist's view (Eriksson & Kovalainen, 2015; Ormston, Spencer, Barnard, & Snape, 2014). The epistemology was interpretivism, as the researcher interviewed executives and data analysts (Eriksson & Kovalainen, 2015; Ritchie, Lewis, Nicholls, & Ormston, 2013). Furthermore, literature relating to decision-making supported the researcher's interpretivist view, as people generally make decisions based on what they know at the time (Betsch & Haberstroh, 2014). Therefore, the researcher cannot separate the participant from his/her views (Dhochak & Sharma, 2016).The population for this research comprised of 13 executives tasked with strategic decision-making, as well as 4 data analysts who are either internal (permanent employees) or external (consultants) of the organisation within the private sector. 1.4 Conclusion: RQ1: What do individual organisational executives value and use in data and data visualisation for strategic decision-making purposes? Based upon the findings, to answer RQ1, organisational executives must first be clear on the value of the decision. No benefit will be derived from data visualisation if the decision lacks value. The executives also stressed the importance of understanding how data relevancy was identified, based on the premise used by the data visualisation developers. Executives also value source data accuracy and preventing a one-dimensional view by only incorporating data from one source. Hence the value of dynamism, or differing data angles, is important. In terms of the value in data visualisation, it must provide simplicity, clarity, intuitiveness, insightfulness, gap, pattern and trending capability in a collaboration enabling manner, supporting the requirements and decision objectives of the executive. However, an additional finding also identified the importance of the executive's knowledge of the topic at hand and having some familiarity of the topic. Finally, the presenter of the visualisation must also provide a guiding force to assist the executive in reaching a final decision, but not actually formulate the decision for the executive. RQ2: How does data visualisation impact on an executive's ability to use and digest relevant information, including on his/her decision-making speed and confidence? Based on the findings, to answer RQ2, themes of consumption, speed and confidence can be used. However; the final themes of use and trust overlap the initial 3 theme. Consumption is impacted by the data visualisation's ability to talk to the objective of the decision and the ability of the technology used to map the mental model and thinking processes of the decision-maker. Furthermore, data visualisations must not only identify the best decision, but also help the executive to define actionable steps to meet the goal of the decision. Executives appreciate the knowledge and skill of peers and prefer an open approach to decision-making, provided that each inclusion is to the benefit of the organisation as a whole. Benchmark statistics from similar industries also add to the consumption factor. Speed was only defined in terms of the data visualisation design, including the use of contrasting elements, such as colour, to highlight anomalies and areas of interest with greater speed. Furthermore, tolerance limits can also assist the executive in identifying where thresholds have been surpassed, or where areas of underperformance have occurred, focussing on problem areas within the organisation. Finally, confidence is not only impacted by the data visualisation itself but is also affected by the executive's knowledge of the decision and the factors affecting the decision, the ability of the data visualisation presenter to understand, guide and add value to the decision process, the accuracy and integrity of the data presented, the familiarity of the technology used to present the data visualisation and the ability of the data visualisation to enable explorative and collaborative methods for decision-making. RQ3: What elements should data analysts consider when developing data visualisations? Based on the findings, to answer RQ3, the trust theme identifies qualitative factors, relating to the presenter. The value, consumption and confidence themes all point to the relevance of having an open and collaborative organisational culture that enables the effective use of data visualisation. Collaboration brings individuals together and the power of knowledgeable individuals can enhance the final decision. In terms of the presenter, his/her organisational ranking, handling of complexity and multiple audience requirements, use of data in the data visualisation, ability to answer questions, his/her confidence and maturity, professionalism, delivery of the message when presenting, knowledge of the subject presented, understanding of the executive's objectives and data visualisation methodology, creation of a "WOW" factor and understanding the data journey are all important considerations

    Conceptual framework of a novel hybrid methodology between computational fluid dynamics and data mining techniques for medical dataset application

    Get PDF
    This thesis proposes a novel hybrid methodology that couples computational fluid dynamic (CFD) and data mining (DM) techniques that is applied to a multi-dimensional medical dataset in order to study potential disease development statistically. This approach allows an alternate solution for the present tedious and rigorous CFD methodology being currently adopted to study the influence of geometric parameters on hemodynamics in the human abdominal aortic aneurysm. This approach is seen as a “marriage” between medicine and computer domains

    L'AIS : une donnée pour l'analyse des activités en mer

    Get PDF
    4 pages, session "Mer et littoral"International audienceCette contribution présente des éléments méthodologiques pour la description des activités humaines en mer dans une perspective d'aide à la gestion. Différentes procédures, combinant l'exploitation de bases de données spatio-temporelles issue de données AIS archivées à des analyses spatiales au sein d'un SIG, sont testées afin de caractériser le transport maritime en Mer d'Iroise (Bretagne, France) sur les plans spatiaux, temporels et quantitatifs au cours d'une année
    corecore