1,091 research outputs found

    eXamine: a Cytoscape app for exploring annotated modules in networks

    Get PDF
    Background. Biological networks have growing importance for the interpretation of high-throughput "omics" data. Statistical and combinatorial methods allow to obtain mechanistic insights through the extraction of smaller subnetwork modules. Further enrichment analyses provide set-based annotations of these modules. Results. We present eXamine, a set-oriented visual analysis approach for annotated modules that displays set membership as contours on top of a node-link layout. Our approach extends upon Self Organizing Maps to simultaneously lay out nodes, links, and set contours. Conclusions. We implemented eXamine as a freely available Cytoscape app. Using eXamine we study a module that is activated by the virally-encoded G-protein coupled receptor US28 and formulate a novel hypothesis about its functioning

    Practical data mining in a large utility company

    Get PDF
    We present in this paper the main applications of data mining techniques at Electricité de France, the French national electric power company. This includes electric load curve analysis and prediction of customer characteristics. Closely related with data mining techniques are data warehouse management problems: we show that statistical methods can be used to help to manage data consistency and to provide accurate reports even when missing data are present

    Practical data mining in a large utility company

    Get PDF
    We present in this paper the main applications of data mining techniques at Electricité de France, the French national electric power company. This includes electric load curve analysis and prediction of customer characteristics. Closely related with data mining techniques are data warehouse management problems: we show that statistical methods can be used to help to manage data consistency and to provide accurate reports even when missing data are present

    Medical imaging analysis with artificial neural networks

    Get PDF
    Given that neural networks have been widely reported in the research community of medical imaging, we provide a focused literature survey on recent neural network developments in computer-aided diagnosis, medical image segmentation and edge detection towards visual content analysis, and medical image registration for its pre-processing and post-processing, with the aims of increasing awareness of how neural networks can be applied to these areas and to provide a foundation for further research and practical development. Representative techniques and algorithms are explained in detail to provide inspiring examples illustrating: (i) how a known neural network with fixed structure and training procedure could be applied to resolve a medical imaging problem; (ii) how medical images could be analysed, processed, and characterised by neural networks; and (iii) how neural networks could be expanded further to resolve problems relevant to medical imaging. In the concluding section, a highlight of comparisons among many neural network applications is included to provide a global view on computational intelligence with neural networks in medical imaging

    Mapping the state of financial stability

    Get PDF
    The paper uses the Self-Organizing Map for mapping the state of financial stability and visualizing the sources of systemic risks as well as for predicting systemic financial crises. The Self-Organizing Financial Stability Map (SOFSM) enables a two-dimensional representation of a multidimensional financial stability space that allows disentangling the individual sources impacting on systemic risks. The SOFSM can be used to monitor macro-financial vulnerabilities by locating a country in the financial stability cycle: being it either in the pre-crisis, crisis, post-crisis or tranquil state. In addition, the SOFSM performs better than or equally well as a logit model in classifying in-sample data and predicting out-of-sample the global financial crisis that started in 2007. Model robustness is tested by varying the thresholds of the models, the policymaker’s preferences, and the forecasting horizons. JEL Classification: E44, E58, F01, F37, G01macroprudential supervision, prediction, Self-Organizing Map (SOM), Systemic financial crisis, systemic risk, visualization

    Exploratory Analysis of Functional Data via Clustering and Optimal Segmentation

    Full text link
    We propose in this paper an exploratory analysis algorithm for functional data. The method partitions a set of functions into KK clusters and represents each cluster by a simple prototype (e.g., piecewise constant). The total number of segments in the prototypes, PP, is chosen by the user and optimally distributed among the clusters via two dynamic programming algorithms. The practical relevance of the method is shown on two real world datasets

    Integrated characterisation of mud-rich overburden sediment sequences using limited log and seismic data: Application to seal risk

    Get PDF
    Muds and mudstones are the most abundant sediments in sedimentary basins and can control fluid migration and pressure. In petroleum systems, they can also act as source, reservoir or seal rocks. More recently, the sealing properties of mudstones have been used for nuclear waste storage and geological CO2 sequestration. Despite the growing importance of mudstones, their geological modelling is poorly understood and clear quantitative studies are needed to address 3D lithology and flow properties distribution within these sediments. The key issues in this respect are the high degree of heterogeneity in mudstones and the alteration of lithology and flow properties with time and depth. In addition, there are often very limited field data (log and seismic), with lower quality within these sediments, which makes the common geostatistical modelling practices ineffective. In this study we assess/capture quantitatively the flow-important characteristics of heterogeneous mud-rich sequences based on limited conventional log and post-stack seismic data in a deep offshore West African case study. Additionally, we develop a practical technique of log-seismic integration at the cross-well scale to translate 3D seismic attributes into lithology probabilities. The final products are probabilistic multiattribute transforms at different resolutions which allow prediction of lithologies away from wells while keeping the important sub-seismic stratigraphic and structural flow features. As a key result, we introduced a seismically-driven risk attribute (so-called Seal Risk Factor "SRF") which showed robust correspondence to the lithologies within the seismic volume. High seismic SRFs were often a good approximation for volumes containing a higher percentage of coarser-grained and distorted sediments, and vice versa. We believe that this is the first attempt at quantitative, integrated characterisation of mud-rich overburden sediment sequences using log and seismic data. Its application on modern seismic surveys can save days of processing/mapping time and can reduce exploration risk by basing decisions on seal texture and lithology probabilities

    Data exploration process based on the self-organizing map

    Get PDF
    With the advances in computer technology, the amount of data that is obtained from various sources and stored in electronic media is growing at exponential rates. Data mining is a research area which answers to the challange of analysing this data in order to find useful information contained therein. The Self-Organizing Map (SOM) is one of the methods used in data mining. It quantizes the training data into a representative set of prototype vectors and maps them on a low-dimensional grid. The SOM is a prominent tool in the initial exploratory phase in data mining. The thesis consists of an introduction and ten publications. In the publications, the validity of SOM-based data exploration methods has been investigated and various enhancements to them have been proposed. In the introduction, these methods are presented as parts of the data mining process, and they are compared with other data exploration methods with similar aims. The work makes two primary contributions. Firstly, it has been shown that the SOM provides a versatile platform on top of which various data exploration methods can be efficiently constructed. New methods and measures for visualization of data, clustering, cluster characterization, and quantization have been proposed. The SOM algorithm and the proposed methods and measures have been implemented as a set of Matlab routines in the SOM Toolbox software library. Secondly, a framework for SOM-based data exploration of table-format data - both single tables and hierarchically organized tables - has been constructed. The framework divides exploratory data analysis into several sub-tasks, most notably the analysis of samples and the analysis of variables. The analysis methods are applied autonomously and their results are provided in a report describing the most important properties of the data manifold. In such a framework, the attention of the data miner can be directed more towards the actual data exploration task, rather than on the application of the analysis methods. Because of the highly iterative nature of the data exploration, the automation of routine analysis tasks can reduce the time needed by the data exploration process considerably.reviewe

    Visualization Techniques For Malware Behavior Analysis

    Get PDF
    Malware spread via Internet is a great security threat, so studying their behavior is important to identify and classify them. Using SSDT hooking we can obtain malware behavior by running it in a controlled environment and capturing interactions with the target operating system regarding file, process, registry, network and mutex activities. This generates a chain of events that can be used to compare them with other known malware. In this paper we present a simple approach to convert malware behavior into activity graphs and show some visualization techniques that can be used to analyze malware behavior, individually or grouped. © 2011 SPIE.8019The Society of Photo-Optical Instrumentation Engineers (SPIE)Tufte, E.R., (2001) The Visual Display of Quantitative Information, , Graphic PressKeim, D., Visual data mining. Tutorial (1997) Proc. 23rd International Conference on Very Large Data BasesCleveland, W.S., (1993) Visualizing Data, , Hobart PressGrégio, A.R.A., Aplicação de técnicas de data mining para a análise de logs de tráfego tcp/ip (2007) Applied Computing at INPE - Brazilian Institute for Space Research, , Masters dissertationInselberg, A., The plane with parallel coordinates (1985) The Visual Computer, 1 (2), pp. 69-91Inselberg, A., (2009) Parallel Coordinates - Visual Multidimensional Geometry and its Applications, , SpringerKohonen, T., (1997) Self-Organizing Maps, , SpringerBeddow, J., Shape coding of multidimensional data on a mircocomputer display (1990) Proc. of the First IEEE Conference on Visualization, pp. 238-246Keim, D.A., Kriegel, H.-P., Using visualization to support data mining of large existing databases (1993) Proc. IEEE Visualization '93 WorkshopShneiderman, B., Tree visualization with tree-maps: A 2-D space-filling approach (1991) ACM Transactions on Graphics, 11, pp. 92-99www.shadowserver.orgwww.cert.brwww.cert.br/docs/whitepapers/spambotsCalais, P.H., Pires, D.E.V., Guedes, D.O., Meira Jr., W., Hoepers, C., Steding-Jessen, K., A campaign-based characterization of spamming strategies (2008) Proc. of Fifth Conference on E-mail and Anti-Spa

    Methods of visualization and analysis of cardiac depolarization in the three dimensional space

    Get PDF
    The master thesis presents methods for intellectual analysis and visualization 3D EKG in order to increase the efficiency of ECG analysis by extracting additional data. Visualization is presented as part of the signal analysis tasks considered imaging techniques and their mathematical description. Have been developed algorithms for calculating and visualizing the signal attributes are described using mathematical methods and tools for mining signal. The model of patterns searching for comparison purposes of accuracy of methods was constructed, problems of a clustering and classification of data are solved, the program of visualization of data is also developed. This approach gives the largest accuracy in a task of the intellectual analysis that is confirmed in this work. Considered visualization and analysis techniques are also applicable to the multi-dimensional signals of a different kind
    • …
    corecore