1,686 research outputs found
Performance Evolution Blueprint: Understanding the Impact of Software Evolution on Performance
International audienceUnderstanding the root of a performance drop or improvement requires analyzing different program executions at a fine grain level. Such an analysis involves dedicated profiling and representation techniques. JProfiler and YourKit, two recognized code profilers fail, on both providing adequate metrics and visual representations, conveying a false sense of the performance variation root. We propose performance evolution blueprint, a visual support to precisely compare multiple software executions. Our blueprint is offered by Rizel, a code profiler to efficiently explore performance of a set of benchmarks against multiple software revisions
Syntactic and Semantic Analysis and Visualization of Unstructured English Texts
People have complex thoughts, and they often express their thoughts with complex sentences using natural languages. This complexity may facilitate efficient communications among the audience with the same knowledge base. But on the other hand, for a different or new audience this composition becomes cumbersome to understand and analyze. Analysis of such compositions using syntactic or semantic measures is a challenging job and defines the base step for natural language processing.
In this dissertation I explore and propose a number of new techniques to analyze and visualize the syntactic and semantic patterns of unstructured English texts.
The syntactic analysis is done through a proposed visualization technique which categorizes and compares different English compositions based on their different reading complexity metrics. For the semantic analysis I use Latent Semantic Analysis (LSA) to analyze the hidden patterns in complex compositions. I have used this technique to analyze comments from a social visualization web site for detecting the irrelevant ones (e.g., spam). The patterns of collaborations are also studied through statistical analysis.
Word sense disambiguation is used to figure out the correct sense of a word in a sentence or composition. Using textual similarity measure, based on the different word similarity measures and word sense disambiguation on collaborative text snippets from social collaborative environment, reveals a direction to untie the knots of complex hidden patterns of collaboration
Augmenting IDEs with Runtime Information for Software Maintenance
Object-oriented language features such as inheritance, abstract types, late-binding, or polymorphism lead to distributed and scattered code, rendering a software system hard to understand and maintain. The integrated development environment (IDE), the primary tool used by developers to maintain software systems, usually purely operates on static source code and does not reveal dynamic relationships between distributed source artifacts, which makes it difficult for developers to understand and navigate software systems. Another shortcoming of today's IDEs is the large amount of information with which they typically overwhelm developers. Large software systems encompass several thousand source artifacts such as classes and methods. These static artifacts are presented by IDEs in views such as trees or source editors. To gain an understanding of a system, developers have to open many such views, which leads to a workspace cluttered with different windows or tabs. Navigating through the code or maintaining a working context is thus difficult for developers working on large software systems. In this dissertation we address the question how to augment IDEs with dynamic information to better navigate scattered code while at the same time not overwhelming developers with even more information in the IDE views. We claim that by first reducing the amount of information developers have to deal with, we are subsequently able to embed dynamic information in the familiar source perspectives of IDEs to better comprehend and navigate large software spaces. We propose means to reduce or mitigate the information by highlighting relevant source elements, by explicitly representing working context, and by automatically housekeeping the workspace in the IDE. We then improve navigation of scattered code by explicitly representing dynamic collaboration and software features in the static source perspectives of IDEs. We validate our claim by conducting empirical experiments with developers and by analyzing recorded development sessions
Scalable Profiling and Visualization for Characterizing Microbiomes
Metagenomics is the study of the combined genetic material found in microbiome samples, and it serves as an instrument for studying microbial communities, their biodiversities, and the relationships to their host environments. Creating, interpreting, and understanding microbial community profiles produced from microbiome samples is a challenging task as it requires large computational resources along with innovative techniques to process and analyze datasets that can contain terabytes of information.
The community profiles are critical because they provide information about what microorganisms are present in the sample, and in what proportions. This is particularly important as many human diseases and environmental disasters are linked to changes in microbiome compositions.
In this work we propose novel approaches for the creation and interpretation of microbial community profiles. This includes: (a) a cloud-based, distributed computational system that generates detailed community profiles by processing large DNA sequencing datasets against large reference genome collections, (b) the creation of Microbiome Maps: interpretable, high-resolution visualizations of community profiles, and (c) a machine learning framework for characterizing microbiomes from the Microbiome Maps that delivers deep insights into microbial communities.
The proposed approaches have been implemented in three software solutions: Flint, a large scale profiling framework for commercial cloud systems that can process millions of DNA sequencing fragments and produces microbial community profiles at a very low cost; Jasper, a novel method for creating Microbiome Maps, which visualizes the abundance profiles based on the Hilbert curve; and Amber, a machine learning framework for characterizing microbiomes using the Microbiome Maps generated by Jasper with high accuracy.
Results show that Flint scales well for reference genome collections that are an order of magnitude larger than those used by competing tools, while using less than a minute to profile a million reads on the cloud with 65 commodity processors. Microbiome maps produced by Jasper are compact, scalable representations of extremely complex microbial community profiles with numerous demonstrable advantages, including the ability to display latent relationships that are hard to elicit. Finally, experiments show that by using images as input instead of unstructured tabular input, the carefully engineered software, Amber, can outperform other sophisticated machine learning tools available for classification of microbiomes
Visualisation of Large-Scale Call-Centre Data
The contact centre industry employs 4% of the entire United King-dom and United States’ working population and generates gigabytes of operational data that require analysis, to provide insight and to improve efficiency. This thesis is the result of a collaboration with QPC Limited who provide data collection and analysis products for call centres. They provided a large data-set featuring almost 5 million calls to be analysed. This thesis utilises novel visualisation techniques to create tools for the exploration of the large, complex call centre data-set and to facilitate unique observations into the data.A survey of information visualisation books is presented, provid-ing a thorough background of the field. Following this, a feature-rich application that visualises large call centre data sets using scatterplots that support millions of points is presented. The application utilises both the CPU and GPU acceleration for processing and filtering and is exhibited with millions of call events.This is expanded upon with the use of glyphs to depict agent behaviour in a call centre. A technique is developed to cluster over-lapping glyphs into a single parent glyph dependant on zoom level and a customizable distance metric. This hierarchical glyph repre-sents the mean value of all child agent glyphs, removing overlap and reducing visual clutter. A novel technique for visualising individually tailored glyphs using a Graphics Processing Unit is also presented, and demonstrated rendering over 100,000 glyphs at interactive frame rates. An open-source code example is provided for reproducibility.Finally, a novel interaction and layout method is introduced for improving the scalability of chord diagrams to visualise call transfers. An exploration of sketch-based methods for showing multiple links and direction is made, and a sketch-based brushing technique for filtering is proposed. Feedback from domain experts in the call centre industry is reported for all applications developed
How Visualization Supports the Daily Work in Traditional Humanities on the Example of Visual Analysis Case Studies
Attempts to convince humanities scholars of digital approaches are met with
resistance, often. The so-called Digitization Anxiety is the phenomenon that
describes the fear of many traditional scientists of being replaced by digital
processes. This hinders not only the progress of the scientific domains themselves
– since a lot of digital potential is missing – but also makes the everyday work
of researchers unnecessarily difficult. Over the past eight years, we have
made various attempts to walk the tightrope between 'How can we help
traditional humanities to exploit their digital potential?' and 'How can we
make them understand that their expertise is not replaced by digital means, but
complemented?' We will present our successful interdisciplinary collaborations:
How they came about, how they developed, and the problems we encountered. In
the first step, we will look at the theoretical basics, which paint a comprehensive
picture of the digital humanities and introduces us to the topic of visualization.
The field of visualization has shown a special ability: It manages to walk the
tightrope and thus keeps digitization anxiety at bay, while not only making it
easier for scholars to access their data, but also enabling entirely new research
questions. After an introduction to our interdisciplinary collaborations with
the Musical Instrument Museum of Leipzig University, as well as with the
Bergen-Belsen Memorial, we will present a series of user scenarios that we
have collected in the course of 13 publications. These show our cooperation
partners solving different research tasks, which we classify using Brehmer and
Munzner’s Task Classification. In this way, we show that we provide researchers
with a wide range of opportunities: They can answer their traditional research
questions – and in some cases verify long-standing hypotheses about the data
for the first time – but also develop their own interest in previously impossible,
new research questions and approaches. Finally, we conclude our insights on
individual collaborative ideas with perspectives on our newest projects. These
have risen from the growing interest of collaborators in the methods we deliver.
For example, we get insights into the music of real virtuosos of the 20th century.
The necessary music storage media can be heard for the first time through
digital tools without risking damage to the old material. In addition, we can
provide computer-aided analysis capabilities that help musicologists in their work.
In the course of the visualization project at the Bergen-Belsen memorial, we
will see that what was once a small diary project has grown into a multimodal
and international project with institutions of culture and science from eight
countries. This is dedicated not only to the question of preserving cultural
objects from Nazi persecution contexts but also to modern ways of disseminating
and processing knowledge around this context. Finally, we will compile our
experience and accumulated knowledge in the form of problems and challenges
at the border between computer science and traditional humanities. These will
serve as preparation and assistance for future and current interested parties of
such interdisciplinary collaborative project
U-TRACER® - o uso das tecnologias da comunicação no ensino superior - uma ferramenta de visualização de informação para o contexto do ensino superior público português
Doutoramento em Multimédia em EducaçãoInformation Visualization is gradually emerging to assist the representation and
comprehension of large datasets about Higher Education Institutions, making
the data more easily understood. The importance of gaining insights and
knowledge regarding higher education institutions is little disputed. Within this
knowledge, the emerging and urging area in need of a systematic
understanding is the use of communication technologies, area that is having a
transformative impact on educational practices worldwide.
This study focused on the need to visually represent a dataset about how
Portuguese Public Higher Education Institutions are using Communication
Technologies as a support to teaching and learning processes. Project
TRACER identified this need, regarding the Portuguese public higher education
context, and carried out a national data collection. This study was developed
within project TRACER, and worked with the dataset collected in order to
conceptualize an information visualization tool U-TRACER®.
The main goals of this study related to: conceptualization of the information
visualization tool U-TRACER®, to represent the data collected by project
TRACER; understand higher education decision makers perception of
usefulness regarding the tool.
The goals allowed us to contextualize the phenomenon of information
visualization tools regarding higher education data, realizing the existing trends.
The research undertaken was of qualitative nature, and followed the method of
case study with four moments of data collection.The first moment regarded the conceptualization of the U-TRACER®, with two
focus group sessions with Higher Education professionals, with the aim of
defining the interaction features the U-TRACER® should offer. The second data
collection moment involved the proposal of the graphical displays that would
represent the dataset, which reading effectiveness was tested by end-users.
The third moment involved the development of a usability test to the UTRACER
® performed by higher education professionals and which resulted in
the proposal of improvements to the final prototype of the tool. The fourth
moment of data collection involved conducting exploratory, semi-structured
interviews, to the institutional decision makers regarding their perceived
usefulness of the U-TRACER®.
We consider that the results of this study contribute towards two moments of
reflection. The challenges of involving end-users in the conceptualization of an
information visualization tool; the relevance of effective visual displays for an
effective communication of the data and information. The second relates to the
reflection about how the higher education decision makers, stakeholders of the
U-TRACER® tool, perceive usefulness of the tool, both for communicating their
institutions data and for benchmarking exercises, as well as a support for
decision processes. Also to reflect on the main concerns about opening up data
about higher education institutions in a global market.A Visualização de Informação emerge gradualmente como uma área que
assiste a representação e a compreensão de dados sobre as instituições de
Ensino Superior. Esta compreensão e conhecimento aprofundado sobre as
instituições de Ensino Superior tem uma importância internacional
reconhecida. Uma das áreas emergentes do Ensino Superior, com um impacte
transformador das práticas educativas em todo o mundo e que urge conhecer
e compreender de forma sistematizada, relaciona-se com o uso das
Tecnologias da Comunicação no suporte às práticas pedagógicas.
No foco deste trabalho está a necessidade de representar visualmente um
conjunto de dados recolhido no âmbito do projeto TRACER, sobre e o uso que
as Instituições de Ensino Superior Público Português fazem das Tecnologias
da Comunicação como suporte aos processos de ensino e aprendizagem. O
projeto TRACER identificou esta necessidade e fez uma recolha de dados a
nível nacional. Este estudo desenvolveu-se no âmbito deste projeto, e utilizou
os dados recolhidos com o objetivo de conceptualizar uma ferramenta de
visualização de informação - U-TRACER® - que daria visibilidade a esses
dados.
Os principais objetivos deste estudo prendem-se com: a conceptualização da
ferramenta de visualização de informação denominada U-TRACER®, para o
contexto do Ensino Superior Português; a compreensão sobre a utilidade desta
plataforma para decisores das instituições de Ensino Superior Português, no
suporte a processos de tomada de decisão.
Os objetivos permitiram contextualizar o fenómeno das ferramentas de
visualização da informação com dados sobre instituições de Ensino Superior,
perceber as tendências de uso da visualização de informação nesse contexto.A investigação de natureza qualitativa, seguiu princípios de Investigação &
Desenvolvimento adotando o método de estudo de caso desenvolvido em
quatro fases de recolha de dados. A primeira fase prendeu-se com a
conceptualização da plataforma U-TRACER®, tendo-se desenvolvido duas
sessões de focus group com profissionais do ensino superior, com o objetivo
de conhecer de definir os requisitos de interação da ferramenta. A segunda
fase deu origem à proposta de representação gráfica dos dados recolhidos no
Âmbito do projeto TRACER “O uso das Tecnologias da Comunicação no
Ensino Superior Público Português”, e um teste à eficácia de leitura dos
gráficos propostos. A terceira fase envolveu um teste de usabilidade à
ferramenta U-TRACER®, por profissionais do Ensino Superior enquanto
utilizadores finais, tendo resultado na proposta de melhorias ao protótipo final.
A quarta fase de recolha de dados envolveu a realização de entrevistas semiestruturadas,
realizadas a decisores de Instituições de Ensino Superior Público
Português, com o objetivo de compreender a sua perceção relativamente à
utilidade da U-TRACER®.
Considera-se que os resultados deste estudo contribuíram para a área da
visualização de informação como suporte à representação de dados sobre o
Ensino Superior; refletir sobre a necessidade de envolvimento dos utilizadores
finais no processo de conceptualização da ferramenta; a importância da
representação gráfica na comunicação eficaz da informação; e conhecer a
perceção dos decisores das instituições do ensino superior sobre a utilidade
desta ferramenta utilizada como meio para a comunicação de informação
sobre a sua instituição, como exercício de benchmarking, e a sua utilidade
como suporte a processos informação e decisão que envolvem o uso das
Tecnologias da Comunicação. Este estudo contribui ainda para a reflexão
sobre a abertura de dados de instituições de Ensino Superior num mercado
global
Visualization of biological data: Infrastructure, design and application
Visualization is an important component of biological data analysis. Ideally, visual methods are tightly integrated with analysis methods, so that it is seamless to plot data from different intermediate stages of the analysis. Bioconductor provides a substantial analysis platform, but limited tools for genomic data visualization. Visual tools for genomic data, eg GenomeView, IGV, IGB, primarily are detached from the analysis engine. This research fills this gap, by developing visualization methods that are integrated into the Bioconductor suite. There are three main components of the research:
* New visual tools for genomic data that utilize the latest research in visualization.
* Infrastructure development to support the visual tools, and analysis of other types of biological data.
* Application of the visualization methods to the analysis of RNA-seq and DNA-seq data
- …