Search CORE

19 research outputs found

XCluSim: a visual analytics tool for interactively comparing multiple clustering results of bioinformatics data

Author: Cho Young-Joon
Kim Bohyoung
Ko Bongkyung
L'Yi Sehi
Lee Jaeyong
Seo Jinwook
Shin DongHwa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/02/2017
Field of study

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided he original work is properly cited.Abstract Background The primary goal of pathway analysis using transcriptome data is to find significantly perturbed pathways. However, pathway analysis is not always successful in identifying pathways that are truly relevant to the context under study. A major reason for this difficulty is that a single gene is involved in multiple pathways. In the KEGG pathway database, there are 146 genes, each of which is involved in more than 20 pathways. Thus activation of even a single gene will result in activation of many pathways. This complex relationship often makes the pathway analysis very difficult. While we need much more powerful pathway analysis methods, a readily available alternative way is to incorporate the literature information. Results In this study, we propose a novel approach for prioritizing pathways by combining results from both pathway analysis tools and literature information. The basic idea is as follows. Whenever there are enough articles that provide evidence on which pathways are relevant to the context, we can be assured that the pathways are indeed related to the context, which is termed as relevance in this paper. However, if there are few or no articles reported, then we should rely on the results from the pathway analysis tools, which is termed as significance in this paper. We realized this concept as an algorithm by introducing Context Score and Impact Score and then combining the two into a single score. Our method ranked truly relevant pathways significantly higher than existing pathway analysis tools in experiments with two data sets. Conclusions Our novel framework was implemented as ContextTRAP by utilizing two existing tools, TRAP and BEST. ContextTRAP will be a useful tool for the pathway based analysis of gene expression data since the user can specify the context of the biological experiment in a set of keywords. The web version of ContextTRAP is available at http://biohealth.snu.ac.kr/software/contextTRA

SNU Open Repository and Archive

test project

Author: Sehi L'Yi
Publication venue: 'Center for Open Science'
Publication date: 01/04/2024
Field of study

OSF Preprints

Understanding Visualization Authoring Techniques for Genomics Data in the Context of Personas and Tasks

Author: Nils Gehlenborg
Sehi L'Yi
Publication venue: OSF
Publication date: 01/04/2024
Field of study

OSF Preprints

Gos: a declarative library for interactive genomics visualization in Python

Author: Nils Gehlenborg
Sehi L'Yi
Trevor Manz
Publication venue: Open Science Framework
Publication date: 27/06/2023
Field of study

Gos is a declarative library for Python designed to create interactive multi-scale visualizations of genomics and epigenomics data. It provides a consistent and simple API to author custom visualizations based on the Gosling visualization grammar. Gos abstracts away technical complexities involved with configuring web-based genome browsers and integrates seamlessly within computational notebooks environments to enable new interactive analysis workflows. Gos is released under the MIT License and available on the Python Package Index (PyPI). The source code is publically available on GitHub (https://github.com/gosling-lang/gos), and documentation with examples can be found at https://gosling-lang.github.io/gos

OSF Preprints

The Role of Visualization in Genomics Data Analysis Workflows: The Interview

Author: Nils Gehlenborg
Qianwen Wang
Sehi L'Yi
Publication venue: Open Science Framework
Publication date: 27/06/2023
Field of study

The diversity of genome-mapped data and analysis tasks makes it challenging for a single visualization tool to fulfill all visualization needs. To design a visualization tool that supports various genomics workflows of users, it is critical to first gain insights into the diverse workflows and the limitations of existing genomics tools for supporting them. In this paper, we conducted semi-structured interviews (N=9) to understand the role of visualization in genomics data analysis workflows. Our main goals were to identify various genomics workflows, from data analysis to visual exploration and presentation, and to observe challenges that genomics analysts encounter in these workflows when using existing tools. Through the interviews, we found several unique characteristics of genomics workflows, such as the use of multiple visualization tools and many repetitive tasks, which can significantly affect the overall performance. Based on our findings, we discuss implications for designing effective visualization authoring tools that tightly support genomics workflows, such as supporting automation and reproducibility

OSF Preprints

Digital Accessibility of Life Sciences Data Portals and Journal Websites‬

Author: Alexander Lex
Nils Gehlenborg
Sehi L'Yi
Thomas C. Smits
Publication venue: Open Science Framework
Publication date: 03/11/2023
Field of study

Enhancing the diversity and inclusion of the life sciences workforce has become an important problem as highlighted by many organizations in the US, including NIH, NHGRI, and NSF. People with visual impairments are one of the groups that face barriers to access to the biology workforce. To overcome this challenge, it is important to understand their current barriers in biological research and education. The most common assistive technology used by people with visual impairments is the screen reader (45.2%). However, multiple studies found that many websites largely fail to meet accessibility guidelines, making it challenging or even impossible for screen reader users to access existing resources. To help gain better insights into how well people with visual impairments can access existing biological resources, we evaluated the digital accessibility of two essential resources for data-driven studies—data portals and journal websites. Using an automated evaluation tool, we collected accessibility evaluation data for a large corpus of resources (N=3,943). In addition, we collected metadata of individual resources (e.g., geospatial, temporal, and impact score data) for a more insightful analysis. All datasets, as well as the entire source code, are available online on Zenodo and GitHub under a CC-BY and MIT license, respectively

OSF Preprints

GenoREC: A Recommendation System for Interactive Genomics Data Visualization

Author: Aditeya Pandey
Michelle Borkin
Nils Gehlenborg
Qianwen Wang
Sehi L'Yi
Publication venue: Open Science Framework
Publication date: 27/06/2023
Field of study

Interpretation of genomics data is critically reliant on the application of a wide range of visualization tools. A large number of visualization techniques for genomics data and different analysis tasks pose a significant challenge for genomics analysts: which visualization technique is most likely to help them generate insights into their data? Since genomics analysts typically have limited training in data visualization, their choices are often based on trial and error or guided by technical details, such as data formats that a specific tool can load. This approach prevents them from making effective visualization choices for the many combinations of data types and analysis questions that they encounter in their work. Visualization recommendation systems assist non-experts in creating data visualization by recommending appropriate visualizations based on the data and task characteristics. However, existing visualization recommendation systems are not designed to handle domain-specific problems. To address these challenges we designed GenoREC, a novel visualization recommendation system for genomics. GenoREC enables genomics analysts to select effective visualizations based on a description of their data and analysis tasks. Here we present the recommendation model that uses a knowledge-based method for choosing appropriate visualizations and a web application that enables analysts to input their requirements, explore recommended visualizations, and export them for their usage. Furthermore, we present the results of two user studies demonstrating that GenoREC recommends visualizations that are both accepted by domain experts and suited to address the given genomics analysis problem. All supplemental material is available on OSF: https://osf.io/y73pt

OSF Preprints

Enabling Multimodal User Interactions for Genomics Visualization Creation

Author: Man Qing Liang
Nils Gehlenborg
Qianwen Wang
Sehi L'Yi
Xiao Liu
Publication venue: Open Science Framework
Publication date: 27/06/2023
Field of study

Visualization plays an important role in extracting insights from complex and large-scale genomics data. Traditional graphical user interfaces (GUIs) offer limited flexibility for custom visualizations. Our prior work, Gosling, enables expressive visualization creation using a grammar-based approach, but beginners may face challenges in constructing complex visualizations. To address this, we explore multimodal interactions, including sketches, example images, and natural language inputs, to streamline visualization creation. Specifically, we customize two deep learning models (YOLO v7 and GPT3.5) to interpret user interactions and convert them into Gosling specifications. A workflow is proposed to progressively introduce and integrate multimodal interactions. We then present use cases demonstrating their effectiveness and identify challenges and opportunities for future research

OSF Preprints

Understanding Visualization Authoring Techniques for Genomics Data in the Context of Personas and Tasks

Author: Anna Vilanova
Astrid van den Brandt
Huyen N. Nguyen
Nils Gehlenborg
Sehi L'Yi
Publication venue: Open Science Framework
Publication date: 01/04/2024
Field of study

Genomics experts rely on visualization to extract and share insights from complex and large-scale datasets. Beyond off-the-shelf tools for data exploration, there is an increasing need for platforms that aid experts in authoring customized visualizations for both exploration and communication of insights. A variety of interactive techniques have been proposed for authoring data visualizations, such as template editing, shelf configuration, natural language input, and code editors. However, it remains unclear how genomics experts create visualizations and which techniques best support their visualization tasks and needs. To address this gap, we conducted two user studies with genomics researchers: (1) semi-structured interviews (n=20) to identify the tasks, user contexts, and current visualization authoring techniques and (2) an exploratory study (n=13) using visual probes to elicit users’ intents and desired techniques when creating visualizations. Our contributions include (1) a characterization of how visualization authoring is currently utilized in genomics visualization, identifying limitations and benefits in light of common criteria for authoring tools, and (2) generalizable design implications for genomics visualization authoring tools based on our findings on task- and user-specific usefulness of authoring techniques. All supplemental materials are available at https://osf.io/bdj4v

OSF Preprints