72,747 research outputs found
Introducing legacy program scripting to molecular biology toolkit (MBT)
Successful navigating of the ever-changing landscape of molecular visualization programs requires a common thread that can offer users an oasis from the maelstrom of new languages, proprietary applications, and miniscule software life-cycles. Crossapplication scripting remains the benchmark, allowing scientists and researchers to speak a common tongue. The introduction of a new, powerful visualization language, Molecular Biology Toolkit (MBT), has tempted many users to abandon previous methodologies and adopt a new mode of research. MBT, however, is not without drawbacks. Its lack of scripting capabilities creates unmanageable complexity for unsophisticated end-users, namely those without the ability to program. MBT, thus, lacks the basic handholds for its widespread acceptance in the molecular visualization community. As a toolkit package without its own mode of execution, its design challenges users to develop their own customized features and applications. However, able to contribute as a text based virtual molecular collection or a fully rendered 3D molecular representation, MBT has the tools researchers want in a new visualization program. Using JavaCC to parse legacy commands and in turn executing MBT methods all from a single, simple command, I have reintroduced scripting to the modern molecular visualization landscape. Combining these two programs, this project takes steps to encourage the exciting molecular manipulations capable in MBT while bridging to a friendly, user-centric scripting patterns required by end-users not entrenched in software development
VOICE: Visual Oracle for Interaction, Conversation, and Explanation
We present VOICE, a novel approach for connecting large language models'
(LLM) conversational capabilities with interactive exploratory visualization.
VOICE introduces several innovative technical contributions that drive our
conversational visualization framework. Our foundation is a pack-of-bots that
can perform specific tasks, such as assigning tasks, extracting instructions,
and generating coherent content. We employ fine-tuning and prompt engineering
techniques to tailor bots' performance to their specific roles and accurately
respond to user queries, and a new prompt-based iterative scene-tree generation
establishes a coupling with a structural model. Our text-to-visualization
method generates a flythrough sequence matching the content explanation.
Finally, 3D natural language interaction provides capabilities to navigate and
manipulate the 3D models in real-time. The VOICE framework can receive
arbitrary voice commands from the user and responds verbally, tightly coupled
with corresponding visual representation with low latency and high accuracy. We
demonstrate the effectiveness and high generalizability potential of our
approach by applying it to two distinct domains: analyzing three 3D molecular
models with multi-scale and multi-instance attributes, and showcasing its
effectiveness on a cartographic map visualization. A free copy of this paper
and all supplemental materials are available at https://osf.io/g7fbr/
MBEToolbox: a Matlab toolbox for sequence data analysis in molecular biology and evolution
BACKGROUND: MATLAB is a high-performance language for technical computing, integrating computation, visualization, and programming in an easy-to-use environment. It has been widely used in many areas, such as mathematics and computation, algorithm development, data acquisition, modeling, simulation, and scientific and engineering graphics. However, few functions are freely available in MATLAB to perform the sequence data analyses specifically required for molecular biology and evolution. RESULTS: We have developed a MATLAB toolbox, called MBEToolbox, aimed at filling this gap by offering efficient implementations of the most needed functions in molecular biology and evolution. It can be used to manipulate aligned sequences, calculate evolutionary distances, estimate synonymous and nonsynonymous substitution rates, and infer phylogenetic trees. Moreover, it provides an extensible, functional framework for users with more specialized requirements to explore and analyze aligned nucleotide or protein sequences from an evolutionary perspective. The full functions in the toolbox are accessible through the command-line for seasoned MATLAB users. A graphical user interface, that may be especially useful for non-specialist end users, is also provided. CONCLUSION: MBEToolbox is a useful tool that can aid in the exploration, interpretation and visualization of data in molecular biology and evolution. The software is publicly available at and
NaviCell: a web-based environment for navigation, curation and maintenance of large molecular interaction maps
Molecular biology knowledge can be systematically represented in a
computer-readable form as a comprehensive map of molecular interactions. There
exist a number of maps of molecular interactions containing detailed
description of various cell mechanisms. It is difficult to explore these large
maps, to comment their content and to maintain them. Though there exist several
tools addressing these problems individually, the scientific community still
lacks an environment that combines these three capabilities together. NaviCell
is a web-based environment for exploiting large maps of molecular interactions,
created in CellDesigner, allowing their easy exploration, curation and
maintenance. NaviCell combines three features: (1) efficient map browsing based
on Google Maps engine; (2) semantic zooming for viewing different levels of
details or of abstraction of the map and (3) integrated web-based blog for
collecting the community feedback. NaviCell can be easily used by experts in
the field of molecular biology for studying molecular entities of their
interest in the context of signaling pathways and cross-talks between pathways
within a global signaling network. NaviCell allows both exploration of detailed
molecular mechanisms represented on the map and a more abstract view of the map
up to a top-level modular representation. NaviCell facilitates curation,
maintenance and updating the comprehensive maps of molecular interactions in an
interactive fashion due to an imbedded blogging system. NaviCell provides an
easy way to explore large-scale maps of molecular interactions, thanks to the
Google Maps and WordPress interfaces, already familiar to many users. Semantic
zooming used for navigating geographical maps is adopted for molecular maps in
NaviCell, making any level of visualization meaningful to the user. In
addition, NaviCell provides a framework for community-based map curation.Comment: 20 pages, 5 figures, submitte
Dynamic Influence Networks for Rule-based Models
We introduce the Dynamic Influence Network (DIN), a novel visual analytics
technique for representing and analyzing rule-based models of protein-protein
interaction networks. Rule-based modeling has proved instrumental in developing
biological models that are concise, comprehensible, easily extensible, and that
mitigate the combinatorial complexity of multi-state and multi-component
biological molecules. Our technique visualizes the dynamics of these rules as
they evolve over time. Using the data produced by KaSim, an open source
stochastic simulator of rule-based models written in the Kappa language, DINs
provide a node-link diagram that represents the influence that each rule has on
the other rules. That is, rather than representing individual biological
components or types, we instead represent the rules about them (as nodes) and
the current influence of these rules (as links). Using our interactive DIN-Viz
software tool, researchers are able to query this dynamic network to find
meaningful patterns about biological processes, and to identify salient aspects
of complex rule-based models. To evaluate the effectiveness of our approach, we
investigate a simulation of a circadian clock model that illustrates the
oscillatory behavior of the KaiC protein phosphorylation cycle.Comment: Accepted to TVCG, in pres
Materials Informatics Transformer: A Language Model for Interpretable Materials Properties Prediction
Recently, the remarkable capabilities of large language models (LLMs) have
been illustrated across a variety of research domains such as natural language
processing, computer vision, and molecular modeling. We extend this paradigm by
utilizing LLMs for material property prediction by introducing our model
Materials Informatics Transformer (MatInFormer). Specifically, we introduce a
novel approach that involves learning the grammar of crystallography through
the tokenization of pertinent space group information. We further illustrate
the adaptability of MatInFormer by incorporating task-specific data pertaining
to Metal-Organic Frameworks (MOFs). Through attention visualization, we uncover
the key features that the model prioritizes during property prediction. The
effectiveness of our proposed model is empirically validated across 14 distinct
datasets, hereby underscoring its potential for high throughput screening
through accurate material property prediction
- …