70 research outputs found
Improvement of Information Transfer Rates Using a Hybrid EEG-NIRS Brain-Computer Interface with a Short Trial Length: Offline and Pseudo-Online Analyses
Electroencephalography (EEG) and near-infrared spectroscopy (NIRS) are non-invasive neuroimaging methods that record the electrical and metabolic activity of the brain, respectively. Hybrid EEG-NIRS brain-computer interfaces (hBCIs) that use complementary EEG and NIRS information to enhance BCI performance have recently emerged to overcome the limitations of existing unimodal BCIs, such as vulnerability to motion artifacts for EEG-BCI or low temporal resolution for NIRS-BCI. However, with respect to NIRS-BCI, in order to fully induce a task-related brain activation, a relatively long trial length (≥10 s) is selected owing to the inherent hemodynamic delay that lowers the information transfer rate (ITR; bits/min). To alleviate the ITR degradation, we propose a more practical hBCI operated by intuitive mental tasks, such as mental arithmetic (MA) and word chain (WC) tasks, performed within a short trial length (5 s). In addition, the suitability of the WC as a BCI task was assessed, which has so far rarely been used in the BCI field. In this experiment, EEG and NIRS data were simultaneously recorded while participants performed MA and WC tasks without preliminary training and remained relaxed (baseline; BL). Each task was performed for 5 s, which was a shorter time than previous hBCI studies. Subsequently, a classification was performed to discriminate MA-related or WC-related brain activations from BL-related activations. By using hBCI in the offline/pseudo-online analyses, average classification accuracies of 90.0 ± 7.1/85.5 ± 8.1% and 85.8 ± 8.6/79.5 ± 13.4% for MA vs. BL and WC vs. BL, respectively, were achieved. These were significantly higher than those of the unimodal EEG- or NIRS-BCI in most cases. Given the short trial length and improved classification accuracy, the average ITRs were improved by more than 96.6% for MA vs. BL and 87.1% for WC vs. BL, respectively, compared to those reported in previous studies. The suitability of implementing a more practical hBCI based on intuitive mental tasks without preliminary training and with a shorter trial length was validated when compared to previous studies
Simultaneous Acquisition of EEG and NIRS during Cognitive Tasks for an Open Access Dataset
We provide an open access multimodal brain-imaging dataset of simultaneous electroencephalography (EEG) and near-infrared spectroscopy (NIRS) recordings. Twenty-six healthy participants performed three cognitive tasks: 1) n-back (0-, 2- and 3-back), 2) discrimination/selection response task (DSR) and 3) word generation (WG) tasks. The data provided includes: 1) measured data, 2) demographic data, and 3) basic analysis results. For n-back (dataset A) and DSR tasks (dataset B), event-related potential (ERP) analysis was performed, and spatiotemporal characteristics and classification results for “target” vs. “non-target” (dataset A) and symbol “O” vs. symbol “X” (dataset B) are provided. Time-frequency analysis was performed to show the EEG spectral power to differentiate the task-relevant activations. Spatiotemporal characteristics of hemodynamic responses are also shown. For the WG task (dataset C), the EEG spectral power and spatiotemporal characteristics of hemodynamic responses are analyzed, and the potential merit of hybrid EEG-NIRS BCIs was validated with respect to classification accuracy. We expect that the dataset provided will facilitate performance evaluation and comparison of many neuroimaging analysis techniques
Aligning Large Language Models via Fine-grained Supervision
Pre-trained large-scale language models (LLMs) excel at producing coherent
articles, yet their outputs may be untruthful, toxic, or fail to align with
user expectations. Current approaches focus on using reinforcement learning
with human feedback (RLHF) to improve model alignment, which works by
transforming coarse human preferences of LLM outputs into a feedback signal
that guides the model learning process. However, because this approach operates
on sequence-level feedback, it lacks the precision to identify the exact parts
of the output affecting user preferences. To address this gap, we propose a
method to enhance LLM alignment through fine-grained token-level supervision.
Specifically, we ask annotators to minimally edit less preferred responses
within the standard reward modeling dataset to make them more favorable,
ensuring changes are made only where necessary while retaining most of the
original content. The refined dataset is used to train a token-level reward
model, which is then used for training our fine-grained Proximal Policy
Optimization (PPO) model. Our experiment results demonstrate that this approach
can achieve up to an absolute improvement of in LLM performance, in
terms of win rate against the reference model, compared with the traditional
PPO model
Lessons learned from the early performance evaluation of Intel Optane DC Persistent Memory in DBMS
Non-volatile memory (NVM) is an emerging technology, which has the
persistence characteristics of large capacity storage devices(e.g., HDDs and
SSDs), while providing the low access latency and byte-addressablity of
traditional DRAM memory. This unique combination of features open up several
new design considerations when building database management systems (DBMSs),
such as replacing DRAM (as the main working space memory) or block devices (as
the persistent storage), or complementing both at the same time for several
DBMS components (such as access methods,storage engine, buffer management,
logging/recovery, etc).
However, interacting with NVM requires changes to application software to
best use the device (e.g. mmap and clflush of small cache lines instead of
write and fsync of large page buffers). Before introducing (potentially major)
code changes to the DBMS for NVM, developers need a clear understanding of NVM
performance in various conditions to help make better design choices.
In this paper, we provide extensive performance evaluations conducted with a
recently released NVM device, Intel Optane DC Persistent Memory (PMem), under
different configurations with several micro-benchmark tools. Further, we
evaluate OLTP and OLAP database workloads (i.e., TPC-C and TPC-H) with
Microsoft SQL Server 2019 when using the NVM device as an in-memory buffer pool
or persistent storage. From the lessons learned we share some recommendations
for future DBMS design with PMem, e.g.simple hardware or software changes are
not enough for the best use of PMem in DBMSs
Better database cost/performance via batched I/O on programmable SSD
Data should be placed at the most cost- and performance-effective tier in the storage hierarchy. While performance and cost decrease with distance from the CPU, the cost/performance trade-off depends on how efficiently data can be moved across tiers. Log structuring improves this cost/performance by writing batches of pages from main memory to secondary storage using a conventional block-at-a-time I/O interface. However, log structuring incurs overhead in the form of recovery and garbage collection. With computational Solid-State Drives, it is now possible to design a storage interface that minimizes this overhead. In this paper, we offload log structuring from the CPU to the SSD. We define a new batch I/O storage interface and we design a Flash Translation Layer that takes care of log structuring on the SSD side. This removes the CPU computational and I/O load associated with recovery and garbage collection. We compare the performance of the Bw-tree key-value store with its LLAMA host-based log structuring to the same key-value software stack executing on a computational SSD equipped with a batch I/O interface. Our experimental results show the benefits of eliminating redundancies, minimizing interactions across storage layers, and avoiding the CPU cost of providing log structuring.N
Accelerating Large-Scale Graph-based Nearest Neighbor Search on a Computational Storage Platform
K-nearest neighbor search is one of the fundamental tasks in various
applications and the hierarchical navigable small world (HNSW) has recently
drawn attention in large-scale cloud services, as it easily scales up the
database while offering fast search. On the other hand, a computational storage
device (CSD) that combines programmable logic and storage modules on a single
board becomes popular to address the data bandwidth bottleneck of modern
computing systems. In this paper, we propose a computational storage platform
that can accelerate a large-scale graph-based nearest neighbor search algorithm
based on SmartSSD CSD. To this end, we modify the algorithm more amenable on
the hardware and implement two types of accelerators using HLS- and RTL-based
methodology with various optimization methods. In addition, we scale up the
proposed platform to have 4 SmartSSDs and apply graph parallelism to boost the
system performance further. As a result, the proposed computational storage
platform achieves 75.59 query per second throughput for the SIFT1B dataset at
258.66W power dissipation, which is 12.83x and 17.91x faster and 10.43x and
24.33x more energy efficient than the conventional CPU-based and GPU-based
server platform, respectively. With multi-terabyte storage and custom
acceleration capability, we believe that the proposed computational storage
platform is a promising solution for cost-sensitive cloud datacenters.Comment: Extension of FCCM 20201 and Accepted in Transaction on Computer
A peptide encoded by a highly conserved gene belonging to the genus Streptomyces shows antimicrobial activity against plant pathogens
The genus Streptomyces has been unceasingly highlighted for the versatility and diversity of the antimicrobial agents they produce. Moreover, it is a heavily sequenced taxon in the phylum Actinobacteria. In this study, 47 sequence profiles were identified as proteins highly conserved within the genus Streptomyces. Significant hits to the 38 profiles were found in more than 2000 Streptomyces genomes, 11 of which were further conserved in more than 90% of Actinobacterial genomes analyzed. Only a few genes corresponding to these sequence profiles were functionally characterized, which play regulatory roles in the morphology and biosynthesis of antibiotics. Here a highly conserved sequence, namely, SHC-AMP (Streptomyces highly conserved antimicrobial peptide), which exhibited antimicrobial activity against bacterial and fungal plant pathogens, was reported. In particular, Arabidopsis thaliana was effectively protected against infection with Pseudomonas syringae pv. tomato DC3000 by treatment with this peptide. Results indicated the potential application of this peptide as an antimicrobial agent for control of plant diseases. Our results suggest putative target genes for controlling Streptomyces spp., including the one exhibiting antimicrobial activity against a wide range of phytopathogens
Fast Statistical Alignment
We describe a new program for the alignment of multiple biological sequences that is both statistically motivated and fast enough for problem sizes that arise in practice. Our Fast Statistical Alignment program is based on pair hidden Markov models which approximate an insertion/deletion process on a tree and uses a sequence annealing algorithm to combine the posterior probabilities estimated from these models into a multiple alignment. FSA uses its explicit statistical model to produce multiple alignments which are accompanied by estimates of the alignment accuracy and uncertainty for every column and character of the alignment—previously available only with alignment programs which use computationally-expensive Markov Chain Monte Carlo approaches—yet can align thousands of long sequences. Moreover, FSA utilizes an unsupervised query-specific learning procedure for parameter estimation which leads to improved accuracy on benchmark reference alignments in comparison to existing programs. The centroid alignment approach taken by FSA, in combination with its learning procedure, drastically reduces the amount of false-positive alignment on biological data in comparison to that given by other methods. The FSA program and a companion visualization tool for exploring uncertainty in alignments can be used via a web interface at http://orangutan.math.berkeley.edu/fsa/, and the source code is available at http://fsa.sourceforge.net/
- …
