45 research outputs found
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
For decades, context-dependent phonemes have been the dominant sub-word unit
for conventional acoustic modeling systems. This status quo has begun to be
challenged recently by end-to-end models which seek to combine acoustic,
pronunciation, and language model components into a single neural network. Such
systems, which typically predict graphemes or words, simplify the recognition
process since they remove the need for a separate expert-curated pronunciation
lexicon to map from phoneme-based units to words. However, there has been
little previous work comparing phoneme-based versus grapheme-based sub-word
units in the end-to-end modeling framework, to determine whether the gains from
such approaches are primarily due to the new probabilistic model, or from the
joint learning of the various components with grapheme-based units.
In this work, we conduct detailed experiments which are aimed at quantifying
the value of phoneme-based pronunciation lexica in the context of end-to-end
models. We examine phoneme-based end-to-end models, which are contrasted
against grapheme-based ones on a large vocabulary English Voice-search task,
where we find that graphemes do indeed outperform phonemes. We also compare
grapheme and phoneme-based approaches on a multi-dialect English task, which
once again confirm the superiority of graphemes, greatly simplifying the system
for recognizing multiple dialects
EXIT: Extrapolation and Interpolation-based Neural Controlled Differential Equations for Time-series Classification and Forecasting
Deep learning inspired by differential equations is a recent research trend
and has marked the state of the art performance for many machine learning
tasks. Among them, time-series modeling with neural controlled differential
equations (NCDEs) is considered as a breakthrough. In many cases, NCDE-based
models not only provide better accuracy than recurrent neural networks (RNNs)
but also make it possible to process irregular time-series. In this work, we
enhance NCDEs by redesigning their core part, i.e., generating a continuous
path from a discrete time-series input. NCDEs typically use interpolation
algorithms to convert discrete time-series samples to continuous paths.
However, we propose to i) generate another latent continuous path using an
encoder-decoder architecture, which corresponds to the interpolation process of
NCDEs, i.e., our neural network-based interpolation vs. the existing explicit
interpolation, and ii) exploit the generative characteristic of the decoder,
i.e., extrapolation beyond the time domain of original data if needed.
Therefore, our NCDE design can use both the interpolated and the extrapolated
information for downstream machine learning tasks. In our experiments with 5
real-world datasets and 12 baselines, our extrapolation and interpolation-based
NCDEs outperform existing baselines by non-trivial margins.Comment: main 8 page
Characteristics of Human Brain Activity during the Evaluation of Service-to-Service Brand Extension
Brand extension is a marketing strategy to apply the previously established brand name into new goods or service. A number of studies have reported the characteristics of human event-related potentials (ERPs) in response to the evaluation of goods-to-goods brand extension. In contrast, human brain responses to the evaluation of service extension are relatively unexplored. The aim of this study was investigating cognitive processes underlying the evaluation of service-to-service brand extension with electroencephalography (EEG). A total of 56 text stimuli composed of service brand name (S1) followed by extended service name (S2) were presented to participants. The EEG of participants was recorded while participants were asked to evaluate whether a given brand extension was acceptable or not. The behavioral results revealed that participants could evaluate brand extension though they had little knowledge about the extended services, indicating the role of brand in the evaluation of the services. Additionally, we developed a method of grouping brand extension stimuli according to the fit levels obtained from behavioral responses, instead of grouping of stimuli a priori. The ERP analysis identified three components during the evaluation of brand extension: N2, P300, and N400. No difference in the N2 amplitude was found among the different levels of a fit between S1 and S2. The P300 amplitude for the low level of fit was greater than those for higher levels (p < 0.05). The N400 amplitude was more negative for the mid- and high-level fits than the low level. The ERP results of P300 and N400 indicate that the early stage of brain extension evaluation might first detect low-fit brand extension as an improbable target followed by the late stage of the integration of S2 into S1. Along with previous findings, our results demonstrate different cognitive evaluation of service-to-service brand extension from goods-to-goods
????????? ?????? ????????? ????????? ?????? ????????? ?????? ?????? ?????? ?????? ??????
Department of Biomedical Engineering (Human Factors Engineering)We encounter a lot of choices every day based on subjective values that we assign to the choice alternatives. However, we cannot put the same amount of effort into every single choice due to limited time and cognitive resources. These capacity limitations would lead us to be efficient in value-based decision-making. In this dissertation, I proposed two accounts on efficient ways of making value-based choices, the precision of value representation and efficient information acquisition during choice process.
Many researchers have reported that value differences are encoded to make decisions, and thus it is conceivable that humans would take an efficient strategy for distinguishing one from another to select the best one. This would be affected by how precisely the values are represented, though the dimension of value differences might not be completely consistent with that of representation on each value. As in the sensory neural circuits, humans may preferentially process statistically likely (e.g, frequent in the environment) and/or biologically meaningful signals, known as efficient coding. In value-based decision-making within the same category, the high-valued items would be more likely to be considered and much beneficial to decision-makers than low-valued ones. Thus, I speculated that a precise representation would be observed in the high-valued items. However, when there are too many alternatives to be evaluated, humans are likely to make a quick and adequate choice to save time and energy. Thus, instead of having precise representation, they would gather information efficiently (e.g., selective attention, the amount of deliberation, etc) from alternatives.
The first study aimed to investigate whether high values would be more precisely represented than low values and how the precision of value representation would be related to the choice performance. I conducted human behavior experiments using sets of binary choices of snacks and observed that participants had more precise representation on high-valued items than low ones. In addition, they made faster and more accurate choices for the high-valued pairs than low-valued ones only when the value differences were large (i.e., the interaction effect of value magnitude and value difference). Then, to prove the precision of value representation would determine the choice reaction time and accuracy, I simulated the data using the sequential sampling model with decision value using the precision of value representation as well as value difference. I further developed the alternative model based on previous findings that high attention on high values. As a result, only the proposed model could depict the interaction effect of value magnitude and value difference on choice performance whereas the alternative model could not. Also, the choice reaction times of group data were more similar to those of simulated data using the proposed model than using the alternative one. These findings imply that the precise value representation on the high-valued items would be an efficient way of making decisions, taking less time but making good choices.
The second study was conducted to confirm that the supposition that humans would have a sharp representation of the high-valued items due to efficient coding such that high-valued items have been frequently exposed and/or entail benefits compared to low-valued ones. I ran choice experiments as in the first study, with an additional task manipulating either choice frequency or choice outcome for low-valued pairs. In the former condition, participants were additionally exposed to the low-valued pairs, and in the latter, choice pairs with biased monetary rewards for the low-valued pairs by giving more rewards than high-valued pairs so that the choices between low-valued items could be beneficial as much as high-valued items. The items were valuated again after the manipulation. Any manipulation that sharpens the precision of low-valued items would be the basis of findings in the first study. The results showed that the repetitive exposure for the low-valued pairs made the value representation of the low-valued items much narrower than that of the high-valued ones. On the other hand, the more earnings from the low-valued pairs than the high ones did not show additional improvement of precision in low magnitude. Thus, values would become precisely represented via frequent exposure rather than benefits from the choice outcome.
The last study aimed to investigate the efficient way of making decisions under multi-alternative decision-making. Since the benefit from saving cognitive energy would weigh over the benefit from the accurate choice, decision-makers do not need to have precise value representations for alternatives. Instead, they would gather information efficiently (e.g., selective attention, quick accumulation, etc.). I measured eye movements and electroencephalography (EEG) during the choice task to address how each item would be processed during the choice. As a result, participants put less effort to compare alternatives (small number of fixations), even ignore more alternatives (low % alternatives considered), less time spent on the candidates (short fixation duration on unchosen items), and only stick to the one that they were about to choose (high dwell time on chosen item) for the high-valued alternatives than low-valued ones. In addition, the chosen item among high-valued alternatives was considered as less informative to participants (attenuated frontocentral N300/N400 effect) but with sustained attention (parieto-occipital sustained negativity). Behaviorally, inaccurate but fast choices were made among high-valued alternatives while participants were more enjoyable, confident, satisfying but not regretful compared to one for the low-valued alternatives. In sum, the fewer items were compared during the choice among high-valued alternatives, saving capacity at the expense of making the best choice. However, once attended, the high-valued item was considered as apparent one with attention sustained, indicating efficient information acquisition during decision-making.
In sum, three studies elucidate that high precision of value representation on high-alternatives due to efficient coding generates the fast and accurate choice for high-valued alternatives (Study 1 and Study 2) and efficiently allocated attention on multiple options is advantageous to save time and cognitive resources making adequate choices with efficiently acquired information from high-valued alternatives (Study 3). These findings shed light on the two underlying mechanisms, efficient coding and efficient information acquisition, to save time and cognitive resource during value-based decision-making.clos