45 research outputs found

    Wide-Angle Foveation for All-Purpose Use

    Get PDF
    This paper proposes a model of a wide-angle space-variant image that provides a guide for designing a fovea sensor. First, an advanced wide-angle foveated (AdWAF) model is formulated, taking all-purpose use into account. This proposed model uses both Cartesian (linear) coordinates and logarithmic coordinates in both planar projection and spherical projection. Thus, this model divides its wide-angle field of view into four areas, such that it can represent an image by various types of lenses, flexibly. The first simulation compares with other lens models, in terms of image height and resolution. The result shows that the AdWAF model can reduce image data by 13.5%, compared to a log-polar lens model, both having the same resolution in the central field of view. The AdWAF image is remapped from an actual input image by the prototype fovea lens, a wide-angle foveated (WAF) lens, using the proposed model. The second simulation compares with other foveation models used for the existing log-polar chip and vision system. The third simulation estimates a scale-invariant property by comparing with the existing fovea lens and the log-polar lens. The AdWAF model gives its planar logarithmic part a complete scale-invariant property, while the fovea lens has 7.6% error at most in its spherical logarithmic part. The fourth simulation computes optical flow in order to examine the unidirectional property when the fovea sensor by the AdWAF model moves, compared to the pinhole camera. The result obtained by using a concept of a virtual cylindrical screen indicates that the proposed model has advantages in terms of computation and application of the optical flow when the fovea sensor moves forward

    An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

    Full text link
    Wav2vec2 has achieved success in applying Transformer architecture and self-supervised learning to speech recognition. Recently, these have come to be used not only for speech recognition but also for the entire speech processing. This paper introduces an effective end-to-end speaker identification model applied Transformer-based contextual model. We explored the relationship between the hyper-parameters and the performance in order to discern the structure of an effective model. Furthermore, we propose a pooling method, Temporal Gate Pooling, with powerful learning ability for speaker identification. We applied Conformer as encoder and BEST-RQ for pre-training and conducted an evaluation utilizing the speaker identification of VoxCeleb1. The proposed method has achieved an accuracy of 87.1% with 28.5M parameters, demonstrating comparable precision to wav2vec2 with 317.7M parameters. Code is available at https://github.com/HarunoriKawano/speaker-identification-with-tgp.Comment: 5 pages, 3 figure

    Eccentricity estimator for wide-angle fovea sensor by FMI descriptor approach

    Get PDF
    This paper proposes a method for estimating eccentricity that corresponds to an incident angle to a fovea sensor. The proposed method applies Fourier-Mellin Invariant descriptor for estimating rotation, scale, and translation, by taking both geometrical distortion and non-uniform resolution of a space-variant image by the fovea sensor into account. The following 2 points are focused in this paper. One is to use multi-resolution images computed by discrete wavelet transform for reducing noise caused by foveation properly. Another is to use a variable window function (although the window function is generally used for reducing DFT leakage caused by both ends of a signal.) for changing an effective field of view (FOV) in order not to sacrifice high accuracy. The simulation compares the root mean square (RMS) of the foveation noise between uniform and non-uniform resolutions, when a resolution level and a FOV level are changed, respectively. Experimental results show that the proposed method is consistent with the wide-angle space-variant image by the fovea sensor, i.e., it does not sacrifice high accuracy in the central FOV

    Image Extraction by Wide Angle Foveated Lens for Overt-Attention

    Get PDF
    This paper defines Wide Angle Foveated (WAF) imaging. A proposed model combines Cartesian coordinate system, a log-polar coordinate system, and a unique camera model composed of planar projection and spherical projection for all-purpose use of a single imaging device. The central field-of-view (FOV) and intermediate FOV are given translation-invariance and, rotation and scale-invariance for pattern recognition, respectively. Further, the peripheral FOV is more useful for camera’s view direction control, because its image height is linear to an incident angle to the camera model’s optical center point. Thus, this imaging model improves its usability especially when a camera is dynamically moved, that is, overt-attention. Moreover, simulation results of image extraction show advantages of the proposed model, in view of its magnification factor of the central FOV, accuracy of scale-invariance and flexibility to describe other WAF vision sensors

    Machine Vision System to Induct Binocular Wide-Angle Foveated Information into Both the Human and Computers - Feature Generation Algorithm based on DFT for Binocular Fixation

    Get PDF
    This paper introduces a machine vision system, which is suitable for cooperative works between the human and computer. This system provides images inputted from a stereo camera head not only to the processor but also to the user’s sight as binocular wide-angle foveated (WAF) information, thus it is applicable for Virtual Reality (VR) systems such as tele-existence or training experts. The stereo camera head plays a role to get required input images foveated by special wide-angle optics under camera view direction control and 3D head mount display (HMD) displays fused 3D images to the user. Moreover, an analog video signal processing device much inspired from a structure of the human visual system realizes a unique way to provide WAF information to plural processors and the user. Therefore, this developed vision system is also much expected to be applicable for the human brain and vision research, because the design concept is to mimic the human visual system. Further, an algorithm to generate features using Discrete Fourier Transform (DFT) for binocular fixation in order to provide well-fused 3D images to 3D HMD is proposed. This paper examines influences of applying this algorithm to space variant images such as WAF images, based on experimental results

    Eccentricity estimator for wide-angle fovea sensor by FMI descriptor approach

    Get PDF
    This paper proposes a method for estimating eccentricity that corresponds to an incident angle to a fovea sensor. The proposed method applies Fourier-Mellin Invariant descriptor for estimating rotation, scale, and translation, by taking both geometrical distortion and non-uniform resolution of a space-variant image by the fovea sensor into account. The following 2 points are focused in this paper. One is to use multi-resolution images computed by discrete wavelet transform for reducing noise caused by foveation properly. Another is to use a variable window function (although the window function is generally used for reducing DFT leakage caused by both ends of a signal.) for changing an effective field of view (FOV) in order not to sacrifice high accuracy. The simulation compares the root mean square (RMS) of the foveation noise between uniform and non-uniform resolutions, when a resolution level and a FOV level are changed, respectively. Experimental results show that the proposed method is consistent with the wide-angle space-variant image by the fovea sensor, i.e., it does not sacrifice high accuracy in the central FOV

    Eccentricity Compensator for Log-Polar Sensor

    Get PDF
    his paper aims at acquiring robust rotation, scale, and translation-invariant feature from a space-variant image by a fovea sensor. A proposed model of eccentricity compensator corrects deformation that occurs in a log-polar image when the fovea sensor is not centered at a target, that is, when eccentricity exists. An image simulator in discrete space remaps a compensated log-polar image using this model. This paper proposes unreliable feature omission (UFO) that reduces local high frequency noise in the space-variant image using discrete wavelet transform. It discards coefficients when they are regarded as unreliable based on digitized errors of the input image. The first simulation mainly tests geometric performance of the compensator, in case without noise. This result shows the compensator performs well and its root mean square error (RMSE) changes only by up to 2.54 [%] in condition of eccentricity within 34.08[deg]. The second simulation applies UFO to the log-polar image remapped by the compensator, taking its space-variant resolution into account. The result draws a conclusion that UFO performs better in case with more white Gaussian noise (WGN), even if the resolution of the compensated log-polar image is not isotropic

    In Situ Enzyme Activity in the Dissolved and Particulate Fraction of the Fluid from Four Pitcher Plant Species of the Genus Nepenthes

    Get PDF
    The genus Nepenthes, a carnivorous plant, has a pitcher to trap insects and digest them in the contained fluid to gain nutrient. A distinctive character of the pitcher fluid is the digestive enzyme activity that may be derived from plants and dwelling microbes. However, little is known about in situ digestive enzymes in the fluid. Here we examined the pitcher fluid from four species of Nepenthes. High bacterial density was observed within the fluids, ranging from 7×106 to 2.2×108 cells ml−1. We measured the activity of three common enzymes in the fluid: acid phosphatases, β-d-glucosidases, and β-d-glucosaminidases. All the tested enzymes detected in the liquid of all the pitcher species showed activity that considerably exceeded that observed in aquatic environments such as freshwater, seawater, and sediment. Our results indicate that high enzyme activity within a pitcher could assist in the rapid decomposition of prey to maximize efficient nutrient use. In addition, we filtered the fluid to distinguish between dissolved enzyme activity and particle-bound activity. As a result, filtration treatment significantly decreased the activity in all enzymes, while pH value and Nepenthes species did not affect the enzyme activity. It suggested that enzymes bound to bacteria and other organic particles also would significantly contribute to the total enzyme activity of the fluid. Since organic particles are themselves usually colonized by attached and highly active bacteria, it is possible that microbe-derived enzymes also play an important role in nutrient recycling within the fluid and affect the metabolism of the Nepenthes pitcher plant
    corecore