1,022 research outputs found
Real-Time Decoding of an Integrate and Fire Encoder
Neuronal encoding models range from the detailed biophysically-based Hodgkin Huxley model, to the statistical linear time invariant model specifying firing rates in terms of the extrinsic signal. Decoding the former becomes intractable, while the latter does not adequately capture the nonlinearities present in the neuronal encoding system. For use in practical applications, we wish to record the output of neurons, namely spikes, and decode this signal fast in order to act on this signal, for example to drive a prosthetic device. Here, we introduce a causal, real-time decoder of the biophysically-based Integrate and Fire encoding neuron model. We show that the upper bound of the real-time reconstruction error decreases polynomially in time, and that the L[subscript 2] norm of the error is bounded by a constant that depends on the density of the spikes, as well as the bandwidth and the decay of the input signal. We numerically validate the effect of these parameters on the reconstruction error.National Science Foundation (U.S.) (Emerging Frontiers in Research and Innovation Grant 1137237
Wildfire Spread Prediction Using Attention Mechanisms In U-Net
An investigation into using attention mechanisms for better feature extraction in wildfire spread prediction models. This research examines the U-net architecture to achieve image segmentation, a process that partitions images by classifying pixels into one of two classes. The deep learning models explored in this research integrate modern deep learning architectures, and techniques used to optimize them. The models are trained on 12 distinct observational variables derived from the Google Earth Engine catalog. Evaluation is conducted with accuracy, Dice coefficient score, ROC-AUC, and F1-score. This research concludes that when augmenting U-net with attention mechanisms, the attention component improves feature suppression and recognition, improving overall performance. Furthermore, employing ensemble modeling reduces bias and variation, leading to more consistent and accurate predictions. When inferencing on wildfire propagation at 30-minute intervals, the architecture presented in this research achieved a ROC-AUC score of 86.2% and an accuracy of 82.1%
DESIGN FRAMEWORK FOR INTERNET OF THINGS BASED NEXT GENERATION VIDEO SURVEILLANCE
Modern artificial intelligence and machine learning opens up new era towards video
surveillance system. Next generation video surveillance in Internet of Things (IoT) environment is
an emerging research area because of high bandwidth, big-data generation, resource constraint
video surveillance node, high energy consumption for real time applications. In this thesis, various
opportunities and functional requirements that next generation video surveillance system should
achieve with the power of video analytics, artificial intelligence and machine learning are
discussed. This thesis also proposes a new video surveillance system architecture introducing fog
computing towards IoT based system and contributes the facilities and benefits of proposed system
which can meet the forthcoming requirements of surveillance. Different challenges and issues
faced for video surveillance in IoT environment and evaluate fog-cloud integrated architecture to
penetrate and eliminate those issues.
The focus of this thesis is to evaluate the IoT based video surveillance system. To this end,
two case studies were performed to penetrate values towards energy and bandwidth efficient video
surveillance system. In one case study, an IoT-based power efficient color frame transmission and
generation algorithm for video surveillance application is presented. The conventional way is to
transmit all R, G and B components of all frames. Using proposed technique, instead of sending
all components, first one color frame is sent followed by a series of gray-scale frames. After a
certain number of gray-scale frames, another color frame is sent followed by the same number of
gray-scale frames. This process is repeated for video surveillance system. In the decoder, color
information is formulated from the color frame and then used to colorize the gray-scale frames. In
another case study, a bandwidth efficient and low complexity frame reproduction technique that is
also applicable in IoT based video surveillance application is presented. Using the second
technique, only the pixel intensity that differs heavily comparing to previous frame’s
corresponding pixel is sent. If the pixel intensity is similar or near similar comparing to the
previous frame, the information is not transferred. With this objective, the bit stream is created for
every frame with a predefined protocol. In cloud side, the frame information can be reproduced by
implementing the reverse protocol from the bit stream.
Experimental results of the two case studies show that the IoT-based proposed approach
gives better results than traditional techniques in terms of both energy efficiency and quality of the video, and therefore, can enable sensor nodes in IoT to perform more operations with energy
constraints
Engineering evaluations and studies. Volume 3: Exhibit C
High rate multiplexes asymmetry and jitter, data-dependent amplitude variations, and transition density are discussed
You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models
RESTful APIs are popular web services, requiring documentation to ease their
comprehension, reusability and testing practices. The OpenAPI Specification
(OAS) is a widely adopted and machine-readable format used to document such
APIs. However, manually documenting RESTful APIs is a time-consuming and
error-prone task, resulting in unavailable, incomplete, or imprecise
documentation. As RESTful API testing tools require an OpenAPI specification as
input, insufficient or informal documentation hampers testing quality.
Recently, Large Language Models (LLMs) have demonstrated exceptional
abilities to automate tasks based on their colossal training data. Accordingly,
such capabilities could be utilized to assist the documentation and testing
process of RESTful APIs.
In this paper, we present RESTSpecIT, the first automated RESTful API
specification inference and black-box testing approach leveraging LLMs. The
approach requires minimal user input compared to state-of-the-art RESTful API
inference and testing tools; Given an API name and an LLM key, HTTP requests
are generated and mutated with data returned by the LLM. By sending the
requests to the API endpoint, HTTP responses can be analyzed for inference and
testing purposes. RESTSpecIT utilizes an in-context prompt masking strategy,
requiring no model fine-tuning. Our evaluation demonstrates that RESTSpecIT is
capable of: (1) inferring specifications with 85.05% of GET routes and 81.05%
of query parameters found on average, (2) discovering undocumented and valid
routes and parameters, and (3) uncovering server errors in RESTful APIs.
Inferred specifications can also be used as testing tool inputs
Automatic Rural Road Centerline Extraction from Aerial Images for a Forest Fire Support System
In the last decades, Portugal has been severely affected by forest fires which have caused
massive damage both environmentally and socially. Having a well-structured and precise
mapping of rural roads is critical to help firefighters to mitigate these events. The
traditional process of extracting rural roads centerlines from aerial images is extremely
time-consuming and tedious, because the mapping operator has to manually label the road
area and extract the road centerline.
A frequent challenge in the process of extracting rural roads centerlines is the high
amount of environmental complexity and road occlusions caused by vehicles, shadows, wild
vegetation, and trees, bringing heterogeneous segments that can be further improved. This
dissertation proposes an approach to automatically detect rural road segments as well as
extracting the road centerlines from aerial images.
The proposed method focuses on two main steps: on the first step, an architecture based
on a deep learning model (DeepLabV3+) is used, to extract the road features maps and
detect the rural roads. On the second step, the first stage of the process is an optimization
for improving road connections, as well as cleaning white small objects from the predicted
image by the neural network. Finally, a morphological approach is proposed to extract
the rural road centerlines from the previously detected roads by using thinning algorithms
like the Zhang-Suen and Guo-Hall methods.
With the automation of these two stages, it is now possible to detect and extract road
centerlines from complex rural environments automatically and faster than the traditional
ways, and possibly integrating that data in a Geographical Information System (GIS),
allowing the creation of real-time mapping applications.Nas últimas décadas, Portugal tem sido severamente afetado por fogos florestais, que têm
causado grandes estragos ambientais e sociais. Possuir um sistema de mapeamento de
estradas rurais bem estruturado e preciso é essencial para ajudar os bombeiros a mitigar
este tipo de eventos. Os processos tradicionais de extração de eixos de via em estradas
rurais a partir de imagens aéreas são extremamente demorados e fastidiosos. Um desafio
frequente na extração de eixos de via de estradas rurais é a alta complexidade dos ambientes
rurais e de estes serem obstruídos por veículos, sombras, vegetação selvagem e árvores,
trazendo segmentos heterogéneos que podem ser melhorados.
Esta dissertação propõe uma abordagem para detetar automaticamente estradas rurais,
bem como extrair os eixos de via de imagens aéreas.
O método proposto concentra-se em duas etapas principais: na primeira etapa é utilizada
uma arquitetura baseada em modelos de aprendizagem profunda (DeepLabV3+),
para detetar as estradas rurais. Na segunda etapa, primeiramente é proposta uma otimização
de intercessões melhorando as conexões relativas aos eixos de via, bem como a
remoção de pequenos artefactos que estejam a introduzir ruído nas imagens previstas pela
rede neuronal. E, por último, é utilizada uma abordagem morfológica para extrair os eixos
de via das estradas previamente detetadas recorrendo a algoritmos de esqueletização tais
como os algoritmos Zhang-Suen e Guo-Hall.
Automatizando estas etapas, é então possível extrair eixos de via de ambientes rurais
de grande complexidade de forma automática e com uma maior rapidez em relação aos
métodos tradicionais, permitindo, eventualmente, integrar os dados num Sistema de Informação
Geográfica (SIG), possibilitando a criação de aplicativos de mapeamento em tempo
real
Minimum-error, energy-constrained source coding by sensory neurons
Neural coding, the process by which neurons represent, transmit, and manipulate physical signals, is critical to the function of the nervous system. Despite years of study, neural coding is still not fully understood. Efforts to model neural coding could improve both the understanding of the nervous system and the design of artificial devices which interact with neurons. Sensory receptors and neurons transduce physical signals into a sequence of action potentials, called a spike train. The principles which underly the translation from signal to spike train are still under investigation.
From the perspective of an organism, neural codes which maximize the fidelity of the encoded signal (minimize encoding error), provide a competitive advantage. Selective pressure over evolutionary timescales has likely encouraged neural codes which minimize encoding error. At the same time, neural coding is metabolically expensive, which suggests that selective pressure would also encourage neural codes which minimize energy. Based on these assumptions, this work proposes a principle of neural coding which captures the trade-off between error and energy as a constrained optimization problem of minimizing encoding error while satisfying a constraint on energy.
A solution to the proposed optimization problem is derived in the limit of high spike-rates. The solution is to track the instantaneous reconstruction error, and to time spikes when the error crosses a threshold value. In the limit of large signals, the threshold level is a constant, but in general it is signal dependent. This coding model, called the neural source coder, implies neurons should be able to track reconstruction error internally, using the error signal to precisely time spikes. Mathematically, this model is similar to existing adaptive threshold models, but it provides a new way to understand coding by sensory neurons.
Comparing the predictions of the neural source coder to experimental data recorded from a peripheral neuron, the coder is able to predict spike times with considerable accuracy. Intriguingly, this is also true for a cortical neuron which has a low spike-rate. Reconstructions using the neural source coder show lower error than other spiking neuron models. The neural source coder also predicts the asymmetric spike-rate adaptation seen in sensory neurons (the primary-like response). An alternative expression for the neural source coder is as an instantaneous-rate coder of a rate function which depends on the signal, signal derivative, and encoding parameters. The instantaneous rate closely predicts experimental peri-stimulus time histograms.
The addition of a stochastic threshold to the neural source coder accounts for the spike-time jitter observed in experimental datasets. Jittered spike-trains from the neural source coder show long-term interval statistics which closely match experimental recordings from a peripheral neuron. Moreover, the spike trains have strongly anti-correlated intervals, a feature observed in experimental data. Interestingly, jittered spike-trains do not improve reconstruction error for an individual neuron, but reconstruction error is reduced in simulations of small populations of independent neurons. This suggests that jittered spike-trains provide a method for small populations of sensory neurons to improve encoding error.
Finally, a sound coding method for applying the neural source coder to timing spikes for cochlear implants is proposed. For each channel of the cochlear implant, a neural source coder can be used to time pulses to follow the patterns expected by peripheral neurons. Simulations show reduced reconstruction error compared to standard approaches using the signal envelope. Initial experiments with normal-hearing subjects show that a vocoder simulating this cochlear implant sound coding approach results in better speech perception thresholds when compared to a standard noise vocoder. Although further experiments with cochlear implant users are critical, initial results encourage further study of the proposed sound-coding method.
Overall, the proposed principle of minimum-error, energy-constrained encoding for sensory neural coding can be implemented by a spike-timing model with a feedback loop which computes reconstruction error. This model of neural source coding predicts a wide range of experimental observations from both peripheral and cortical neurons. The close agreement between experimental data and the predictions of the neural source coder suggests a fundamental principle underlying neural coding
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
End-to-end simultaneous speech translation (SimulST) outputs translation
while receiving the streaming speech inputs (a.k.a. streaming speech
translation), and hence needs to segment the speech inputs and then translate
based on the current received speech. However, segmenting the speech inputs at
unfavorable moments can disrupt the acoustic integrity and adversely affect the
performance of the translation model. Therefore, learning to segment the speech
inputs at those moments that are beneficial for the translation model to
produce high-quality translation is the key to SimulST. Existing SimulST
methods, either using the fixed-length segmentation or external segmentation
model, always separate segmentation from the underlying translation model,
where the gap results in segmentation outcomes that are not necessarily
beneficial for the translation process. In this paper, we propose
Differentiable Segmentation (DiSeg) for SimulST to directly learn segmentation
from the underlying translation model. DiSeg turns hard segmentation into
differentiable through the proposed expectation training, enabling it to be
jointly trained with the translation model and thereby learn
translation-beneficial segmentation. Experimental results demonstrate that
DiSeg achieves state-of-the-art performance and exhibits superior segmentation
capability.Comment: Accepted at ACL 2023 finding
- …