12,164 research outputs found

    Undergraduate Catalog of Studies, 2023-2024

    Get PDF

    Unsupervised learning-based approach for detecting 3D edges in depth maps

    Get PDF
    3D edge features, which represent the boundaries between different objects or surfaces in a 3D scene, are crucial for many computer vision tasks, including object recognition, tracking, and segmentation. They also have numerous real-world applications in the field of robotics, such as vision-guided grasping and manipulation of objects. To extract these features in the noisy real-world depth data, reliable 3D edge detectors are indispensable. However, currently available 3D edge detection methods are either highly parameterized or require ground truth labelling, which makes them challenging to use for practical applications. To this extent, we present a new 3D edge detection approach using unsupervised classification. Our method learns features from depth maps at three different scales using an encoder-decoder network, from which edge-specific features are extracted. These edge features are then clustered using learning to classify each point as an edge or not. The proposed method has two key benefits. First, it eliminates the need for manual fine-tuning of data-specific hyper-parameters and automatically selects threshold values for edge classification. Second, the method does not require any labelled training data, unlike many state-of-the-art methods that require supervised training with extensive hand-labelled datasets. The proposed method is evaluated on five benchmark datasets with single and multi-object scenes, and compared with four state-of-the-art edge detection methods from the literature. Results demonstrate that the proposed method achieves competitive performance, despite not using any labelled data or relying on hand-tuning of key parameters.</p

    Research on detection of transmission line corridor external force object containing random feature targets

    Get PDF
    With the objective of achieving “double carbon,” the power grid is placing greater importance on the security of transmission lines. The transmission line corridor has complex situations with external force targets and irregularly featured objects including smoke. For this reason, in this paper, the high-performance YOLOX-S model is selected for transmission line corridor external force object detection and improved to enhance model multi-object detection capability and irregular feature extraction capability. Firstly, to enhance the perception capability of external force objects in complex environment, we improve the feature output capability by adding the global context block after the output of the backbone. Then, we integrate convolutional block attention module into the feature fusion operation to enhance the recognition of objects with random features, among the external force targets by incorporating attention mechanism. Finally, we utilize EIoU to enhance the accuracy of object detection boxes, enabling the successful detection of external force targets in transmission line corridors. Through training and validating the model with the established external force dataset, the improved model demonstrates the capability to successfully detect external force objects and achieves favorable results in multi-class target detection. While there is improvement in the detection capability of external force objects with random features, the results indicate the need to enhance smoke recognition, particularly in further distinguishing targets between smoke and fog

    Neuromodulatory effects on early visual signal processing

    Get PDF
    Understanding how the brain processes information and generates simple to complex behavior constitutes one of the core objectives in systems neuroscience. However, when studying different neural circuits, their dynamics and interactions researchers often assume fixed connectivity, overlooking a crucial factor - the effect of neuromodulators. Neuromodulators can modulate circuit activity depending on several aspects, such as different brain states or sensory contexts. Therefore, considering the modulatory effects of neuromodulators on the functionality of neural circuits is an indispensable step towards a more complete picture of the brain’s ability to process information. Generally, this issue affects all neural systems; hence this thesis tries to address this with an experimental and computational approach to resolve neuromodulatory effects on cell type-level in a well-define system, the mouse retina. In the first study, we established and applied a machine-learning-based classification algorithm to identify individual functional retinal ganglion cell types, which enabled detailed cell type-resolved analyses. We applied the classifier to newly acquired data of light-evoked retinal ganglion cell responses and successfully identified their functional types. Here, the cell type-resolved analysis revealed that a particular principle of efficient coding applies to all types in a similar way. In a second study, we focused on the issue of inter-experimental variability that can occur during the process of pooling datasets. As a result, further downstream analyses may be complicated by the subtle variations between the individual datasets. To tackle this, we proposed a theoretical framework based on an adversarial autoencoder with the objective to remove inter-experimental variability from the pooled dataset, while preserving the underlying biological signal of interest. In the last study of this thesis, we investigated the functional effects of the neuromodulator nitric oxide on the retinal output signal. To this end, we used our previously developed retinal ganglion cell type classifier to unravel type-specific effects and established a paired recording protocol to account for type-specific time-dependent effects. We found that certain retinal ganglion cell types showed adaptational type-specific changes and that nitric oxide had a distinct modulation of a particular group of retinal ganglion cells. In summary, I first present several experimental and computational methods that allow to study functional neuromodulatory effects on the retinal output signal in a cell type-resolved manner and, second, use these tools to demonstrate their feasibility to study the neuromodulator nitric oxide

    Applications of Deep Learning Models in Financial Forecasting

    Get PDF
    In financial markets, deep learning techniques sparked a revolution, reshaping conventional approaches and amplifying predictive capabilities. This thesis explored the applications of deep learning models to unravel insights and methodologies aimed at advancing financial forecasting. The crux of the research problem lies in the applications of predictive models within financial domains, characterised by high volatility and uncertainty. This thesis investigated the application of advanced deep-learning methodologies in the context of financial forecasting, addressing the challenges posed by the dynamic nature of financial markets. These challenges were tackled by exploring a range of techniques, including convolutional neural networks (CNNs), long short-term memory networks (LSTMs), autoencoders (AEs), and variational autoencoders (VAEs), along with approaches such as encoding financial time series into images. Through analysis, methodologies such as transfer learning, convolutional neural networks, long short-term memory networks, generative modelling, and image encoding of time series data were examined. These methodologies collectively offered a comprehensive toolkit for extracting meaningful insights from financial data. The present work investigated the practicality of a deep learning CNN-LSTM model within the Directional Change framework to predict significant DC events—a task crucial for timely decisionmaking in financial markets. Furthermore, the potential of autoencoders and variational autoencoders to enhance financial forecasting accuracy and remove noise from financial time series data was explored. Leveraging their capacity within financial time series, these models offered promising avenues for improved data representation and subsequent forecasting. To further contribute to financial prediction capabilities, a deep multi-model was developed that harnessed the power of pre-trained computer vision models. This innovative approach aimed to predict the VVIX, utilising the cross-disciplinary synergy between computer vision and financial forecasting. By integrating knowledge from these domains, novel insights into the prediction of market volatility were provided

    Deep ensemble model-based moving object detection and classification using SAR images

    Get PDF
    In recent decades, image processing and computer vision models have played a vital role in moving object detection on the synthetic aperture radar (SAR) images. Capturing of moving objects in the SAR images is a difficult task. In this study, a new automated model for detecting moving objects is proposed using SAR images. The proposed model has four main steps, namely, preprocessing, segmentation, feature extraction, and classification. Initially, the input SAR image is pre-processed using a histogram equalization technique. Then, the weighted Otsu-based segmentation algorithm is applied for segmenting the object regions from the pre-processed images. When using the weighted Otsu, the segmented grayscale images are not only clear but also retain the detailed features of grayscale images. Next, feature extraction is carried out by gray-level co-occurrence matrix (GLCM), median binary patterns (MBPs), and additive harmonic mean estimated local Gabor binary pattern (AHME-LGBP). The final step is classification using deep ensemble models, where the objects are classified by employing the ensemble deep learning technique, combining the models like the bidirectional long short-term memory (Bi-LSTM), recurrent neural network (RNN), and improved deep belief network (IDBN), which is trained with the features extracted previously. The combined models increase the accuracy of the results significantly. Furthermore, ensemble modeling reduces the variance and modeling method bias, which decreases the chances of overfitting. Compared to a single contributing model, ensemble models perform better and make better predictions. Additionally, an ensemble lessens the spread or dispersion of the model performance and prediction accuracy. Finally, the performance of the proposed model is related to the conventional models with respect to different measures. In the mean-case scenario, the proposed ensemble model has a minimum error value of 0.032, which is better related to other models. In both median- and best-case scenario studies, the ensemble model has a lower error value of 0.029 and 0.015

    Histopathology image classification: highlighting the gap between manual analysis and AI automation

    Get PDF
    The field of histopathological image analysis has evolved significantly with the advent of digital pathology, leading to the development of automated models capable of classifying tissues and structures within diverse pathological images. Artificial intelligence algorithms, such as convolutional neural networks, have shown remarkable capabilities in pathology image analysis tasks, including tumor identification, metastasis detection, and patient prognosis assessment. However, traditional manual analysis methods have generally shown low accuracy in diagnosing colorectal cancer using histopathological images. This study investigates the use of AI in image classification and image analytics using histopathological images using the histogram of oriented gradients method. The study develops an AI-based architecture for image classification using histopathological images, aiming to achieve high performance with less complexity through specific parameters and layers. In this study, we investigate the complicated state of histopathological image classification, explicitly focusing on categorizing nine distinct tissue types. Our research used open-source multi-centered image datasets that included records of 100.000 non-overlapping images from 86 patients for training and 7180 non-overlapping images from 50 patients for testing. The study compares two distinct approaches, training artificial intelligence-based algorithms and manual machine learning models, to automate tissue classification. This research comprises two primary classification tasks: binary classification, distinguishing between normal and tumor tissues, and multi-classification, encompassing nine tissue types, including adipose, background, debris, stroma, lymphocytes, mucus, smooth muscle, normal colon mucosa, and tumor. Our findings show that artificial intelligence-based systems can achieve 0.91 and 0.97 accuracy in binary and multi-class classifications. In comparison, the histogram of directed gradient features and the Random Forest classifier achieved accuracy rates of 0.75 and 0.44 in binary and multi-class classifications, respectively. Our artificial intelligence-based methods are generalizable, allowing them to be integrated into histopathology diagnostics procedures and improve diagnostic accuracy and efficiency. The CNN model outperforms existing machine learning techniques, demonstrating its potential to improve the precision and effectiveness of histopathology image analysis. This research emphasizes the importance of maintaining data consistency and applying normalization methods during the data preparation stage for analysis. It particularly highlights the potential of artificial intelligence to assess histopathological images

    On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse

    Get PDF
    This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse

    Using Image Translation To Synthesize Amyloid Beta From Structural MRI

    Get PDF
    Amyloid-beta and brain atrophy are known hallmarks of Alzheimer’s Disease (AD) and can be quantified with positron emission tomography (PET) and structural magnetic resonance imaging (MRI), respectively. PET uses radiotracers that bind to amyloid-beta, whereas MRI can measure brain morphology. PET scans have limitations including cost, invasiveness (involve injections and ionizing radiation exposure), and have limited accessibility, making PET not practical for screening early-onset AD. Conversely, MRI is a cheaper, less-invasive (free from ionizing radiation), and is more widely available, however, it cannot provide the necessary molecular information. There is a known relationship between amyloid-beta and brain atrophy. This thesis aims to synthesize amyloid-beta PET images from structural MRI using image translation, an advanced form of machine learning. The developed models have reported high-similarity metrics between the real and synthetic PET images and high-degree of accuracy in radiotracer quantification. The results are highly impactful as it enables amyloid-beta measurements form every MRI, for free

    Sustainable Collaboration: Federated Learning for Environmentally Conscious Forest Fire Classification in Green Internet of Things (IoT)

    Get PDF
    Forests are an invaluable natural resource, playing a crucial role in the regulation of both local and global climate patterns. Additionally, they offer a plethora of benefits such as medicinal plants, food, and non-timber forest products. However, with the growing global population, the demand for forest resources has escalated, leading to a decline in their abundance. The reduction in forest density has detrimental impacts on global temperatures and raises the likelihood of forest fires. To address these challenges, this paper introduces a Federated Learning framework empowered by the Internet of Things (IoT). The proposed framework integrates with an Intelligent system, leveraging mounted cameras strategically positioned in highly vulnerable areas susceptible to forest fires. This integration enables the timely detection and monitoring of forest fire occurrences and plays its part in avoiding major catastrophes. The proposed framework incorporates the Federated Stochastic Gradient Descent (FedSGD) technique to aggregate the global model in the cloud. The dataset employed in this study comprises two classes: fire and non-fire images. This dataset is distributed among five nodes, allowing each node to independently train the model on their respective devices. Following the local training, the learned parameters are shared with the cloud for aggregation, ensuring a collective and comprehensive global model. The effectiveness of the proposed framework is assessed by comparing its performance metrics with the recent work. The proposed algorithm achieved an accuracy of 99.27 % and stands out by leveraging the concept of collaborative learning. This approach distributes the workload among nodes, relieving the server from excessive burden. Each node is empowered to obtain the best possible model for classification, even if it possesses limited data. This collaborative learning paradigm enhances the overall efficiency and effectiveness of the classification process, ensuring optimal results in scenarios where data availability may be constrained
    • …
    corecore