60 research outputs found

    Deep Clustering and Deep Network Compression

    Get PDF
    The use of deep learning has grown increasingly in recent years, thereby becoming a much-discussed topic across a diverse range of fields, especially in computer vision, text mining, and speech recognition. Deep learning methods have proven to be robust in representation learning and attained extraordinary achievement. Their success is primarily due to the ability of deep learning to discover and automatically learn feature representations by mapping input data into abstract and composite representations in a latent space. Deep learning’s ability to deal with high-level representations from data has inspired us to make use of learned representations, aiming to enhance unsupervised clustering and evaluate the characteristic strength of internal representations to compress and accelerate deep neural networks.Traditional clustering algorithms attain a limited performance as the dimensionality in-creases. Therefore, the ability to extract high-level representations provides beneficial components that can support such clustering algorithms. In this work, we first present DeepCluster, a clustering approach embedded in a deep convolutional auto-encoder. We introduce two clustering methods, namely DCAE-Kmeans and DCAE-GMM. The DeepCluster allows for data points to be grouped into their identical cluster, in the latent space, in a joint-cost function by simultaneously optimizing the clustering objective and the DCAE objective, producing stable representations, which is appropriate for the clustering process. Both qualitative and quantitative evaluations of proposed methods are reported, showing the efficiency of deep clustering on several public datasets in comparison to the previous state-of-the-art methods.Following this, we propose a new version of the DeepCluster model to include varying degrees of discriminative power. This introduces a mechanism which enables the imposition of regularization techniques and the involvement of a supervision component. The key idea of our approach is to distinguish the discriminatory power of numerous structures when searching for a compact structure to form robust clusters. The effectiveness of injecting various levels of discriminatory powers into the learning process is investigated alongside the exploration and analytical study of the discriminatory power obtained through the use of two discriminative attributes: data-driven discriminative attributes with the support of regularization techniques, and supervision discriminative attributes with the support of the supervision component. An evaluation is provided on four different datasets.The use of neural networks in various applications is accompanied by a dramatic increase in computational costs and memory requirements. Making use of the characteristic strength of learned representations, we propose an iterative pruning method that simultaneously identifies the critical neurons and prunes the model during training without involving any pre-training or fine-tuning procedures. We introduce a majority voting technique to compare the activation values among neurons and assign a voting score to evaluate their importance quantitatively. This mechanism effectively reduces model complexity by eliminating the less influential neurons and aims to determine a subset of the whole model that can represent the reference model with much fewer parameters within the training process. Empirically, we demonstrate that our pruning method is robust across various scenarios, including fully-connected networks (FCNs), sparsely-connected networks (SCNs), and Convolutional neural networks (CNNs), using two public datasets.Moreover, we also propose a novel framework to measure the importance of individual hidden units by computing a measure of relevance to identify the most critical filters and prune them to compress and accelerate CNNs. Unlike existing methods, we introduce the use of the activation of feature maps to detect valuable information and the essential semantic parts, with the aim of evaluating the importance of feature maps, inspired by novel neural network interpretability. A majority voting technique based on the degree of alignment between a se-mantic concept and individual hidden unit representations is utilized to evaluate feature maps’ importance quantitatively. We also propose a simple yet effective method to estimate new convolution kernels based on the remaining crucial channels to accomplish effective CNN compression. Experimental results show the effectiveness of our filter selection criteria, which outperforms the state-of-the-art baselines.To conclude, we present a comprehensive, detailed review of time-series data analysis, with emphasis on deep time-series clustering (DTSC), and a founding contribution to the area of applying deep clustering to time-series data by presenting the first case study in the context of movement behavior clustering utilizing the DeepCluster method. The results are promising, showing that the latent space encodes sufficient patterns to facilitate accurate clustering of movement behaviors. Finally, we identify state-of-the-art and present an outlook on this important field of DTSC from five important perspectives

    Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

    Get PDF
    Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

    Artificial intelligence within the interplay between natural and artificial computation:Advances in data science, trends and applications

    Get PDF
    Artificial intelligence and all its supporting tools, e.g. machine and deep learning in computational intelligence-based systems, are rebuilding our society (economy, education, life-style, etc.) and promising a new era for the social welfare state. In this paper we summarize recent advances in data science and artificial intelligence within the interplay between natural and artificial computation. A review of recent works published in the latter field and the state the art are summarized in a comprehensive and self-contained way to provide a baseline framework for the international community in artificial intelligence. Moreover, this paper aims to provide a complete analysis and some relevant discussions of the current trends and insights within several theoretical and application fields covered in the essay, from theoretical models in artificial intelligence and machine learning to the most prospective applications in robotics, neuroscience, brain computer interfaces, medicine and society, in general.BMS - Pfizer(U01 AG024904). Spanish Ministry of Science, projects: TIN2017-85827-P, RTI2018-098913-B-I00, PSI2015-65848-R, PGC2018-098813-B-C31, PGC2018-098813-B-C32, RTI2018-101114-B-I, TIN2017-90135-R, RTI2018-098743-B-I00 and RTI2018-094645-B-I00; the FPU program (FPU15/06512, FPU17/04154) and Juan de la Cierva (FJCI-2017–33022). Autonomous Government of Andalusia (Spain) projects: UMA18-FEDERJA-084. Consellería de Cultura, Educación e Ordenación Universitaria of Galicia: ED431C2017/12, accreditation 2016–2019, ED431G/08, ED431C2018/29, Comunidad de Madrid, Y2018/EMT-5062 and grant ED431F2018/02. PPMI – a public – private partnership – is funded by The Michael J. Fox Foundation for Parkinson’s Research and funding partners, including Abbott, Biogen Idec, F. Hoffman-La Roche Ltd., GE Healthcare, Genentech and Pfizer Inc

    From Fully-Supervised Single-Task to Semi-Supervised Multi-Task Deep Learning Architectures for Segmentation in Medical Imaging Applications

    Get PDF
    Medical imaging is routinely performed in clinics worldwide for the diagnosis and treatment of numerous medical conditions in children and adults. With the advent of these medical imaging modalities, radiologists can visualize both the structure of the body as well as the tissues within the body. However, analyzing these high-dimensional (2D/3D/4D) images demands a significant amount of time and effort from radiologists. Hence, there is an ever-growing need for medical image computing tools to extract relevant information from the image data to help radiologists perform efficiently. Image analysis based on machine learning has pivotal potential to improve the entire medical imaging pipeline, providing support for clinical decision-making and computer-aided diagnosis. To be effective in addressing challenging image analysis tasks such as classification, detection, registration, and segmentation, specifically for medical imaging applications, deep learning approaches have shown significant improvement in performance. While deep learning has shown its potential in a variety of medical image analysis problems including segmentation, motion estimation, etc., generalizability is still an unsolved problem and many of these successes are achieved at the cost of a large pool of datasets. For most practical applications, getting access to a copious dataset can be very difficult, often impossible. Annotation is tedious and time-consuming. This cost is further amplified when annotation must be done by a clinical expert in medical imaging applications. Additionally, the applications of deep learning in the real-world clinical setting are still limited due to the lack of reliability caused by the limited prediction capabilities of some deep learning models. Moreover, while using a CNN in an automated image analysis pipeline, it’s critical to understand which segmentation results are problematic and require further manual examination. To this extent, the estimation of uncertainty calibration in a semi-supervised setting for medical image segmentation is still rarely reported. This thesis focuses on developing and evaluating optimized machine learning models for a variety of medical imaging applications, ranging from fully-supervised, single-task learning to semi-supervised, multi-task learning that makes efficient use of annotated training data. The contributions of this dissertation are as follows: (1) developing a fully-supervised, single-task transfer learning for the surgical instrument segmentation from laparoscopic images; and (2) utilizing supervised, single-task, transfer learning for segmenting and digitally removing the surgical instruments from endoscopic/laparoscopic videos to allow the visualization of the anatomy being obscured by the tool. The tool removal algorithms use a tool segmentation mask and either instrument-free reference frames or previous instrument-containing frames to fill in (inpaint) the instrument segmentation mask; (3) developing fully-supervised, single-task learning via efficient weight pruning and learned group convolution for accurate left ventricle (LV), right ventricle (RV) blood pool and myocardium localization and segmentation from 4D cine cardiac MR images; (4) demonstrating the use of our fully-supervised memory-efficient model to generate dynamic patient-specific right ventricle (RV) models from cine cardiac MRI dataset via an unsupervised learning-based deformable registration field; and (5) integrating a Monte Carlo dropout into our fully-supervised memory-efficient model with inherent uncertainty estimation, with the overall goal to estimate the uncertainty associated with the obtained segmentation and error, as a means to flag regions that feature less than optimal segmentation results; (6) developing semi-supervised, single-task learning via self-training (through meta pseudo-labeling) in concert with a Teacher network that instructs the Student network by generating pseudo-labels given unlabeled input data; (7) proposing largely-unsupervised, multi-task learning to demonstrate the power of a simple combination of a disentanglement block, variational autoencoder (VAE), generative adversarial network (GAN), and a conditioning layer-based reconstructor for performing two of the foremost critical tasks in medical imaging — segmentation of cardiac structures and reconstruction of the cine cardiac MR images; (8) demonstrating the use of 3D semi-supervised, multi-task learning for jointly learning multiple tasks in a single backbone module – uncertainty estimation, geometric shape generation, and cardiac anatomical structure segmentation of the left atrial cavity from 3D Gadolinium-enhanced magnetic resonance (GE-MR) images. This dissertation summarizes the impact of the contributions of our work in terms of demonstrating the adaptation and use of deep learning architectures featuring different levels of supervision to build a variety of image segmentation tools and techniques that can be used across a wide spectrum of medical image computing applications centered on facilitating and promoting the wide-spread computer-integrated diagnosis and therapy data science

    WLAN-paikannuksen elinkaaren tukeminen

    Get PDF
    The advent of GPS positioning at the turn of the millennium provided consumers with worldwide access to outdoor location information. For the purposes of indoor positioning, however, the GPS signal rarely penetrates buildings well enough to maintain the same level of positioning granularity as outdoors. Arriving around the same time, wireless local area networks (WLAN) have gained widespread support both in terms of infrastructure deployments and client proliferation. A promising approach to bridge the location context then has been positioning based on WLAN signals. In addition to being readily available in most environments needing support for location information, the adoption of a WLAN positioning system is financially low-cost compared to dedicated infrastructure approaches, partly due to operating on an unlicensed frequency band. Furthermore, the accuracy provided by this approach is enough for a wide range of location-based services, such as navigation and location-aware advertisements. In spite of this attractive proposition and extensive research in both academia and industry, WLAN positioning has yet to become the de facto choice for indoor positioning. This is despite over 20 000 publications and the foundation of several companies. The main reasons for this include: (i) the cost of deployment, and re-deployment, which is often significant, if not prohibitive, in terms of work hours; (ii) the complex propagation of the wireless signal, which -- through interaction with the environment -- renders it inherently stochastic; (iii) the use of an unlicensed frequency band, which means the wireless medium faces fierce competition by other technologies, and even unintentional radiators, that can impair traffic in unforeseen ways and impact positioning accuracy. This thesis addresses these issues by developing novel solutions for reducing the effort of deployment, including optimizing the indoor location topology for the use of WLAN positioning, as well as automatically detecting sources of cross-technology interference. These contributions pave the way for WLAN positioning to become as ubiquitous as the underlying technology.GPS-paikannus avattiin julkiseen käyttöön vuosituhannen vaihteessa, jonka jälkeen sitä on voinut käyttää sijainnin paikantamiseen ulkotiloissa kaikkialla maailmassa. Sisätiloissa GPS-signaali kuitenkin harvoin läpäisee rakennuksia kyllin hyvin voidakseen tarjota vastaavaa paikannustarkkuutta. Langattomat lähiverkot (WLAN), mukaan lukien tukiasemat ja käyttölaitteet, yleistyivät nopeasti samoihin aikoihin. Näiden verkkojen signaalien käyttö on siksi alusta asti tarjonnut lupaavia mahdollisuuksia sisätilapaikannukseen. Useimmissa ympäristöissä on jo valmiit WLAN-verkot, joten paikannuksen käyttöönotto on edullista verrattuna järjestelmiin, jotka vaativat erillisen laitteiston. Tämä johtuu osittain lisenssivapaasta taajuusalueesta, joka mahdollistaa kohtuuhintaiset päätelaitteet. WLAN-paikannuksen tarjoama tarkkuus on lisäksi riittävä monille sijaintipohjaisille palveluille, kuten suunnistamiselle ja paikkatietoisille mainoksille. Näistä lupaavista alkuasetelmista ja laajasta tutkimuksesta huolimatta WLAN-paikannus ei ole kuitenkaan pystynyt lunastamaan paikkaansa pääasiallisena sisätilapaikannusmenetelmänä. Vaivannäöstä ei ole puutetta; vuosien saatossa on julkaistu yli 20 000 tieteellistä artikkelia sekä perustettu useita yrityksiä. Syitä tähän kehitykseen on useita. Ensinnäkin, paikannuksen pystyttäminen ja ylläpito vaativat aikaa ja vaivaa. Toiseksi, langattoman signaalin eteneminen ja vuorovaikutus ympäristön kanssa on hyvin monimutkaista, mikä tekee mallintamisesta vaikeaa. Kolmanneksi, eri teknologiat ja laitteet kilpailevat lisenssivapaan taajuusalueen käytöstä, mikä johtaa satunnaisiin paikannustarkkuuteen vaikuttaviin tietoliikennehäiriöihin. Väitöskirja esittelee uusia menetelmiä joilla voidaan merkittävästi pienentää paikannusjärjestelmän asennuskustannuksia, jakaa ympäristö automaattisesti osiin WLAN-paikannusta varten, sekä tunnistaa mahdolliset langattomat häiriölähteet. Nämä kehitysaskeleet edesauttavat WLAN-paikannuksen yleistymistä jokapäiväiseen käyttöön

    Investigating a deep learning approach to real-time air quality prediction and visualisation on UK highways

    Get PDF
    The construction of intercity highways by the United Kingdom (UK) government has resulted in a progressive increase in vehicle emissions and pollution from noise, dust, and vibrations amid growing concerns about air pollution. Existing roadside pollution monitoring devices have faced limitations due to their fixed locations, limited sensitivity, and inability to capture the full spatial variability, which can result in less accurate measurements of transient and fine-scale pollutants like nitrogen oxides and particulate matter. Reports on regional highways across the country are based on a limited number of fixed monitoring stations that are sometimes located far from the highway. These periodic and coarse-grained measurements cause inefficient highway air quality reporting, leading to inaccurate air quality forecasts. Multi-target neural network is a type of machine learning algorithm that offers the advantage of simultaneously predicting multiple pollutants, enhancing predictive accuracy and efficiency by capturing complex interdependencies among various air quality parameters. The potentials of this and similar multi-target prediction techniques are yet to be fully exploited in the air quality space due to the unavailability of the right data set. To address these limitations, this doctoral thesis proposes and implements a framework which adopts cutting-edge digital technologies such as Internet of Things, Big Data and Deep Learning for a more efficient way of capturing and forecasting traffic related air pollution (TRAP). The empirical component of the study involves a detailed comparative analysis of advanced predictive models, incorporating an enriched dataset that includes road elevation, vehicle emission factors, and background maps, alongside traditional traffic flow, weather, and pollution data. The research adopts a multi-target regression approach to forecast concentrations of NO2, PM2.5, and PM10 across multiple time steps. Various models were tested, with Fastai's tabular model, Prophet's time-series model, and scikit-learn's multioutput regressor being central to the experimentation. The Fastai model demonstrated superior performance, evidenced by its Root-Mean Square Error (RMSE) scores for each pollutant. Statistical analysis using the Friedman and Wilcoxon tests confirmed the Fastai model's significance, further supported by an algorithmic audit that identified key features contributing to the model's predictive power. This doctoral thesis not only advances the methodology for air quality monitoring and forecasting along highways but also lays the groundwork for future research aimed at refining air quality assessment practices and enhancing environmental health standards

    Behavior quantification as the missing link between fields: Tools for digital psychiatry and their role in the future of neurobiology

    Full text link
    The great behavioral heterogeneity observed between individuals with the same psychiatric disorder and even within one individual over time complicates both clinical practice and biomedical research. However, modern technologies are an exciting opportunity to improve behavioral characterization. Existing psychiatry methods that are qualitative or unscalable, such as patient surveys or clinical interviews, can now be collected at a greater capacity and analyzed to produce new quantitative measures. Furthermore, recent capabilities for continuous collection of passive sensor streams, such as phone GPS or smartwatch accelerometer, open avenues of novel questioning that were previously entirely unrealistic. Their temporally dense nature enables a cohesive study of real-time neural and behavioral signals. To develop comprehensive neurobiological models of psychiatric disease, it will be critical to first develop strong methods for behavioral quantification. There is huge potential in what can theoretically be captured by current technologies, but this in itself presents a large computational challenge -- one that will necessitate new data processing tools, new machine learning techniques, and ultimately a shift in how interdisciplinary work is conducted. In my thesis, I detail research projects that take different perspectives on digital psychiatry, subsequently tying ideas together with a concluding discussion on the future of the field. I also provide software infrastructure where relevant, with extensive documentation. Major contributions include scientific arguments and proof of concept results for daily free-form audio journals as an underappreciated psychiatry research datatype, as well as novel stability theorems and pilot empirical success for a proposed multi-area recurrent neural network architecture.Comment: PhD thesis cop

    AI and IoT Meet Mobile Machines

    Get PDF
    Infrastructure construction is society's cornerstone and economics' catalyst. Therefore, improving mobile machinery's efficiency and reducing their cost of use have enormous economic benefits in the vast and growing construction market. In this thesis, I envision a novel concept smart working site to increase productivity through fleet management from multiple aspects and with Artificial Intelligence (AI) and Internet of Things (IoT)

    A review of natural language processing in contact centre automation

    Get PDF
    Contact centres have been highly valued by organizations for a long time. However, the COVID-19 pandemic has highlighted their critical importance in ensuring business continuity, economic activity, and quality customer support. The pandemic has led to an increase in customer inquiries related to payment extensions, cancellations, and stock inquiries, each with varying degrees of urgency. To address this challenge, organizations have taken the opportunity to re-evaluate the function of contact centres and explore innovative solutions. Next-generation platforms that incorporate machine learning techniques and natural language processing, such as self-service voice portals and chatbots, are being implemented to enhance customer service. These platforms offer robust features that equip customer agents with the necessary tools to provide exceptional customer support. Through an extensive review of existing literature, this paper aims to uncover research gaps and explore the advantages of transitioning to a contact centre that utilizes natural language solutions as the norm. Additionally, we will examine the major challenges faced by contact centre organizations and offer reco
    corecore