19 research outputs found

    Surgical Data Science - from Concepts toward Clinical Translation

    Get PDF
    Recent developments in data science in general and machine learning in particular have transformed the way experts envision the future of surgery. Surgical Data Science (SDS) is a new research field that aims to improve the quality of interventional healthcare through the capture, organization, analysis and modeling of data. While an increasing number of data-driven approaches and clinical applications have been studied in the fields of radiological and clinical data science, translational success stories are still lacking in surgery. In this publication, we shed light on the underlying reasons and provide a roadmap for future advances in the field. Based on an international workshop involving leading researchers in the field of SDS, we review current practice, key achievements and initiatives as well as available standards and tools for a number of topics relevant to the field, namely (1) infrastructure for data acquisition, storage and access in the presence of regulatory constraints, (2) data annotation and sharing and (3) data analytics. We further complement this technical perspective with (4) a review of currently available SDS products and the translational progress from academia and (5) a roadmap for faster clinical translation and exploitation of the full potential of SDS, based on an international multi-round Delphi process

    Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense

    Full text link
    Recent progress in deep learning is essentially based on a "big data for small tasks" paradigm, under which massive amounts of data are used to train a classifier for a single narrow task. In this paper, we call for a shift that flips this paradigm upside down. Specifically, we propose a "small data for big tasks" paradigm, wherein a single artificial intelligence (AI) system is challenged to develop "common sense", enabling it to solve a wide range of tasks with little training data. We illustrate the potential power of this new paradigm by reviewing models of common sense that synthesize recent breakthroughs in both machine and human vision. We identify functionality, physics, intent, causality, and utility (FPICU) as the five core domains of cognitive AI with humanlike common sense. When taken as a unified concept, FPICU is concerned with the questions of "why" and "how", beyond the dominant "what" and "where" framework for understanding vision. They are invisible in terms of pixels but nevertheless drive the creation, maintenance, and development of visual scenes. We therefore coin them the "dark matter" of vision. Just as our universe cannot be understood by merely studying observable matter, we argue that vision cannot be understood without studying FPICU. We demonstrate the power of this perspective to develop cognitive AI systems with humanlike common sense by showing how to observe and apply FPICU with little training data to solve a wide range of challenging tasks, including tool use, planning, utility inference, and social learning. In summary, we argue that the next generation of AI must embrace "dark" humanlike common sense for solving novel tasks.Comment: For high quality figures, please refer to http://wellyzhang.github.io/attach/dark.pd

    Virtuaalse proovikabiini 3D kehakujude ja roboti juhtimisalgoritmide uurimine

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsiooneVirtuaalne riiete proovimine on üks põhilistest teenustest, mille pakkumine võib suurendada rõivapoodide edukust, sest tänu sellele lahendusele väheneb füüsilise töö vajadus proovimise faasis ning riiete proovimine muutub kasutaja jaoks mugavamaks. Samas pole enamikel varem välja pakutud masinnägemise ja graafika meetoditel õnnestunud inimkeha realistlik modelleerimine, eriti terve keha 3D modelleerimine, mis vajab suurt kogust andmeid ja palju arvutuslikku ressurssi. Varasemad katsed on ebaõnnestunud põhiliselt seetõttu, et ei ole suudetud korralikult arvesse võtta samaaegseid muutusi keha pinnal. Lisaks pole varasemad meetodid enamasti suutnud kujutiste liikumisi realistlikult reaalajas visualiseerida. Käesolev projekt kavatseb kõrvaldada eelmainitud puudused nii, et rahuldada virtuaalse proovikabiini vajadusi. Välja pakutud meetod seisneb nii kasutaja keha kui ka riiete skaneerimises, analüüsimises, modelleerimises, mõõtmete arvutamises, orientiiride paigutamises, mannekeenidelt võetud 3D visuaalsete andmete segmenteerimises ning riiete mudeli paigutamises ja visualiseerimises kasutaja kehal. Selle projekti käigus koguti visuaalseid andmeid kasutades 3D laserskannerit ja Kinecti optilist kaamerat ning koostati nendest andmebaas. Neid andmeid kasutati välja töötatud algoritmide testimiseks, mis peamiselt tegelevad riiete realistliku visuaalse kujutamisega inimkehal ja suuruse pakkumise süsteemi täiendamisega virtuaalse proovikabiini kontekstis.Virtual fitting constitutes a fundamental element of the developments expected to rise the commercial prosperity of online garment retailers to a new level, as it is expected to reduce the load of the manual labor and physical efforts required. Nevertheless, most of the previously proposed computer vision and graphics methods have failed to accurately and realistically model the human body, especially, when it comes to the 3D modeling of the whole human body. The failure is largely related to the huge data and calculations required, which in reality is caused mainly by inability to properly account for the simultaneous variations in the body surface. In addition, most of the foregoing techniques cannot render realistic movement representations in real-time. This project intends to overcome the aforementioned shortcomings so as to satisfy the requirements of a virtual fitting room. The proposed methodology consists in scanning and performing some specific analyses of both the user's body and the prospective garment to be virtually fitted, modeling, extracting measurements and assigning reference points on them, and segmenting the 3D visual data imported from the mannequins. Finally, superimposing, adopting and depicting the resulting garment model on the user's body. The project is intended to gather sufficient amounts of visual data using a 3D laser scanner and the Kinect optical camera, to manage it in form of a usable database, in order to experimentally implement the algorithms devised. The latter will provide a realistic visual representation of the garment on the body, and enhance the size-advisor system in the context of the virtual fitting room under study

    Material Management Framework utilizing Near Real-Time Monitoring of Construction Operations

    Get PDF
    Materials management is a vital process in the delivery of construction facilities. Studies by the Construction Industry Institute (CII) have demonstrated that materials and installed equipment can constitute 40– 70% of the total construction hard cost and affect 80% of the project schedule. Despite its significance, most of the construction industry sectors are suffering from poor material management processes including inaccurate warehouse records, over-ordering and large surpluses of material at project completion, poor site storage practices, running out of materials, late deliveries, double-handling of components, out-of-specification material, and out of sequence deliveries which all result in low productivity, delay in construction and cost overruns. Inefficient material management can be attributed to the complex, unstructured, and dynamic nature of the construction industry, which has not been considered in a large number of studies available in this field. The literature reveals that available computer-based materials management systems focus on (1) integration of the materials management functions, and (2) application of Automated Data Collection (ADC) technologies to collect materials localization and tracking data for their computerized materials management systems. Moreover in studies that focused on applying ADC technologies in construction materials management, positioning and tracking critical resources in construction sites, and identifying unique materials received at the job site are the main applications of their used technologies. Even though, various studies have improved materials management processes copiously in the construction industry, the benefits of considering the dynamic nature of construction (in terms of near real-time progress monitoring using state of the art technologies and techniques) and its integration with a dynamic materials management system have been left out. So, in contrast with other studies, this research presents a construction materials management framework capable of considering the dynamic nature of construction projects. It includes a vital component to monitor project progress in near real-time to estimate the installation and consumption of materials. This framework consists of three models: “preconstruction model,” “construction model,” and “data analysis and reporting model.” This framework enables (1) generation of optimized material delivery schedules based on Material Requirement Planning (MRP) and minimum total cost, (2) issuance of material Purchase Orders (POs) according to optimized delivery schedules, (3) tracking the status of POs (Expediting methods), (4) collection and assessment of material data as it arrives on site, (5) considering the inherent dynamics of construction operations by monitoring project progress to update project schedule and estimate near real-time consumption of materials and eventually (6) updating MRP and optimized delivery schedule frequently throughout the construction phase. An optimized material delivery schedule and an optimized purchase schedule with the least cost are generated by the preconstruction model to avoid consequences of early/late purchasing and excess/inadequate purchasing. Accurate assessment of project progress and estimation of installed or consumed materials are essential for an effective construction material management system. The construction model focuses on the collection of near real-time site data using ADC technologies. Project progress is visualized from two different perspectives, comparing as-built with as-planned and comparing various as-built status captured on consecutive points of time. Due to the recent improvements in digital photography and webcams, which made this technology more cost-effective and practical for monitoring project progress, digital imaging (including 360° images) is selected and applied for project progress monitoring in the construction (data acquisition) model. In the last model, which is the data analysis and reporting model, Deep Learning (DL) and image processing algorithms are proposed to visualize and detect actual progress in terms of built elements in near real-time. In contrast with the other studies in which conventional computer vision algorithms are often used to monitor projects progress, in this research, a deep Convolutional Auto-Encoder (CAE) and Mask Region-based Convolutional Neural Network (R-CNN) are utilized to facilitate vision-based indoor and outdoor progress monitoring of construction operations. The updated project schedule based on the actual progress is the output of this model, and it is used as the primary input for the developed material management framework to update MRP, optimized material delivery, and purchase schedules, respectively. Applicability of the models in the developed material management framework has been tested through laboratory and field experiments. The results demonstrated the accuracy and capabilities of the developed models in the framework

    Adaptive Robot Framework: Providing Versatility and Autonomy to Manufacturing Robots Through FSM, Skills and Agents

    Get PDF
    207 p.The main conclusions that can be extracted from an analysis of the current situation and future trends of the industry,in particular manufacturing plants, are the following: there is a growing need to provide customization of products, ahigh variation of production volumes and a downward trend in the availability of skilled operators due to the ageingof the population. Adapting to this new scenario is a challenge for companies, especially small and medium-sizedenterprises (SMEs) that are suffering first-hand how their specialization is turning against them.The objective of this work is to provide a tool that can serve as a basis to face these challenges in an effective way.Therefore the presented framework, thanks to its modular architecture, allows focusing on the different needs of eachparticular company and offers the possibility of scaling the system for future requirements. The presented platform isdivided into three layers, namely: interface with robot systems, the execution engine and the application developmentlayer.Taking advantage of the provided ecosystem by this framework, different modules have been developed in order toface the mentioned challenges of the industry. On the one hand, to address the need of product customization, theintegration of tools that increase the versatility of the cell are proposed. An example of such tools is skill basedprogramming. By applying this technique a process can be intuitively adapted to the variations or customizations thateach product requires. The use of skills favours the reuse and generalization of developed robot programs.Regarding the variation of the production volumes, a system which permits a greater mobility and a faster reconfigurationis necessary. If in a certain situation a line has a production peak, mechanisms for balancing the loadwith a reasonable cost are required. In this respect, the architecture allows an easy integration of different roboticsystems, actuators, sensors, etc. In addition, thanks to the developed calibration and set-up techniques, the system canbe adapted to new workspaces at an effective time/cost.With respect to the third mentioned topic, an agent-based monitoring system is proposed. This module opens up amultitude of possibilities for the integration of auxiliary modules of protection and security for collaboration andinteraction between people and robots, something that will be necessary in the not so distant future.For demonstrating the advantages and adaptability improvement of the developed framework, a series of real usecases have been presented. In each of them different problematic has been resolved using developed skills,demonstrating how are adapted easily to the different casuistic

    Advancing Multi-Modal Deep Learning: Towards Language-Grounded Visual Understanding

    Get PDF
    Using deep learning, computer vision now rivals people at object recognition and detection, opening doors to tackle new challenges in image understanding. Among these challenges, understanding and reasoning about language grounded visual content is of fundamental importance to advancing artificial intelligence. Recently, multiple datasets and algorithms have been created as proxy tasks towards this goal, with visual question answering (VQA) being the most widely studied. In VQA, an algorithm needs to produce an answer to a natural language question about an image. However, our survey of datasets and algorithms for VQA uncovered several sources of dataset bias and sub-optimal evaluation metrics that allowed algorithms to perform well by merely exploiting superficial statistical patterns. In this dissertation, we describe new algorithms and datasets that address these issues. We developed two new datasets and evaluation metrics that enable a more accurate measurement of abilities of a VQA model, and also expand VQA to include new abilities, such as reading text, handling out-of-vocabulary words, and understanding data-visualization. We also created new algorithms for VQA that have helped advance the state-of-the-art for VQA, including an algorithm that surpasses humans on two different chart question answering datasets about bar-charts, line-graphs and pie charts. Finally, we provide a holistic overview of several yet-unsolved challenges in not only VQA but vision and language research at large. Despite enormous progress, we find that a robust understanding and integration of vision and language is still an elusive goal, and much of the progress may be misleading due to dataset bias, superficial correlations and flaws in standard evaluation metrics. We carefully study and categorize these issues for several vision and language tasks and outline several possible paths towards development of safe, robust and trustworthy AI for language-grounded visual understanding

    Determination of Elevations for Excavation Operations Using Drone Technologies

    Get PDF
    Using deep learning technology to rapidly estimate depth information from a single image has been studied in many situations, but it is new in construction site elevation determinations, and challenges are not limited to the lack of datasets. This dissertation presents the research results of utilizing drone ortho-imaging and deep learning to estimate construction site elevations for excavation operations. It provides two flexible options of fast elevation determination including a low-high-ortho-image-pair-based method and a single-frame-ortho-image-based method. The success of this research project advanced the ortho-imaging utilization in construction surveying, strengthened CNNs (convolutional neural networks) to work with large scale images, and contributed dense image pixel matching with different scales.This research project has three major tasks. First, the high-resolution ortho-image and elevation-map datasets were acquired using the low-high ortho-image pair-based 3D-reconstruction method. In detail, a vertical drone path is designed first to capture a 2:1 scale ortho-image pair of a construction site at two different altitudes. Then, to simultaneously match the pixel pairs and determine elevations, the developed pixel matching and virtual elevation algorithm provides the candidate pixel pairs in each virtual plane for matching, and the four-scaling patch feature descriptors are used to match them. Experimental results show that 92% of pixels in the pixel grid were strongly matched, where the accuracy of elevations was within ±5 cm.Second, the acquired high-resolution datasets were applied to train and test the ortho-image encoder and elevation-map decoder, where the max-pooling and up-sampling layers link the ortho-image and elevation-map in the same pixel coordinate. This convolutional encoder-decoder was supplemented with an input ortho-image overlapping disassembling and output elevation-map assembling algorithm to crop the high-resolution datasets into multiple small-patch datasets for model training and testing. Experimental results indicated 128×128-pixel small-patch had the best elevation estimation performance, where 21.22% of the selected points were exactly matched with “ground truth,” 31.21% points were accurately matched within ±5 cm. Finally, vegetation was identified in high-resolution ortho-images and removed from corresponding elevation-maps using the developed CNN-based image classification model and the vegetation removing algorithm. Experimental results concluded that the developed CNN model using 32×32-pixel ortho-image and class-label small-patch datasets had 93% accuracy in identifying objects and localizing objects’ edges
    corecore