40 research outputs found

    Novel Techniques for Automated Dental Identification

    Get PDF
    Automated dental identification is one of the best candidates for postmortem identification. With the large number of victims encountered in mass disasters, automating the process of postmortem identification is receiving an increased attention. This dissertation introduces new approaches for different stages of Automated Dental Identification system: These stages include segmentations, classification, labeling, and matching:;We modified the seam carving technique to adapt the problem of segmenting dental image records into individual teeth. We propose a two-stage teeth segmentation approach for segmenting the dental images. In the first stage, the teeth images are preprocessed by a two-step thresholding technique, which starts with an iterative thresholding followed by an adaptive thresholding to binarize the teeth images. In the second stage, we adapt the seam carving technique on the binary images, using both horizontal and vertical seams, to separate each individual tooth. We have obtained an optimality rate of 54.02% for the bitewing type images, which is superior to all existing fully automated dental segmentation algorithms in the literature, and a failure rate of 1.05%. For the periapical type images, we have obtained a high optimality rate of 58.13% and a low failure rate of 0.74 which also surpasses the performance of existing techniques. An important problem in automated dental identification is automatic classification of teeth into four classes (molars, premolars, canines, and incisors). A dental chart is a key to avoiding illogical comparisons that inefficiently consume the limited computational resources, and may mislead decision-making. We tackle this composite problem using a two-stage approach. The first stage, utilizes low computational-cost, appearance-based features, using Orthogonal Locality Preserving Projections (OLPP) for assigning an initial class. The second stage applies a string matching technique, based on teeth neighborhood rules, to validate initial teeth-classes and hence to assign each tooth a number corresponding to its location in the dental chart, even in the presence of a missed tooth. The experimental results of teeth classification show that on a large dataset of bitewing and periapical films, the proposed approach achieves overall classification accuracy of 77% and teeth class validation enhances the overall teeth classification accuracy to 87% which is slightly better than the performance obtained from previous methods based on EigenTeeth the performance of which is 75% and 86%, respectively.;We present a new technique that searches the dental database to find a candidate list. We use dental records of the FBI\u27s Criminal Justice Service (CJIC) ADIS database, that contains 104 records (about 500 bitewing and periapical films) involving more than 2000 teeth, 47 Antemortem (AM) records and 57 Postmortem (PM) records with 20 matched records.;The proposed approach consists of two main stages, the first stage is to preprocess the dental records (segmentation and teeth labeling classification) in order to get a reliable, appearance-based, low computational-cost feature. In the second stage, we developed a technique based on LaplacianTeeth using OLPP algorithm to produce a candidate list. The proposed technique can correctly retrieve the dental records 65% in the 5 top ranks while the method based on EigenTeeth remains at 60%. The proposed approach takes about 0.17 seconds to make record to record comparison while the other method based on EigenTeeth takes about 0.09 seconds.;Finally, we address the teeth matching problem by presenting a new technique for dental record retrieval. The technique is based on the matching of the Scale Invariant feature Transform (SIFT) descriptors guided by the teeth contour between the subject and reference dental records. Our fundamental objective is to accomplish a relatively short match list, with a high probability of having the correct match reference. The proposed technique correctly retrieves the dental records with performance rates of 35% and 75% in the 1 and 5 top ranks respectively, and takes only an average time of 4.18 minutes to retrieve a match list. This compares favorably with the existing technique shape-based (edge direction histogram) method which has the performance rates of 29% and 46% in the 1 and 5 top ranks respectively.;In summary, the proposed ADIS system accurately retrieves the dental record with an overall rate of 80% in top 5 ranks when a candidate list of 20 is used (from potential match search) whereas a candidate size of 10 yields an overall rate of 84% in top 5 ranks and takes only a few minutes to search the database, which compares favorably against most of the existing methods in the literature, when both accuracy and computational complexity are considered

    Top-K link recommendation for development of P2P social networks

    Get PDF
    Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2014.Thesis (Master's) -- Bilkent University, 2014.Includes bibliographical references leaves 45-48.The common approach for implementing social networks has been using centralized infrastructures, which inherently include problems of privacy, censorship, scalability, and fault-tolerance. Although decentralized systems offer a natural solution, significant research is needed to build an end-to-end peer-to-peer social network where data is stored among trusted users. The centralized algorithms need to be revisited for a P2P setting, where the nodes have connectivity to only neighbors, have no information of global topology, and may go offline and churn resulting in changes of the graph structure. The social graph algorithms should be designed as robust to node failures and network changes. We model P2P social networks as uncertain graphs where each node can go offline, and we introduce link recommendation algorithms that support the development of decentralized social networks. We propose methods to recommend top-k links to improve the underlying topology and efficiency of the overlay network, while preserving the locality of the social structure. Our approach aims to optimize the probabilistic reachability, improve the robustness of the local network and avoid loss from failures of the peers. We model the problem through discrete optimization and assign a score to each node to capture both the topological connectivity and the social centrality of the corresponding node. We evaluate the proposed methods with respect to performance and quality measures developed for P2P social networks.Aytaş, YusufM.S

    Smooth Scan: Statistics-Oblivious Access Paths

    Get PDF
    Query optimizers depend heavily on statistics representing column distributions to create efficient query plans. In many cases, though, statistics are outdated or non-existent, and the process of refreshing statistics is very expensive, especially for ad-hoc workloads on ever bigger data. This results in suboptimal plans that severely hurt performance. The main problem is that any decision, once made by the optimizer, is fixed throughout the execution of a query. In particular, each logical operator translates into a fixed choice of a physical operator at run-time. In this paper we advocate for continuous adaptation and morphing of physical operators throughout their lifetime, by adjusting their behavior in accordance with the statistical properties of the data. We demonstrate the benefits of the new paradigm by designing and implementing an adaptive access path operator called Smooth Scan, which morphs continuously within the space of traditional index access and full table scan. Smooth Scan behaves similarly to an index scan for low selectivity; if selectivity increases, however, Smooth Scan progressively morphs its behavior toward a sequential scan. As a result, a system with Smooth Scan requires no optimization decisions up front nor does it need accurate statistics to provide good performance. We implement Smooth Scan in PostgreSQL and, using both synthetic benchmarks as well as TPC-H, we show that it achieves robust performance while at the same time being statistics-oblivious

    Benchmark methodologies for the optimized physical synthesis of RISC-V microprocessors

    Get PDF
    As technology continues to advance and chip sizes shrink, the complexity and design time required for integrated circuits have significantly increased. To address these challenges, Electronic Design Automation (EDA) tools have been introduced to streamline the design flow. These tools offer various methodologies and options to optimize power, performance, and chip area. However, selecting the most suitable methods from these options can be challenging, as they may lead to trade-offs among power, performance, and area. While architectural and Register Transfer Level (RTL) optimizations have been extensively studied in existing literature, the impact of optimization methods available in EDA tools on performance has not been thoroughly researched. This thesis aims to optimize a semiconductor processor through EDA tools within the physical synthesis domain to achieve increased performance while maintaining a balance between power efficiency and area utilization. By leveraging floorplanning tools and carefully selecting technology libraries and optimization options, the CV32E40P open-source processor is subjected to various floorplans to analyze their impact on chip performance. The employed techniques, including multibit components prefer option, multiplexer tree prefer option, identification and exclusion of problematic cells, and placement blockages, lead to significant improvements in cell density, congestion mitigation, and timing. The optimized synthesis results demonstrate a 71\% enhancement in chip design performance without a substantial increase in area, showcasing the effectiveness of these techniques in improving large-scale integrated circuits' performance, efficiency, and manufacturability. By exploring and implementing the available options in EDA tools, this study demonstrates how the processor's performance can be significantly improved while maintaining a balanced and efficient chip design. The findings contribute valuable insights to the field of electronic design automation, offering guidance to designers in selecting suitable methodologies for optimizing processors and other integrated circuits

    Automatic synthesis and optimization of chip multiprocessors

    Get PDF
    The microprocessor technology has experienced an enormous growth during the last decades. Rapid downscale of the CMOS technology has led to higher operating frequencies and performance densities, facing the fundamental issue of power dissipation. Chip Multiprocessors (CMPs) have become the latest paradigm to improve the power-performance efficiency of computing systems by exploiting the parallelism inherent in applications. Industrial and prototype implementations have already demonstrated the benefits achieved by CMPs with hundreds of cores.CMP architects are challenged to take many complex design decisions. Only a few of them are:- What should be the ratio between the core and cache areas on a chip?- Which core architectures to select?- How many cache levels should the memory subsystem have?- Which interconnect topologies provide efficient on-chip communication?These and many other aspects create a complex multidimensional space for architectural exploration. Design Automation tools become essential to make the architectural exploration feasible under the hard time-to-market constraints. The exploration methods have to be efficient and scalable to handle future generation on-chip architectures with hundreds or thousands of cores.Furthermore, once a CMP has been fabricated, the need for efficient deployment of the many-core processor arises. Intelligent techniques for task mapping and scheduling onto CMPs are necessary to guarantee the full usage of the benefits brought by the many-core technology. These techniques have to consider the peculiarities of the modern architectures, such as availability of enhanced power saving techniques and presence of complex memory hierarchies.This thesis has several objectives. The first objective is to elaborate the methods for efficient analytical modeling and architectural design space exploration of CMPs. The efficiency is achieved by using analytical models instead of simulation, and replacing the exhaustive exploration with an intelligent search strategy. Additionally, these methods incorporate high-level models for physical planning. The related contributions are described in Chapters 3, 4 and 5 of the document.The second objective of this work is to propose a scalable task mapping algorithm onto general-purpose CMPs with power management techniques, for efficient deployment of many-core systems. This contribution is explained in Chapter 6 of this document.Finally, the third objective of this thesis is to address the issues of the on-chip interconnect design and exploration, by developing a model for simultaneous topology customization and deadlock-free routing in Networks-on-Chip. The developed methodology can be applied to various classes of the on-chip systems, ranging from general-purpose chip multiprocessors to application-specific solutions. Chapter 7 describes the proposed model.The presented methods have been thoroughly tested experimentally and the results are described in this dissertation. At the end of the document several possible directions for the future research are proposed

    Routing optimization algorithms in integrated fronthaul/backhaul networks supporting multitenancy

    Get PDF
    Mención Internacional en el título de doctorEsta tesis pretende ayudar en la definición y el diseño de la quinta generación de redes de telecomunicaciones (5G) a través del modelado matemático de las diferentes cualidades que las caracterizan. En general, la ambición de estos modelos es realizar una optimización de las redes, ensalzando sus capacidades recientemente adquiridas para mejorar la eficiencia de los futuros despliegues tanto para los usuarios como para los operadores. El periodo de realización de esta tesis se corresponde con el periodo de investigación y definición de las redes 5G, y, por lo tanto, en paralelo y en el contexto de varios proyectos europeos del programa H2020. Por lo tanto, las diferentes partes del trabajo presentado en este documento cuadran y ofrecen una solución a diferentes retos que han ido apareciendo durante la definición del 5G y dentro del ámbito de estos proyectos, considerando los comentarios y problemas desde el punto de vista de todos los usuarios finales, operadores y proveedores. Así, el primer reto a considerar se centra en el núcleo de la red, en particular en cómo integrar tráfico fronthaul y backhaul en el mismo estrato de transporte. La solución propuesta es un marco de optimización para el enrutado y la colocación de recursos que ha sido desarrollado teniendo en cuenta restricciones de retardo, capacidad y caminos, maximizando el grado de despliegue de Unidades Distribuidas (DU) mientras se minimizan los agregados de las Unidades Centrales (CU) que las soportan. El marco y los algoritmos heurísticos desarrollados (para reducir la complexidad computacional) son validados y aplicados a redes tanto a pequeña como a gran (nivel de producción) escala. Esto los hace útiles para los operadores de redes tanto para la planificación de la red como para el ajuste dinámico de las operaciones de red en su infraestructura (virtualizada). Moviéndonos más cerca de los usuarios, el segundo reto considerado se centra en la colocación de servicios en entornos de nube y borde (cloud/edge). En particular, el problema considerado consiste en seleccionar la mejor localización para cada función de red virtual (VNF) que compone un servicio en entornos de robots en la nube, que implica restricciones estrictas en las cotas de retardo y fiabilidad. Los robots, vehículos y otros dispositivos finales proveen competencias significativas como impulsores, sensores y computación local que son esenciales para algunos servicios. Por contra, estos dispositivos están en continuo movimiento y pueden perder la conexión con la red o quedarse sin batería, cosa que reta aún más la entrega de servicios en este entorno dinámico. Así, el análisis realizado y la solución propuesta abordan las restricciones de movilidad y batería. Además, también se necesita tener en cuenta los aspectos temporales y los objetivos conflictivos de fiabilidad y baja latencia en el despliegue de servicios en una red volátil, donde los nodos de cómputo móviles actúan como una extensión de la infraestructura de cómputo de la nube y el borde. El problema se formula como un problema de optimización para colocación de VNFs minimizando el coste y también se propone un heurístico eficiente. Los algoritmos son evaluados de forma extensiva desde varios aspectos por simulación en escenarios que reflejan la realidad de forma detallada. Finalmente, el último reto analizado se centra en dar soporte a servicios basados en el borde, en particular, aprendizaje automático (ML) en escenarios del Internet de las Cosas (IoT) distribuidos. El enfoque tradicional al ML distribuido se centra en adaptar los algoritmos de aprendizaje a la red, por ejemplo, reduciendo las actualizaciones para frenar la sobrecarga. Las redes basadas en el borde inteligente, en cambio, hacen posible seguir un enfoque opuesto, es decir, definir la topología de red lógica alrededor de la tarea de aprendizaje a realizar, para así alcanzar el resultado de aprendizaje deseado. La solución propuesta incluye un modelo de sistema que captura dichos aspectos en el contexto de ML supervisado, teniendo en cuenta tanto nodos de aprendizaje (que realizan las computaciones) como nodos de información (que proveen datos). El problema se formula para seleccionar (i) qué nodos de aprendizaje e información deben cooperar para completar la tarea de aprendizaje, y (ii) el número de iteraciones a realizar, para minimizar el coste de aprendizaje mientras se garantizan los objetivos de error predictivo y tiempo de ejecución. La solución también incluye un algoritmo heurístico que es evaluado ensalzando una topología de red real y considerando tanto las tareas de clasificación como de regresión, y cuya solución se acerca mucho al óptimo, superando las soluciones alternativas encontradas en la literatura.This thesis aims to help in the definition and design of the 5th generation of telecommunications networks (5G) by modelling the different features that characterize them through several mathematical models. Overall, the aim of these models is to perform a wide optimization of the network elements, leveraging their newly-acquired capabilities in order to improve the efficiency of the future deployments both for the users and the operators. The timeline of this thesis corresponds to the timeline of the research and definition of 5G networks, and thus in parallel and in the context of several European H2020 programs. Hence, the different parts of the work presented in this document match and provide a solution to different challenges that have been appearing during the definition of 5G and within the scope of those projects, considering the feedback and problems from the point of view of all the end users, operators and providers. Thus, the first challenge to be considered focuses on the core network, in particular on how to integrate fronthaul and backhaul traffic over the same transport stratum. The solution proposed is an optimization framework for routing and resource placement that has been developed taking into account delay, capacity and path constraints, maximizing the degree of Distributed Unit (DU) deployment while minimizing the supporting Central Unit (CU) pools. The framework and the developed heuristics (to reduce the computational complexity) are validated and applied to both small and largescale (production-level) networks. They can be useful to network operators for both network planning as well as network operation adjusting their (virtualized) infrastructure dynamically. Moving closer to the user side, the second challenge considered focuses on the allocation of services in cloud/edge environments. In particular, the problem tackled consists of selecting the best the location of each Virtual Network Function (VNF) that compose a service in cloud robotics environments, that imply strict delay bounds and reliability constraints. Robots, vehicles and other end-devices provide significant capabilities such as actuators, sensors and local computation which are essential for some services. On the negative side, these devices are continuously on the move and might lose network connection or run out of battery, which further challenge service delivery in this dynamic environment. Thus, the performed analysis and proposed solution tackle the mobility and battery restrictions. We further need to account for the temporal aspects and conflicting goals of reliable, low latency service deployment over a volatile network, where mobile compute nodes act as an extension of the cloud and edge computing infrastructure. The problem is formulated as a cost-minimizing VNF placement optimization and an efficient heuristic is proposed. The algorithms are extensively evaluated from various aspects by simulation on detailed real-world scenarios. Finally, the last challenge analyzed focuses on supporting edge-based services, in particular, Machine Learning (ML) in distributed Internet of Things (IoT) scenarios. The traditional approach to distributed ML is to adapt learning algorithms to the network, e.g., reducing updates to curb overhead. Networks based on intelligent edge, instead, make it possible to follow the opposite approach, i.e., to define the logical network topology around the learning task to perform, so as to meet the desired learning performance. The proposed solution includes a system model that captures such aspects in the context of supervised ML, accounting for both learning nodes (that perform computations) and information nodes (that provide data). The problem is formulated to select (i) which learning and information nodes should cooperate to complete the learning task, and (ii) the number of iterations to perform, in order to minimize the learning cost while meeting the target prediction error and execution time. The solution also includes an heuristic algorithm that is evaluated leveraging a real-world network topology and considering both classification and regression tasks, and closely matches the optimum, outperforming state-of-the-art alternatives.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Pablo Serrano Yáñez-Mingot.- Secretario: Andrés García Saavedra.- Vocal: Luca Valcarengh

    Integrated Software Synthesis for Signal Processing Applications

    Get PDF
    Signal processing applications usually encounter multi-dimensional real-time performance requirements and restrictions on resources, which makes software implementation complex. Although major advances have been made in embedded processor technology for this application domain -- in particular, in technology for programmable digital signal processors -- traditional compiler techniques applied to such platforms do not generate machine code of desired quality. As a result, low-level, human-driven fine tuning of software implementations is needed, and we are therefore in need of more effective strategies for software implementation for signal processing applications. In this thesis, a number of important memory and performance optimization problems are addressed for translating high-level representations of signal processing applications into embedded software implementations. This investigation centers around signal processing-oriented dataflow models of computation. This form of dataflow provides a coarse grained modeling approach that is well-suited to the signal processing domain and is increasingly supported by commercial and research-oriented tools for design and implementation of signal processing systems. Well-developed dataflow models of signal processing systems expose high-level application structure that can be used by designers and design tools to guide optimization of hardware and software implementations. This thesis advances the suite of techniques available for optimization of software implementations that are derived from the application structure exposed from dataflow representations. In addition, the specialized architecture of programmable digital signal processors is considered jointly with dataflow-based analysis to streamline the optimization process for this important family of embedded processors. The specialized features of programmable digital signal processors that are addressed in this thesis include parallel memory banks to facilitate data parallelism, and signal-processing-oriented addressing modes and address register management capabilities. The problems addressed in this thesis involve several inter-related features, and therefore an integrated approach is required to solve them effectively. This thesis proposes such an integrated approach, and develops the approach through formal problem formulations, in-depth theoretical analysis, and extensive experimentation

    Resource and thermal management in 3D-stacked multi-/many-core systems

    Full text link
    Continuous semiconductor technology scaling and the rapid increase in computational needs have stimulated the emergence of multi-/many-core processors. While up to hundreds of cores can be placed on a single chip, the performance capacity of the cores cannot be fully exploited due to high latencies of interconnects and memory, high power consumption, and low manufacturing yield in traditional (2D) chips. 3D stacking is an emerging technology that aims to overcome these limitations of 2D designs by stacking processor dies over each other and using through-silicon-vias (TSVs) for on-chip communication, and thus, provides a large amount of on-chip resources and shortens communication latency. These benefits, however, are limited by challenges in high power densities and temperatures. 3D stacking also enables integrating heterogeneous technologies into a single chip. One example of heterogeneous integration is building many-core systems with silicon-photonic network-on-chip (PNoC), which reduces on-chip communication latency significantly and provides higher bandwidth compared to electrical links. However, silicon-photonic links are vulnerable to on-chip thermal and process variations. These variations can be countered by actively tuning the temperatures of optical devices through micro-heaters, but at the cost of substantial power overhead. This thesis claims that unearthing the energy efficiency potential of 3D-stacked systems requires intelligent and application-aware resource management. Specifically, the thesis improves energy efficiency of 3D-stacked systems via three major components of computing systems: cache, memory, and on-chip communication. We analyze characteristics of workloads in computation, memory usage, and communication, and present techniques that leverage these characteristics for energy-efficient computing. This thesis introduces 3D cache resource pooling, a cache design that allows for flexible heterogeneity in cache configuration across a 3D-stacked system and improves cache utilization and system energy efficiency. We also demonstrate the impact of resource pooling on a real prototype 3D system with scratchpad memory. At the main memory level, we claim that utilizing heterogeneous memory modules and memory object level management significantly helps with energy efficiency. This thesis proposes a memory management scheme at a finer granularity: memory object level, and a page allocation policy to leverage the heterogeneity of available memory modules and cater to the diverse memory requirements of workloads. On the on-chip communication side, we introduce an approach to limit the power overhead of PNoC in (3D) many-core systems through cross-layer thermal management. Our proposed thermally-aware workload allocation policies coupled with an adaptive thermal tuning policy minimize the required thermal tuning power for PNoC, and in this way, help broader integration of PNoC. The thesis also introduces techniques in placement and floorplanning of optical devices to reduce optical loss and, thus, laser source power consumption.2018-03-09T00:00:00

    Smooth Scan: Robust Query Execution with a Statistics-oblivious Access Operator

    Get PDF
    Query optimizers depend heavily on statistics representing column distributions to create efficient query plans. In many cases, though, statistics are outdated or non-existent, and the process of refreshing statistics is very expensive, especially for ad-hoc workloads on ever bigger data. This results in suboptimal plans that severely hurt performance. The main problem is that any decision, once made by the optimizer, is fixed throughout the execution of a query. In particular, each logical operator translates into a fixed choice of a physical operator at run-time. In this paper we advocate for continuous adaptation and morphing of physical operators throughout their lifetime, by adjusting their behavior in accordance with the statistical properties of the data. We demonstrate the benefits of the new paradigm by designing and implementing an adaptive access path operator called Smooth Scan, which morphs continuously within the space of traditional index access and full table scan. Smooth Scan behaves similarly to an index scan for low selectivity; if selectivity increases, however, Smooth Scan progressively morphs its behavior toward a sequential scan. As a result, a system with Smooth Scan requires no access path decisions up front nor does it need accurate statistics to provide good performance. We implement Smooth Scan in PostgreSQL and, using both synthetic benchmarks as well as TPC-H, we show that it achieves robust performance while at the same time being statistics-oblivious
    corecore