36 research outputs found

    Electronic Post-Compensation of Optical Transmission Impairments Using Digital Backward Propagation

    Get PDF
    Systems and method of compensating for transmission impairment are disclosed. One such method comprises: receiving an optical signal which has been distorted in the physical domain by an optical transmission channel; and propagating the distorted optical signal backward in the electronic domain in a corresponding virtual optical transmission channel

    An embedded adaptive optics real time controller

    Get PDF
    The design and realisation of a low cost, high speed control system for adaptive optics (AO) is presented. This control system is built around a field programmable gate array (FPGA). FPGA devices represent a fundamentally different approach to implementing control systems than conventional central processing units. The performance of the FPGA control system is demonstrated in a specifically constructed laboratory AO experiment where closed loop AO correction is shown. An alternative application of the control system is demonstrated in the field of optical tweezing, where it is used to study the motion dynamics of particles trapped within laser foci

    X‐ray microscopy and automatic detection of defects in through silicon vias in three‐dimensional integrated circuits

    Get PDF
    Through silicon vias (TSVs) are a key enabling technology for interconnection and realization of complex three-dimensional integrated circuit (3D-IC) components. In order to perform failure analysis without the need of destructive sample preparation, x-ray microscopy (XRM) is a rising method of analyzing the internal structure of samples. However, there is still a lack of evaluated scan recipes or best practices regarding XRM parameter settings for the study of TSVs in the current state of literature. There is also an increased interest in automated machine learning and deep learning approaches for qualitative and quantitative inspection processes in recent years. Especially deep learning based object detection is a well-known methodology for fast detection and classification capable of working with large volumetric XRM datasets. Therefore, a combined XRM and deep learning object detection workflow for automatic micrometer accurate defect location on liner-TSVs was developed throughout this work. Two measurement setups including detailed information about the used parameters for either full IC device scan or detailed TSV scan were introduced. Both are able to depict delamination defects and finer structures in TSVs with either a low or high resolution. The combination of a 0.4 objective with a beam voltage of 40 kV proved to be a good combination for achieving optimal imaging contrast for the full-device scan. However, detailed TSV scans have demonstrated that the use of a 20 objective along with a beam voltage of 140 kV significantly improves image quality. A database with 30,000 objects was created for automated data analysis, so that a well-established object recognition method for automated defect analysis could be integrated into the process analysis. This RetinaNet-based object detection method achieves a very strong average precision of 0.94. It supports the detection of erroneous TSVs in both top view and side view, so that defects can be detected at different depths. Consequently, the proposed workflow can be used for failure analysis, quality control or process optimization in R&D environments

    Stream Processor Development using Multi-Threshold NULL Convention Logic Asynchronous Design Methodology

    Get PDF
    Decreasing transistor feature size has led to an increase in the number of transistors in integrated circuits (IC), allowing for the implementation of more complex logic. However, such logic also requires more complex clock tree synthesis (CTS) to avoid timing violations as the clock must reach many more gates over larger areas. Thus, timing analysis requires significantly more computing power and designer involvement than in the past. For these reasons, IC designers have been pushed to nix conventional synchronous (SYNC) architecture and explore novel methodologies such as asynchronous, self-timed architecture. This dissertation evaluates the nominal active energy, voltage-scaled active energy, and leakage power dissipation across two cores of a stream processor: Smoothing Filter (SF) and Histogram Equalization (HEQ). Both cores were implemented in Multi-Threshold NULL Convention Logic (MTNCL) and clock-gated synchronous methodologies using a gate-level netlist to avoid any architectural discrepancies while guaranteeing impartial comparisons. MTNCL designs consumed more active energy than their synchronous counterparts due to the dual-rail encoding system; however, high-threshold-voltage (High-Vt) transistors used in MTNCL threshold gates reduced leakage power dissipation by up to 227%. During voltage-scaling simulations, MTNCL circuits showed a high level of robustness as the output results were logically valid across all voltage sweeps without any additional circuitry. SYNC circuits, however, needed extra logic, such as a DVS controller, to adjust the circuit’s speed when VDD changed. Although SYNC circuits still consumed less average energy, MTNCL circuit power gains accelerated when switching to lower voltage domains

    Stream Processor Development using Multi-Threshold NULL Convention Logic Asynchronous Design Methodology

    Get PDF
    Decreasing transistor feature size has led to an increase in the number of transistors in integrated circuits (IC), allowing for the implementation of more complex logic. However, such logic also requires more complex clock tree synthesis (CTS) to avoid timing violations as the clock must reach many more gates over larger areas. Thus, timing analysis requires significantly more computing power and designer involvement than in the past. For these reasons, IC designers have been pushed to nix conventional synchronous (SYNC) architecture and explore novel methodologies such as asynchronous, self-timed architecture. This dissertation evaluates the nominal active energy, voltage-scaled active energy, and leakage power dissipation across two cores of a stream processor: Smoothing Filter (SF) and Histogram Equalization (HEQ). Both cores were implemented in Multi-Threshold NULL Convention Logic (MTNCL) and clock-gated synchronous methodologies using a gate-level netlist to avoid any architectural discrepancies while guaranteeing impartial comparisons. MTNCL designs consumed more active energy than their synchronous counterparts due to the dual-rail encoding system; however, high-threshold-voltage (High-Vt) transistors used in MTNCL threshold gates reduced leakage power dissipation by up to 227%. During voltage-scaling simulations, MTNCL circuits showed a high level of robustness as the output results were logically valid across all voltage sweeps without any additional circuitry. SYNC circuits, however, needed extra logic, such as a DVS controller, to adjust the circuit’s speed when VDD changed. Although SYNC circuits still consumed less average energy, MTNCL circuit power gains accelerated when switching to lower voltage domains

    Paralelización de un algoritmo de reconstrucción tomográfica de rayos X para plataformas híbridas basadas en multi-GPU y multi-core

    Get PDF
    En el presente Proyecto de Fin de Carrera se aborda la necesidad de paralelización y optimización tanto de la memoria como de los recursos de procesamiento disponibles para reducir al mínimo posible el tiempo de procesamiento en la ejecución de una aplicación de imagen médica. La aplicación bajo estudio es la reconstrucción de imagen de tomografía de rayos X (TAC) basada en geometría de haz cónico. La TAC es una modalidad de imagen médica basada en el uso de rayos X para obtener imágenes de cortes del cuerpo a estudiar. En vez de obtener una única imagen plana (proyección), como es el caso de la radiografía convencional, en la TAC se obtiene un conjunto de proyecciones en distintos ángulos alrededor del cuerpo. Posteriormente, la computadora recoge todos estos datos de proyección y los combina en un volumen final que representa la reconstrucción digital 3D del cuerpo, permitiendo obtener cortes del mismo en cualquier dirección. Para lograr el máximo rendimiento se han tenido presentes las distintas etapas por las que transcurre el proceso de reconstrucción y, por tanto, la necesidad de buscar una infraestructura óptima para cada una de ellas, dividiendo así el trabajo entre las tareas que deben ser realizadas por la CPU y las que deben ser realizadas dentro de la GPU. Además, para lograr los resultados más satisfactorios se han utilizado otro tipo de técnicas como la paralelización, mecanismos de entrada y salida asíncronos y alineamiento de memoria. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------This work addresses the need of parallelization and optimization of both memory and the resources available to minimize the processing time needed in a medical imaging application. This application is the tomographic image reconstruction of data obtained with X-ray computed tomography (CT) systems based on cone-beam geometry. CT is a medical imaging modality based on the use of X-rays to provide images of slices of the body under study. Instead of acquiring a single image (projection), as in the case of conventional radiography, in CT we acquire a set of projections at different angles. Subsequently, the computer combines these projection data into a final 3D reconstructed volume, visualizing planes inside the body in any direction. For maximum performance, the different stages of the reconstruction process have been taken into account in order to find the optimal infrastructure for each of them. As a result, the work is divided among the tasks to be performed by the CPU and those to be performed within the GPU. Furthermore, in order to achieve more satisfactory results, we have used other techniques such as parallelization, I/O mechanisms, and alignment of asynchronous memory.Ingeniería Técnica en Informática de Gestió

    CMOS optical centroid processor for an integrated Shack-Hartmann wavefront sensor

    Get PDF
    A Shack Hartmann wavefront sensor is used to detect the distortion of light in an optical wavefront. It does this by sampling the wavefront with an array of lenslets and measuring the displacement of focused spots from reference positions. These displacements are linearly related to the local wavefront tilts from which the entire wavefront can be reconstructed. In most Shack Hartmann wavefront sensors, a CCD is used to sample the entire wavefront, typically at a rate of 25 to 60 Hz, and a whole frame of light spots is read out before their positions are processed. This results in a data bottleneck. In this design, parallel processing is achieved by incorporating local centroid processing for each focused spot, thereby requiring only reduced bandwidth data to be transferred off-chip at a high rate. To incorporate centroid processing at the sensor level requires high levels of circuit integration not possible with a CCD technology. Instead a standard 0.7J..lmCMOS technology was used but photodetector structures for this technology are not well characterised. As such characterisation of several common photodiode structures was carried out which showed good responsitivity of the order of 0.3 AIW. Prior to fabrication on-chip, a hardware emulation system using a reprogrammable FPGA was built which implemented the centroiding algorithm successfully. Subsequently, the design was implemented as a single-chip CMOS solution. The fabricated optical centroid processor successfully computed and transmitted the centroids at a rate of more than 2.4 kHz, which when integrated as an array of tilt sensors will allow a data rate that is independent of the number of tilt sensors' employed. Besides removing the data bottleneck present in current systems, the design also offers advantages in terms of power consumption, system size and cost. The design was also shown to be extremely scalable to a complete low cost real time adaptive optics system

    Adaptive Full Aperture Wavefront Sensor Study

    Get PDF
    This grant and the work described was in support of a Seven Segment Demonstrator (SSD) and review of wavefront sensing techniques proposed by the Government and Contractors for the Next Generation Space Telescope (NGST) Program. A team developed the SSD concept. For completeness, some of the information included in this report has also been included in the final report of a follow-on contract (H-27657D) entitled "Construction of Prototype Lightweight Mirrors". The original purpose of this GTRI study was to investigate how various wavefront sensing techniques might be most effectively employed with large (greater than 10 meter) aperture space based telescopes used for commercial and scientific purposes. However, due to changes in the scope of the work performed on this grant and in light of the initial studies completed for the NGST program, only a portion of this report addresses wavefront sensing techniques. The wavefront sensing techniques proposed by the Government and Contractors for the NGST were summarized in proposals and briefing materials developed by three study teams including NASA Goddard Space Flight Center, TRW, and Lockheed-Martin. In this report, GTRI reviews these approaches and makes recommendations concerning the approaches. The objectives of the SSD were to demonstrate functionality and performance of a seven segment prototype array of hexagonal mirrors and supporting electromechanical components which address design issues critical to space optics deployed in large space based telescopes for astronomy and for optics used in spaced based optical communications systems. The SSD was intended to demonstrate technologies which can support the following capabilities: Transportation in dense packaging to existing launcher payload envelopes, then deployable on orbit to form a space telescope with large aperture. Provide very large (greater than 10 meters) primary reflectors of low mass and cost. Demonstrate the capability to form a segmented primary or quaternary mirror into a quasi-continuous surface with individual subapertures phased so that near diffraction limited imaging in the visible wavelength region is achieved. Continuous compensation of optical wavefront due to perturbations caused by imperfections, natural disturbances, and equipment induced vibrations/deflections to provide near diffraction limited imaging performance in the visible wavelength region. Demonstrate the feasibility of fabricating such systems with reduced mass and cost compared to past approaches

    디스플레이 장치를 위한 고정 비율 압축 하드웨어 설계

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 이혁재.디스플레이 장치에서의 압축 방식은 일반적인 비디오 압축 표준과는 다른 몇 가지 특징이 있다. 첫째, 특수한 어플리케이션을 목표로 한다. 둘째, 압축 이득, 소비 전력, 실시간 처리 등을 위해 하드웨어 크기가 작고, 목표로 하는 압축률이 낮다. 셋째, 래스터 주사 순서에 적합해야 한다. 넷째, 프레임 메모리 크기를 제한시키거나 임의 접근을 하기 위하여 압축 단위당 목표 압축률을 실시간으로 정확히 맞출 수 있어야 한다. 본 논문에서는 이와 같은 특징을 만족시키는 세 가지 압축 알고리즘과 하드웨어 구조를 제안하도록 한다. LCD 오버드라이브를 위한 압축 방식으로는 BTC(block truncation coding) 기반의 압축 방식을 제안하도록 한다. 본 논문은 압축 이득을 증가시키기 위하여 목표 압축률 12에 대한 압축 방식을 제안하는데, 압축 효율을 향상시키기 위하여 크게 두 가지 방법을 이용한다. 첫 번째는 이웃하는 블록과의 공간적 연관성을 이용하여 비트를 절약하는 방법이다. 그리고 두 번째는 단순한 영역은 2×16 코딩 블록, 복잡한 영역은 2×8 코딩 블록을 이용하는 방법이다. 2×8 코딩 블록을 이용하는 경우 목표 압축률을 맞추기 위하여 첫 번째 방법으로 절약된 비트를 이용한다. 저비용 근접-무손실 프레임 메모리 압축을 위한 방식으로는 1D SPIHT(set partitioning in hierarchical trees) 기반의 압축 방식을 제안하도록 한다. SPIHT은 고정 목표 압축률을 맞추는데 매우 효과적인 압축 방식이다. 그러나 1D 형태인 1D SPIHT은 래스터 주사 순서에 적합함에도 관련 연구가 많이 진행되지 않았다. 본 논문은 1D SPIHT의 가장 큰 문제점인 속도 문제를 해결할 수 있는 하드웨어 구조를 제안한다. 이를 위해 1D SPIHT 알고리즘은 병렬성을 이용할 수 있는 형태로 수정된다. 인코더의 경우 병렬 처리를 방해하는 의존 관계가 해결되고, 파이프라인 스케쥴링이 가능하게 된다. 디코더의 경우 병렬로 동작하는 각 패스가 디코딩할 비트스트림의 길이를 미리 예측할 수 있도록 알고리즘이 수정된다. 고충실도(high-fidelity) RGBW 컬러 이미지 압축을 위한 방식으로는 예측 기반의 압축 방식을 제안하도록 한다. 제안 예측 방식은 두 단계의 차분 과정으로 구성된다. 첫 번째는 공간적 연관성을 이용하는 단계이고, 두 번째는 인터-컬러 연관성을 이용하는 단계이다. 코딩의 경우 압축 효율이 높은 VLC(variable length coding) 방식을 이용하도록 한다. 그러나 기존의 VLC 방식은 목표 압축률을 정확히 맞추는데 어려움이 있었으므로 본 논문에서는 Golomb-Rice 코딩을 기반으로 한 고정 길이 압축 방식을 제안하도록 한다. 제안 인코더는 프리-코더와 포스터-코더로 구성되어 있다. 프리-코더는 특정 상황에 대하여 실제 인코딩을 수행하고, 다른 모든 상황에 대한 예측 인코딩 정보를 계산하여 포스터-코더에 전달한다. 그리고 포스트-코더는 전달받은 정보를 이용하여 실제 비트스트림을 생성한다.제 1 장 서론 1 1.1 연구 배경 1 1.2 연구 내용 4 1.3 논문 구성 8 제 2 장 이전 연구 9 2.1 BTC 9 2.1.1 기본 BTC 알고리즘 9 2.1.2 컬러 이미지 압축을 위한 BTC 알고리즘 10 2.2 SPIHT 13 2.2.1 1D SPIHT 알고리즘 13 2.2.2 SPIHT 하드웨어 17 2.3 예측 기반 코딩 19 2.3.1 예측 방법 19 2.3.2 VLC 20 2.3.3 예측 기반 코딩 하드웨어 22 제 3 장 LCD 오버드라이브를 위한 BTC 24 3.1 제안 알고리즘 24 3.1.1 비트-절약 방법 25 3.1.2 블록 크기 선택 방법 29 3.1.3 알고리즘 요약 31 3.2 하드웨어 구조 33 3.2.1 프레임 메모리 인터페이스 34 3.2.2 인코더와 디코더의 구조 37 3.3 실험 결과 44 3.3.1 알고리즘 성능 44 3.3.2 하드웨어 구현 결과 49 제 4 장 저비용 근접-무손실 프레임 메모리 압축을 위한 고속 1D SPIHT 54 4.1 인코더 하드웨어 구조 54 4.1.1 의존 관계 분석 및 제안하는 파이프라인 스케쥴 54 4.1.2 분류 비트 재배치 57 4.2 디코더 하드웨어 구조 59 4.2.1 비트스트림의 시작 주소 계산 59 4.2.2 절반-패스 처리 방법 63 4.3 하드웨어 구현 65 4.4 실험 결과 73 제 5 장 고충실도 RGBW 컬러 이미지 압축을 위한 고정 압축비 VLC 81 5.1 제안 알고리즘 81 5.1.1 RGBW 인터-컬러 연관성을 이용한 예측 방식 82 5.1.2 고정 압축비를 위한 Golomb-Rice 코딩 85 5.1.3 알고리즘 요약 89 5.2 하드웨어 구조 90 5.2.1 인코더 구조 91 5.2.2 디코더 구조 95 5.3 실험 결과 101 5.3.1 알고리즘 실험 결과 101 5.3.2 하드웨어 구현 결과 107 제 6 장 압축 성능 및 하드웨어 크기 비교 분석 113 6.1 압축 성능 비교 113 6.2 하드웨어 크기 비교 120 제 7 장 결론 125 참고문헌 128 ABSTRACT 135Docto
    corecore