1,309 research outputs found

    DESIGN OF ANALOG CIRCUITS USING PSEUDO FLOATING GATE

    Get PDF
    In this paper we present pseudo floating gate and its bidirectional property. Inverter also can be implemented using bidirectional property. The inverter can be made bidirectional simply by interchanging vdd and gnd and no need to add any circuitry or any amplifier. We are using this inverter to implement the differentiator and integrator. We are first implementing inverter using pseudo floating gate. The bidirectionality of the gate is further evolved to be able to control signal flow conditions. And finally using this inverter we are implementing differentiator and integrator. Typical applications are in filter design and IO ports in ICs. Linearity and AC simulations are presented to show the good properties and versatility suited for Bi-directional analog circuit design

    An Analog VLSI Deep Machine Learning Implementation

    Get PDF
    Machine learning systems provide automated data processing and see a wide range of applications. Direct processing of raw high-dimensional data such as images and video by machine learning systems is impractical both due to prohibitive power consumption and the “curse of dimensionality,” which makes learning tasks exponentially more difficult as dimension increases. Deep machine learning (DML) mimics the hierarchical presentation of information in the human brain to achieve robust automated feature extraction, reducing the dimension of such data. However, the computational complexity of DML systems limits large-scale implementations in standard digital computers. Custom analog signal processing (ASP) can yield much higher energy efficiency than digital signal processing (DSP), presenting means of overcoming these limitations. The purpose of this work is to develop an analog implementation of DML system. First, an analog memory is proposed as an essential component of the learning systems. It uses the charge trapped on the floating gate to store analog value in a non-volatile way. The memory is compatible with standard digital CMOS process and allows random-accessible bi-directional updates without the need for on-chip charge pump or high voltage switch. Second, architecture and circuits are developed to realize an online k-means clustering algorithm in analog signal processing. It achieves automatic recognition of underlying data pattern and online extraction of data statistical parameters. This unsupervised learning system constitutes the computation node in the deep machine learning hierarchy. Third, a 3-layer, 7-node analog deep machine learning engine is designed featuring online unsupervised trainability and non-volatile floating-gate analog storage. It utilizes massively parallel reconfigurable current-mode analog architecture to realize efficient computation. And algorithm-level feedback is leveraged to provide robustness to circuit imperfections in analog signal processing. At a processing speed of 8300 input vectors per second, it achieves 1×1012 operation per second per Watt of peak energy efficiency. In addition, an ultra-low-power tunable bump circuit is presented to provide similarity measures in analog signal processing. It incorporates a novel wide-input-range tunable pseudo-differential transconductor. The circuit demonstrates tunability of bump center, width and height with a power consumption significantly lower than previous works

    Integrated high-voltage switched-capacitor DC-DC converters

    Get PDF
    The focus of this work is on the integrated circuit (IC) level integration of high-voltage switched-capacitor (SC) converters with the goal of fully integrated power management solutions for system-on-chip (SoC) and system-in-pagage (SiP) applications. The full integration of SC converters provides a low cost and compact power supply solution for modern electronics. Currently, there are almost no fully integrated SC converters with input voltages above 5 V. The purpose of this work is to provide solutions for higher input voltages. The increasing challenges of a compact and efficient power supply on the chip are addressed. High-voltage rated components and the increased losses caused by parasitics not only reduce power density but also efficiency. Loss mechanisms in high-voltage SC converters are investigated resulting in an optimized model for high-voltage SC converters. The model developed allows an appropriate comparison of different semiconductor technologies and converter topologies. Methods and design proposals for loss reduction are presented. Control of power switches with their supporting circuits is a further challenge for high-voltage SC converters. The aim of this work is to develop fully integrated SC converters with a wide input voltage range. Different topologies and concepts are investigated. The implemented fully integrated SC converter has an input voltage range of 2 V to 13 V. This is twice the range of existing converters. This is achieved by an implemented buck and boost mode as well as 17 conversion ratios. Experimental results show a peak efficiency of 81.5%. This is the highest published peak efficiency for fully integrated SC converters with an input voltage > 5V. With the help of the model developed in this work, a three-phase SC converter topology for input voltages up to 60 V is derived and then investigated and discussed. Another focus of this work is on the power supply of sensor nodes and smart home applications with low-power consumption. Highly integrated micro power supplies that operate directly from mains voltage are particularly suitable for these applications. The micro power supply proposed in this work utilizes the high-voltage SC converter developed. The output power is 14 times higher and the power density eleven times higher than prior work. Since plenty of power switches are built into modern multi-ratio SC converters, the switch control circuits must be optimized with regard to low-power consumption and area requirements. In this work, different level shifter concepts are investigated and a low-power high-voltage level shifter for 50 V applications based on a capacitive level shifter is introduced. The level shifter developed exceeds the state of the art by a factor of more than eleven with a power consumption of 2.1pJ per transition. A propagation delay of 1.45 ns is achieved. The presented high-voltage level shifter is the first level shifter for 50 V applications with a propagation delay below 2 ns and power consumption below 20pJ per transition. Compared to the state of the art, the figure of merit is significantly improved by a factor of two. Furthermore, various charge pump concepts are investigated and evaluated within the context of this work. The charge pump, optimized in this work, improves the state of the art by a factor of 1.6 in terms of efficiency. Bidirectional switches must be implemented at certain locations within the power stage to prevent reverse conduction. The topology of a bidirectional switch developed in this work reduces the dynamic switching losses by 70% and the area consumption including the required charge pumps by up to 65% compared to the state of the art. These improvements make it possible to control the power switches in a fast and efficient way. Index terms — integrated power management, high input voltage, multi-ratio SC converter, level shifter, bidirectional switch, micro power supplyDer Schwerpunkt dieser Arbeit liegt auf der Erforschung von Switched-Capacitor (SC) Spannungswandler für höhere Eingangsspannungen. Ziel der Arbeit ist es Lösungen für ein voll auf dem Halbleiterchip integriertes Power Management anzubieten um System on Chip (SoC) und System in Package (SiP) zu ermöglichen. Die vollständige Integration von SC Spannungswandlern bietet eine kostengünstige und kompakte Spannungsversorgungslösung für moderne Elektronik. Der kontinuierliche Trend hin zu immer kompakterer Elektronik und hin zu höheren Versorgungsspannungen wird in dieser Arbeit adressiert. Aktuell gibt es sehr wenige voll integrierte SC Spannungswandler mit einer Eingangsspannung größer 5 V. Die mit steigender Spannung zunehmenden Herausforderungen an eine kompakte und effiziente Spannungsversorgung auf dem Chip werden in dieser Arbeit untersucht. Die höhere Spannungsfestigkeit der verwendeten Komponenten korreliert mit erhöhten Verlusten und erhöhtem Flächenverbrauch, welche sich negativ auf den Wirkungsgrad und die Leistungsdichte von SC Spannungswandlern auswirkt. Bestandteil dieser Arbeit ist die Untersuchung dieser Verlustmechanismen und die Entwicklung eines Modells, welches speziell für höhere Spannungen optimiert wurde. Das vorgestellte Modell ermöglicht zum einen die optimale Dimensionierung der Spannungswandler und zum anderen faire Vergleichsmöglichkeiten zwischen verschiedenen SC Spannungswandler Architekturen und Halbleitertechnologien. Demnach haben sowohl die gewählte Architektur und Halbleitertechnologie als auch die Kombination aus gewählter Architektur und Technologie erheblichen Einfluss auf die Leistungsfähigkeit der Spannungswandler. Ziel dieser Arbeit ist die Vollintegration eines SC Spannungswandlers mit einem weiten und hohen Eingangsspannungsbereich zu entwickeln. Dazu wurden verschiedene Schaltungsarchitekturen und Konzepte untersucht. Der vorgestellte vollintegrierte SC Spannungswandler weist einen Eingangsspannungsbereich von 2 V bis 13 V auf. Dies ist eine Verdopplung im Vergleich zum Stand der Technik. Dies wird durch einen implementierten Auf- und Abwärtswandler-Betriebsmodus sowie 17 Übersetzungsverhältnisse erreicht. Experimentelle Ergebnisse zeigen einen Spitzenwirkungsgrad von 81.5%. Dies ist der höchste veröffentlichte Spitzenwirkungsgrad für vollintegrierte SC Spannungswandler mit einer Eingangsspannung größer 5 V. Mit Hilfe des in dieser Arbeit entwickelten Modells wird eine dreiphasige SC Spannungswandler Architektur für Eingangsspannungen bis zu 60 V entwickelt und anschließend analysiert und diskutiert. Ein weiterer Schwerpunkt dieser Arbeit adressiert die kompakte Spannungsversorgung von Sensorknoten mit geringem Stromverbrauch, für Anwendungen wie Smart Home und Internet der Dinge (IoT). Für diese Anwendungen eignen sich besonders gut hochintegrierte Mikro-Netzteile, welche direkt mit dem 230VRMS-Hausnetz (bzw. 110VRMS) betrieben werden können. Das in dieser Arbeit vorgestellte Mikro-Netzteil nutzt einen in dieser Arbeit entwickelten SC Spannungswandler für hohe Eingangsspannungen. Die damit erzielte Ausgangsleistung ist 14-mal größer im Vergleich zum Stand der Technik. In SC Spannungswandlern für hohe Spannungen werden viele Leistungsschalter benötigt, deshalb muss bei der Schalteransteuerung besonders auf einen geringen Leistungsverbrauch und Flächenbedarf der benötigten Schaltungsblöcke geachtet werden. Gegenstand dieser Arbeit ist sowohl die Analyse verschiedener Konzepte für Pegelumsetzer, als auch die Entwicklung eines stromsparenden Pegelumsetzers für 50 V-Anwendungen. Mit einer Leistungsaufnahme von 2.1pJ pro Signalübergang reduziert der entwickelte Pegelumsetzer mit kapazitiver Kopplung um mehr als elfmal die Leistungsaufnahme im Vergleich zum Stand der Technik. Die erreichte Laufzeitverzögerung beträgt 1.45 ns. Damit erzielt der vorgestellte Hochspannungs-Pegelumsetzer als erster Pegelumsetzer für 50 V-Anwendungen eine Laufzeitverzögerung unter 2 ns und eine Leistungsaufnahme unter 20pJ pro Signalwechsel. Im Vergleich zum Stand der Technik wird die Leistungskennzahl um den Faktor zwei deutlich verbessert. Darüber hinaus werden im Rahmen dieser Arbeiten verschiedene Ladungspumpenkonzepte untersucht und bewertet. Die in dieser Arbeit optimierte Ladungspumpe verbessert den Stand der Technik um den Faktor 1.6 in Bezug auf den Wirkungsgrad. Die in dieser Arbeit entwickelte Schaltungsarchitektur eines bidirektionalen Schalters reduziert die dynamischen Schaltverluste um 70% und den benötigten Flächenbedarf inklusive der benötigten Ladungspumpe um bis zu 65% gegenüber dem Stand der Technik. Diese Verbesserungen ermöglichen es, die Leistungsschalter schnell und effizient anzusteuern. Schlagworte — Integriertes Powermanagement, hohe Eingangsspannung, Multi-Ratio SC Spannungswan- dler, Pegelumsetzer, bidirektionaler Schalter, Mikro-Netztei

    Modeling of deadtime events in power converters with half-bridge modules for a highly accurate hardware-in-the-loop fixed point implementation in fpga

    Full text link
    Hardware-in-the-loop (HIL) simulations of power converters must achieve a truthful representation in real time with simulation steps on the order of microseconds or tens of nanoseconds. The numerical solution for the differential equations that model the state of the converter can be calculated using the fourth-order Runge–Kutta method, which is notably more accurate than Euler methods. However, when the mathematical error due to the solver is drastically reduced, other sources of error arise. In the case of converters that use deadtimes to control the switches, such as any power converter including half-bridge modules, the inductor current reaching zero during deadtimes generates a model error large enough to offset the advantages of the Runge–Kutta method. A specific model is needed for such events. In this paper, an approximation is proposed, where the time step is divided into two semi-steps. This serves to recover the accuracy of the calculations at the expense of needing a division operation. A fixed-point implementation in VHDL is proposed, reusing a block along several calculation cycles to compute the needed parameters for the Runge–Kutta method. The implementation in a low-cost field-programmable gate arrays (FPGA) (Xilinx Artix-7) achieves an integration time of 1 µs. The calculation errors are six orders of magnitude smaller for both capacitor voltage and inductor current for the worst case, the one where the current reaches zero during the deadtimes in 78% of the simulated cycles. The accuracy achieved with the proposed fixed point implementation is very close to that of 64-bit floating point and can operate in real time with a resolution of 1 µs. Therefore, the results show that this approach is suitable for modeling converters based on half-bridge modules by using FPGAs. This solution is intended for easy integration into any HIL system, including commercial HIL systems, showing that its application even with relatively high integration steps (1 µs) surpasses the results of techniques with even faster integration steps that do not take these events into accoun

    Quantum Computing

    Full text link
    Quantum mechanics---the theory describing the fundamental workings of nature---is famously counterintuitive: it predicts that a particle can be in two places at the same time, and that two remote particles can be inextricably and instantaneously linked. These predictions have been the topic of intense metaphysical debate ever since the theory's inception early last century. However, supreme predictive power combined with direct experimental observation of some of these unusual phenomena leave little doubt as to its fundamental correctness. In fact, without quantum mechanics we could not explain the workings of a laser, nor indeed how a fridge magnet operates. Over the last several decades quantum information science has emerged to seek answers to the question: can we gain some advantage by storing, transmitting and processing information encoded in systems that exhibit these unique quantum properties? Today it is understood that the answer is yes. Many research groups around the world are working towards one of the most ambitious goals humankind has ever embarked upon: a quantum computer that promises to exponentially improve computational power for particular tasks. A number of physical systems, spanning much of modern physics, are being developed for this task---ranging from single particles of light to superconducting circuits---and it is not yet clear which, if any, will ultimately prove successful. Here we describe the latest developments for each of the leading approaches and explain what the major challenges are for the future.Comment: 26 pages, 7 figures, 291 references. Early draft of Nature 464, 45-53 (4 March 2010). Published version is more up-to-date and has several corrections, but is half the length with far fewer reference

    Low-power accelerators for cognitive computing

    Get PDF
    Deep Neural Networks (DNNs) have achieved tremendous success for cognitive applications, and are especially efficient in classification and decision making problems such as speech recognition or machine translation. Mobile and embedded devices increasingly rely on DNNs to understand the world. Smartphones, smartwatches and cars perform discriminative tasks, such as face or object recognition, on a daily basis. Despite the increasing popularity of DNNs, running them on mobile and embedded systems comes with several main challenges: delivering high accuracy and performance with a small memory and energy budget. Modern DNN models consist of billions of parameters requiring huge computational and memory resources and, hence, they cannot be directly deployed on low-power systems with limited resources. The objective of this thesis is to address these issues and propose novel solutions in order to design highly efficient custom accelerators for DNN-based cognitive computing systems. In first place, we focus on optimizing the inference of DNNs for sequence processing applications. We perform an analysis of the input similarity between consecutive DNN executions. Then, based on the high degree of input similarity, we propose DISC, a hardware accelerator implementing a Differential Input Similarity Computation technique to reuse the computations of the previous execution, instead of computing the entire DNN. We observe that, on average, more than 60% of the inputs of any neural network layer tested exhibit negligible changes with respect to the previous execution. Avoiding the memory accesses and computations for these inputs results in 63% energy savings on average. In second place, we propose to further optimize the inference of FC-based DNNs. We first analyze the number of unique weights per input neuron of several DNNs. Exploiting common optimizations, such as linear quantization, we observe a very small number of unique weights per input for several FC layers of modern DNNs. Then, to improve the energy-efficiency of FC computation, we present CREW, a hardware accelerator that implements a Computation Reuse and an Efficient Weight Storage mechanism to exploit the large number of repeated weights in FC layers. CREW greatly reduces the number of multiplications and provides significant savings in model memory footprint and memory bandwidth usage. We evaluate CREW on a diverse set of modern DNNs. On average, CREW provides 2.61x speedup and 2.42x energy savings over a TPU-like accelerator. In third place, we propose a mechanism to optimize the inference of RNNs. RNN cells perform element-wise multiplications across the activations of different gates, sigmoid and tanh being the common activation functions. We perform an analysis of the activation function values, and show that a significant fraction are saturated towards zero or one in popular RNNs. Then, we propose CGPA to dynamically prune activations from RNNs at a coarse granularity. CGPA avoids the evaluation of entire neurons whenever the outputs of peer neurons are saturated. CGPA significantly reduces the amount of computations and memory accesses while avoiding sparsity by a large extent, and can be easily implemented on top of conventional accelerators such as TPU with negligible area overhead, resulting in 12% speedup and 12% energy savings on average for a set of widely used RNNs. Finally, in the last contribution of this thesis we focus on static DNN pruning methodologies. DNN pruning reduces memory footprint and computational work by removing connections and/or neurons that are ineffectual. However, we show that prior pruning schemes require an extremely time-consuming iterative process that requires retraining the DNN many times to tune the pruning parameters. Then, we propose a DNN pruning scheme based on Principal Component Analysis and relative importance of each neuron's connection that automatically finds the optimized DNN in one shot.Les xarxes neuronals profundes (DNN) han aconseguit un èxit enorme en aplicacions cognitives, i són especialment eficients en problemes de classificació i presa de decisions com ara reconeixement de veu o traducció automàtica. Els dispositius mòbils depenen cada cop més de les DNNs per entendre el món. Els telèfons i rellotges intel·ligents, o fins i tot els cotxes, realitzen diàriament tasques discriminatòries com ara el reconeixement de rostres o objectes. Malgrat la popularitat creixent de les DNNs, el seu funcionament en sistemes mòbils presenta diversos reptes: proporcionar una alta precisió i rendiment amb un petit pressupost de memòria i energia. Les DNNs modernes consisteixen en milions de paràmetres que requereixen recursos computacionals i de memòria enormes i, per tant, no es poden utilitzar directament en sistemes de baixa potència amb recursos limitats. L'objectiu d'aquesta tesi és abordar aquests problemes i proposar noves solucions per tal de dissenyar acceleradors eficients per a sistemes de computació cognitiva basats en DNNs. En primer lloc, ens centrem en optimitzar la inferència de les DNNs per a aplicacions de processament de seqüències. Realitzem una anàlisi de la similitud de les entrades entre execucions consecutives de les DNNs. A continuació, proposem DISC, un accelerador que implementa una tècnica de càlcul diferencial, basat en l'alt grau de semblança de les entrades, per reutilitzar els càlculs de l'execució anterior, en lloc de computar tota la xarxa. Observem que, de mitjana, més del 60% de les entrades de qualsevol capa de les DNNs utilitzades presenten canvis menors respecte a l'execució anterior. Evitar els accessos de memòria i càlculs d'aquestes entrades comporta un estalvi d'energia del 63% de mitjana. En segon lloc, proposem optimitzar la inferència de les DNNs basades en capes FC. Primer analitzem el nombre de pesos únics per neurona d'entrada en diverses xarxes. Aprofitant optimitzacions comunes com la quantització lineal, observem un nombre molt reduït de pesos únics per entrada en diverses capes FC de DNNs modernes. A continuació, per millorar l'eficiència energètica del càlcul de les capes FC, presentem CREW, un accelerador que implementa un eficient mecanisme de reutilització de càlculs i emmagatzematge dels pesos. CREW redueix el nombre de multiplicacions i proporciona estalvis importants en l'ús de la memòria. Avaluem CREW en un conjunt divers de DNNs modernes. CREW proporciona, de mitjana, una millora en rendiment de 2,61x i un estalvi d'energia de 2,42x. En tercer lloc, proposem un mecanisme per optimitzar la inferència de les RNNs. Les cel·les de les xarxes recurrents realitzen multiplicacions element a element de les activacions de diferents comportes, sigmoides i tanh sent les funcions habituals d'activació. Realitzem una anàlisi dels valors de les funcions d'activació i mostrem que una fracció significativa està saturada cap a zero o un en un conjunto d'RNNs populars. A continuació, proposem CGPA per podar dinàmicament les activacions de les RNNs a una granularitat gruixuda. CGPA evita l'avaluació de neurones senceres cada vegada que les sortides de neurones parelles estan saturades. CGPA redueix significativament la quantitat de càlculs i accessos a la memòria, aconseguint en mitjana un 12% de millora en el rendiment i estalvi d'energia. Finalment, en l'última contribució d'aquesta tesi ens centrem en metodologies de poda estàtica de les DNNs. La poda redueix la petjada de memòria i el treball computacional mitjançant l'eliminació de connexions o neurones redundants. Tanmateix, mostrem que els esquemes de poda previs fan servir un procés iteratiu molt llarg que requereix l'entrenament de les DNNs moltes vegades per ajustar els paràmetres de poda. A continuació, proposem un esquema de poda basat en l'anàlisi de components principals i la importància relativa de les connexions de cada neurona que optimitza automàticament el DNN optimitzat en un sol tret sense necessitat de sintonitzar manualment múltiples paràmetresPostprint (published version

    Transistor-Like Spin Nano-Switches: Physics and Applications

    Get PDF
    Progress in the last two decades has effectively integrated spintronics and nanomagnetics into a single field, creating a new class of spin-based devices that are now being widely used in magnetic memory devices. However, it is not clear if these advances could also be used to build logic devices

    Realtime image noise reduction FPGA implementation with edge detection

    Get PDF
    The purpose of this dissertation was to develop and implement, in a Field Programmable Gate Array (FPGA), a noise reduction algorithm for real-time sensor acquired images. A Moving Average filter was chosen due to its fulfillment of a low demanding computational expenditure nature, speed, good precision and low to medium hardware resources utilization. The technique is simple to implement, however, if all pixels are indiscriminately filtered, the result will be a blurry image which is undesirable. Since human eye is more sensitive to contrasts, a technique was introduced to preserve sharp contour transitions which, in the author’s opinion, is the dissertation contribution. Synthetic and real images were tested. Synthetic, composed both with sharp and soft tone transitions, were generated with a developed algorithm, while real images were captured with an 8-kbit (8192 shades) high resolution sensor scaled up to 10 × 103 shades. A least-squares polynomial data smoothing filter, Savitzky-Golay, was used as comparison. It can be adjusted using 3 degrees of freedom ─ the window frame length which varies the filtering relation size between pixels’ neighborhood, the derivative order, which varies the curviness and the polynomial coefficients which change the adaptability of the curve. Moving Average filter only permits one degree of freedom, the window frame length. Tests revealed promising results with 2 and 4ℎ polynomial orders. Higher qualitative results were achieved with Savitzky-Golay’s better signal characteristics preservation, especially at high frequencies. FPGA algorithms were implemented in 64-bit integer registers serving two purposes: increase precision, hence, reducing the error comparatively as if it were done in floating-point registers; accommodate the registers’ growing cumulative multiplications. Results were then compared with MATLAB’s double precision 64-bit floating-point computations to verify the error difference between both. Used comparison parameters were Mean Squared Error, Signalto-Noise Ratio and Similarity coefficient.O objetivo desta dissertação foi desenvolver e implementar, em FPGA, um algoritmo de redução de ruído para imagens adquiridas em tempo real. Optou-se por um filtro de Média Deslizante por não exigir uma elevada complexidade computacional, ser rápido, ter boa precisão e requerer moderada utilização de recursos. A técnica é simples, mas se abordada como filtragem monotónica, o resultado é uma indesejável imagem desfocada. Dado o olho humano ser mais sensível ao contraste, introduziu-se uma técnica para preservar os contornos que, na opinião do autor, é a sua principal contribuição. Utilizaram-se imagens sintéticas e reais nos testes. As sintéticas, compostas por fortes e suaves contrastes foram geradas por um algoritmo desenvolvido. As reais foram capturadas com um sensor de alta resolução de 8-kbit (8192 tons) e escalonadas a 10 × 103 tons. Um filtro com suavização polinomial de mínimos quadrados, SavitzkyGolay, foi usado como comparação. Possui 3 graus de liberdade: o tamanho da janela, que varia o tamanho da relação de filtragem entre os pixels vizinhos; a ordem da derivada, que varia a curvatura do filtro e os coeficientes polinomiais, que variam a adaptabilidade da curva aos pontos a suavizar. O filtro de Média Deslizante é apenas ajustável no tamanho da janela. Os testes revelaram-se promissores nas 2ª e 4ª ordens polinomiais. Obtiveram-se resultados qualitativos com o filtro Savitzky-Golay que detém melhores características na preservação do sinal, especialmente em altas frequências. Os algoritmos em FPGA foram implementados em registos de vírgula fixa de 64-bits, servindo dois propósitos: aumentar a precisão, reduzindo o erro comparativamente ao terem sido em vírgula flutuante; acomodar o efeito cumulativo das multiplicações. Os resultados foram comparados com os cálculos de 64-bits obtidos pelo MATLAB para verificar a diferença de erro entre ambos. Os parâmetros de medida foram MSE, SNR e coeficiente de Semelhança
    corecore