7 research outputs found

    Enhancing the Ranking Context of Dense Retrieval Methods through Reciprocal Nearest Neighbors

    Full text link
    Sparse annotation poses persistent challenges to training dense retrieval models; for example, it distorts the training signal when unlabeled relevant documents are used spuriously as negatives in contrastive learning. To alleviate this problem, we introduce evidence-based label smoothing, a novel, computationally efficient method that prevents penalizing the model for assigning high relevance to false negatives. To compute the target relevance distribution over candidate documents within the ranking context of a given query, we assign a non-zero relevance probability to those candidates most similar to the ground truth based on the degree of their similarity to the ground-truth document(s). To estimate relevance we leverage an improved similarity metric based on reciprocal nearest neighbors, which can also be used independently to rerank candidates in post-processing. Through extensive experiments on two large-scale ad hoc text retrieval datasets, we demonstrate that reciprocal nearest neighbors can improve the ranking effectiveness of dense retrieval models, both when used for label smoothing, as well as for reranking. This indicates that by considering relationships between documents and queries beyond simple geometric distance we can effectively enhance the ranking context.Comment: EMNLP 202

    CODER: An efficient framework for improving retrieval through COntextual Document Embedding Reranking

    Full text link
    Contrastive learning has been the dominant approach to training dense retrieval models. In this work, we investigate the impact of ranking context - an often overlooked aspect of learning dense retrieval models. In particular, we examine the effect of its constituent parts: jointly scoring a large number of negatives per query, using retrieved (query-specific) instead of random negatives, and a fully list-wise loss. To incorporate these factors into training, we introduce Contextual Document Embedding Reranking (CODER), a highly efficient retrieval framework. When reranking, it incurs only a negligible computational overhead on top of a first-stage method at run time (delay per query in the order of milliseconds), allowing it to be easily combined with any state-of-the-art dual encoder method. After fine-tuning through CODER, which is a lightweight and fast process, models can also be used as stand-alone retrievers. Evaluating CODER in a large set of experiments on the MS~MARCO and TripClick collections, we show that the contextual reranking of precomputed document embeddings leads to a significant improvement in retrieval performance. This improvement becomes even more pronounced when more relevance information per query is available, shown in the TripClick collection, where we establish new state-of-the-art results by a large margin.Comment: EMNLP 202

    Comprehensive comparison and experimental validation of band-structure calculation methods in III\u2013V semiconductor quantum wells

    Get PDF
    We present and thoroughly compare band-structures computed with density functional theory, tight-binding, k p and non-parabolic effective mass models. Parameter sets for the non-parabolic C, the L and X valleys and intervalley bandgaps are extracted for bulk InAs, GaAs and InGaAs. We then consider quantum-wells with thickness ranging from 3 nm to 10 nm and the bandgap dependence on film thickness is compared with experiments for In0:53Ga0:47As quantum-wells. The impact of the band-structure on the drain current of nanoscale MOSFETs is simulated with ballistic transport models, the results provide a rigorous assessment of III\u2013V semiconductor band structure calculation methods and calibrated band parameters for device simulations

    Design and implementation of all-optical integrated systems for multiformat regeneration

    No full text
    187 σ.Η αμιγώς οπτική αναγέννηση και η μετατροπή μήκους κύματος αποτελούν θεμελιώδεις προϋποθέσεις για την υλοποίηση οπτικών συστημάτων μετάδοσης νέας γενιάς, τα οποία πρόκειται να επιτρέψουν την επέκταση της διαφάνειας των οπτικών δικτύων και την πλήρη αξιοποίηση του τεράστιου εύρους ζώνης που παρέχει η οπτική ίνα ως φυσικό μέσο μετάδοσης, καθιστώντας εφικτούς πολύ υψηλούς ρυθμούς μετάδοσης δεδομένων. Λαμβάνοντας αυτό υπ’όψιν, στα πλαίσια της συγκεκριμένης διπλωματικής εργασίας μελετήθηκαν θεωρητικά και προσομοιώθηκαν με τη βοήθεια κατάλληλου λογισμικού διατάξεις για αμιγώς οπτική 2R αναγέννηση σημάτων πολλαπλών σχημάτων διαμόρφωσης (OOK, DPSK, DQPSK) και ρυθμών μετάδοσης (22Gbaud, 44Gbaud) οι οποίες βασίζονται στη χρήση συμβολομετρικής διάταξης Mach-Zehnder (ΜΖΙ) με ενεργά στοιχεία Οπτικούς Ενισχυτές Ημιαγωγού (SOA). Το πρότυπο των διατάξεων προσομοίωσης ήταν το οπτικό chip αναγεννητή που κατασκεύασε η εταιρία CIP για λογαριασμό του Εργαστηρίου Φωτονικών Επικοινωνιών. Στόχος της εργασίας ήταν να προσδιοριστούν οι βέλτιστες τιμές παραμέτρων λειτουργίας των διατάξεων, δηλαδή οι τιμές που εξασφαλίζουν τη μεγαλύτερη βελτίωση ως προς τον παράγοντα ποιότητας (Quality factor) και τον λόγο σβέσης (Extinction Ratio) του σήματος εξόδου της διάταξης στην περίπτωση που το σήμα εισόδου είναι OOK, και στην περίπτωση που το σήμα εισόδου είναι DPSK ή DQPSK οι τιμές για τις οποίες το σήμα εξόδου παρουσιάζει μικρότερο amplitude jitter και phase variation και άρα, μετά την αποκωδικοποίησή του, καλύτερο Quality factor και Extinction Ratio. Επίσης προσδιορίστηκαν χαρακτηριστικά λειτουργίας των διατάξεων όπως το εύρος τιμών του Q factor σήματος εισόδου για τις οποίες παρατηρείται αναγέννηση και η τιμή του Q factor σήματος εισόδου για την οποία η αναγέννηση είναι εντονότερη. Επιπλέον, εξετάσθηκε θεωρητικά και προσομοιωτικά το ενδεχόμενο βελτίωσης της διάταξης αναγεννητή DQPSK μέσω της χρήσης σύμφωνου σχήματος (ενός 90º optical hybrid) για την παραγωγή των σημάτων ελέγχου των SOA-MZIs. Τέλος, παρουσιάστηκαν τα αποτελέσματα από την πειραματική αξιολόγηση του chip της CIP που πραγματοποιήθηκε από το Εργαστήριο Φωτονικών Επικοινωνιών.All-optical signal regeneration and wavelength conversion are fundamental prerequisites for the implementation of new generation optical transmission systems, which are due to allow the enhancement of optical networks transparency and the full utilization of the huge bandwidth offered by optical fibres, thereby making feasible the transmission of data at very high bitrates. Bearing this in mind, within the framework of this thesis we studied theoretically and simulated at two different bitrates (22Gbaud, 44Gbaud) systems for all-optical, multiformat (OOK, DPSK, DQPSK) 2R regeneration which are based on a Mach-Zehnder Interferometer (MZI) switch, using Semiconductor Optical Amplifiers (SOA) as an active component. As a model for the simulation layouts we used the optical regenerator chip manufactured by CIP on behalf of the Photonics Communications Research Laboratory (PCRL). The main objective of the thesis was to determine the optimal values of the functional parameters of these systems, i.e. in case the input signal is OOK modulated, the values that provide the greatest improvement as regards the Quality factor and the Extinction Ratio of the output signal, whereas in case of DPSK and DQPSK input signals, the values for which the output signal suffers from less amplitude jitter and phase variation, therefore displaying better Quality factor and Extinction Ratio after its decoding. Functional characteristics of those systems were also assessed, like the value range of the input signal’s Q factor for which regeneration is observed as well as the value of the input signal’s Q factor for which regeneration appears to be maximum. Moreover, we examined both theoretically and through simulations the possibility of improving the DQPSK regenerator configuration by using a coherent scheme (90º optical hybrid), as the first stage of the device, for the generation of the control signals used by the SOA-MZI. Lastly, the results of the experimental evaluation of CIP’s chip, which was carried out by the Photonics Communications Research Laboratory, were presented.Γεώργιος Π. ΖερβέαςΒασίλειος Θ. Κωστέα
    corecore