45 research outputs found

    Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs

    Full text link
    Most NLP tasks are modeled as supervised learning and thus require labeled training data to train effective models. However, manually producing such data at sufficient quality and quantity is known to be costly and time-intensive. Current research addresses this bottleneck by exploring a novel paradigm called zero-shot learning via dataset generation. Here, a powerful LLM is prompted with a task description to generate labeled data that can be used to train a downstream NLP model. For instance, an LLM might be prompted to "generate 500 movie reviews with positive overall sentiment, and another 500 with negative sentiment." The generated data could then be used to train a binary sentiment classifier, effectively leveraging an LLM as a teacher to a smaller student model. With this demo, we introduce Fabricator, an open-source Python toolkit for dataset generation. Fabricator implements common dataset generation workflows, supports a wide range of downstream NLP tasks (such as text classification, question answering, and entity recognition), and is integrated with well-known libraries to facilitate quick experimentation. With Fabricator, we aim to support researchers in conducting reproducible dataset generation experiments using LLMs and help practitioners apply this approach to train models for downstream tasks.Comment: 3 Figures and 2 Table

    Analysis of riboflavin/ultraviolet a corneal cross-linking by molecular spectroscopy

    Get PDF
    Corneal cross-linking (CXL) with riboflavin and ultraviolet A light is a therapeutic procedure to restore the mechanical stability of corneal tissue. The treatment method is applied to pathological tissue, such as keratoconus and induces the formation of new cross-links. At present, the molecular mechanisms of induced cross-linking are still not known exactly. In this study, we investigated molecular alterations within porcine cornea tissue after treatment with riboflavin and ultraviolet A light by surface enhanced Raman spectroscopy (SERS). For that purpose, after CXL treatment a thin silver layer was vapor-deposited onto cornea flaps. To explore molecular alterations induced by the photochemical process hierarchical cluster analysis (HCA) was used. The detailed analysis of SERS spectra reveals that there is no general change in collagen secondary structure while modifications on amino acid side chains are the most dominant outcome. The formation of secondary and aromatic amine groups as well as methylene and carbonyl groups were observed. Even though successful cross-linking could not be registered in all treated samples, Raman signals of newly formed chemical groups are already present in riboflavin only treated corneas

    Imaging the tympanic membrane oscillation ex vivo with Doppler optical coherence tomography during simulated Eustachian catarrh

    Get PDF
    Recently, optical coherence tomography (OCT) was utilized in multiple studies for structural and functional imaging of the middle ear and the tympanic membrane. Since Doppler OCT allows both, the spatially resolved measurement of the tympanic membrane oscillation and high-resolution imaging, it is regarded as a promising tool for future in vivo applications. In this study, Doppler OCT is utilized for the visualization of the tympanic membrane oscillation in temporal bones with simulated Eustachian catarrh, which was realized by generating a depression in the tympanic cavity. The transfer function, meaning the oscillation amplitude normalized to the applied sound pressure, is measured frequency resolved in the range from 0.5 kHz to 6 kHz and with a lateral spatial resolution of 0.4 mm. Typical oscillation patterns could be observed in case of ambient pressure in the tympanic cavity. Under depression the characteristic oscillation patterns were observed with widely congruent appearance but at higher frequencies

    Core–shell bioprinting as a strategy to apply differentiation factors in a spatially defined manner inside osteochondral tissue substitutes

    Get PDF
    One of the key challenges in osteochondral tissue engineering is to define specified zones with varying material properties, cell types and biochemical factors supporting locally adjusted differentiation into the osteogenic and chondrogenic lineage, respectively. Herein, extrusion-based core–shell bioprinting is introduced as a potent tool allowing a spatially defined delivery of cell types and differentiation factors TGF-β3 and BMP-2 in separated compartments of hydrogel strands, and, therefore, a local supply of matching factors for chondrocytes and osteoblasts. Ink development was based on blends of alginate and methylcellulose, in combination with varying concentrations of the nanoclay Laponite whose high affinity binding capacity for various molecules was exploited. Release kinetics of model molecules was successfully tuned by Laponite addition. Core–shell bioprinting was proven to generate well-oriented compartments within one strand as monitored by optical coherence tomography in a non-invasive manner. Chondrocytes and osteoblasts were applied each in the shell while the respective differentiation factors (TGF-β3, BMP-2) were provided by a Laponite-supported core serving as central factor depot within the strand, allowing directed differentiation of cells in close contact to the core. Experiments with bi-zonal constructs, comprising an osteogenic and a chondrogenic zone, revealed that the local delivery of the factors from the core reduces effects of these factors on the cells in the other scaffold zone. These observations prove the general suitability of the suggested system for co-differentiation of different cell types within a zonal construct

    Non-rigid Point Cloud Registration for Middle Ear Diagnostics with Endoscopic Optical Coherence Tomography

    Full text link
    Purpose: Middle ear infection is the most prevalent inflammatory disease, especially among the pediatric population. Current diagnostic methods are subjective and depend on visual cues from an otoscope, which is limited for otologists to identify pathology. To address this shortcoming, endoscopic optical coherence tomography (OCT) provides both morphological and functional in-vivo measurements of the middle ear. However, due to the shadow of prior structures, interpretation of OCT images is challenging and time-consuming. To facilitate fast diagnosis and measurement, improvement in the readability of OCT data is achieved by merging morphological knowledge from ex-vivo middle ear models with OCT volumetric data, so that OCT applications can be further promoted in daily clinical settings. Methods: We propose C2P-Net: a two-staged non-rigid registration pipeline for complete to partial point clouds, which are sampled from ex-vivo and in-vivo OCT models, respectively. To overcome the lack of labeled training data, a fast and effective generation pipeline in Blender3D is designed to simulate middle ear shapes and extract in-vivo noisy and partial point clouds. Results: We evaluate the performance of C2P-Net through experiments on both synthetic and real OCT datasets. The results demonstrate that C2P-Net is generalized to unseen middle ear point clouds and capable of handling realistic noise and incompleteness in synthetic and real OCT data. Conclusion: In this work, we aim to enable diagnosis of middle ear structures with the assistance of OCT images. We propose C2P-Net: a two-staged non-rigid registration pipeline for point clouds to support the interpretation of in-vivo noisy and partial OCT images for the first time. Code is available at: https://gitlab.com/nct\_tso\_public/c2p-net

    In vivo imaging of human oral hard and soft tissues by polarizationsensitive optical coherence tomography

    Get PDF
    Since optical coherence tomography (OCT) provides three-dimensional high-resolution images of biological tissue, the benefit of polarization contrast in the field of dentistry is highlighted in this study. Polarization-sensitive OCT (PS OCT) with phase-sensitive recording is used for imaging dental and mucosal tissues in the human oral cavity in vivo. An enhanced polarization contrast of oral structures is reached by analyzing the signals of the co- and crosspolarized channels of the swept source PS OCT system quantitatively with respect to reflectivity, retardation, optic axis orientation, and depolarization. The calculation of these polarization parameters enables a high tissue-specific contrast imaging for the detailed physical interpretation of human oral hard and soft tissues. For the proof-of-principle, imaging of composite restorations and mineralization defects at premolars as well as gingival, lingual, and labial oral mucosa was performed in vivo within the anterior oral cavity. The achieved contrast-enhanced results of the investigated human oral tissues by means of polarizationsensitive imaging are evaluated by the comparison with conventional intensity-based OCT

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Full text link
    Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License

    Polarization sensitive optical coherence tomography utilizing a buffered swept source laser

    No full text
    We present an approach for polarization sensitive optical coherence tomography (PS-OCT) that solely requires a modification of the light source, a buffered swept source laser. For this purpose a single-mode fiber-based Fourier domain mode locked laser is extended by fourfold buffering with manual fiber polarization controllers to emit alternating sweep polarizations, while the polarization contrast calibration is realized by a high-speed polarimeter. As the introduced setup utilizes standard scanning and detection units, the proposed method is a promising way to enhance various swept source OCT systems by polarization sensitive imaging. Preliminary measurements of a human finger nail with different polarization contrasts demonstrate the feasibility of the concept

    Large-Scale Label Interpretation Learning for Few-Shot Named Entity Recognition

    Full text link
    Few-shot named entity recognition (NER) detects named entities within text using only a few annotated examples. One promising line of research is to leverage natural language descriptions of each entity type: the common label PER might, for example, be verbalized as ''person entity.'' In an initial label interpretation learning phase, the model learns to interpret such verbalized descriptions of entity types. In a subsequent few-shot tagset extension phase, this model is then given a description of a previously unseen entity type (such as ''music album'') and optionally a few training examples to perform few-shot NER for this type. In this paper, we systematically explore the impact of a strong semantic prior to interpret verbalizations of new entity types by massively scaling up the number and granularity of entity types used for label interpretation learning. To this end, we leverage an entity linking benchmark to create a dataset with orders of magnitude of more distinct entity types and descriptions as currently used datasets. We find that this increased signal yields strong results in zero- and few-shot NER in in-domain, cross-domain, and even cross-lingual settings. Our findings indicate significant potential for improving few-shot NER through heuristical data-based optimization.Comment: 8 page

    Differentiation of Occlusal Discolorations and Carious Lesions with Hyperspectral Imaging In Vitro

    Get PDF
    Stains and stained incipient lesions can be challenging to differentiate with established clinical tools. New diagnostic techniques are required for improved distinction to enable early noninvasive treatment. This in vitro study evaluates the performance of artificial intelligence (AI)-based classification of hyperspectral imaging data for early occlusal lesion detection and differentiation from stains. Sixty-five extracted permanent human maxillary and mandibular bicuspids and molars (International Caries Detection and Assessment System [ICDAS] II 0–4) were imaged with a hyperspectral camera (Diaspective Vision TIVITA® Tissue, Diaspective Vision, Pepelow, Germany) at a distance of 350 mm, acquiring spatial and spectral information in the wavelength range 505–1000 nm; 650 fissural spectra were used to train classification algorithms (models) for automated distinction between stained but sound enamel and stained lesions. Stratified 10-fold cross-validation was used. The model with the highest classification performance, a fine k-nearest neighbor classification algorithm, was used to classify five additional tooth fissural areas. Polarization microscopy of ground sections served as reference. Compared to stained lesions, stained intact enamel showed higher reflectance in the wavelength range 525–710 nm but lower reflectance in the wavelength range 710–1000 nm. A fine k-nearest neighbor classification algorithm achieved the highest performance with a Matthews correlation coefficient (MCC) of 0.75, a sensitivity of 0.95 and a specificity of 0.80 when distinguishing between intact stained and stained lesion spectra. The superposition of color-coded classification results on further tooth occlusal projections enabled qualitative assessment of the entire fissure’s enamel health. AI-based evaluation of hyperspectral images is highly promising as a complementary method to visual and radiographic examination for early occlusal lesion detection
    corecore