Portail HAL de Télécom Paris
Not a member yet
    11116 research outputs found

    U-DREAM: Unsupervised Dereverberation guided by a Reverberation Model

    No full text
    International audienceThis paper explores the outcome of training state-of-the-art dereverberation models with supervision settings ranging from weakly-supervised to virtually unsupervised, relying solely on reverberant signals and an acoustic model for training. Most of the existing deep learning approaches typically require paired dry and reverberant data, which are difficult to obtain in practice. We develop instead a sequential learning strategy motivated by a maximum-likelihood formulation of the dereverberation problem, wherein acoustic parameters and dry signals are estimated from reverberant inputs using deep neural networks, guided by a reverberation matching loss. Our most data-efficient variant requires only 100 reverberation-parameter-labeled samples to outperform an unsupervised baseline, demonstrating the effectiveness and practicality of the proposed method in low-resource scenarios

    Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings

    No full text
    International audienceOne-step generators distilled from Masked Diffusion Models (MDMs) compress multiple sampling steps into a single forward pass, enabling efficient text and image synthesis. However, they suffer two key limitations: they inherit modeling bias from the teacher, and their discrete token outputs block gradient flow, preventing post-distillation refinements such as adversarial training, reward-based fine-tuning, and Test-Time Embedding Optimization (TTEO). In this work, we introduce soft embeddings, a simple relaxation that replaces discrete tokens with the expected embeddings under the generator's output distribution. Soft embeddings preserve representation fidelity for one-step discrete generator while providing a fully differentiable continuous surrogate that is compatible with teacher backbones and tokenizer decoders. Integrating soft embeddings into the Di[M]O distillation framework (denoted Soft-Di[M]O) makes one-step generators end-to-end trainable and enables straightforward application of GAN-based refinement, differentiable reward fine-tuning, and TTEO. Empirically, across multiple MDM teachers (e.g., MaskBit, MaskGen), Soft-Di[M]O achieves state-of-the-art one-step results: improved class-to-image performance, a one-step FID of 1.56 on ImageNet-256 with GAN-based refinement, along with higher GenEval and HPS scores on text-to-image with reward fine-tuning, and further gains from TTEO

    Cloud Agent (SSI wallet) for ClinConNet platform

    No full text

    Des Poids aux Couches : Compression de Réseau Neuronal Profond pour une Inférence Efficace

    No full text
    Deep learning models continue to grow in depth and computational cost, yet modern inference pipelines remain constrained by latency, memory, and energy budgets. This thesis investigates where redundant computation hides in over-parameterized architectures, and how to remove it safely. We first analyze the Sparse Double Descent phenomenon and show how aggressive sparsification can paradoxically enhance generalization. We characterize this behavior and propose regularization and distillation-based approaches supported by an entropy-based metric. Building on this metric, we introduce three familiesof depth-reduction strategies: entropy-based pruning (EGP, EASIER), BatchNorm-guided layer collapse(TLC), and Optimal Transport–based inductive regularization (LaCoOT). Together, these methods reduce up to 70% of network depth across CNNs, Transformers, and diffusion models, often with minimal performance degradation, and sometimes even gains in accuracy. Finally, we extend the notion of redundancy to the operand by proposing FOLDER, a training-free token-pruning module that accelerates multimodal LLMs by up to 2.4 times with preserved or improved performance. Collectively, these contributions advance the understanding of redundancy in deep networks and propose general strategies for improving inference efficiency, paving the way toward more sustainable and adaptive deep learning models.Les modèles d’apprentissage profond continuent de gagner en profondeur et en coût de calcul, mais les pipelines d’inférence modernes restent limités par la latence, la mémoire et les budgets énergétiques. Cette thèse examine où se cachent les calculs redondants dans les architectures surparamétrées et comment les supprimer en toute sécurité. Nous analysons d’abord le phénomène de double descente clairsemée et montrons comment un élagage non structuré agressif peut paradoxalement améliorer la généralisation. Nous caractérisons ce comportement et proposons des approches basées sur la régularisation et la distillation, soutenues par une métrique basée sur l’entropie. En nous appuyant sur cette métrique, nous introduisons trois familles de stratégies de réduction de la profondeur: l’élagage basé sur l’entropie (EGP, EASIER), l’effondrement des couches guidé par BatchNorm (TLC) et la régularisation inductive basée sur le transport optimal (LaCoOT). Ensemble, ces méthodes réduisent jusqu’à 70% la profondeur des réseaux dans les CNN, les Transformers et les modèles de diffusion, souvent avec une dégradation minimale des performances, et parfois même avec des gains de précision. Enfin, nous étendons la notionde redondance à l’opérande en proposant FOLDER, un module d’élagage de jetons sans apprentissage qui accélère les LLM multimodaux jusqu’à 2,4 fois tout en conservant ou en améliorant les performances.Collectivement, ces contributions font progresser la compréhension de la redondance dans les réseaux profonds et proposent des stratégies générales pour améliorer l’efficacité de l’inférence, ouvrant la voie à des modèles d’apprentissage profond plus durables et adaptatifs

    RV-Sec5: Enhancing RISC-V Security Evaluation via Targeted ISA-Level Instrumentation using gem5

    No full text
    International audienceThe modularity of the RISC-V Instruction Set Architecture (ISA) has accelerated its adoption in security-critical domains, yet it introduces significant challenges for pre-silicon security validation. Current evaluation methods often rely on high-level emulation that overlooks microarchitectural side effects or post-silicon testing that identifies vulnerabilities too late in the design cycle. This paper presents RV-Sec5, a systematic framework for ISA-level security evaluation that leverages the gem5 simulator. Unlike standard simulators, RV-Sec5 introduces a methodology to map high-level security invariants-such as privilege isolation and memory protection-directly to automated, cycle-accurate instrumentation points within the ISA decoder. This approach bridges the semantic gap between abstract security policies and low-level hardware execution. We demonstrate the framework's efficacy through a case study involving unauthorized Control and Status Register (CSR) modifications, showing how RV-Sec5 detects privilege escalation attempts and monitors microarchitectural anomalies, such as TLB flushes and cache state changes, in real-time

    Walking lanes / walking lines: Bodily alignments and passing through doorways

    No full text
    International audienceThere are very few studies that analyse the role of artefacts as shaping joint locomotion in public places. By video-recording pedestrians passing through doorways in a mall, we have observed how openings and doors contribute to mobile formations such as walking lanes or files. Doors play a major part as a focus for common direction. Doors occasion a modification of speed and a re-arrangement of spatial proximity between pedestrians during the process of passing through. We argue that mobile formats such as walking together in public places are based on culturally-methodic dynamics of bodily orientation to others. They are also based on a conjoint orientation to apertures that afford entry spaces to doors through which pedestrians wish to pass. Physical-artefactual boundaries such as doors, sidewalks and lanes play a major role in shaping joint locomotion. We would like to focus on a particular case of locomotion driven by artefacts: the passing through doors shaped by serial arrangements of pedestrians in a following/followed format. We treat this case of mobile formation as a specific genuine form of aggregate in its own right, distinct from side-by-side walking and other forms of mobile file

    The Hi-Audio Online Platform for Recording and Distributing Multi-Track Music Datasets

    No full text
    International audienceThis paper introduces the Hi-Audio online platform, an open-source tool designed to support musicians and researchers in the field of Music Information Retrieval (MIR). The platform enables the recording, uploading, and sharing of multitrack musical compositions, aiming to build an open-access audio database to advance research in music technology. Uploaded audio files are automatically analyzed upon synchronization with the server, leveraging signal processing techniques and machine learning models to generate rich metadata. The platform facilitates remote and asynchronous collaboration via a web-based interface accessible at hiaudio.fr. Furthermore, a novel built-in method for accurate and robust round-trip latency estimation in the browser is proposed and integrated into the platform, demonstrating its applicability in real-world distributed recording scenarios. Finally, an initial user evaluation with musicians was conducted to assess usability and practical relevance under realistic usage conditions. The evaluation combined task-based performance analysis with standardized usability and workload measures. The results indicate high task completion rates for core recording functions and show that the platform can be used effectively by musicians with minimal prior training

    Du réseau des Micro-Folies à un potentiel « patrimoine en réseau »Discussion d’une initiative étatique par une approche multiscalaire

    No full text
    International audienc

    Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation

    No full text
    While music remains a challenging domain for generative models like Transformers, a two-pronged approach has recently proved successful: inserting musically-relevant structural information into the positional encoding (PE) module and using kernel approximation techniques based on Random Fourier Features (RFF) to lower the computational cost from quadratic to linear. Yet, it is not clear how such RFF-based efficient PEs compare with those based on rotation matrices, such as Rotary Positional Encoding (RoPE). In this paper, we present a unified framework based on kernel methods to analyze both families of efficient PEs. We use this framework to develop a novel PE method called RoPEPool, capable of extracting causal relationships from temporal sequences. Using RFF-based PEs and rotation-based PEs, we demonstrate how seemingly disparate PEs can be jointly studied by considering the interactions they induce between two descriptive levels of the data: the input, capturing quickly-varying components, and the prior, capturing slowly-varying components. For empirical validation, we use a symbolic music generation task, namely, melody harmonization. We show that RoPEPool, combined with highly-informative structural priors, outperforms all methods

    Conséquences de l’entrée en application du Data Act dans le secteur automobile

    No full text
    International audienceCet article fait partie d'un dossier sur les enjeux juridiques et de cybersécurité dans le secteur automobile. Il introduit les principaux changements induits par l'entrée en application du règlement européen sur les données, en adoptant une approche sectorielle. Sont présentés les conséquences spécifiques dans le secteur automobile, les démarches à réaliser pour se mettre en conformité, son articulation avec d'autres règlementations sectorielles comme l'UN R156 ou la directive ITS, ainsi que les récentes lignes directrices de la Commission européenne sur l'application du règlement aux données issues des véhicules connectés

    0

    full texts

    7,567

    metadata records
    Updated in last 30 days.
    Portail HAL de Télécom Paris is based in France
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇