8,623 research outputs found

    Sunyaev-Zel'dovich clusters reconstruction in multiband bolometer camera surveys

    Full text link
    We present a new method for the reconstruction of Sunyaev-Zel'dovich (SZ) galaxy clusters in future SZ-survey experiments using multiband bolometer cameras such as Olimpo, APEX, or Planck. Our goal is to optimise SZ-Cluster extraction from our observed noisy maps. We wish to emphasize that none of the algorithms used in the detection chain is tuned on prior knowledge on the SZ -Cluster signal, or other astrophysical sources (Optical Spectrum, Noise Covariance Matrix, or covariance of SZ Cluster wavelet coefficients). First, a blind separation of the different astrophysical components which contribute to the observations is conducted using an Independent Component Analysis (ICA) method. Then, a recent non linear filtering technique in the wavelet domain, based on multiscale entropy and the False Discovery Rate (FDR) method, is used to detect and reconstruct the galaxy clusters. Finally, we use the Source Extractor software to identify the detected clusters. The proposed method was applied on realistic simulations of observations. As for global detection efficiency, this new method is impressive as it provides comparable results to Pierpaoli et al. method being however a blind algorithm. Preprint with full resolution figures is available at the URL: w10-dapnia.saclay.cea.fr/Phocea/Vie_des_labos/Ast/ast_visu.php?id_ast=728Comment: Submitted to A&A. 32 Pages, text onl

    Estimating the number of components of a multicomponent nonstationary signal using the short-term time-frequency RĂ©nyi entropy

    Get PDF
    This article proposes a method for estimating the local number of signals components using the short term RĂ©nyi entropy of signals in the time-frequency plane. (Additional details can be found in the comprehensive book on Time-Frequency Signal Analysis and Processing (see http://www.elsevier.com/locate/isbn/0080443354). In addition, the most recent upgrade of the original software package that calculates Time-Frequency Distributions and Instantaneous Frequency estimators can be downloaded from the web site: www.time-frequency.net. This was the first software developed in the field, and it was first released publicly in 1987 at the 1st ISSPA conference held in Brisbane, Australia, and then continuously updated).The time-frequency RĂ©nyi entropy provides a measure of complexity of a nonstationary multicomponent signal in the time-frequency plane. When the complexity of a signal corresponds to the number of its components, then this information is measured as the RĂ©nyi entropy of the time-frequency distribution (TFD) of the signal. This article presents a solution to the problem of detecting the number of components that are present in short-time interval of the signal TFD, using the short-term RĂ©nyi entropy. The method is automatic and it does not require a prior information about the signal. The algorithm is applied on both synthetic and real data, using a quadratic separable kernel TFD. The results confirm that the short-term RĂ©nyi entropy can be an effective tool for estimating the local number of components present in the signal. The key aspect of selecting a suitable TFD is also discussed

    Rate-Distortion Classification for Self-Tuning IoT Networks

    Full text link
    Many future wireless sensor networks and the Internet of Things are expected to follow a software defined paradigm, where protocol parameters and behaviors will be dynamically tuned as a function of the signal statistics. New protocols will be then injected as a software as certain events occur. For instance, new data compressors could be (re)programmed on-the-fly as the monitored signal type or its statistical properties change. We consider a lossy compression scenario, where the application tolerates some distortion of the gathered signal in return for improved energy efficiency. To reap the full benefits of this paradigm, we discuss an automatic sensor profiling approach where the signal class, and in particular the corresponding rate-distortion curve, is automatically assessed using machine learning tools (namely, support vector machines and neural networks). We show that this curve can be reliably estimated on-the-fly through the computation of a small number (from ten to twenty) of statistical features on time windows of a few hundreds samples

    Spectral Sequence Motif Discovery

    Full text link
    Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, motif finding algorithms of increasingly high performance are required to process the big datasets produced by new high-throughput sequencing technologies. Most existing algorithms are computationally demanding and often cannot support the large size of new experimental data. We present a new motif discovery algorithm that is built on a recent machine learning technique, referred to as Method of Moments. Based on spectral decompositions, this method is robust under model misspecification and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. In a few minutes, we can process datasets of hundreds of thousand sequences and extract motif profiles that match those computed by various state-of-the-art algorithms.Comment: 20 pages, 3 figures, 1 tabl

    Ridge detection for nonstationary multicomponent signals with time-varying wave-shape functions and its applications

    Full text link
    We introduce a novel ridge detection algorithm for time-frequency (TF) analysis, particularly tailored for intricate nonstationary time series encompassing multiple non-sinusoidal oscillatory components. The algorithm is rooted in the distinctive geometric patterns that emerge in the TF domain due to such non-sinusoidal oscillations. We term this method \textit{shape-adaptive mode decomposition-based multiple harmonic ridge detection} (\textsf{SAMD-MHRD}). A swift implementation is available when supplementary information is at hand. We demonstrate the practical utility of \textsf{SAMD-MHRD} through its application to a real-world challenge. We employ it to devise a cutting-edge walking activity detection algorithm, leveraging accelerometer signals from an inertial measurement unit across diverse body locations of a moving subject

    Machine Learning and Alternative Data Analytics for Fashion Finance

    Get PDF
    This dissertation investigates the application of Machine Learning, Natural Language Processing and computational finance to a novel area Fashion Finance. Specifically identifying investment opportunities within the Apparel industry using influential alternative data sources such as Instagram. Fashion investment is challenging due to the ephemeral nature of the industry and the difficulty for investors who lack an understanding of how to analyze trend-driven consumer brands. Unstructured online data (e-commerce stores, social media, online blogs, news, etc.), introduce new opportunities for investment signals extraction. We focus on how trading signals can be generated from the Instagram data and events reported in the news articles. Part of this research work was done in collaboration with Arabesque Asset Management. Farfetch, the online luxury retailer, and Living Bridge Private Equity provided industry advice. Research Datasets The datasets used for this research are collected from various sources and include the following types of data: - Financial data: daily stock prices of 50 U.S. and European Apparel and Footwear equities, daily U.S. Retail Trade and U.S. Consumer Non-Durables sectors indices, Form 10-K reports. - Instagram data: daily Instagram profile followers for 11 fashion companies. - News data: 0.5 mln news articles that mention selected 50 equities. Research Experiments The thesis consists of the below studies: 1. Relationship between Instagram Popularity and Stock Prices. This study investigates a link between the changes in a company's popularity (daily followers counts) on Instagram and its stock price, revenue movements. We use cross-correlation analysis to find whether the signals derived from the followers' data could help to infer a company's future financial performance. Two hypothetical trading strategies are designed to test if the changes in a company's Instagram popularity could improve the returns. To test the hypotheses, Wilcoxon signed-rank test is used. 2. Dynamic Density-based News Clustering. The aim of this study is twofold: 1) analyse the characteristics of relevant news event articles and how they differ from the noisy/irrelevant news; 2) using the insights, design an unsupervised framework that clusters news articles and identifies events clusters without predefined parameters or expert knowledge. The framework incorporates the density-based clustering algorithm DBSCAN where the clustering parameters are selected dynamically with Gaussian Mixture Model and by maximizing the inter-cluster Information Entropy. 3. ALGA: Automatic Logic Gate Annotator for Event Detection. We design a news classification model for detecting fashion events that are likely to impact a company's stock price. The articles are represented by the following text embeddings: TF-IDF, Doc2Vec and BERT (Transformer Neural Network). The study is comprised of two parts: 1) we design a domain-specific automatic news labelling framework ALGA. The framework incorporates topic extraction (Latent Dirichlet Allocation) and clustering (DBSCAN) algorithms in addition to other filters to annotate the dataset; 2) using the labelled dataset, we train Logistic Regression classifier for identifying financially relevant news. The model shows the state-of-the-art results in the domain-specific financial event detection problem. Contribution to Science This research work presents the following contributions to science: - Introducing original work in Machine Learning and Natural Language Processing application for analysing alternative data on ephemeral fashion assets. - Introducing the new metrics to measure and track a fashion brand's popularity for investment decision making. - Design of the dynamic news events clustering framework that finds events clusters of various sizes in the news articles without predefined parameters. - Present the original Automatic Logic Gate Annotator framework (ALGA) for automatic labelling of news articles for the financial event detection task. - Design of the Apparel and Footwear news events classifier using the datasets generated by the ALGA's framework and show the state-of-the-art performance in a domain-specific financial event detection task. - Build the \textit{Fashion Finance Dictionary} that contains 320 phrases related to various financially-relevant events in the Apparel and Footwear industry

    A signal complexity-based approach for AM–FM signal modes counting

    Get PDF
    I segnali modulati in frequenza appaiono in molte discipline applicate, tra cui la geologia, la comunicazione, la biologia e l'acustica. Questi sono multicomponenti, cioè consistono in forme d'onda multiple, con frequenza specifica dipendente dal tempo (frequenza istantanea). Nella maggior parte delle applicazioni pratiche, il numero di modalità - che è sconosciuto - è necessario per analizzare correttamente un segnale; per esempio per separare ogni singolo componente e per stimare la sua frequenza istantanea. Il rilevamento del numero di componenti è un problema impegnativo, specialmente nel caso di modalità che interferiscono. L'approccio basato sull'entropia di Rényi si è dimostrato adatto per il conteggio delle modalità di un segnale, ma è limitato a componenti ben separate. Il presente documento affronta questo problema introducendo una nuova nozione di complessità del segnale. In particolare, lo spettrogramma di un segnale multicomponente è visto come un processo non stazionario in cui l'interferenza si alterna alla non interferenza. La complessità relativa alla transizione tra sezioni consecutive dello spettrogramma viene valutata mediante la Run Length Encoding. Sulla base di una legge di evoluzione tempo-frequenza dello spettrogramma, le variazioni di complessità sono studiate per stimare accuratamente il numero di componenti. Il metodo presentato è adatto a segnali multicomponente con modalità non separabili, così come ad ampiezze variabili nel tempo e mostra robustezza al rumore.Frequency modulated signals appear in many applied disciplines, including geology, communication, biology and acoustics. They are naturally 1multicomponent, i.e., they consist of multiple waveforms, with specific time-dependent frequency (instantaneous frequency). In most practical applications, the number of modes—which is unknown—is needed for correctly analyzing a signal; for instance for separating each individual component and for estimating its instantaneous frequency. Detecting the number of components is a challenging problem, especially in the case of interfering modes. The Rényi Entropy-based approach has proven to be suitable for signal modes counting, but it is limited to well separated components. This paper addresses this issue by introducing a new notion of signal complexity. Specifically, the spectrogram of a multicomponent signal is seen as a non-stationary process where interference alternates with non-interference. Complexity concerning the transition between consecutive spectrogram sections is evaluated by means of a modified Run Length Encoding. Based on a spectrogram time-frequency evolution law, complexity variations are studied for accurately estimating the number of components. The presented method is suitable for multicomponent signals with non-separable modes, as well as time-varying amplitudes, showing robustness to noise
    • …
    corecore