17 research outputs found

    Astro-COLIBRI: An Advanced Platform for Real-Time Multi-Messenger Astrophysics

    Full text link
    Observations of transient phenomena like Gamma-Ray Bursts (GRBs), Fast Radio Bursts (FRBs), stellar flares and explosions (novae and supernovae), combined with the detection of novel cosmic messengers like high-energy neutrinos and gravitational waves has revolutionized astrophysics over the last years. The discovery potential of both ulti-messenger and multi-wavelength follow-up observations as well as serendipitous observations could be maximized with a novel tool which allows for quickly acquiring an overview over relevant information associated with each new detection. Here we present Astro-COLIBRI, a novel and comprehensive platform for this challenge. Astro-COLIBRI's architecture comprises a public RESTful API, real-time databases, a cloud-based alert system and a website as well as apps for iOS and Android as clients for users. Astro-COLIBRI evaluates incoming messages of astronomical observations from all available alert streams in real time, filters them by user specified criteria and puts them into their MWL and MM context. The clients provide a graphical representation with an easy to grasp summary of the relevant data to allow for the fast identification of interesting phenomena, provides an assessment of observing conditions at a large selection of observatories around the world, and much more. Here the key features of Astro-COLIBRI are presented. We outline the architecture, summarize the used data resources, and provide examples for applications and use cases. Focussing on the high-energy domain, we'll discuss the use of the platform in searches for high-energy gamma-ray counterparts to high-energy neutrinos, gamma-ray bursts and gravitational waves.Comment: Proceedings 38th International Cosmic Ray Conference (ICRC2023

    Étude de méthodes d'augmentation de données pour la reconnaissance d'entités nommées en astrophysique

    No full text
    International audienceIn this paper, we investigate the effectiveness of data augmentation for named entity recognition in astrophysics. To this end, we compare three augmentation methods using two recent annotated corpora in the domain : DEAL and TDAC, both in English. We generated artificial data using rule-based and language model-based approaches. The data was then iteratively added to finetune an entity detection system. The results show a threshold effect : adding artificial data beyond a specific quantity is no longer beneficial and can decrease F-measure. The threshold varies for each method and depends on the language model employed. This study also highlights that data augmentation is more effective for small corpora, consistent with previous studies. Indeed, our experiments demonstrate the potential to improve the F-measure by 1 point in the DEAL corpus and up to 2 points in the TDAC corpus.Dans cet article nous étudions l'intérêt de l'augmentation de données pour le repérage d'entités nommées en domaine de spécialité : l'astrophysique. Pour cela, nous comparons trois méthodes d'augmentation en utilisant deux récents corpus annotés du domaine : DEAL et TDAC, tous deux en anglais. Nous avons générés les données artificielles en utilisant des méthodes à base de règles et à base de modèles de langue. Les données ont ensuite été ajoutées de manière itérative pour affiner un système de détection d'entités. Les résultats permettent de constater un effet de seuil : ajouter des données artificielles au-delà d'une certaine quantité ne présente plus d'intérêt et peut dégrader la F-mesure. Sur les deux corpus, le seuil varie selon la méthode employée, et en fonction du modèle de langue utilisé. Cette étude met également en évidence que l'augmentation de données est plus efficace sur de petits corpus, ce qui est cohérent avec d'autres études antérieures. En effet, nos expériences montrent qu'il est possible d'améliorer de 1 point la F-mesure sur le corpus DEAL, et jusqu'à 2 points sur le corpus TDAC

    Étude de méthodes d'augmentation de données pour la reconnaissance d'entités nommées en astrophysique

    No full text
    International audienceIn this paper, we investigate the effectiveness of data augmentation for named entity recognition in astrophysics. To this end, we compare three augmentation methods using two recent annotated corpora in the domain : DEAL and TDAC, both in English. We generated artificial data using rule-based and language model-based approaches. The data was then iteratively added to finetune an entity detection system. The results show a threshold effect : adding artificial data beyond a specific quantity is no longer beneficial and can decrease F-measure. The threshold varies for each method and depends on the language model employed. This study also highlights that data augmentation is more effective for small corpora, consistent with previous studies. Indeed, our experiments demonstrate the potential to improve the F-measure by 1 point in the DEAL corpus and up to 2 points in the TDAC corpus.Dans cet article nous étudions l'intérêt de l'augmentation de données pour le repérage d'entités nommées en domaine de spécialité : l'astrophysique. Pour cela, nous comparons trois méthodes d'augmentation en utilisant deux récents corpus annotés du domaine : DEAL et TDAC, tous deux en anglais. Nous avons générés les données artificielles en utilisant des méthodes à base de règles et à base de modèles de langue. Les données ont ensuite été ajoutées de manière itérative pour affiner un système de détection d'entités. Les résultats permettent de constater un effet de seuil : ajouter des données artificielles au-delà d'une certaine quantité ne présente plus d'intérêt et peut dégrader la F-mesure. Sur les deux corpus, le seuil varie selon la méthode employée, et en fonction du modèle de langue utilisé. Cette étude met également en évidence que l'augmentation de données est plus efficace sur de petits corpus, ce qui est cohérent avec d'autres études antérieures. En effet, nos expériences montrent qu'il est possible d'améliorer de 1 point la F-mesure sur le corpus DEAL, et jusqu'à 2 points sur le corpus TDAC

    A Majority Voting Strategy of a SciBERT-based Ensemble Models for Detecting Entities in the Astrophysics Literature (Shared Task)

    No full text
    International audienceDetecting Entities in the Astrophysics Literature (DEAL) is a proposed shared task in the scope of the first Workshop on Information Extraction from Scientific Publications (WIESP) at AACL-IJCNLP 2022. It aims to propose systems identifying astrophysical named entities. This article presents our system based on a majority voting strategy of an ensemble composed of 32 SciBERT models. The system we propose is ranked second and outperforms the baseline provided by the organisers by achieving an F1 score of 0.7993 and a Matthews Correlation Coefficient (MCC) score of 0.8978 in the testing phase

    TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition

    No full text
    International audienceThe increased interest in time-domain astronomy over the last decades has resulted in a substantial increase in observation report publication leading to a saturation of how astrophysicists read, analyze and classify information. Due to the short life span of the detected astronomical events, information related to the characterization of new phenomena has to be communicated and analyzed very rapidly to allow other observatories to react and conduct their follow-up observations. This paper introduces TDAC: a Time-Domain Astrophysics Corpus. TDAC is the first corpus based on astrophysical observation reports. We also present the NLP experiments we made for named entity recognition based on annotations we made and annotations from the WIESP DEAL shared task

    A Majority Voting Strategy of a SciBERT-based Ensemble Models for Detecting Entities in the Astrophysics Literature (Shared Task)

    No full text
    International audienceDetecting Entities in the Astrophysics Literature (DEAL) is a proposed shared task in the scope of the first Workshop on Information Extraction from Scientific Publications (WIESP) at AACL-IJCNLP 2022. It aims to propose systems identifying astrophysical named entities. This article presents our system based on a majority voting strategy of an ensemble composed of 32 SciBERT models. The system we propose is ranked second and outperforms the baseline provided by the organisers by achieving an F1 score of 0.7993 and a Matthews Correlation Coefficient (MCC) score of 0.8978 in the testing phase

    TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition

    No full text
    International audienceThe increased interest in time-domain astronomy over the last decades has resulted in a substantial increase in observation report publication leading to a saturation of how astrophysicists read, analyze and classify information. Due to the short life span of the detected astronomical events, information related to the characterization of new phenomena has to be communicated and analyzed very rapidly to allow other observatories to react and conduct their follow-up observations. This paper introduces TDAC: a Time-Domain Astrophysics Corpus. TDAC is the first corpus based on astrophysical observation reports. We also present the NLP experiments we made for named entity recognition based on annotations we made and annotations from the WIESP DEAL shared task

    Astro-COLIBRI: a new platform for real-time multi-messenger astrophysics

    No full text
    International audienceFlares of known astronomical sources and new transient phenomena occur on different timescales, from sub-seconds to several days or weeks. The discovery potential of both serendipitous observations and multi-messenger and multi-wavelength follow-up observations could be maximized with a tool which allows for quickly acquiring an overview over both persistent sources as well as transient events in the relevant phase space. We here present COincidence LIBrary for Real-time Inquiry (Astro-COLIBRI), a novel and comprehensive tool for this task.Astro-COLIBRI's architecture comprises a RESTful API, a real-time database, a cloud-based alert system and a website as well as apps for iOS and Android as clients for users. The structure of Astro-COLIBRI is optimized for performance and reliability and exploits concepts such as multi-index database queries, a global content delivery network (CDN), and direct data streams from the database to the clients to allow for a seemless user experience. Astro-COLIBRI evaluates incoming VOEvent messages of astronomical observations in real time, filters them by user specified criteria and puts them into their MWL and MM context. The clients provide a graphical representation with an easy to grasp summary of the relevant data to allow for the fast identification of interesting phenomena and provides an assessment of observing conditions at a large selection of observatories around the world.In this contribution, the key features of Astro-COLIBRI are presented. We'll outline the architecture, summarize the used data resources and provide examples for applications and use cases. Focussing on the high-energy domain, we'll for example showcase the search for high-energy gamma-ray counterparts to high-energy neutrinos, gamma-ray bursts and gravitational waves

    TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition

    No full text
    International audienceThe increased interest in time-domain astronomy over the last decades has resulted in a substantial increase in observation report publication leading to a saturation of how astrophysicists read, analyze and classify information. Due to the short life span of the detected astronomical events, information related to the characterization of new phenomena has to be communicated and analyzed very rapidly to allow other observatories to react and conduct their follow-up observations. This paper introduces TDAC: a Time-Domain Astrophysics Corpus. TDAC is the first corpus based on astrophysical observation reports. We also present the NLP experiments we made for named entity recognition based on annotations we made and annotations from the WIESP DEAL shared task

    A Majority Voting Strategy of a SciBERT-based Ensemble Models for Detecting Entities in the Astrophysics Literature (Shared Task)

    No full text
    International audienceDetecting Entities in the Astrophysics Literature (DEAL) is a proposed shared task in the scope of the first Workshop on Information Extraction from Scientific Publications (WIESP) at AACL-IJCNLP 2022. It aims to propose systems identifying astrophysical named entities. This article presents our system based on a majority voting strategy of an ensemble composed of 32 SciBERT models. The system we propose is ranked second and outperforms the baseline provided by the organisers by achieving an F1 score of 0.7993 and a Matthews Correlation Coefficient (MCC) score of 0.8978 in the testing phase
    corecore