86 research outputs found
SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation
Subjectivity and difference of opinion are key social phenomena, and it is
crucial to take these into account in the annotation and detection process of
derogatory textual content. In this paper, we use four datasets provided by
SemEval-2023 Task 11 and fine-tune a BERT model to capture the disagreement in
the annotation. We find individual annotator modeling and aggregation lowers
the Cross-Entropy score by an average of 0.21, compared to the direct training
on the soft labels. Our findings further demonstrate that annotator metadata
contributes to the average 0.029 reduction in the Cross-Entropy score.Comment: SemEval Task 11 paper (System
An Active Instance-based Machine Learning method for Stellar Population Studies
We have developed a method for fast and accurate stellar population
parameters determination in order to apply it to high resolution galaxy
spectra. The method is based on an optimization technique that combines active
learning with an instance-based machine learning algorithm. We tested the
method with the retrieval of the star-formation history and dust content in
"synthetic" galaxies with a wide range of S/N ratios. The "synthetic" galaxies
where constructed using two different grids of high resolution theoretical
population synthesis models. The results of our controlled experiment shows
that our method can estimate with good speed and accuracy the parameters of the
stellar populations that make up the galaxy even for very low S/N input. For a
spectrum with S/N=5 the typical average deviation between the input and fitted
spectrum is less than 10**{-5}. Additional improvements are achieved using
prior knowledge.Comment: 14 pages, 25 figures, accepted by Monthly Notice
- …