Search CORE

25 research outputs found

Improving Polish to English Neural Machine Translation with Transfer Learning: Effects of Data Volume and Language Similarity

Author: Chia Zheng Lin
Eronen Juuso
Masui Fumito
Nowakowski Karol
Ptaszynski Michal
Publication venue
Publication date: 01/06/2023
Field of study

This paper investigates the impact of data volume and the use of similar languages on transfer learning in a machine translation task. We find out that having more data generally leads to better performance, as it allows the model to learn more patterns and generalizations from the data. However, related languages can also be particularly effective when there is limited data available for a specific language pair, as the model can leverage the similarities between the languages to improve performance. To demonstrate, we fine-tune mBART model for a Polish-English translation task using the OPUS-100 dataset. We evaluate the performance of the model under various transfer learning configurations, including different transfer source languages and different shot levels for Polish, and report the results. Our experiments show that a combination of related languages and larger amounts of data outperforms the model trained on related languages or larger amounts of data alone. Additionally, we show the importance of related languages in zero-shot and few-shot configurations

arXiv.org e-Print Archive

Hinta-arviojärjestelmä huoneistoille

Author: Eronen Juuso
Publication venue
Publication date: 07/11/2018
Field of study

The objective of this thesis was to compare different regression methods for predicting the price of housing unit, to design and develop a price estimation software and to determine which features are most important when determining the price of a housing unit. The software was developed for Alma Mediapartners Oy to be integrated with their housing marketplace Etuovi.com. Amazon’s SageMaker cloud machine learning platform was used to develop and deploy the software. The data consisted of Etuovi.com’s housing unit advertisements posted by real estate agencies and individual customers. Comparing the machine learning algorithms and developing the software used data from a time period of one year. The features used for training the models included, for example, location, size of the housing unit, age of the building and house type. The tested algorithms included linear regression, regression tree, random forest, gradient boosting and extreme gradient boosting out of which extreme gradient boosting had the best performance. The final model showed that over half of the test samples had an error of less than one percent while 80% of the samples had an error of less than ten percent. Less than four percent of the test samples had an error of 25% or more. The software was developed on Amazon SageMaker following SageMaker’s developer guide. The software fetches the housing unit dataset from Etuovi.com’s data warehouse and trains a model on the dataset on a virtual machine using the extreme gradient boosting-algorithm. The trained model is then hosted in the cloud and can be integrated with Etuovi.com as an independent component

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

TUT DPub

Hinta-arviojärjestelmä huoneistoille

Author: Eronen Juuso
Publication venue
Publication date: 07/11/2018
Field of study

Trepo - Institutional Repository of Tampere University

Tyttökoripalloilijoiden voimaharjoittelu. Opas valmentajille

Author: Eronen Juuso
Lemström Emmi
Publication venue
Publication date: 01/01/2020
Field of study

Tämä opinnäytetyö käsittelee 12–15-vuotiaiden naispuolisten koripalloilijoiden voimaharjoittelua. Teoreettisessa viitekehyksessä käsittelemme tyttöjen fyy- sistä kehittymistä ja voimaharjoittelua lähdekirjallisuuteen ja tutkimustietoon perustuen. Koripallossa vammat kohdistuvat useimmiten alaraajoihin. Ennen murrosikää tai sen aikana aloitetulla voimaharjoittelulla luodaan hyvä pohja tu- levalle (Mero ym. 2012, 110). Ennen aikuisikää aloitetun voimaharjoittelun tar- koituksena on harjoitella tekniikkaa sekä ehkäistä vammoja ja loukkaantumi- sia. Lisäksi voimaharjoittelu tuo harjoitteluun monipuolisuutta ja kehittää la- jinomaista suorituskykyä. (Mäennenä 2019, 275.) Opinnäytetyö on toteutettu tuotekehitysprosessina mukaillen Jämsän ja Man- nisen tuotekehitysmallia. Tuotekehitys on jaettu viiteen eri perusvaiheeseen. Tuote on kehitelty yhdessä toimeksiantajan kanssa, jotta siitä on saatu heidän tarpeisiin sopiva. Tuotekehitys käynnistyi tunnistamalla ongelma ja kehitys- tarve. Seuraava vaihe oli tuotteen ideointi, jossa havaittuun ongelmaan haet- tiin ratkaisua. Sitä seurasi luonnostelu-, kehittely- ja viimeistelyvaiheet. Opinnäytetyön tarkoituksena oli luoda opas, joka toimii valmentajien työkaluna joukkueen tyttöjen voimaharjoittelun toteutuksessa. Tavoitteena on lisätä val- mentajien tietämystä nuorten koripalloilijoiden voimaharjoittelusta. Oppaassa käsittelemme erilaisia voimaharjoittelun toteutustapoja. Useimmat oppaan har- joitteet pystytään tekemään joko kehonpainoharjoitteina tai painojen, vastuk- sien tai muiden välineiden kanssa riippuen siitä, mitä voiman osa-aluetta halu- taan kehittää. Toimeksiantajana toimii Kouvolalainen koripalloseura Kouvolan Kouvot Ry

Theseus

Analysis of Atmospheric Precursor of Extreme Summers in Central Europe

Author: Canete Laetitia
Eronen Tommi
Gorelov Dmitry
Hakala Jani
Jokinen Ari
Kankainen Anu
Kolhinen Veli
Koponen Jukka
Moore Iain
Reinikainen Juuso
Rinta-Antila Sami
Publication venue: Philipps-Universität Marburg, Geographie
Publication date: 01/01/2014
Field of study

In der vorliegenden Arbeit werden die Eigenschaften extrem heißer und trockener Sommer in Zentraleuropa und deren vorhergehender Winter - Frühlings – Übergangsperioden analysiert, mit dem Ziel potenzielle Vorläufer extremer Sommer in der atmosphärischen Zirkulation zu identifizieren. Dabei werden sowohl Wechselwirkungen zwischen Landoberfläche und Atmosphäre als auch großräumige atmosphärische Zirkulationsregime im Zusammenhang analysiert und diskutiert. Die Analyse basiert auf in situ Beobachtungen, satellitenbasierten Fernerkundgsbeobachtungen und Re-analysedaten. In dieser Arbeit werden extrem heiße und trockene Sommer mithilfe der Kombination aus solarer Einstrahlung und Niederschlag als zentralem Proxy definiert. Die extremsten Sommer hinsichtlich des Solarstrahlungsüberschusses und des Niederschlagsdefizites wurden gemeinsam mit den jeweils vorhergehenden Winter - Frühlings - Übergangsperioden im Untersuchungsgebiet 47°N -56°N; 4°E - 15°E (Deutschland und angrenzende Gebiete) analysiert. Die Analyse basiert auf regionalen Mitteln der akkumulierten Monatsmittel der Winter – Frühlings – Übergangsperiode (Februar, März, April – FMA) und der Sommerperiode (Juni, Juli, August – JJA) von Solarstrahlung und Niederschlag im Untersuchungsgebiet und auf der Analyse der saisonalen Anomalien des Geopotenzials in 850 hPa über dem Nordatlantik und Europa. Die Modellexperimente anderer Autoren für Süd- und Südosteuropa wurden bestätigt: Für die extremsten Hitzesommer wurde auch für Zentraleuropa eine Dominanz von antizyklonalen Zirkulationsregimen mit dem damit verbundenen Solarstrahlungsüberschuss und dem Niederschlagsdefizit (im Vergleich zum langjährigen Mittel) beobachtet. Zwei der drei extremsten sonnigen und trockenen Sommer in Zentraleuropa im Zeitraum 1958 - 2011 wurden bereits in der jeweils vorhergehenden FMA Periode mit extrem großen positiven Solarstrahlungsanomalien und extrem großen negativen Niederschlagsanomalien präkonditioniert. Für dieselben Jahre konnte auch eine Präkonditionierung der Atmosphäre während der FMA Periode identifiziert werden: ein Dipol in der Druckanomalie (Anomalie des Geopotenzials in 850 hPa), mit einem Zentrum negativer Anomalie über Südgrönland und einem Zentrum positiver Anomalie über der Nordsee und Fennoskandien. Als ein Maß für die Stärke dieses Dipols wurde der neue Grönland - Nordsee - Dipol - Index (GNDI) eingeführt. In der Mehrzahl der Jahre mit extrem sonnigen und trockenen Sommern überschreitet der GNDI der vorhergehenden FMA Periode einen Wert von 20. Ein Zusammenhang mit der NAO oder der AO und den extremen Sommern konnte nicht festgestellt werden. Einer der als extrem sonnig und trocken identifizierten Sommer der Zeitreihe wurde nicht präkonditioniert. Jedoch trat im Winter vor diesem Ereignis ein extrem starkes El Nino Ereignis auf. Auf der anderen Seite gab es ein Jahr mit ungewöhnlich sonniger und trockener FMA Periode (Präkonditionierung), auf die aber ein eher feuchter Sommer mit durchschnittlicher solarer Einstrahlung folgte. Im Winter zuvor trat ein extrem starkes La Nina Ereignis auf. Dies führt zu der Schlussfolgerung, dass extrem starke ENSO Ereignisse das europäische Klima auf der saisonalen Skala beeinflussen können: Starke El Nino Ereignisse können extreme Sommer verursachen. Starke La Nina Ereignisse können Signale erzeugen, die das Potenzial haben, die Verbindung zwischen sonnigen und trockenen FMA Perioden und den darauf folgenden Sommern zu stören. Im Kontext extrem heißer und trockener Sommer in Zentraleuropa zeigen diese Ergebnisse Folgendes: - Zusätzlich zur Dominanz antizyklonaler Drucksysteme in der atmosphärischen Zirkulation und den Wechselwirkungen zwischen Landoberfläche und Atmosphäre, ist ENSO ein weiterer wichtiger Faktor für die Entwicklung extremer Sommer in Zentraleuropa. - Durch ENSO ausgelöste Effekte haben das Potenzial die Verbindungen zwischen der FMA Periode und dem darauf folgenden Sommer bezüglich des Zusammenspiels von atmosphärischer Zirkulation und Wechselwirkungen zwischen Landoberfläche und Atmosphäre, welche für die Entwicklung extrem heißer und trockener Sommer verantwortlich sind, zu zerstören. Diese Erkenntnisse werden im neu entwickelten Zentraleuropäischen Dürreindex (CEDI) zusammengeführt. Mithilfe des CEDI können alle extrem heißen und trockenen Sommer der oberen 10% Perzentile und ein der extremer Sommer der oberen 20% Perzentile in Zentraleuropa richtig "nachhergesagt" werden. Die Ergebnisse dieser Arbeit tragen zu einem besseren Verständnis der Entwicklung extremer Sommer in Zentraleuropa bei und sind daher ein wertvoller Beitrag zur Verbesserung der sommerlichen Jahreszeitenvorhersage für dieses Gebiet. Aufgrund der Ergebnisse kann erwartet werden, dass der neu entwickelte CEDI einen wesentlichen Beitrag zur Entwicklung eines Frühwarnsystems für extrem heiße und trockene Sommer leisten kann

Crossref

Jyväskylä University Digital Archive

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

High-precision mass measurements of 25Al and 30P at JYFLTRAP

Author: Canete Laetitia
Eronen Tommi
Gorelov Dmitry
Hakala Jani
Jokinen Ari
Kankainen Anu
Kolhinen Veli
Koponen Jukka
Moore Iain
Reinikainen Juuso
Rinta-Antila Sami
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/12/2017
Field of study

The masses of the astrophysically relevant nuclei 25Al and 30P have been measured with a Penning trap for the first time. The mass-excess values for 25Al ( Δ=−8915.962(63) keV) and 30P ( Δ=−20200.854(64) keV) obtained with the JYFLTRAP double Penning trap mass spectrometer are in good agreement with the Atomic Mass Evaluation 2012 values but ≈ 5-10 times more precise. A high precision is required for calculating resonant proton-capture rates of astrophysically important reactions 25Al (p,γ)26Si and 30P(p,γ)31S. In this work, Q(p,γ)=5513.99(13) keV and Q(p,γ)=6130.64(24) keV were obtained for 25Al and 30P, respectively. The effect of the more precise values on the resonant proton-capture rates has been studied. In addition to nuclear astrophysics, the measured QEC value of 25Al, 4276.805(45) keV, is relevant for studies of T = 1/2 mirror beta decays which have a potential to be used to test the Conserved Vector Current hypothesis.peerReviewe

Jyväskylä University Digital Archive

Mass of astrophysically relevant 31Cl and the breakdown of the isobaric multiplet mass equation

Author: Canete Laetitia
Eronen Tommi
Hakala Jani
Jokinen Ari
Kankainen Anu
Koponen Jukka
Moore Iain
Nesterenko Dmitrii
Reinikainen Juuso
Rinta-Antila Sami
Voss Annika
Äystö Juha
Publication venue: 'American Physical Society (APS)'
Publication date: 27/01/2016
Field of study

The mass of 31Cl has been measured with the JYFLTRAP double-Penning-trap mass spectrometer at the Ion Guide Isotope Separator On-Line (IGISOL) facility. The determined mass-excess value, −7034.7(34) keV, is 15 times more precise than in the Atomic Mass Evaluation 2012. The quadratic form of the isobaric multiplet mass equation for the T=3/2 quartet at A=31 fails (χ2n=11.6) and a nonzero cubic term, d=−3.5(11) keV, is obtained when the new mass value is adopted. 31Cl has been found to be less proton-bound, with a proton separation energy of Sp=264.6(34) keV. Energies for the excited states in 31Cl and the photodisintegration rate on 31Cl have been determined with significantly improved precision by using the new Sp value. The improved photodisintegration rate helps to constrain astrophysical conditions where 30S can act as a waiting point in the rapid proton capture process in type-I x-ray bursts.peerReviewe

arXiv.org e-Print Archive

Jyväskylä University Digital Archive

Crossref

MEDIENwissenschaft: Rezensionen, Reviews wird 20!

Author: Canete Laetitia
Eronen Tommi
Gorelov Dmitry
Hakala Jani
Jokinen Ari
Kankainen Anu
Kolhinen Veli
Koponen Jukka
Moore Iain
Nesterenko Dmitrii
Reinikainen Juuso
Rinta-Antila Sami
Äystö Juha
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2003
Field of study

One of the key parameters for the reaction network calculations for the rapid proton capture (rp) process, occurring e.g., in type I X-ray bursts, are the masses of the involved nuclei. Nowadays, masses of even rather exotic nuclei can be measured very precisely employing Penning-trap mass spectrometry. With the JYFLTRAP Penning trap at the IGISOL facility, masses of around 100 neutron-deficient nuclei have been determined with a typical precision of a few keV. Most recently, 25Al, 30P, 31Cl, and 52Co have been measured. Of these, the precision of the mass-excess value of 31Cl was improved from 50 to 3.4 keV, and the mass of 52Co was experimentally determined for the first time. The mass of 31Cl is relevant for estimating the waiting-point conditions for 30S as the 31Cl(γ, p)30S–30S(p, γ)31Cl equilibrium ratio depends exponentially on the Q value. For 52Co, located at the path towards 56Ni, a deviation from the extrapolated mass value has been revealed. In this contribution, recent JYFLTRAP experiments for the rp process will be discussed.peerReviewe

Crossref

Jyväskylä University Digital Archive

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Precision 71Ga – 71Ge mass-difference measurement

Author: Alanssari M.
Canete Laetitia
Eronen Tommi
Frekers D.
Hakala Jani
Holl M.
Jokinen Ari
Kankainen Anu
Koponen Jukka
Moore Iain
Nesterenko Dmitrii
Pohjalainen Ilkka
Reinikainen Juuso
Rinta-Antila Sami
Voss Annika
Publication venue: 'Elsevier BV'
Publication date: 08/08/2016
Field of study

The 71Ga(νe, e−) 71Ge reaction Q value has been measured with the JYFLTRAP mass spectrometer at the IGISOL facility of the University of Jyv¨askyl¨a to Q = 232.443(93) keV. This value agrees with previous measurements, though it features a much higher accuracy. The Q value is being discussed in the context of the solar neutrino capture rate in 71Ga.peerReviewe

Jyväskylä University Digital Archive

Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach

Author: Gniewosz Leliwa
Ida Dziublewska
Jan Piesiewicz
Juuso Eronen
Kamil Soliwoda
Maciej Brochocki
Marcin Fortuna
Marek Godny
Maria Dowgiallo
Michal Marcinczuk
Michal Ptaszynski
Michal Wroczynski
Monika Zasko-Zielinska
Olimpia Hubert
Patrycja Tempska
Paula Karbowska
Pawel Skrzek
Publication venue: 'MDPI AG'
Publication date: 09/11/2021
Field of study

In this paper, we study language used by suicidal users on Reddit social media platform. To do that, we firstly collect a large-scale dataset of Reddit posts and annotate it with highly trained and expert annotators under a rigorous annotation scheme. Next, we perform a multifaceted analysis of the dataset, including: (1) the analysis of user activity before and after posting a suicidal message, and (2) a pragmalinguistic study on the vocabulary used by suicidal users. In the second part of the analysis, we apply LIWC, a dictionary-based toolset widely used in psychology and linguistic research, which provides a wide range of linguistic category annotations on text. However, since raw LIWC scores are not sufficiently reliable, or informative, we propose a procedure to decrease the possibility of unreliable and misleading LIWC scores leading to misleading conclusions by analyzing not each category separately, but in pairs with other categories. The analysis of the results supported the validity of the proposed approach by revealing a number of valuable information on the vocabulary used by suicidal users and helped to pin-point false predictors. For example, we were able to specify that death-related words, typically associated with suicidal posts in the majority of the literature, become false predictors, when they co-occur with apostrophes, even in high-risk subreddits. On the other hand, the category-pair based disambiguation helped to specify that death becomes a predictor only when co-occurring with future-focused language, informal language, discrepancy, or 1st person pronouns. The promising applicability of the approach was additionally analyzed for its limitations, where we found out that although LIWC is a useful and easily applicable tool, the lack of any contextual processing makes it unsuitable for application in psychological and linguistic studies. We conclude that disadvantages of LIWC can be easily overcome by creating a number of high-performance AI-based classifiers trained for annotation of similar categories as LIWC, which we plan to pursue in future work

Multidisciplinary Digital Publishing Institute