11 research outputs found

    Modelling old-age retirement : An adaptive multi-outcome LAD-lasso regression approach

    Get PDF
    Using unique administrative register data, we investigate old-age retirement under the statutory pension scheme in Finland. The analysis is based on multi-outcome modelling of pensions and working lives together with a range of explanatory variables. An adaptive multi-outcome LAD-lasso regression method is applied to obtain estimates of earnings and socioeconomic factors affecting old-age retirement and to decide which of these variables should be included in our model. The proposed statistical technique produces robust and less biased regression coefficient estimates in the context of skewed outcome distributions and an excess number of zeros in some of the explanatory variables. The results underline the importance of late life course earnings and employment to the final amount of pension and reveal differences in pension outcomes across socioeconomic groups. We conclude that adaptive LAD-lasso regression is a promising statistical technique that could be usefully employed in studying various topics in the pension industry.Peer reviewe

    Hamiltonin Monte Carlon soveltaminen finanssiaikasarjoihin

    No full text
    Markovin ketju Monte Carlo -menetelmät ovat olleet tärkeä osa Bayes-tilastotiedettä jo 90-luvulta saakka. Monet perinteiset MCMC-algoritmit, kuten Metropolis-algoritmi ja Gibbsin otanta, ovat yhä suuressa suosiossa tutkijoiden keskuudessa. Nämä yksinkertaiset simulaatioalgoritmit muuttuvat sitä tehottomammiksi, mitä monimutkaisemmista malleista on kysymys. Tässä tutkielmassa esitellään Hamiltonin Monte Carlo, jolla pyritään ratkaisemaan monimutkaisten mallien ongelman simuloinnissa. HMC-algoritmin matemaattisen haastavuuden takia algoritmin toiminta esitetään ensin yksinkertaisten esimerkkien kautta, minkä jälkeen syvennytään sen rakenteeseen ja teoreettiseen taustaan. Tämän lisäksi vertaillaan HMC:n ja Metropolis-algoritmin tehokkuutta ja autokorrelaatioita kahdessa finanssimallissa samalla käyden läpi algoritmin implementoinnin haasteet. Esimerkinomaisena sovelluskohteena käytetään kahta finanssimallia, joiden avulla mallinnetaan osake- ja korkosijoitusten tuottoa. Bayesiläinen lähestymistapa on luonteva tapa arvioida finanssimallien parametrien epävarmuutta. Molemmissa valituissa malleissa HMC osoittautui ajallisesti hitaammaksi kuin Metropolis-algoritmi: samankaltaisten tulosten saaminen vaati HMC-algoritmissa huomattavasti vähemmän iteraatioita kuin Metropolis-algoritmissa, mutta yksittäisen arvon generoiminen oli HMC:ssä huomattavasti hitaampaa. HMC-algoritmin tuottaman ketjun jäsenten välinen autokorrelaatio oli kuitenkin merkittävästi pienempää mitä Metropolis-algoritmissa

    Modelling old-age retirement:an adaptive multi-outcome LAD-lasso regression approach

    No full text
    Abstract Using unique administrative register data, we investigate old-age retirement under the statutory pension scheme in Finland. The analysis is based on multi-outcome modelling of pensions and working lives together with a range of explanatory variables. An adaptive multi-outcome LAD-lasso regression method is applied to obtain estimates of earnings and socioeconomic factors affecting old-age retirement and to decide which of these variables should be included in our model. The proposed statistical technique produces robust and less biased regression coefficient estimates in the context of skewed outcome distributions and an excess number of zeros in some of the explanatory variables. The results underline the importance of late life course earnings and employment to the final amount of pension and reveal differences in pension outcomes across socioeconomic groups. We conclude that adaptive LAD-lasso regression is a promising statistical technique that could be usefully employed in studying various topics in the pension industry

    Capacitated spatial clustering with multiple constraints and attributes

    No full text
    Capacitated spatial clustering, a type of unsupervised machine learning method, is often used to tackle problems in compressing data, classification, logistic optimization and infrastructure optimization. Depending on the application at hand, a multitude of extensions to the clustering problem may be necessary. In this article, we propose a number of novel extensions to PACK, a recent capacitated partitional spatial clustering method which uses an optimization algorithm that is based on linear programming tasks. These extensions relate to the relocation and location preference of cluster centers, outliers, and non-spatial attributes, and they can be considered jointly. In the context of edge server placement, these improve the spatial location of servers while considering, for example, application placement on the servers in response to spatial application usage patterns. We demonstrate the usefulness of an extended version of PACK with an example with simulated data, as well as a real world example in edge server placement for a city region with various different setups. These setups are evaluated with summary statistics about spatial proximity and attribute similarity. As a result, the similarity of the clusters was improved by 53% at best while simultaneously the proximity degraded only by 18%. The extensions provide valuable means for including non-spatial information in the cluster analysis, and to attain better overall proximity and similarity

    Scaling up an edge server deployment

    No full text
    Abstract In this article, we study the scaling up of edge computing deployments. In edge computing, deployments are scaled up by adding more computational capacity atop the initial deployment, as deployment budgets allow. However, without careful consideration, adding new servers may not improve proximity to the mobile users, crucial for the Quality of Experience of users and the Quality of Service of the network operators. In this paper, we propose a novel method for scaling up an edge computing deployment by selecting the optimal number of new edge servers and their placement, and re-allocating access points optimally to the old and new edge servers. The algorithm is evaluated with two scenarios, using data on a real-world large-scale wireless network deployment. The evaluation shows that the proposed method is stable on a real city-scale deployment, resulting in optimized Quality of Service for the network operator

    EDISON:an edge-native method and architecture for distributed interpolation

    No full text
    Abstract Spatio-temporal interpolation provides estimates of observations in unobserved locations and time slots. In smart cities, interpolation helps to provide a fine-grained contextual and situational understanding of the urban environment, in terms of both short-term (e.g., weather, air quality, traffic) or long term (e.g., crime, demographics) spatio-temporal phenomena. Various initiatives improve spatio-temporal interpolation results by including additional data sources such as vehicle-fitted sensors, mobile phones, or micro weather stations of, for example, smart homes. However, the underlying computing paradigm in such initiatives is predominantly centralized, with all data collected and analyzed in the cloud. This solution is not scalable, as when the spatial and temporal density of sensor data grows, the required transmission bandwidth and computational capacity become unfeasible. To address the scaling problem, we propose EDISON: algorithms for distributed learning and inference, and an edge-native architecture for distributing spatio-temporal interpolation models, their computations, and the observed data vertically and horizontally between device, edge and cloud layers. We demonstrate EDISON functionality in a controlled, simulated spatio-temporal setup with 1 M artificial data points. While the main motivation of EDISON is the distribution of the heavy computations, the results show that EDISON also provides an improvement over alternative approaches, reaching at best a 10% smaller RMSE than a global interpolation and 6% smaller RMSE than a baseline distributed approach
    corecore