1,088 research outputs found

    Advances in Public Transport Platform for the Development of Sustainability Cities

    Get PDF
    Modern societies demand high and varied mobility, which in turn requires a complex transport system adapted to social needs that guarantees the movement of people and goods in an economically efficient and safe way, but all are subject to a new environmental rationality and the new logic of the paradigm of sustainability. From this perspective, an efficient and flexible transport system that provides intelligent and sustainable mobility patterns is essential to our economy and our quality of life. The current transport system poses growing and significant challenges for the environment, human health, and sustainability, while current mobility schemes have focused much more on the private vehicle that has conditioned both the lifestyles of citizens and cities, as well as urban and territorial sustainability. Transport has a very considerable weight in the framework of sustainable development due to environmental pressures, associated social and economic effects, and interrelations with other sectors. The continuous growth that this sector has experienced over the last few years and its foreseeable increase, even considering the change in trends due to the current situation of generalized crisis, make the challenge of sustainable transport a strategic priority at local, national, European, and global levels. This Special Issue will pay attention to all those research approaches focused on the relationship between evolution in the area of transport with a high incidence in the environment from the perspective of efficiency

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Capturing Evolution Genes for Time Series Data

    Full text link
    The modeling of time series is becoming increasingly critical in a wide variety of applications. Overall, data evolves by following different patterns, which are generally caused by different user behaviors. Given a time series, we define the evolution gene to capture the latent user behaviors and to describe how the behaviors lead to the generation of time series. In particular, we propose a uniform framework that recognizes different evolution genes of segments by learning a classifier, and adopt an adversarial generator to implement the evolution gene by estimating the segments' distribution. Experimental results based on a synthetic dataset and five real-world datasets show that our approach can not only achieve a good prediction results (e.g., averagely +10.56% in terms of F1), but is also able to provide explanations of the results.Comment: a preprint version. arXiv admin note: text overlap with arXiv:1703.10155 by other author

    Land Use and Transport: Settlement Patterns and the Demand for Travel. Stage 2 Background Technical Report

    Get PDF

    Applications of Federated Learning in Smart Cities: Recent Advances, Taxonomy, and Open Challenges

    Full text link
    Federated learning plays an important role in the process of smart cities. With the development of big data and artificial intelligence, there is a problem of data privacy protection in this process. Federated learning is capable of solving this problem. This paper starts with the current developments of federated learning and its applications in various fields. We conduct a comprehensive investigation. This paper summarize the latest research on the application of federated learning in various fields of smart cities. In-depth understanding of the current development of federated learning from the Internet of Things, transportation, communications, finance, medical and other fields. Before that, we introduce the background, definition and key technologies of federated learning. Further more, we review the key technologies and the latest results. Finally, we discuss the future applications and research directions of federated learning in smart cities

    Data-Driven Methods for Data Center Operations Support

    Get PDF
    During the last decade, cloud technologies have been evolving at an impressive pace, such that we are now living in a cloud-native era where developers can leverage on an unprecedented landscape of (possibly managed) services for orchestration, compute, storage, load-balancing, monitoring, etc. The possibility to have on-demand access to a diverse set of configurable virtualized resources allows for building more elastic, flexible and highly-resilient distributed applications. Behind the scenes, cloud providers sustain the heavy burden of maintaining the underlying infrastructures, consisting in large-scale distributed systems, partitioned and replicated among many geographically dislocated data centers to guarantee scalability, robustness to failures, high availability and low latency. The larger the scale, the more cloud providers have to deal with complex interactions among the various components, such that monitoring, diagnosing and troubleshooting issues become incredibly daunting tasks. To keep up with these challenges, development and operations practices have undergone significant transformations, especially in terms of improving the automations that make releasing new software, and responding to unforeseen issues, faster and sustainable at scale. The resulting paradigm is nowadays referred to as DevOps. However, while such automations can be very sophisticated, traditional DevOps practices fundamentally rely on reactive mechanisms, that typically require careful manual tuning and supervision from human experts. To minimize the risk of outages—and the related costs—it is crucial to provide DevOps teams with suitable tools that can enable a proactive approach to data center operations. This work presents a comprehensive data-driven framework to address the most relevant problems that can be experienced in large-scale distributed cloud infrastructures. These environments are indeed characterized by a very large availability of diverse data, collected at each level of the stack, such as: time-series (e.g., physical host measurements, virtual machine or container metrics, networking components logs, application KPIs); graphs (e.g., network topologies, fault graphs reporting dependencies among hardware and software components, performance issues propagation networks); and text (e.g., source code, system logs, version control system history, code review feedbacks). Such data are also typically updated with relatively high frequency, and subject to distribution drifts caused by continuous configuration changes to the underlying infrastructure. In such a highly dynamic scenario, traditional model-driven approaches alone may be inadequate at capturing the complexity of the interactions among system components. DevOps teams would certainly benefit from having robust data-driven methods to support their decisions based on historical information. For instance, effective anomaly detection capabilities may also help in conducting more precise and efficient root-cause analysis. Also, leveraging on accurate forecasting and intelligent control strategies would improve resource management. Given their ability to deal with high-dimensional, complex data, Deep Learning-based methods are the most straightforward option for the realization of the aforementioned support tools. On the other hand, because of their complexity, this kind of models often requires huge processing power, and suitable hardware, to be operated effectively at scale. These aspects must be carefully addressed when applying such methods in the context of data center operations. Automated operations approaches must be dependable and cost-efficient, not to degrade the services they are built to improve. i

    Methodologies for the assessment of industrial and energy assets, based on data analysis and BI

    Get PDF
    In July 2020, post pandemic onset, Europe launched the Next Generation EU (NGEU) program. The amount of resources deployed to revitalize Europe has reached 750 billion. The NGEU initiative directs significant resources to Italy. These funds can enable our country to boost investment and increase employment. The missions of Italian Recovery and Resilience Plan (PNRR) include digitization, innovation and sustainable mobility (rail network investments, etc.). In this context, this doctorate thesis discusses the importance of infrastructure for society with a special focus on energy, railway and motorway infrastructure. The central theme of sustainability, defined by the World Commission on Environment and Development (WCDE) as ''development that meets the needs of the present generation without compromising the ability of future generations to meet their needs’’, is also highlighted. Through their activities and relationships, organizations contribute positively or negatively to the goal of sustainable development. Sustainability becomes an integrated part of corporate culture. First research in this thesis describes how Artificial Intelligence techniques can play a supporting role for both maintenance operators in tunnel monitoring and those responsible for safety in operation. Relevant information can be extracted from large volumes of data from sensor equipment in an efficient, fast, dynamic and adaptive manner and made immediately usable by those operating machinery and services to support rapid decisions. Performing sensor-based analysis in motorway tunnels represents a major technological breakthrough that would simplify tunnel management activities and thus the detection of possible deterioration, while keeping risk within tolerance limits. The idea involves the creation of an algorithm for detecting faults, acquiring real-time data from tunnel subsystem sensors and using it to help identify the tunnel's state of service. Artificial intelligence models were trained over a sixmonth period with a granularity of one-hour time series measured on a road tunnel forming part of the Italian motorway systems. The verification was carried out with 3 reference to a series of failures recorded by the sensors. The second research argument is relates to the transfer capacities of high-voltage overhead lines (HVOHL), which are often limited by the critical temperature of the power line, which depends on the magnitude of the current transferred and the environmental conditions, i.e. ambient temperature, wind, etc. In order to use existing power lines more effectively (with a view to progressive decarbonization) and more safely with respect to critical power line temperatures, this work proposes a Dynamic Thermal Rating (DTR) approach using IoT sensors installed on a number of HV OHL located in different geographical locations in Italy. The objective is to estimate the temperature and ampacity of the OHL conductor, using a data-driven thermomechanical model with a bayesian probabilistic approach, in order to improve the confidence interval of the results. This work shows that it might be possible to estimate a spatio-temporal temperature distribution for each OHL and an increase in the threshold values of the effective current to optimize the OHL ampacity. The proposed model was validated using the Monte Carlo method. Finally, in this thesis is presented study on KPIs as indispensable allies of top management in the asset control phase. They are often overwhelmed by the availability of a huge amount of Key Performance Indicators (KPIs). Most managers struggle In understanding and identifying the few vital management metrics and instead collect and report a vast amount of everything that is easy to measure. As a result, they end up drowning in data, thirsty for information. This condition does not allow good systems management. The aim of this research is help the Asset Management System (AMS) of a railway infrastructure manager using business intelligence (BI) to equip itself with a KPI management system in line with the AM presented by the normative ISO 55000 - 55001 - 55002 and UIC (International Union of Railways) guideline, for the specific case of a railway infrastructure. This work starts from the study of these regulations, continues with the exploration, definition and use of KPIs. Subsequently KPIs of a generic infrastructure are identified and analyzed, 4 especially for the specific case of a railway infrastructure manager. These KPIs are fitted in the internal elements of the AM frameworks (ISO-UIC) for systematization. Moreover, an analysis of the KPIs now used in the company is made, compared with the KPIs that an infrastructure manager should have. Starting from here a gap analysis is done for the optimization of AMS

    The 8th International Conference on Time Series and Forecasting

    Get PDF
    The aim of ITISE 2022 is to create a friendly environment that could lead to the establishment or strengthening of scientific collaborations and exchanges among attendees. Therefore, ITISE 2022 is soliciting high-quality original research papers (including significant works-in-progress) on any aspect time series analysis and forecasting, in order to motivating the generation and use of new knowledge, computational techniques and methods on forecasting in a wide range of fields

    Contributions to time series data mining departing from the problem of road travel time modeling

    Get PDF
    194 p.Bidaiarientzako Informazio Sistema Aurreratuak (BISA) errepideetan sensoreenbidez bildutako datuak jaso, prozesatu eta jakitera ematen dituzte,erabiltzailei haien bidaietan lagunduz eta ibilbidea hasi baino lehen eta bideanhartu beharreko erabakiak erraztuz [5]. Helburu honetarako, BISA sistemektrafiko ereduak beharrezkoak dituzte, bidaiarientzat baliagarriak izandaitezkeen trafiko aldagaiak deskribatu, simulatu eta iragartzeko balio duelako.Zehazki, kontutan hartu daitezkeen trafiko aldagai guztietatik (fluxua,errepidearen okupazioa, abiadurak, etab.) bidai denbora da erabiltzaileentzatintuitiboena eta ulerterrazena den aldagaia eta, beraz, BISA sistemetan garrantziberezia hartzen duena [6]. Bidai denbora, aurrez zehaztutako puntubatetik bestera joateko ibilgailu batek behar duen denborari deritzo.Bidai denboren eredugintzan bi problema nagusi bereizten dira: estimazioaeta iragarpena. Nahiz eta literaturan batzuetan bi kontzeptu hauek baliokidetzatjo, berez, bi problema bereizi dira, ezaugarri eta helburu ezberdinekin,eta teknika ezberdinak eskatzen dituztenak.Alde batetik, bidai denboren estimazioaren helburua iada amaitutakobidaietan ibilgailuak bataz beste zenbat denbora igaro duten kalkulatzeada. Horretarako, ibilbidean zehar jasotako trafikoari buruzko informazioaedo/eta bestelako datuak (eguraldia, egutegiko informazioa, etab.) erabildaitezke [1]. Estimazio metodo ezberdinak eskuragarri dauden datu motaeta kantitatearen araberara sailka daitezke eta, a posteriori motako balorazioakegiteko balio dute. Bestalde, bidai denboren iragarpena, orainean edoetorkizunean hasiko diren bidaien denborak kalkulatzean datza. Honetarako,iragarpena egiten den momentuan jasotako eta iraganeko trafikoari buruzkodatuak eta testuinguruko informazioa erabiltzen da [8].Ibilgailu kopuru eta auto-ilaren ugaritzeen ondorioz, bidai denboren estimazioeta predikzio onak lortzea geroz eta beharrezkoagoa da, trafikoarenkudeaketa egokia ahalbidetzen duelako. Hau ikusirik, azken urteetan eredumota ezberdin andana proposatu eta argitaratu dira. Nolanahi ere, literaturarenberrikuspen eta analisi sakon bat egin dugu tesi honen lehenengoatalean. Bertan, ondorioztatu ahal izan dugu proposatutako eredu guztiakez direla egokiak errepide sare, trafiko egoera eta datu mota guztiekin erabiltzeko.Izan ere, atera dugun ondorio nabariena, argitaratutako eredu askokez dituztela BISA sistemen eskakizun praktikoak betetzen, da. Lehenik etabehin, eredu asko errepide zati txikietan soilik aplika daitezke, eta ez dagoargi errepide sare guztira nola hedatu daitezkeen. Bestalde, eredu gehienekdatu mota bakarra erabiltzen dute eta errealitatean ohikoa da datu mota batekinbaina gehiagorekin lan egin behar izatea. Azkenik, pilaketa ez-ohikoenaurrean malgutasun mugatua izatea ere desabantaila nabari eta ohikoa da.Hau honela, eredu konbinatu edo hibridoak proposamen hauetatik guztietatiketorkizun handiena dutenak direla dirudi, patroi ezberdinetara moldatzekogaitasuna dutelako, eta eredu eta datu mota ezberdinak nahastekoaukera ematen dutelako.Tesi honetan, bidai denborak iragartzeko eredu hibrido edo konbinatuakhartuko ditugu abiapuntutzat. Zehazki, hasieran datuak antzekotasunarenarabera multzokatzen dituenetan jarriko dugu arreta. Metodo hauek, datuakmultzokatu ondoren, multzo bakoitzari bidai denborak iragartzeko eredu ezberdinbat aplikatzen diote, zehatzagoa eta patroi espezifiko horrentzat espresukieraikia.Eredu talde honen kasu berezi bat, datuen multzokatzea denbora serieentaldekatzearen bitartez egiten duena da. Denbora serieen taldekatzea (clustering-a ingelesez) datu mehatzaritzako gainbegiratu gabeko ataza bat da, nonhelburua, denbora serie multzo, edo beste era batera esanda, denbora seriedatu base bat emanik, serie hauek talde homogeneoetan banatzea den [3]. Xedea,beraz, talde bereko serieen antzekotasuna ahalik eta handiena izatea etaaldiz, talde ezberdinetako serieak ahalik eta desberdinenak izatea da. Trafikodatuetan eta bidai denboretan, portaera ezberdinetako egunak aurkitzea osoohikoa da (adib. asteguna eta asteburuak). Hau honela, egun osoan zeharjasotako bidai denborez osatutako serie bat izanik, metodo mota honek lehenik,dagokion egun mota identifikatuko luke eta ondoren iragarpenak egunmota horretarako bereziki eraikitako eredu batekin lortuko lituzke.Denbora serieen clustering-an oinarritutako eredu mota hau ez da ia inoizerabili literaturan eta, ondorioz, bere onurak eta desabantailak ez dira ondoaztertu orain arte. Honegatik, tesi honen bigarren kapituluan, eredugintzaprozeduaren hasieran egun mota ezberdinak identifikatzea bidai denboreniragarpenak lortzeko lagungarria ote den aztertu dugu, emaitza positiboaklortuz. Hala ere, praktikan, honelako eredu konbinatuak eraikitzeak eta erabiltzeakzailtasun bat baino gehiago dakartza. Tesi honetan bi arazo nagusietanjarriko dugu arreta eta hauentzat soluzio bana proposatzea izango duguhelburu.Hasteko, denbora serieak multzokatzeko, erabaki ez tribial batzuk hartubehar dira, adibidez distantzia funtzio egoki bat aukeratzea. Literaturanbehin baino gehiagotan erakutsi da erabaki hau oso garrantzitsua dela etaasko baldintzatzen dituela lortuko diren emaitzak [7]. Trafikoko kasuan ere,hau honela dela demostratu dugu. Baina distantzia baten aukeraketa ez dabatere erraza. Azken urteotan hamaika distantzia ezberdin proposatu dituikerlari komunitateak denbora serieekin lan egiteko eta, dirudienez, datu basebakoitzaren ezaugarrien arabera, bat ala bestea izaten dela egokiena [3, 7].Guk dakigula, ez dago metodologia formalik erabiltzaileei aukeraketa hauegiten laguntzen dionik, ez batik bat denbora serieen clustering-aren testuinguruan.Metodologia ohikoena distantzia sorta bat probatzea eta lortutakoemaitzen arabera bat aukeratzea da. Zoritxarrez, distantzia batzuen kalkuluakonputazionalki oso garestia da, eta beraz, estrategia hau ez da batereeraginkorra praktikan.Ataza hau simplifikatzeko asmoarekin, tesiko hirugarren kapituluan etiketaanitzeko sailkatzaile bat (ingelesez multi-label classifier ) proposatzen dugudenbora serieen datu base bat multzokatzeko, distantzia egokiena modu automatikoanaukeratzen duena. Sailkatzaile hau eraikitzeko, hasteko, denboraserie datu base baten alderdi batzuk deskribatzeko ezaugarri sorta bat definitudugu. Besteak beste, datuetan dagoen zarata maila, autokorrelazio maila,serie atipikoen kopurua, periodizitatea eta beste hainbat ezaugarri neurtu etakuantifikatzeko metodoak proposatu ditugu. Ezaugarri hauek sailkatzaileakbehar duen input informazioa edo, bestela esanda, sailkatzailearen menpekoaldagaiak izango dira. Emaitza gisa, sailkatzaileak datu base batentzategokienak diren distantziak itzuliko dizkigu, kandidatu sorta batetik, noski.Sailkatzaile honen baliagarritasuna egiaztatzeko, esperimentu sorta zabalbat bideratu dugu, bai lan honetarako bereziki sortutako datu base sintetikoekineta bai UCR artxiboko [4] benetako datuak erabiliz. Lortutako emaitzapositiboak argi uzten dute proposatutako sailkatzaileak denbora serie batmultzokatzeko distantzia funtzio baten aukeraketa errazteko balio duela.Ekarpen hau azalduta, berriz bidai denboren iragarpenerako eredu kon-binatuetara itzuli eta bigarren problema bat identifikatzen dugu, tesiko bigarrenekarpen nagusira eramango gaituena. Gogoratu eredu konbinatu hauekhasiera batean datuak multzokatzen dituztela, clustering algoritmoak erabiliz.Talde bakoitzak patroi edo trafiko portaera ezberdin bat adieraziko du.Ondoren, talde bakoitzean iragarpenak egiteko, iragarpen eredu ezberdin bateraikiko dugu, soilik multzo horretako datu historikoak erabiliz. Gure kasuan,denbora serieen clustering-a aplikatu dugu eta beraz, egun mota ezberdinaklortuko ditugu. Ondoren, iragarpen berriak egin nahi ezkero, egun berri bathasten denean, zein multzokoa den asmatu beharko dugu, erabili behar duguneredua aukeratzeko.Ohartu, iragarpenak egiteko garaian, ez dugula egun osoko daturik izangoeskuragarri. Adibidez, goizeko hamarretan, eguerdiko hamabietan (2 ordugeroago) puntu batetik bestera joateko beharko dugun denbora iragarri nahibadugu, soilik egun horretan hamarrak arte jasotako informazioa izango dugueskuragarri, informazio historikoarekin batera, noski. Egoera honetan, egunhorretako informazio partzialarekin, seriearen lehen zatiarekin soilik, erabakibehar dugu zein multzotakoa den. Noski, ordurarte jasotako informazioa ezbada nahikoa adierazgarria, kalterako izan daiteke multzo eta eredu zehatzbat aukeratzea, eta ziurrenik hobe izango da eredu orokorrago bat erabiltzea,datu historiko guztiekin eraikia. Finean, egun berriak ahal bezain prontomultzo batera esleitu nahi ditugu, baina esleipen hauetan ahal bezain erroregutxien egin nahi dugu.Logikoa da pentsatzea esleipenak geroz eta lehenago eginez akatsak egitekoaukera handiagoa dela. Hau honela, helburua esleipenak ahal bezain azkaregitea da, baina zehaztasun maila onargarri bat bermatuz. Denbora serieenmehatzaritzan problema honi denbora serieen sailkapen goiztiarra (ingelesezearly classification of time series) deritzo [10].Denbora serieen sailkapena (ingelesez time series classification) [9, 10] datumehatzaritzako gainbegiratutako problema aski ezaguna da non, denboraserie multzo bat eta haietako bakoitzaren klasea jakinik, helburua sailkatzailebat eraikitzea den, serie berrien klaseak iragartzeko gai dena.Denbora serieen sailkapenaren azpi-problema gisa, sailkapen goiztiarra,denboran zehar iristen den datu zerrenda bat ahalik eta lasterren klase zehatzbatean sailkatzeko nahia edo beharra dagoenean agertzen da [10]. Adibide gisa,informatika medikoan, gaixoaren datu klinikoak denboran zehar monitorizatueta jasotzen dira, eta gaixotasun batzuen detekzio goiztiarra erabakigarriada pazientearen egoeran. Esaterako, arterien buxadura, fotopletismografia(PPG) serieen bidez detektatzen da errazen [2], baina diagnosian segunduhamarren baten atzerapenak, guztiz ondorio ezberdinak ekar ditzake.Honela, tesiaren 4. kapituluan, denbora serieen datu mehatzaritzari bigarrenekarpen garrantzitsu bezala, ECDIRE (Early Classification frameworkfor time series based on class DIscriminativeness and REliability ofpredictions) izeneko denbora serieen sailkatzaile goiztiarra aurkeztu dugu.Sailkatzaile hau eraikitzeko, entrenamendu fasean, metodoak klase bakoitzaanalizatzen du eta beste klaseengandik noiztik aurrera ezberdindu daitekeenkalkulatzen du, aurrez ezarritako zehaztasun maila bat mantenduz,noski. Zehaztasun maila hau erabiltzaileak finkatuko du haren interesen arabera.Entrenamentu fase honetan lortutako informazioak sailkapenak noizegin zehaztuko digu eta, beraz, serieak goizegi esleitzea saihesten lagundukodu. Bestalde, ECDIRE metodoak sailkatzaile probabilistikoak erabiltzen ditu,eta sailkatzaile mota hauengandik lortutako a-posteriori probabilitateak,lortutako sailkapenen zehaztasuna beste era batean kontrolatzen lagundukodigu.ECDIRE metodoa UCR artxiboko 45 datu baseei aplikatu diogu, literaturanorain arte lortutako emaitzak hobetuz. Bestalde, kasu erreal bateanmetodoaren aplikazioa nolakoa izango zen erakusteko, kantuen bidezko txoriendetekzio eta identifikazio problema baterako sortutako datu base batekinere burutu ditugu esperimentuak, emaitza egokiak lortuz.Azkenik, berriro ere bidai denboren iragarpenera itzuli gara eta aurrekobi ekarpenak problema honi aplikatu dizkiogu. Lortutako emaitzetatik,problema zehatz honetarako, proposatutako bi metodoetan egin beharrekomoldaketa batzuk identifikatu ditugu. Hasteko, distantzia aukeratzeaz gain,hauen parametroak ere aukeratu behar dira. Hau egiteko silhouette bezalakoindizeak erabili ditugu, baina argitzeke dago ea metodo hau ataza honetarakoonena den. Bestalde, datuen garbiketa eta aurre-prozesatze sakon bat beharrezkoadela ere ikusi dugu, serie atipikoak eta zaratak clustering soluzioetaneragin handia baitaukate. Azkenik, gure esperimentuak iragarpen eredu historikosimpleetan oinarritu ditugu. Eredu simple hauek ordu berdinean jasotakobidai denboren batez bestekoa kalkulatuz egiten dituzte iragarpenak,eta eredu konplexuagoak erabiltzea aukera interesgarria izan daiteke.Laburbilduz, tesi honetan bidai denboren eredugintzaren literaturarenanalisi batetik hasi gara eta, bertatik abiatuta, denbora serieen mehatzaritzaribi ekarpen egin dizkiogu: lehena, denbora serie multzo bat taldekatzekodistantzia automatikoki aukeratzeko metodo baten diseinua, eta bigarrena,sailkatzaile probabilistikoetan oinarritutako denbora serieen sailkatzaile goiztiarbat. Azkenik, berriro ere bidai denboren eredugintzaren problemara itzuligara eta aurreko bi ekarpenak testuinguru honetan aplikatuko ditugu, etorkizunerakoikerketa ildo berriak zabalduz
    • …
    corecore