3,439 research outputs found

    Analytics of human presence and movement behaviour within specific environments

    Get PDF
    The vast amounts of detailed information, generated by Wi-Fi and other mobile communication technologies, provide an invaluable opportunity to study different aspects of presence and movement behaviours of people within a given environment; for example, a university campus, an organisation office complex, or a city centre. Utilising such data, this thesis studies three main aspects of the human presence and movement behaviours: spatio-temporal movement (where and when do people move), user identification (how to uniquely identify people from their presence and movement historical records), and social grouping (how do people interact). Previous research works have predominantly studied two out of these three aspects, at most. Conversely, we investigate all three aspects in order to develop a coherent view of the human presence and movement behaviour within selected environments. More specifically, we create stochastic models for movement prediction and user identification. We also devise a set of clustering models for the detection of the social groups within a given environment. The thesis makes the following contributions: 1. Proposes a family of predictive models that allows for inference of locations though a collaborative mechanism which does not require the profiling of individual users. These prediction models utilise suffix trees as their core underlying data structure, where predictions about a specific individual are computed over an aggregate model incorporating the collective record of observed behaviours of multiple users. 2. Defines a mobility fingerprint as a profile constructed from the users historical mobility traces. The proposed method for constructing such a profile is a principled and scalable implementation of a variable length Markov model based on n-grams. 3. Proposes density-based clustering methods that discover social groups by analysing activity traces of mobile users as they move around, from one location to another, within an observed environment. We utilise two large collections of mobility traces: a GPS data set from Nokia and an Eduroam network log from Birkbeck, University of London, for the evaluation of the proposed models reported herein

    Analytics of human presence and movement behaviour within specific environments

    Get PDF
    The vast amounts of detailed information, generated by Wi-Fi and other mobile communication technologies, provide an invaluable opportunity to study different aspects of presence and movement behaviours of people within a given environment; for example, a university campus, an organisation office complex, or a city centre. Utilising such data, this thesis studies three main aspects of the human presence and movement behaviours: spatio-temporal movement (where and when do people move), user identification (how to uniquely identify people from their presence and movement historical records), and social grouping (how do people interact). Previous research works have predominantly studied two out of these three aspects, at most. Conversely, we investigate all three aspects in order to develop a coherent view of the human presence and movement behaviour within selected environments. More specifically, we create stochastic models for movement prediction and user identification. We also devise a set of clustering models for the detection of the social groups within a given environment. The thesis makes the following contributions: 1. Proposes a family of predictive models that allows for inference of locations though a collaborative mechanism which does not require the profiling of individual users. These prediction models utilise suffix trees as their core underlying data structure, where predictions about a specific individual are computed over an aggregate model incorporating the collective record of observed behaviours of multiple users. 2. Defines a mobility fingerprint as a profile constructed from the users historical mobility traces. The proposed method for constructing such a profile is a principled and scalable implementation of a variable length Markov model based on n-grams. 3. Proposes density-based clustering methods that discover social groups by analysing activity traces of mobile users as they move around, from one location to another, within an observed environment. We utilise two large collections of mobility traces: a GPS data set from Nokia and an Eduroam network log from Birkbeck, University of London, for the evaluation of the proposed models reported herein

    Enhanced water demand analysis via symbolic approximation within an epidemiology-based forecasting framework

    Get PDF
    Epidemiology-based models have shown to have successful adaptations to deal with challenges coming from various areas of Engineering, such as those related to energy use or asset management. This paper deals with urban water demand, and data analysis is based on an Epidemiology tool-set herein developed. This combination represents a novel framework in urban hydraulics. Specifically, various reduction tools for time series analyses based on a symbolic approximate (SAX) coding technique able to deal with simple versions of data sets are presented. Then, a neural-network-based model that uses SAX-based knowledge-generation from various time series is shown to improve forecasting abilities. This knowledge is produced by identifying water distribution district metered areas of high similarity to a given target area and sharing demand patterns with the latter. The proposal has been tested with databases from a Brazilian water utility, providing key knowledge for improving water management and hydraulic operation of the distribution system. This novel analysis framework shows several benefits in terms of accuracy and performance of neural network models for water demand112sem informaçãosem informaçã

    Enhanced Water Demand Analysis via Symbolic Approximation within an Epidemiology-Based Forecasting Framework

    Full text link
    [EN] Epidemiology-based models have shown to have successful adaptations to deal with challenges coming from various areas of Engineering, such as those related to energy use or asset management. This paper deals with urban water demand, and data analysis is based on an Epidemiology tool-set herein developed. This combination represents a novel framework in urban hydraulics. Specifically, various reduction tools for time series analyses based on a symbolic approximate (SAX) coding technique able to deal with simple versions of data sets are presented. Then, a neural-network-based model that uses SAX-based knowledge-generation from various time series is shown to improve forecasting abilities. This knowledge is produced by identifying water distribution district metered areas of high similarity to a given target area and sharing demand patterns with the latter. The proposal has been tested with databases from a Brazilian water utility, providing key knowledge for improving water management and hydraulic operation of the distribution system. This novel analysis framework shows several benefits in terms of accuracy and performance of neural network models for water demand.Navarrete-López, CF.; Herrera Fernández, AM.; Brentan, BM.; Luvizotto Jr., E.; Izquierdo Sebastián, J. (2019). Enhanced Water Demand Analysis via Symbolic Approximation within an Epidemiology-Based Forecasting Framework. Water. 11(246):1-17. https://doi.org/10.3390/w11020246S11711246Fecarotta, O., Carravetta, A., Morani, M., & Padulano, R. (2018). Optimal Pump Scheduling for Urban Drainage under Variable Flow Conditions. Resources, 7(4), 73. doi:10.3390/resources7040073Creaco, E., & Pezzinga, G. (2018). Comparison of Algorithms for the Optimal Location of Control Valves for Leakage Reduction in WDNs. Water, 10(4), 466. doi:10.3390/w10040466Nguyen, K. A., Stewart, R. A., Zhang, H., Sahin, O., & Siriwardene, N. (2018). Re-engineering traditional urban water management practices with smart metering and informatics. Environmental Modelling & Software, 101, 256-267. doi:10.1016/j.envsoft.2017.12.015Adamowski, J., & Karapataki, C. (2010). Comparison of Multivariate Regression and Artificial Neural Networks for Peak Urban Water-Demand Forecasting: Evaluation of Different ANN Learning Algorithms. Journal of Hydrologic Engineering, 15(10), 729-743. doi:10.1061/(asce)he.1943-5584.0000245Caiado, J. (2010). Performance of Combined Double Seasonal Univariate Time Series Models for Forecasting Water Demand. Journal of Hydrologic Engineering, 15(3), 215-222. doi:10.1061/(asce)he.1943-5584.0000182Herrera, M., Torgo, L., Izquierdo, J., & Pérez-García, R. (2010). Predictive models for forecasting hourly urban water demand. Journal of Hydrology, 387(1-2), 141-150. doi:10.1016/j.jhydrol.2010.04.005Msiza, I. S., Nelwamondo, F. V., & Marwala, T. (2008). Water Demand Prediction using Artificial Neural Networks and Support Vector Regression. Journal of Computers, 3(11). doi:10.4304/jcp.3.11.1-8Tiwari, M., Adamowski, J., & Adamowski, K. (2016). Water demand forecasting using extreme learning machines. Journal of Water and Land Development, 28(1), 37-52. doi:10.1515/jwld-2016-0004Vijayalaksmi, D. P., & Babu, K. S. J. (2015). Water Supply System Demand Forecasting Using Adaptive Neuro-fuzzy Inference System. Aquatic Procedia, 4, 950-956. doi:10.1016/j.aqpro.2015.02.119Zhou, L., Xia, J., Yu, L., Wang, Y., Shi, Y., Cai, S., & Nie, S. (2016). Using a Hybrid Model to Forecast the Prevalence of Schistosomiasis in Humans. International Journal of Environmental Research and Public Health, 13(4), 355. doi:10.3390/ijerph13040355Cadenas, E., Rivera, W., Campos-Amezcua, R., & Heard, C. (2016). Wind Speed Prediction Using a Univariate ARIMA Model and a Multivariate NARX Model. Energies, 9(2), 109. doi:10.3390/en9020109Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175. doi:10.1016/s0925-2312(01)00702-0Herrera, M., García-Díaz, J. C., Izquierdo, J., & Pérez-García, R. (2011). Municipal Water Demand Forecasting: Tools for Intervention Time Series. Stochastic Analysis and Applications, 29(6), 998-1007. doi:10.1080/07362994.2011.610161Khashei, M., & Bijari, M. (2011). A novel hybridization of artificial neural networks and ARIMA models for time series forecasting. Applied Soft Computing, 11(2), 2664-2675. doi:10.1016/j.asoc.2010.10.015Campisi-Pinto, S., Adamowski, J., & Oron, G. (2012). Forecasting Urban Water Demand Via Wavelet-Denoising and Neural Network Models. Case Study: City of Syracuse, Italy. Water Resources Management, 26(12), 3539-3558. doi:10.1007/s11269-012-0089-yBrentan, B. M., Luvizotto Jr., E., Herrera, M., Izquierdo, J., & Pérez-García, R. (2017). Hybrid regression model for near real-time urban water demand forecasting. Journal of Computational and Applied Mathematics, 309, 532-541. doi:10.1016/j.cam.2016.02.009Di Nardo, A., Di Natale, M., Musmarra, D., Santonastaso, G. F., Tzatchkov, V., & Alcocer-Yamanaka, V. H. (2014). Dual-use value of network partitioning for water system management and protection from malicious contamination. Journal of Hydroinformatics, 17(3), 361-376. doi:10.2166/hydro.2014.014Scarpa, F., Lobba, A., & Becciu, G. (2016). Elementary DMA Design of Looped Water Distribution Networks with Multiple Sources. Journal of Water Resources Planning and Management, 142(6), 04016011. doi:10.1061/(asce)wr.1943-5452.0000639Panagopoulos, G. P., Bathrellos, G. D., Skilodimou, H. D., & Martsouka, F. A. (2012). Mapping Urban Water Demands Using Multi-Criteria Analysis and GIS. Water Resources Management, 26(5), 1347-1363. doi:10.1007/s11269-011-9962-3Buchberger, S. G., & Nadimpalli, G. (2004). Leak Estimation in Water Distribution Systems by Statistical Analysis of Flow Readings. Journal of Water Resources Planning and Management, 130(4), 321-329. doi:10.1061/(asce)0733-9496(2004)130:4(321)Candelieri, A. (2017). Clustering and Support Vector Regression for Water Demand Forecasting and Anomaly Detection. Water, 9(3), 224. doi:10.3390/w9030224Padulano, R., & Del Giudice, G. (2018). Pattern Detection and Scaling Laws of Daily Water Demand by SOM: an Application to the WDN of Naples, Italy. Water Resources Management, 33(2), 739-755. doi:10.1007/s11269-018-2140-0Bloetscher, F. (2012). Protecting People, Infrastructure, Economies, and Ecosystem Assets: Water Management in the Face of Climate Change. Water, 4(2), 367-388. doi:10.3390/w4020367Bach, P. M., Rauch, W., Mikkelsen, P. S., McCarthy, D. T., & Deletic, A. (2014). A critical review of integrated urban water modelling – Urban drainage and beyond. Environmental Modelling & Software, 54, 88-107. doi:10.1016/j.envsoft.2013.12.018Goltsev, A. V., Dorogovtsev, S. N., Oliveira, J. G., & Mendes, J. F. F. (2012). Localization and Spreading of Diseases in Complex Networks. Physical Review Letters, 109(12). doi:10.1103/physrevlett.109.128702Danila, B., Yu, Y., Marsh, J. A., & Bassler, K. E. (2006). Optimal transport on complex networks. Physical Review E, 74(4). doi:10.1103/physreve.74.046106Herrera, M., Izquierdo, J., Pérez-García, R., & Montalvo, I. (2012). Multi-agent adaptive boosting on semi-supervised water supply clusters. Advances in Engineering Software, 50, 131-136. doi:10.1016/j.advengsoft.2012.02.005Maslov, S., Sneppen, K., & Zaliznyak, A. (2004). Detection of topological patterns in complex networks: correlation profile of the internet. Physica A: Statistical Mechanics and its Applications, 333, 529-540. doi:10.1016/j.physa.2003.06.002Lloyd, A. L., & Valeika, S. (2007). Network models in epidemiology: an overview. World Scientific Lecture Notes in Complex Systems, 189-214. doi:10.1142/9789812771582_0008Hamilton, I., Summerfield, A., Oreszczyn, T., & Ruyssevelt, P. (2017). Using epidemiological methods in energy and buildings research to achieve carbon emission targets. Energy and Buildings, 154, 188-197. doi:10.1016/j.enbuild.2017.08.079Bardet, J.-P., & Little, R. (2014). Epidemiology of urban water distribution systems. Water Resources Research, 50(8), 6447-6465. doi:10.1002/2013wr015017De Domenico, M., Granell, C., Porter, M. A., & Arenas, A. (2016). The physics of spreading processes in multilayer networks. Nature Physics, 12(10), 901-906. doi:10.1038/nphys3865Hamilton, I. G., Summerfield, A. J., Lowe, R., Ruyssevelt, P., Elwell, C. A., & Oreszczyn, T. (2013). Energy epidemiology: a new approach to end-use energy demand research. Building Research & Information, 41(4), 482-497. doi:10.1080/09613218.2013.798142Herrera, M., Ferreira, A. A., Coley, D. A., & de Aquino, R. R. B. (2016). SAX-quantile based multiresolution approach for finding heatwave events in summer temperature time series. AI Communications, 29(6), 725-732. doi:10.3233/aic-160716Padulano, R., & Del Giudice, G. (2018). A Mixed Strategy Based on Self-Organizing Map for Water Demand Pattern Profiling of Large-Size Smart Water Grid Data. Water Resources Management, 32(11), 3671-3685. doi:10.1007/s11269-018-2012-7Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107-144. doi:10.1007/s10618-007-0064-zAghabozorgi, S., & Wah, T. Y. (2014). Clustering of large time series datasets. Intelligent Data Analysis, 18(5), 793-817. doi:10.3233/ida-140669Yuan, J., Wang, Z., Han, M., & Sun, Y. (2015). A lazy associative classifier for time series. Intelligent Data Analysis, 19(5), 983-1002. doi:10.3233/ida-150754Rasheed, F., Alshalalfa, M., & Alhajj, R. (2011). Efficient Periodicity Mining in Time Series Databases Using Suffix Trees. IEEE Transactions on Knowledge and Data Engineering, 23(1), 79-94. doi:10.1109/tkde.2010.76Schmieder, R., & Edwards, R. (2011). Fast Identification and Removal of Sequence Contamination from Genomic and Metagenomic Datasets. PLoS ONE, 6(3), e17288. doi:10.1371/journal.pone.0017288Valimaki, N., Gerlach, W., Dixit, K., & Makinen, V. (2007). Compressed suffix tree a basis for genome-scale sequence analysis. Bioinformatics, 23(5), 629-630. doi:10.1093/bioinformatics/btl681Ezkurdia, I., Juan, D., Rodriguez, J. M., Frankish, A., Diekhans, M., Harrow, J., … Tress, M. L. (2014). Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes. Human Molecular Genetics, 23(22), 5866-5878. doi:10.1093/hmg/ddu309Bermudez-Santana, C. I. (2016). APLICACIONES DE LA BIOINFORMÁTICA EN LA MEDICINA: EL GENOMA HUMANO. ¿CÓMO PODEMOS VER TANTO DETALLE? Acta Biológica Colombiana, 21(1Supl), 249-258. doi:10.15446/abc.v21n1supl.51233Cai, L., Li, X., Ghosh, M., & Guo, B. (2009). Stability analysis of an HIV/AIDS epidemic model with treatment. Journal of Computational and Applied Mathematics, 229(1), 313-323. doi:10.1016/j.cam.2008.10.067Jackson, M., & Chen-Charpentier, B. M. (2017). Modeling plant virus propagation with delays. Journal of Computational and Applied Mathematics, 309, 611-621. doi:10.1016/j.cam.2016.04.024Brentan, B. M., Meirelles, G., Herrera, M., Luvizotto, E., & Izquierdo, J. (2017). Correlation Analysis of Water Demand and Predictive Variables for Short-Term Forecasting Models. Mathematical Problems in Engineering, 2017, 1-10. doi:10.1155/2017/6343625Bhaskaran, K., Gasparrini, A., Hajat, S., Smeeth, L., & Armstrong, B. (2013). Time series regression studies in environmental epidemiology. International Journal of Epidemiology, 42(4), 1187-1195. doi:10.1093/ije/dyt092HELFENSTEIN, U. (1991). The Use of Transfer Function Models, Intervention Analysis and Related Time Series Methods in Epidemiology. International Journal of Epidemiology, 20(3), 808-815. doi:10.1093/ije/20.3.808Herrera, M., Abraham, E., & Stoianov, I. (2016). A Graph-Theoretic Framework for Assessing the Resilience of Sectorised Water Distribution Networks. Water Resources Management, 30(5), 1685-1699. doi:10.1007/s11269-016-1245-6Jung, D., Choi, Y., & Kim, J. (2016). Optimal Node Grouping for Water Distribution System Demand Estimation. Water, 8(4), 160. doi:10.3390/w8040160Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., & Keogh, E. (2012). Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery, 26(2), 275-309. doi:10.1007/s10618-012-0250-5Cassisi, C., Prestifilippo, M., Cannata, A., Montalto, P., Patanè, D., & Privitera, E. (2016). Probabilistic Reasoning Over Seismic Time Series: Volcano Monitoring by Hidden Markov Models at Mt. Etna. Pure and Applied Geophysics, 173(7), 2365-2386. doi:10.1007/s00024-016-1284-1McCreight, E. M. (1976). A Space-Economical Suffix Tree Construction Algorithm. Journal of the ACM, 23(2), 262-272. doi:10.1145/321941.321946Aghabozorgi, S., Seyed Shirkhorshidi, A., & Ying Wah, T. (2015). Time-series clustering – A decade review. Information Systems, 53, 16-38. doi:10.1016/j.is.2015.04.007Warren Liao, T. (2005). Clustering of time series data—a survey. Pattern Recognition, 38(11), 1857-1874. doi:10.1016/j.patcog.2005.01.02

    The Parallelism Motifs of Genomic Data Analysis

    Get PDF
    Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high performance genomics analysis, including alignment, profiling, clustering, and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or motifs that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing
    corecore