1,032 research outputs found

    Alphabet indexing for approximating features of symbols

    Get PDF
    AbstractWe consider two maximization problems to find a mapping from a large alphabet forming given two sets of strings to a set of a very few symbols specifying a symbol wise transformation of strings. First we show that the problem to find a mapping that transforms the most of the strings as to form disjoint sets cannot be approximated within a ratio n116 in polynomial time, unless P = NP. Next we consider a mapping that retains the difference of the maximum number of pairs of strings over the given sets. We present a polynomial-time approximation algorithm that guarantees a ratio k(k − 1) for mappings to k symbols, as well as proving that the problem is hard to approximate within an arbitrary small ratio in polynomial time. Furthermore, we extend this algorithm as to deal with not only pairs but also tuples of strings and show that it achieves a constant approximation ratio

    Towards a theory of patches

    Get PDF
    AbstractMany applications have a need for indexing unstructured data. It turns out that a similar ad-hoc method is being used in many of them – that of considering small particles of the data.In this paper we formalize this concept as a tiling problem and consider the efficiency of dealing with this model in the pattern matching setting.We present an efficient algorithm for the one-dimensional tiling problem, and the one-dimensional tiled pattern matching problem. We prove the two-dimensional problem is hard and then develop an approximation algorithm with an approximation ratio converging to 2. We show that other two-dimensional versions of the problem are also hard, regardless of the number of neighbors a tile has

    Approaches to Sequence Similarity Representation

    Get PDF
    We discuss several approaches to similarity preserving coding of symbol sequences and possible connections of their distributed versions to metric embeddings. Interpreting sequence representation methods with embeddings can help develop an approach to their analysis and may lead to discovering useful properties

    Enhanced water demand analysis via symbolic approximation within an epidemiology-based forecasting framework

    Get PDF
    Epidemiology-based models have shown to have successful adaptations to deal with challenges coming from various areas of Engineering, such as those related to energy use or asset management. This paper deals with urban water demand, and data analysis is based on an Epidemiology tool-set herein developed. This combination represents a novel framework in urban hydraulics. Specifically, various reduction tools for time series analyses based on a symbolic approximate (SAX) coding technique able to deal with simple versions of data sets are presented. Then, a neural-network-based model that uses SAX-based knowledge-generation from various time series is shown to improve forecasting abilities. This knowledge is produced by identifying water distribution district metered areas of high similarity to a given target area and sharing demand patterns with the latter. The proposal has been tested with databases from a Brazilian water utility, providing key knowledge for improving water management and hydraulic operation of the distribution system. This novel analysis framework shows several benefits in terms of accuracy and performance of neural network models for water demand112sem informaçãosem informaçã

    Strategies for Representing Tone in African Writing Systems

    Get PDF
    Tone languages provide some interesting challenges for the designers of new orthographies. One approach is to omit tone marks, just as stress is not marked in English (zero marking). Another approach is to do phonemic tone analysis and then make heavy use of diacritic symbols to distinguish the `tonemes' (exhaustive marking). While orthographies based on either system have been successful, this may be thanks to our ability to manage inadequate orthographies rather than to any intrinsic advantage which is afforded by one or the other approach. In many cases, practical experience with both kinds of orthography in sub-Saharan Africa has shown that people have not been able to attain the level of reading and writing fluency that we know to be possible for the orthographies of non-tonal languages. In some cases this can be attributed to a sociolinguistic setting which does not favour vernacular literacy. In other cases, the orthography itself might be to blame. If the orthography of a tone language is difficult to user or to learn, then a good part of the reason, I believe, is that the designer either has not paid enough attention to the function of tone in the language, or has not ensured that the information encoded in the orthography is accessible to the ordinary (non-linguist) user of the language. If the writing of tone is not going to continue to be a stumbling block to literacy efforts, then a fresh approach to tone orthography is required, one which assigns high priority to these two factors. This article describes the problems with orthographies that use too few or too many tone marks, and critically evaluates a wide range of creative intermediate solutions. I review the contributions made by phonology and reading theory, and provide some broad methodological principles to guide someone who is seeking to represent tone in a writing system. The tone orthographies of several languages from sub-Saharan Africa are presented throughout the article, with particular emphasis on some tone languages of Cameroon

    Enhanced Water Demand Analysis via Symbolic Approximation within an Epidemiology-Based Forecasting Framework

    Full text link
    [EN] Epidemiology-based models have shown to have successful adaptations to deal with challenges coming from various areas of Engineering, such as those related to energy use or asset management. This paper deals with urban water demand, and data analysis is based on an Epidemiology tool-set herein developed. This combination represents a novel framework in urban hydraulics. Specifically, various reduction tools for time series analyses based on a symbolic approximate (SAX) coding technique able to deal with simple versions of data sets are presented. Then, a neural-network-based model that uses SAX-based knowledge-generation from various time series is shown to improve forecasting abilities. This knowledge is produced by identifying water distribution district metered areas of high similarity to a given target area and sharing demand patterns with the latter. The proposal has been tested with databases from a Brazilian water utility, providing key knowledge for improving water management and hydraulic operation of the distribution system. This novel analysis framework shows several benefits in terms of accuracy and performance of neural network models for water demand.Navarrete-López, CF.; Herrera Fernández, AM.; Brentan, BM.; Luvizotto Jr., E.; Izquierdo Sebastián, J. (2019). Enhanced Water Demand Analysis via Symbolic Approximation within an Epidemiology-Based Forecasting Framework. Water. 11(246):1-17. https://doi.org/10.3390/w11020246S11711246Fecarotta, O., Carravetta, A., Morani, M., & Padulano, R. (2018). Optimal Pump Scheduling for Urban Drainage under Variable Flow Conditions. Resources, 7(4), 73. doi:10.3390/resources7040073Creaco, E., & Pezzinga, G. (2018). Comparison of Algorithms for the Optimal Location of Control Valves for Leakage Reduction in WDNs. Water, 10(4), 466. doi:10.3390/w10040466Nguyen, K. A., Stewart, R. A., Zhang, H., Sahin, O., & Siriwardene, N. (2018). Re-engineering traditional urban water management practices with smart metering and informatics. Environmental Modelling & Software, 101, 256-267. doi:10.1016/j.envsoft.2017.12.015Adamowski, J., & Karapataki, C. (2010). Comparison of Multivariate Regression and Artificial Neural Networks for Peak Urban Water-Demand Forecasting: Evaluation of Different ANN Learning Algorithms. Journal of Hydrologic Engineering, 15(10), 729-743. doi:10.1061/(asce)he.1943-5584.0000245Caiado, J. (2010). Performance of Combined Double Seasonal Univariate Time Series Models for Forecasting Water Demand. Journal of Hydrologic Engineering, 15(3), 215-222. doi:10.1061/(asce)he.1943-5584.0000182Herrera, M., Torgo, L., Izquierdo, J., & Pérez-García, R. (2010). Predictive models for forecasting hourly urban water demand. Journal of Hydrology, 387(1-2), 141-150. doi:10.1016/j.jhydrol.2010.04.005Msiza, I. S., Nelwamondo, F. V., & Marwala, T. (2008). Water Demand Prediction using Artificial Neural Networks and Support Vector Regression. Journal of Computers, 3(11). doi:10.4304/jcp.3.11.1-8Tiwari, M., Adamowski, J., & Adamowski, K. (2016). Water demand forecasting using extreme learning machines. Journal of Water and Land Development, 28(1), 37-52. doi:10.1515/jwld-2016-0004Vijayalaksmi, D. P., & Babu, K. S. J. (2015). Water Supply System Demand Forecasting Using Adaptive Neuro-fuzzy Inference System. Aquatic Procedia, 4, 950-956. doi:10.1016/j.aqpro.2015.02.119Zhou, L., Xia, J., Yu, L., Wang, Y., Shi, Y., Cai, S., & Nie, S. (2016). Using a Hybrid Model to Forecast the Prevalence of Schistosomiasis in Humans. International Journal of Environmental Research and Public Health, 13(4), 355. doi:10.3390/ijerph13040355Cadenas, E., Rivera, W., Campos-Amezcua, R., & Heard, C. (2016). Wind Speed Prediction Using a Univariate ARIMA Model and a Multivariate NARX Model. Energies, 9(2), 109. doi:10.3390/en9020109Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175. doi:10.1016/s0925-2312(01)00702-0Herrera, M., García-Díaz, J. C., Izquierdo, J., & Pérez-García, R. (2011). Municipal Water Demand Forecasting: Tools for Intervention Time Series. Stochastic Analysis and Applications, 29(6), 998-1007. doi:10.1080/07362994.2011.610161Khashei, M., & Bijari, M. (2011). A novel hybridization of artificial neural networks and ARIMA models for time series forecasting. Applied Soft Computing, 11(2), 2664-2675. doi:10.1016/j.asoc.2010.10.015Campisi-Pinto, S., Adamowski, J., & Oron, G. (2012). Forecasting Urban Water Demand Via Wavelet-Denoising and Neural Network Models. Case Study: City of Syracuse, Italy. Water Resources Management, 26(12), 3539-3558. doi:10.1007/s11269-012-0089-yBrentan, B. M., Luvizotto Jr., E., Herrera, M., Izquierdo, J., & Pérez-García, R. (2017). Hybrid regression model for near real-time urban water demand forecasting. Journal of Computational and Applied Mathematics, 309, 532-541. doi:10.1016/j.cam.2016.02.009Di Nardo, A., Di Natale, M., Musmarra, D., Santonastaso, G. F., Tzatchkov, V., & Alcocer-Yamanaka, V. H. (2014). Dual-use value of network partitioning for water system management and protection from malicious contamination. Journal of Hydroinformatics, 17(3), 361-376. doi:10.2166/hydro.2014.014Scarpa, F., Lobba, A., & Becciu, G. (2016). Elementary DMA Design of Looped Water Distribution Networks with Multiple Sources. Journal of Water Resources Planning and Management, 142(6), 04016011. doi:10.1061/(asce)wr.1943-5452.0000639Panagopoulos, G. P., Bathrellos, G. D., Skilodimou, H. D., & Martsouka, F. A. (2012). Mapping Urban Water Demands Using Multi-Criteria Analysis and GIS. Water Resources Management, 26(5), 1347-1363. doi:10.1007/s11269-011-9962-3Buchberger, S. G., & Nadimpalli, G. (2004). Leak Estimation in Water Distribution Systems by Statistical Analysis of Flow Readings. Journal of Water Resources Planning and Management, 130(4), 321-329. doi:10.1061/(asce)0733-9496(2004)130:4(321)Candelieri, A. (2017). Clustering and Support Vector Regression for Water Demand Forecasting and Anomaly Detection. Water, 9(3), 224. doi:10.3390/w9030224Padulano, R., & Del Giudice, G. (2018). Pattern Detection and Scaling Laws of Daily Water Demand by SOM: an Application to the WDN of Naples, Italy. Water Resources Management, 33(2), 739-755. doi:10.1007/s11269-018-2140-0Bloetscher, F. (2012). Protecting People, Infrastructure, Economies, and Ecosystem Assets: Water Management in the Face of Climate Change. Water, 4(2), 367-388. doi:10.3390/w4020367Bach, P. M., Rauch, W., Mikkelsen, P. S., McCarthy, D. T., & Deletic, A. (2014). A critical review of integrated urban water modelling – Urban drainage and beyond. Environmental Modelling & Software, 54, 88-107. doi:10.1016/j.envsoft.2013.12.018Goltsev, A. V., Dorogovtsev, S. N., Oliveira, J. G., & Mendes, J. F. F. (2012). Localization and Spreading of Diseases in Complex Networks. Physical Review Letters, 109(12). doi:10.1103/physrevlett.109.128702Danila, B., Yu, Y., Marsh, J. A., & Bassler, K. E. (2006). Optimal transport on complex networks. Physical Review E, 74(4). doi:10.1103/physreve.74.046106Herrera, M., Izquierdo, J., Pérez-García, R., & Montalvo, I. (2012). Multi-agent adaptive boosting on semi-supervised water supply clusters. Advances in Engineering Software, 50, 131-136. doi:10.1016/j.advengsoft.2012.02.005Maslov, S., Sneppen, K., & Zaliznyak, A. (2004). Detection of topological patterns in complex networks: correlation profile of the internet. Physica A: Statistical Mechanics and its Applications, 333, 529-540. doi:10.1016/j.physa.2003.06.002Lloyd, A. L., & Valeika, S. (2007). Network models in epidemiology: an overview. World Scientific Lecture Notes in Complex Systems, 189-214. doi:10.1142/9789812771582_0008Hamilton, I., Summerfield, A., Oreszczyn, T., & Ruyssevelt, P. (2017). Using epidemiological methods in energy and buildings research to achieve carbon emission targets. Energy and Buildings, 154, 188-197. doi:10.1016/j.enbuild.2017.08.079Bardet, J.-P., & Little, R. (2014). Epidemiology of urban water distribution systems. Water Resources Research, 50(8), 6447-6465. doi:10.1002/2013wr015017De Domenico, M., Granell, C., Porter, M. A., & Arenas, A. (2016). The physics of spreading processes in multilayer networks. Nature Physics, 12(10), 901-906. doi:10.1038/nphys3865Hamilton, I. G., Summerfield, A. J., Lowe, R., Ruyssevelt, P., Elwell, C. A., & Oreszczyn, T. (2013). Energy epidemiology: a new approach to end-use energy demand research. Building Research & Information, 41(4), 482-497. doi:10.1080/09613218.2013.798142Herrera, M., Ferreira, A. A., Coley, D. A., & de Aquino, R. R. B. (2016). SAX-quantile based multiresolution approach for finding heatwave events in summer temperature time series. AI Communications, 29(6), 725-732. doi:10.3233/aic-160716Padulano, R., & Del Giudice, G. (2018). A Mixed Strategy Based on Self-Organizing Map for Water Demand Pattern Profiling of Large-Size Smart Water Grid Data. Water Resources Management, 32(11), 3671-3685. doi:10.1007/s11269-018-2012-7Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107-144. doi:10.1007/s10618-007-0064-zAghabozorgi, S., & Wah, T. Y. (2014). Clustering of large time series datasets. Intelligent Data Analysis, 18(5), 793-817. doi:10.3233/ida-140669Yuan, J., Wang, Z., Han, M., & Sun, Y. (2015). A lazy associative classifier for time series. Intelligent Data Analysis, 19(5), 983-1002. doi:10.3233/ida-150754Rasheed, F., Alshalalfa, M., & Alhajj, R. (2011). Efficient Periodicity Mining in Time Series Databases Using Suffix Trees. IEEE Transactions on Knowledge and Data Engineering, 23(1), 79-94. doi:10.1109/tkde.2010.76Schmieder, R., & Edwards, R. (2011). Fast Identification and Removal of Sequence Contamination from Genomic and Metagenomic Datasets. PLoS ONE, 6(3), e17288. doi:10.1371/journal.pone.0017288Valimaki, N., Gerlach, W., Dixit, K., & Makinen, V. (2007). Compressed suffix tree a basis for genome-scale sequence analysis. Bioinformatics, 23(5), 629-630. doi:10.1093/bioinformatics/btl681Ezkurdia, I., Juan, D., Rodriguez, J. M., Frankish, A., Diekhans, M., Harrow, J., … Tress, M. L. (2014). Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes. Human Molecular Genetics, 23(22), 5866-5878. doi:10.1093/hmg/ddu309Bermudez-Santana, C. I. (2016). APLICACIONES DE LA BIOINFORMÁTICA EN LA MEDICINA: EL GENOMA HUMANO. ¿CÓMO PODEMOS VER TANTO DETALLE? Acta Biológica Colombiana, 21(1Supl), 249-258. doi:10.15446/abc.v21n1supl.51233Cai, L., Li, X., Ghosh, M., & Guo, B. (2009). Stability analysis of an HIV/AIDS epidemic model with treatment. Journal of Computational and Applied Mathematics, 229(1), 313-323. doi:10.1016/j.cam.2008.10.067Jackson, M., & Chen-Charpentier, B. M. (2017). Modeling plant virus propagation with delays. Journal of Computational and Applied Mathematics, 309, 611-621. doi:10.1016/j.cam.2016.04.024Brentan, B. M., Meirelles, G., Herrera, M., Luvizotto, E., & Izquierdo, J. (2017). Correlation Analysis of Water Demand and Predictive Variables for Short-Term Forecasting Models. Mathematical Problems in Engineering, 2017, 1-10. doi:10.1155/2017/6343625Bhaskaran, K., Gasparrini, A., Hajat, S., Smeeth, L., & Armstrong, B. (2013). Time series regression studies in environmental epidemiology. International Journal of Epidemiology, 42(4), 1187-1195. doi:10.1093/ije/dyt092HELFENSTEIN, U. (1991). The Use of Transfer Function Models, Intervention Analysis and Related Time Series Methods in Epidemiology. International Journal of Epidemiology, 20(3), 808-815. doi:10.1093/ije/20.3.808Herrera, M., Abraham, E., & Stoianov, I. (2016). A Graph-Theoretic Framework for Assessing the Resilience of Sectorised Water Distribution Networks. Water Resources Management, 30(5), 1685-1699. doi:10.1007/s11269-016-1245-6Jung, D., Choi, Y., & Kim, J. (2016). Optimal Node Grouping for Water Distribution System Demand Estimation. Water, 8(4), 160. doi:10.3390/w8040160Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., & Keogh, E. (2012). Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery, 26(2), 275-309. doi:10.1007/s10618-012-0250-5Cassisi, C., Prestifilippo, M., Cannata, A., Montalto, P., Patanè, D., & Privitera, E. (2016). Probabilistic Reasoning Over Seismic Time Series: Volcano Monitoring by Hidden Markov Models at Mt. Etna. Pure and Applied Geophysics, 173(7), 2365-2386. doi:10.1007/s00024-016-1284-1McCreight, E. M. (1976). A Space-Economical Suffix Tree Construction Algorithm. Journal of the ACM, 23(2), 262-272. doi:10.1145/321941.321946Aghabozorgi, S., Seyed Shirkhorshidi, A., & Ying Wah, T. (2015). Time-series clustering – A decade review. Information Systems, 53, 16-38. doi:10.1016/j.is.2015.04.007Warren Liao, T. (2005). Clustering of time series data—a survey. Pattern Recognition, 38(11), 1857-1874. doi:10.1016/j.patcog.2005.01.02
    corecore