296 research outputs found

    MLI: An API for Distributed Machine Learning

    Full text link
    MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity and highly competitive performance and scalability

    Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

    Full text link
    We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in the web search literature, the Twitter context introduces a real-time "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides a case study illustrating the challenges of real-time data processing in the era of "big data". We tell the story of how our system was built twice: our first implementation was built on a typical Hadoop-based analytics stack, but was later replaced because it did not meet the latency requirements necessary to generate meaningful real-time results. The second implementation, which is the system deployed in production, is a custom in-memory processing engine specifically designed for the task. This experience taught us that the current typical usage of Hadoop as a "big data" platform, while great for experimentation, is not well suited to low-latency processing, and points the way to future work on data analytics platforms that can handle "big" as well as "fast" data

    Evaluation of Clustering Algorithms on HPC Platforms

    Full text link
    [EN] Clustering algorithms are one of the most widely used kernels to generate knowledge from large datasets. These algorithms group a set of data elements (i.e., images, points, patterns, etc.) into clusters to identify patterns or common features of a sample. However, these algorithms are very computationally expensive as they often involve the computation of expensive fitness functions that must be evaluated for all points in the dataset. This computational cost is even higher for fuzzy methods, where each data point may belong to more than one cluster. In this paper, we evaluate different parallelisation strategies on different heterogeneous platforms for fuzzy clustering algorithms typically used in the state-of-the-art such as the Fuzzy C-means (FCM), the Gustafson-Kessel FCM (GK-FCM) and the Fuzzy Minimals (FM). The experimental evaluation includes performance and energy trade-offs. Our results show that depending on the computational pattern of each algorithm, their mathematical foundation and the amount of data to be processed, each algorithm performs better on a different platform.This work has been partially supported by the Spanish Ministry of Science and Innovation, under the Ramon y Cajal Program (Grant No. RYC2018-025580-I) and by the Spanish "Agencia Estatal de Investigacion" under grant PID2020-112827GB-I00 /AEI/ 10.13039/501100011033, and under grants RTI2018-096384-B-I00, RTC-2017-6389-5 and RTC2019-007159-5, by the Fundacion Seneca del Centro de Coordinacion de la Investigacion de la Region de Murcia under Project 20813/PI/18, and by the "Conselleria de Educacion, Investigacion, Cultura y Deporte, Direccio General de Ciencia i Investigacio, Proyectos AICO/2020", Spain, under Grant AICO/2020/302.Cebrian, JM.; Imbernón, B.; Soto, J.; Cecilia-Canales, JM. (2021). Evaluation of Clustering Algorithms on HPC Platforms. Mathematics. 9(17):1-20. https://doi.org/10.3390/math917215612091

    High-throughput fuzzy clustering on heterogeneous architectures

    Full text link
    [EN] The Internet of Things (IoT) is pushing the next economic revolution in which the main players are data and immediacy. IoT is increasingly producing large amounts of data that are now classified as "dark data'' because most are created but never analyzed. The efficient analysis of this data deluge is becoming mandatory in order to transform it into meaningful information. Among the techniques available for this purpose, clustering techniques, which classify different patterns into groups, have proven to be very useful for obtaining knowledge from the data. However, clustering algorithms are computationally hard, especially when it comes to large data sets and, therefore, they require the most powerful computing platforms on the market. In this paper, we investigate coarse and fine grain parallelization strategies in Intel and Nvidia architectures of fuzzy minimals (FM) algorithm; a fuzzy clustering technique that has shown very good results in the literature. We provide an in-depth performance analysis of the FM's main bottlenecks, reporting a speed-up factor of up to 40x compared to the sequential counterpart version.This work was partially supported by the Fundacion Seneca del Centro de Coordinacion de la Investigacion de la Region de Murcia under Project 20813/PI/18, and by Spanish Ministry of Science, Innovation and Universities under grants TIN2016-78799-P (AEI/FEDER, UE), RTI2018-096384-B-I00, RTI2018-098156-B-C53 and RTC-2017-6389-5.Cebrian, JM.; Imbernón, B.; Soto, J.; García, JM.; Cecilia-Canales, JM. (2020). High-throughput fuzzy clustering on heterogeneous architectures. Future Generation Computer Systems. 106:401-411. https://doi.org/10.1016/j.future.2020.01.022S401411106Waldrop, M. M. (2016). The chips are down for Moore’s law. Nature, 530(7589), 144-147. doi:10.1038/530144aCecilia, J. M., Timon, I., Soto, J., Santa, J., Pereniguez, F., & Munoz, A. (2018). High-Throughput Infrastructure for Advanced ITS Services: A Case Study on Air Pollution Monitoring. IEEE Transactions on Intelligent Transportation Systems, 19(7), 2246-2257. doi:10.1109/tits.2018.2816741Singh, D., & Reddy, C. K. (2014). A survey on platforms for big data analytics. Journal of Big Data, 2(1). doi:10.1186/s40537-014-0008-6Stephens, N., Biles, S., Boettcher, M., Eapen, J., Eyole, M., Gabrielli, G., … Walker, P. (2017). The ARM Scalable Vector Extension. IEEE Micro, 37(2), 26-39. doi:10.1109/mm.2017.35Wright, S. A. (2019). Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems. Future Generation Computer Systems, 92, 900-902. doi:10.1016/j.future.2018.11.020Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering. ACM Computing Surveys, 31(3), 264-323. doi:10.1145/331499.331504Lee, J., Hong, B., Jung, S., & Chang, V. (2018). Clustering learning model of CCTV image pattern for producing road hazard meteorological information. Future Generation Computer Systems, 86, 1338-1350. doi:10.1016/j.future.2018.03.022Pérez-Garrido, A., Girón-Rodríguez, F., Bueno-Crespo, A., Soto, J., Pérez-Sánchez, H., & Helguera, A. M. (2017). Fuzzy clustering as rational partition method for QSAR. Chemometrics and Intelligent Laboratory Systems, 166, 1-6. doi:10.1016/j.chemolab.2017.04.006H.S. Nagesh, S. Goil, A. Choudhary, A scalable parallel subspace clustering algorithm for massive data sets, in: Proceedings 2000 International Conference on Parallel Processing, 2000, pp. 477–484.Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences, 10(2-3), 191-203. doi:10.1016/0098-3004(84)90020-7Havens, T. C., Bezdek, J. C., Leckie, C., Hall, L. O., & Palaniswami, M. (2012). Fuzzy c-Means Algorithms for Very Large Data. IEEE Transactions on Fuzzy Systems, 20(6), 1130-1146. doi:10.1109/tfuzz.2012.2201485Flores-Sintas, A., Cadenas, J., & Martin, F. (1998). A local geometrical properties application to fuzzy clustering. Fuzzy Sets and Systems, 100(1-3), 245-256. doi:10.1016/s0165-0114(97)00038-9Soto, J., Flores-Sintas, A., & Palarea-Albaladejo, J. (2008). Improving probabilities in a fuzzy clustering partition. Fuzzy Sets and Systems, 159(4), 406-421. doi:10.1016/j.fss.2007.08.016Timón, I., Soto, J., Pérez-Sánchez, H., & Cecilia, J. M. (2016). Parallel implementation of fuzzy minimals clustering algorithm. Expert Systems with Applications, 48, 35-41. doi:10.1016/j.eswa.2015.11.011Flores-Sintas, A., M. Cadenas, J., & Martin, F. (2001). Detecting homogeneous groups in clustering using the Euclidean distance. Fuzzy Sets and Systems, 120(2), 213-225. doi:10.1016/s0165-0114(99)00110-4Wang, H., Potluri, S., Luo, M., Singh, A. K., Sur, S., & Panda, D. K. (2011). MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters. Computer Science - Research and Development, 26(3-4), 257-266. doi:10.1007/s00450-011-0171-3Kaltofen, E., & Villard, G. (2005). On the complexity of computing determinants. computational complexity, 13(3-4), 91-130. doi:10.1007/s00037-004-0185-3Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32(3), 241-254. doi:10.1007/bf02289588Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., … Lin, C.-T. (2017). A review of clustering techniques and developments. Neurocomputing, 267, 664-681. doi:10.1016/j.neucom.2017.06.053Woodley, A., Tang, L.-X., Geva, S., Nayak, R., & Chappell, T. (2019). Parallel K-Tree: A multicore, multinode solution to extreme clustering. Future Generation Computer Systems, 99, 333-345. doi:10.1016/j.future.2018.09.038Kwedlo, W., & Czochanski, P. J. (2019). A Hybrid MPI/OpenMP Parallelization of KK -Means Algorithms Accelerated Using the Triangle Inequality. IEEE Access, 7, 42280-42297. doi:10.1109/access.2019.2907885Li, Y., Zhao, K., Chu, X., & Liu, J. (2013). Speeding up k-Means algorithm by GPUs. Journal of Computer and System Sciences, 79(2), 216-229. doi:10.1016/j.jcss.2012.05.004Saveetha, V., & Sophia, S. (2018). Optimal Tabu K-Means Clustering Using Massively Parallel Architecture. Journal of Circuits, Systems and Computers, 27(13), 1850199. doi:10.1142/s0218126618501992Djenouri, Y., Djenouri, D., Belhadi, A., & Cano, A. (2019). Exploiting GPU and cluster parallelism in single scan frequent itemset mining. Information Sciences, 496, 363-377. doi:10.1016/j.ins.2018.07.020Krawczyk, B. (2016). GPU-Accelerated Extreme Learning Machines for Imbalanced Data Streams with Concept Drift. Procedia Computer Science, 80, 1692-1701. doi:10.1016/j.procs.2016.05.509Fang, Y., Chen, Q., & Xiong, N. (2019). A multi-factor monitoring fault tolerance model based on a GPU cluster for big data processing. Information Sciences, 496, 300-316. doi:10.1016/j.ins.2018.04.053Tanweer, S., & Rao, N. (2019). Novel Algorithm of CPU-GPU hybrid system for health care data classification. Journal of Drug Delivery and Therapeutics, 9(1-s), 355-357. doi:10.22270/jddt.v9i1-s.244

    Scaling out Big Data Distributed Pricing in Gaming Industry

    Get PDF
    Game companies have millions of customers, billions of transactions and petabytes of other data related to game events. The vast volume and complexity of this data make it practically impossible to process and analyze it using traditional relational database models (RDBMs). This kind of data can be identified as Big Data, and in order to handle it in efficient manner, multiple issues have to be taken into account. It is more straightforward to answer to these problems when developing completely new system, that can be implemented with all the new techniques and platforms to support big data handling. However, if it is needed to modify an existing system to accommodate data volumes of big data, there are more issues to be taken into account. This thesis starts with the clarification of the definition 'big data'. Scalability and parallelism are key factors for handling big data, thus they will be explained and some of the conventions to do them will be reviewed. Next, different tools and platforms that do parallel programming, are presented. The relevance of big data in gaming industry is briefly explained, as well as the different monetization models that games have. Furthermore, price elasticity of demand is explained to give better understanding of a Dynamic Pricing Engine and what does it do. In this thesis, I solve a bottleneck that emerges in data transfer and processing when introducing big data to an existing system, a Dynamic Pricing Engine, by using parallel programming in order to scale the system. Spark will be used to deal with fetching and processing distributed data. The main focus is in the impact of using parallel programming in comparison to the current solution, which is done with PHP and MySQL. Furthermore, Spark implementations are done against different data storage solutions, such as MySQL, Hadoop and HDFS, and their performance is also compared. The results for utilizing Spark for the implementation show significant improvement in performance time for processing the data. However, the importance of choosing the right data storage for fetching the data can't be understated, as the speed for fetching the data can widely variate.Peliyhtiöillä on miljoonia asiakkaita, miljardeja maksutapahtumia ja petatavuja pelin tapahtumiin liittyvää dataa. Tämän datan suuri määrä ja kompleksisuus tekevät sen prosessoimisesta sekä analysoimisesta lähes mahdotonta tavallisilla relaatiotietokannoilla. Tällaista dataa voidaan kutsua Big Dataksi, ja jotta sen käsittely olisi tehokasta, useita asioita on otettava huomioon. Uuden järjestelmän toteutuksessa näihin ongelmiin pystytään vastaamaan melko johdonmukaisesti, sillä uusimmat tekniikat ja alustat voidaan ottaa tällöin helposti käyttöön. Jos kyseessä on jo olemassa oleva järjestelmä, jota halutaan muuttaa vastaamaan big datamaisiin datamääriin, huomioon otettavien asioden määrä kasvaa. Tämän diplomityön aluksi selitetään termi 'Big Data'. Big Datan kanssa työskentelyyn tarvitaan skaalautuvuutta ja rinnakkaisuutta, joten nämä termit, sekä näiden yleisimmät käytännöt käydään läpi. Seuraavaksi esitellään työkaluja ja alustoja, joilla on mahdollista tehdä rinnakkaisohjelmointia. Big Datan merkitys peliteollisuudessa selitetään lyhyesti, kuten myös eri monetisaatiomallit, joita peliyritykset käyttävät. Lisäksi kysynnän hintajousto käydään läpi, jotta lukijalle olisi helpompaa ymmärtää, mikä seuraavaksi esitelty Apprien on ja mihin sitä käytetään. Tässä diplomityössä etsin ratkaisua Big Datan siirrossa ja prosessoinnissa ilmenevään ongelmaan jo olemassa olevalle järjestelmälle, Apprienille. Tämä pullonkaula ratkaistaan käyttämällä rinnakkaisohjelmointia Sparkin avulla. Pääasiallinen painopiste on selvittää rinnakkaisohjelmoinnilla saavutettu hyöty verrattuna nykyiseen ratkaisuun, joka on toteutettu PHP:llä ja MySQL:llä. Tämän lisäksi, Spark toteusta hyödynnetään eri datan säilytysmalleilla (MySQL, Hadoop+HDFS), ja niiden suorityskykyä vertaillaan. Tulokset, jotka saatiin Spark toteutusta hyödyntämällä, osoittavat merkittävän parannuksen suoritusajassa datan prosessoimisessa. Oikean tietomallin valitsemisen tärkeyttä ei pidä aliarvioida, sillä datan siirtämiseen käytetty aika vaihtelee myös huomattavasti alustasta riippuen

    A System Development Kit for Big Data Applications on FPGA-based Clusters: The EVEREST Approach

    Full text link
    Modern big data workflows are characterized by computationally intensive kernels. The simulated results are often combined with knowledge extracted from AI models to ultimately support decision-making. These energy-hungry workflows are increasingly executed in data centers with energy-efficient hardware accelerators since FPGAs are well-suited for this task due to their inherent parallelism. We present the H2020 project EVEREST, which has developed a system development kit (SDK) to simplify the creation of FPGA-accelerated kernels and manage the execution at runtime through a virtualization environment. This paper describes the main components of the EVEREST SDK and the benefits that can be achieved in our use cases.Comment: Accepted for presentation at DATE 2024 (multi-partner project session