10 research outputs found
Big data y NoSQL: su rol en la revolución del cloud computing y sus retos hacia la estandarización
El auge en la utilización de las tecnologías de información (TI) orientadas a cloud computing ha generado grandes cantidades y variedades de datos, o Big Data, en los centros de datos de las organizaciones proveedoras de estos servicios, generando retos a los desarrolladores y administradores de las TI para el manejo eficiente de esos recursos. Frente a este fenómeno, las bases de datos relacionales son consideradas por los profesionales de las TI como una obstrucción para ofrecer las características esenciales del cloud computing. Esto ha impulsado una serie de nuevos enfoques y tecnologías para la gestión de datos, conocidos como bases de datos no relacionales o NoSQL, diseñadas para proveer rapidez, escalabilidad, alta disponibilidad y elasticidad, facilitando la administración de datos en las soluciones tecnológicas en cloud computing. En este artículo se exponen los conceptos de Big Data, sus principales herramientas tecnológicas para la gestión de datos, el movimiento NoSQL, el rol de estos en la revolución del cloud computing y los retos de estas tecnologías hacia la estandarización
Recommended from our members
Bringing modular concurrency control to the next level
Database users face a tension between ease-of-programming and high performance: ACID transactions can greatly simplify the programming effort of database applications by providing four useful properties—atomicity, consistency, isolation, and durability, but enforcing these properties can degrade performance.
This dissertation eases this tension by improving the performance of ACID transactions for scenarios where data contention is the bottleneck. The approach that we take is federating concurrency control (CC) mechanisms. It is based on the observation that any single CC mechanism is bound to make trade-offs that cause it to perform well in some cases but poorly in others. A federation opens the opportunity of applying each mechanism only to the set of transactions or workloads where it shines, while maintaining isolation.
In particular, this work builds upon Modular Concurrency Control (MCC), a recent technique that federates CCs by partitioning transactions into groups, and by applying different CC mechanisms in each group.
This dissertation addresses two critical shortcomings in the current embodiment of MCC. First, cross-group data conflicts are handled with a single, unoptimized CC mechanism that can significantly limit performance. Second, configuring MCC is a complex task, which runs counter to MCC’s purpose: to improve performance without sacrificing ease-of-programming.
To address these problems, this dissertation presents Tebaldi, a new transactional database that brings Modular Concurrency Control to the next level, both figuratively and literally. Tebaldi introduces a new, hierarchical model to MCC that partitions transactions recursively to compose CC mechanisms in a multi-level tree. This model increases flexibility in federating CC mechanisms, which is the key to realizing the performance potential of federation. Tebaldi reduces configuration complexity by managing the MCC federation automatically: it can detect performance issues in the current workload in real-time, and automatically adjusts its configuration to improve its performance.Computer Science
Tècniques per a la traducció d'esquemes relacionals a no relacionals
En aquest projecte es vol buscar si existeixen tècniques o regles a l'hora de convertir un esquema relacional a no relacional. Per tal de fer-ho s'han realitzat proves de rendiment sobre un sistema relacional i les diferents variants de l'esquema en un sistema no relacional
Laajan mittakaavan Internet-sovelluksia varten kehitetyt hajautetut tietokannat
Suurten Internet-yritysten, kuten Googlen ja Amazonin tarjoamat palvelut edellyttävät valtavien hajautettujen tietomäärien käsittelyä ja varastoimista. Tiedon pitää olla hyvin saatavilla. Tietokantajärjestelmältä edellytetään myös hyvää suorituskykyä. Suorituskyvyn ylläpitämiseksi järjestelmän täytyy skaalautua niin, että tarpeen vaatiessa järjestelmään voidaan lisätä enemmän resursseja. Tietokannan rakenteen tulee olla lisäksi joustava ja helposti muokattavissa. Perinteiset relaatiotietokannat transaktionaalisine oikeellisuus- ja eristyvyysvaatimuksineen ovat olleet liian rajoittavia tähän tarkoitukseen, joten näiden laajan mittakaavan Internet-sovellusten vaatimuksiin on kehitetty muita vaihtoehtoja. Näitä järjestelmiä on alettu kutsua NoSQL-tietokantajärjestelmiksi. NoSQL-tietokannat ovat usein niin erikoistuneita, ettei relaatiomallia ja SQL-kyselykielen koko ilmaisuvoimaa tarvita tai voida käyttää. Näiden tietokantojen tietomalli perustuu avain-arvo-pariin, jossa varastoitu arvo on yksilöity indeksoitavan avaimen perusteella. Tietokannan skeema on taas usein hyvin joustava, tai tietokanta saattaa olla jopa kokonaan skeematon. Käytössä olevat funktiot ovatkin usein rajoittuneet yksittäisten avain-arvo-parien lukemiseen ja päivittämiseen. Näiden tietojen laajan mittakaavan rinnakkaiseen laskentaan on lisäksi kehitetty yksinkertainen MapReduce-ohjelmointiparadigma. Google ja Amazon hyödyntävät näitä järjestelmiä varten rakentamaansa laajan mittakaavan infrastruktuuria tarjoamalla sitä myös muiden yritysten sovelluksien alustaksi NoSQL-tietokantapalveluna.
Tässä tutkielmassa pyritään selventämään NoSQL-tietokantajärjestelmien tallennusratkaisun ja tiedon käsittelyn periaatteita, eroja relaatiotietokantajärjestelmiin sekä millaiseen käyttöön nämä uudet tietokantajärjestelmät oikein soveltuvat. Tutkielmassa esitellään myös MapReduce-ohjelmointiparadigma, NoSQL-tietokantapalveluna sekä joitakin NoSQL-tietokantajärjestelmien luokittelutapoja ja tietokannan tietomalleja. Tutkielma perustuu pääosin aikaisemmin aiheesta laadittuun kirjalliseen materiaaliin, kuten lehti- ja konferenssiartikkeleihin sekä kirjoihin.
NoSQL-tietokantajärjestelmien nykyistä kehitysvaihetta voidaan verrata aikaan ennen SQL:ää. Nämä järjestelmät ovat kovin heterogeeninen joukko, joten myös niiden luokittelu on vaikeaa. NoSQL-tietokantajärjestelmissä ei ole perinteisten relaatiotietokantajärjestelmien pitkälle kehitettyjä ominaisuuksia. Suurin osa edellä mainituista ominaisuuksista pitää toteuttaa sovelluslogiikassa, joten ne jäävät sovellusohjelmoijan vastuulle. Mikään tietokantajärjestelmä tai työkalu ei ole paras ratkaisu kaikkiin tehtäviin. Kussakin järjestelmässä on järkevää ja tehokasta käsitellä ja varastoida pääosin tietyn kaltaista sovellusalueen tietoa. Sopiva tietokantajärjestelmä tai työkalu riippuu täysin yrityksen ja sovelluksen vaatimuksista. Yrityksen tulee siis arvioida sovellusalueen tietojen vaatimuksia
Scalable data management for web applications
Steen, M.R. van [Promotor]Pierre, G.E.O. [Copromotor]Chi, C.H. [Copromotor
Flexibility in Data Management
With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data management. For instance, investigative analytics and agile software development move towards a very agile and flexible handling of data. As the primary facilitators of data management, database systems have to reflect and support these developments. However, traditional database management technology, in particular relational database systems, is built on assumptions of relatively stable application domains. The need to model all data up front in a prescriptive database schema earned relational database management systems the reputation among developers of being inflexible, dated, and cumbersome to work with. Nevertheless, relational systems still dominate the database market. They are a proven, standardized, and interoperable technology, well-known in IT departments with a work force of experienced and trained developers and administrators.
This thesis aims at resolving the growing contradiction between the popularity and omnipresence of relational systems in companies and their increasingly bad reputation among developers. It adapts relational database technology towards more agility and flexibility. We envision a descriptive schema-comes-second relational database system, which is entity-oriented instead of schema-oriented; descriptive rather than prescriptive. The thesis provides four main contributions: (1)~a flexible relational data model, which frees relational data management from having a prescriptive schema; (2)~autonomous physical entity domains, which partition self-descriptive data according to their schema properties for better query performance; (3)~a freely adjustable storage engine, which allows adapting the physical data layout used to properties of the data and of the workload; and (4)~a self-managed indexing infrastructure, which autonomously collects and adapts index information under the presence of dynamic workloads and evolving schemas. The flexible relational data model is the thesis\' central contribution. It describes the functional appearance of the descriptive schema-comes-second relational database system. The other three contributions improve components in the architecture of database management systems to increase the query performance and the manageability of descriptive schema-comes-second relational database systems. We are confident that these four contributions can help paving the way to a more flexible future for relational database management technology
Flexibility in Data Management
With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data management. For instance, investigative analytics and agile software development move towards a very agile and flexible handling of data. As the primary facilitators of data management, database systems have to reflect and support these developments. However, traditional database management technology, in particular relational database systems, is built on assumptions of relatively stable application domains. The need to model all data up front in a prescriptive database schema earned relational database management systems the reputation among developers of being inflexible, dated, and cumbersome to work with. Nevertheless, relational systems still dominate the database market. They are a proven, standardized, and interoperable technology, well-known in IT departments with a work force of experienced and trained developers and administrators.
This thesis aims at resolving the growing contradiction between the popularity and omnipresence of relational systems in companies and their increasingly bad reputation among developers. It adapts relational database technology towards more agility and flexibility. We envision a descriptive schema-comes-second relational database system, which is entity-oriented instead of schema-oriented; descriptive rather than prescriptive. The thesis provides four main contributions: (1)~a flexible relational data model, which frees relational data management from having a prescriptive schema; (2)~autonomous physical entity domains, which partition self-descriptive data according to their schema properties for better query performance; (3)~a freely adjustable storage engine, which allows adapting the physical data layout used to properties of the data and of the workload; and (4)~a self-managed indexing infrastructure, which autonomously collects and adapts index information under the presence of dynamic workloads and evolving schemas. The flexible relational data model is the thesis\' central contribution. It describes the functional appearance of the descriptive schema-comes-second relational database system. The other three contributions improve components in the architecture of database management systems to increase the query performance and the manageability of descriptive schema-comes-second relational database systems. We are confident that these four contributions can help paving the way to a more flexible future for relational database management technology