5 research outputs found
Benchmarking Summarizability Processing in XML Warehouses with Complex Hierarchies
Business Intelligence plays an important role in decision making. Based on
data warehouses and Online Analytical Processing, a business intelligence tool
can be used to analyze complex data. Still, summarizability issues in data
warehouses cause ineffective analyses that may become critical problems to
businesses. To settle this issue, many researchers have studied and proposed
various solutions, both in relational and XML data warehouses. However, they
find difficulty in evaluating the performance of their proposals since the
available benchmarks lack complex hierarchies. In order to contribute to
summarizability analysis, this paper proposes an extension to the XML warehouse
benchmark (XWeB) with complex hierarchies. The benchmark enables us to generate
XML data warehouses with scalable complex hierarchies as well as
summarizability processing. We experimentally demonstrated that complex
hierarchies can definitely be included into a benchmark dataset, and that our
benchmark is able to compare two alternative approaches dealing with
summarizability issues.Comment: 15th International Workshop on Data Warehousing and OLAP (DOLAP
2012), Maui : United States (2012
Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web
If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches
Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web
If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches
Business Intelligence on Non-Conventional Data
The revolution in digital communications witnessed over the last decade had a significant impact on the world of Business Intelligence (BI). In the big data era, the amount and diversity of data that can be collected and analyzed for the decision-making process transcends the restricted and structured set of internal data that BI systems are conventionally limited to. This thesis investigates the unique challenges imposed by three specific categories of non-conventional data: social data, linked data and schemaless data. Social data comprises the user-generated contents published through websites and social media, which can provide a fresh and timely perception about people’s tastes and opinions. In Social BI (SBI), the analysis focuses on topics, meant as specific concepts of interest within the subject area. In this context, this thesis proposes meta-star, an alternative strategy to the traditional star-schema for modeling hierarchies of topics to enable OLAP analyses. The thesis also presents an architectural framework of a real SBI project and a cross-disciplinary benchmark for SBI. Linked data employ the Resource Description Framework (RDF) to provide a public network of interlinked, structured, cross-domain knowledge. In this context, this thesis proposes an interactive and collaborative approach to build aggregation hierarchies from linked data. Schemaless data refers to the storage of data in NoSQL databases that do not force a predefined schema, but let database instances embed their own local schemata. In this context, this thesis proposes an approach to determine the schema profile of a document-based database; the goal is to facilitate users in a schema-on-read analysis process by understanding the rules that drove the usage of the different schemata. A final and complementary contribution of this thesis is an innovative technique in the field of recommendation systems to overcome user disorientation in the analysis of a large and heterogeneous wealth of data