1,025 research outputs found

    Approaches to Avoid Traditional Multidimensional Data Cube: A Survey

    Get PDF
    Data analysis is the growing need of the current era. Data analysis is not only restricted to business domain only. Advancement in the technology opens the door of technology to the every common person and hence data generation is increasing exponentially day by day. Incorporating such huge amount of data in the data analysis system is the big challenge. Handling variety of data is also the difficult issue. The numerous options are coming out to solve these problems. To support the decision making system, Online Analytical Processing (OLAP) is the more suitable option. OLAP uses the multidimensional data analysis approach of data analysis. OLAP includes the analysis of current data as well history data and also the aggregated or summary data. To handle the aggregated data traditionally data cube is used. This paper focuses on the various research techniques to enhance the performance of the data cube

    Open City Data Pipeline

    Get PDF
    Statistical data about cities, regions and at country level is collected for various purposes and from various institutions. Yet, while access to high quality and recent such data is crucial both for decision makers as well as for the public, all to often such collections of data remain isolated and not re-usable, let alone properly integrated. In this paper we present the Open City Data Pipeline, a focused attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and republish this data in a reusable manner as Linked Data. The main feature of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques as well as ontological reasoning over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such imputations per indicator. Additionally, (iv) we make the integrated and enriched data available both in a we browser interface and as machine-readable Linked Data, using standard vocabularies such as QB and PROV, and linking to e.g. DBpedia. Lastly, in an exhaustive evaluation of our approach, we compare our enrichment and cleansing techniques to a preliminary version of the Open City Data Pipeline presented at ISWC2015: firstly, we demonstrate that the combination of equational knowledge and standard machine learning techniques significantly helps to improve the quality of our missing value imputations; secondly, we arguable show that the more data we integrate, the more reliable our predictions become. Hence, over time, the Open City Data Pipeline shall provide a sustainable effort to serve Linked Data about cities in increasing quality.Series: Working Papers on Information Systems, Information Business and Operation

    Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web

    Get PDF
    If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches

    Modeling, Annotating, and Querying Geo-Semantic Data Warehouses

    Get PDF

    CalcHEP 3.4 for collider physics within and beyond the Standard Model

    Full text link
    We present version 3.4 of the CalcHEP software package which is designed for effective evaluation and simulation of high energy physics collider processes at parton level. The main features of CalcHEP are the computation of Feynman diagrams, integration over multi-particle phase space and event simulation at parton level. The principle attractive key-points along these lines are that it has: a) an easy startup even for those who are not familiar with CalcHEP; b) a friendly and convenient graphical user interface; c) the option for a user to easily modify a model or introduce a new model by either using the graphical interface or by using an external package with the possibility of cross checking the results in different gauges; d) a batch interface which allows to perform very complicated and tedious calculations connecting production and decay modes for processes with many particles in the final state. With this features set, CalcHEP can efficiently perform calculations with a high level of automation from a theory in the form of a Lagrangian down to phenomenology in the form of cross sections, parton level event simulation and various kinematical distributions. In this paper we report on the new features of CalcHEP 3.4 which improves the power of our package to be an effective tool for the study of modern collider phenomenology.Comment: 82 pages, elsarticle LaTeX, 7 Figures. Changes from v1: 1) updated reference list and Acknowledgments; 2) 2->1 processes added to CalcHEP; 3) particles decay (i.e. Higgs boson) into virtual W/Z decays added together with comparison to results from Hdecay package; 4) added interface with Root packag

    ICT tools for data management and analysis to support decisional process oriented to sustainable agri-food chains

    Get PDF
    Il settore agroalimentare sta affrontando delle sfide globali. La prima riguarda sfamare la popolazione mondiale che nel 2050, secondo le proiezioni delle Nazioni Unite, raggiungerà quota 9,3 miliardi di persone. La seconda sfida riguarda la richiesta da parte dei consumatori di prodotti ottenuti da filiere agroalimentari sempre più sostenibili, sicure e trasparenti. In particolare, l’Agricoltura sostenibile è una tecnica di gestione in grado di preservare la diversità biologica, la produttività, la capacità di rigenerazione, la vitalità e l’abilità alla funzione di un ecosistema agricolo, assicurandone, oggi e in futuro, le funzioni ecologiche, economiche e sociali a livello locale, nazionale ed globale, senza danneggiare altri ecosistemi. Quindi, per fronteggiare la sfida dell’agricoltura sostenibile, gli agricoltori devono aumentare la qualità e la quantità della produzione, riducendo l’impatto ambientale attraverso nuovi strumenti e nuove strategie di gestione. Questo lavoro analizza l’integrazione nel settore agroalimentare di alcune tecnologie e metodologie ICT per l’acquisizione, gestione e analisi dei dati, come la tecnologia RFID (Radio Frequency IDentification), i FMIS (Farm Management Information Systems), i DW (Data Warehouse) e l’approccio OLAP (On-Line Analytical Processing). Infine, l’adozione delle tecnologie ICT da parte di vere aziende è stata valutata attraverso un questionario. Al riguardo dell’adozione delle tecnologie RFID, questo lavoro analizza l’opportunità di trasferimento tecnologico relativo al monitoraggio e controllo dei prodotti agroalimentari tramite l’utilizzo di sensori innovativi, intelligenti e miniaturizzati. Le informazioni riguardanti lo stato del prodotto sono trasferite in tempo reale in wireless, come previsto dalla tecnologia RFID. In particolare, due soluzioni RFID sono state analizzate, evidenziando vantaggi e punti critici in confronto ai classici sistemi per assicurare la tracciabilità e la qualità dei prodotti agroalimentari. Quindi, questo lavoro analizza la possibilità di sviluppare una struttura che combina le tecnologie della Business Intelligence con i principi della Protezione Integrata (IPM) per aiutare gli agricoltori nel processo decisionale, andando a diminuire l’impatto ambientale ed aumentare la performance produttiva. L’IPM richiede di utilizzare simultaneamente diverse tecniche di protezione delle colture per il controllo dei parassiti e patogeni tramite un approccio ecologico ed economico. Il sistema di BI proposto è chiamato BI4IPM e combina l’approccio OLTP (On-Line Transaction Processing) con quello OLAP per verificare il rispetto dei disciplinari di produzione integrata. BI4IPM è stato testato con dati provenienti da vere aziende olivicole pugliesi. L’olivo è una delle principali colture a livello globale e la Puglia è la prima regione produttrice in Italia, con un gran numero di aziende che generano dati sull’IPM. Le strategie di protezione delle colture sono correlate alle condizioni climatiche, considerando la forte relazione tra clima, colture e parassiti. Quindi, in questo lavoro è presentato un nuovo e avanzato modello OLAP che integra il GSI (Growing Season Index), un modello fenologico, per comparare indirettamente le aziende agricole dal punto di vista climatico. Il sistema proposto permette di analizzare dati IPM di diverse aziende agricole che presentano le stesse condizioni fenologiche in un anno al fine di individuare best practices e di evidenziare e spiegare pratiche differenti adottate da aziende che lavorano in differenti condizioni climatiche. Infine, è stata effettuata un’indagine al fine di capire come le aziende agricole della Basilicata si raggruppano in funzione del livello di innovazione adottato. È stato utilizzato un questionario per domandare alle aziende se adottano strumenti ICT, ed eventualmente in quale processo produttivo o di management vengano usati. È stata quindi effettuata un’analisi cluster sui dati raccolti. I risultati mostrano che, usando il metodo di clustering k-means, appaiono due gruppi: gli innovatori e gli altri. Mentre, applicando la rappresentazione boxlot, si ottengono 3 gruppi: innovatori, utilizzatori precoci e ritardatari.The Agri-Food sector is facing global challenges. The first issue concerns feeding a world population that in 2050, according to United Nations projections, will reach 9.3 billion people. The second challenge is the request by consumers for high quality products obtained by more sustainable, safely and clear agri-food chains. In particular, the Sustainable agriculture is a management strategy able to preserve the biological diversity, productivity, regeneration capacity, vitality and ability to function of an agricultural ecosystem, ensuring, today and in the future, significant ecological, economic and social functions at the local, national and global scales, without harming other ecosystems. Therefore, to face the challenge of the sustainable agriculture, farmers need to increase quality and quantity of the production, reducing the environmental impact through new management strategies and tools. This work explores the integration of several ICT technologies and methodologies in the agri-food sector for the data acquisition, management and analysis, such as RFID technology, Farm Management Information Systems (FMIS), Data Warehouse (DW) and On-Line Analytical Processing (OLAP). Finally, the adoption of the ICT technologies by real farms is evaluated through a survey. Regarding the adoption of the RFID technology, this work explores an opportunity for technology transfer related to the monitoring and control of agri-food products, based on the use of miniaturized, smart and innovative sensors. The information concerning to the state of the product is transferred in real time in a wireless way, according to the RFID technology. In particular, two technical solutions involving RFID are provided, highlighting the advantages and critical points referred to the normal system used to ensure the traceability and the quality of the agri-food products. Therefore, this work explores the possibility of developing a framework that combines business intelligence (BI) technologies with Integrated Pest Management (IPM) principles to support farmers in the decisional process, thereby decreasing environmental cost and improving production performance. The IPM requires the simultaneous use of different crop protection techniques to control pests through an ecological and economic approach. The proposed BI system is called BI4IPM, and it combines on-line transaction processing (OLTP) with OLAP to verify adherence to the IPM technical specifications. BI4IPM is tested with data from real Apulian olive crop farms. Olive tree is one of the most important crop at global scale and Apulia is the first olive-producing region in Italy, with a huge amount of farms that generate IPM data. The crop protection strategies are correlated to the climate conditions considering the very important relation among climate, crops and pests. Therefore, in this work is presented a new advanced OLAP model integrating the Growing Season Index (GSI), a phenology model, to compare indirectly the farms by a climatic point of view. The proposed system allows analysing IPM data of different farms having the same phenological conditions over a year to understand some best practices and to highlight and explain different practices adopted by farms working in different climatic conditions. Finally, a survey aimed at investigating how Lucania' farms cluster according to the level of innovation adopted was performed. It was used a questionnaire for asking if farms adopt ICTs tools and, in case, what type they involved in managing and/or production processes. It has been done a cluster analysis on collected data. Results show that, using k-means clustering method, appear two clusters: innovators, remaining groups. While, using boxplot representation, clustered three groups: innovators, early adopters and laggards

    ImageJ2: ImageJ for the next generation of scientific image data

    Full text link
    ImageJ is an image analysis program extensively used in the biological sciences and beyond. Due to its ease of use, recordable macro language, and extensible plug-in architecture, ImageJ enjoys contributions from non-programmers, amateur programmers, and professional developers alike. Enabling such a diversity of contributors has resulted in a large community that spans the biological and physical sciences. However, a rapidly growing user base, diverging plugin suites, and technical limitations have revealed a clear need for a concerted software engineering effort to support emerging imaging paradigms, to ensure the software's ability to handle the requirements of modern science. Due to these new and emerging challenges in scientific imaging, ImageJ is at a critical development crossroads. We present ImageJ2, a total redesign of ImageJ offering a host of new functionality. It separates concerns, fully decoupling the data model from the user interface. It emphasizes integration with external applications to maximize interoperability. Its robust new plugin framework allows everything from image formats, to scripting languages, to visualization to be extended by the community. The redesigned data model supports arbitrarily large, N-dimensional datasets, which are increasingly common in modern image acquisition. Despite the scope of these changes, backwards compatibility is maintained such that this new functionality can be seamlessly integrated with the classic ImageJ interface, allowing users and developers to migrate to these new methods at their own pace. ImageJ2 provides a framework engineered for flexibility, intended to support these requirements as well as accommodate future needs

    Enabling Ubiquitous OLAP Analyses

    Get PDF
    An OLAP analysis session is carried out as a sequence of OLAP operations applied to multidimensional cubes. At each step of a session, an operation is applied to the result of the previous step in an incremental fashion. Due to its simplicity and flexibility, OLAP is the most adopted paradigm used to explore the data stored in data warehouses. With the goal of expanding the fruition of OLAP analyses, in this thesis we touch several critical topics. We first present our contributions to deal with data extractions from service-oriented sources, which are nowadays used to provide access to many databases and analytic platforms. By addressing data extraction from these sources we make a step towards the integration of external databases into the data warehouse, thus providing richer data that can be analyzed through OLAP sessions. The second topic that we study is that of visualization of multidimensional data, which we exploit to enable OLAP on devices with limited screen and bandwidth capabilities (i.e., mobile devices). Finally, we propose solutions to obtain multidimensional schemata from unconventional sources (e.g., sensor networks), which are crucial to perform multidimensional analyses

    A framework for enriching Data Warehouse analysis with Question Answering systems

    Get PDF
    Business Intelligence (BI) applications allow their users to query, understand, and analyze existing data within their organizations in order to acquire useful knowledge, thus making better strategic decisions. The core of BI applications is a Data Warehouse (DW), which integrates several heterogeneous structured data sources in a common repository of data. However, there is a common agreement in that the next generation of BI applications should consider data not only from their internal data sources, but also data from different external sources (e.g. Big Data, blogs, social networks, etc.), where relevant update information from competitors may provide crucial information in order to take the right decisions. This external data is usually obtained through traditional Web search engines, with a significant effort from users in analyzing the returned information and in incorporating this information into the BI application. In this paper, we propose to integrate the DW internal structured data, with the external unstructured data obtained with Question Answering (QA) techniques. The integration is achieved seamlessly through the presentation of the data returned by the DW and the QA systems into dashboards that allow the user to handle both types of data. Moreover, the QA results are stored in a persistent way through a new DW repository in order to facilitate comparison of the obtained results with different questions or even the same question with different dates.This paper has been partially supported by the MESOLAP (TIN2010-14860), GEODASBI (TIN2012-37493-C03-03), LEGOLANG-UAGE (TIN2012-31224) and DIIM2.0 (PROMETEOII/2014/001) projects from the Spanish Ministry of Education and Competitivity. Alejandro Maté is funded by the Generalitat Valenciana under an ACIF grant (ACIF/2010/298)

    Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web

    Get PDF
    If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches
    • …
    corecore