48 research outputs found
Relatório de Estágio - Solução de BI Roaming Data Science (RoaDS) em ambiente Vodafone
A telecom company (Vodafone), had the need to implement a Business Intelligence solution for
Roaming data across a wide set of different data sources. Based on the data visualization of this
solution, its key users with decision power, can make a business analysis and needs of infrastructure
and software expansion. This document aims to expose the scientific papers produced with the various
stages of production of the solution (state of the art, architecture design and implementation results),
this Business Intelligence solution was designed and implemented with OLAP methodologies and
technologies in a Data Warehouse composed of Data Marts arranged in constellation, the visualization
layer was custom made in JavaScript (VueJS). As a base for the results a questionnaire was created to
be filled in by the key users of the solution. Based on this questionnaire it was possible to ascertain
that user acceptance was satisfactory. The proposed objectives for the implementation of the BI
solution with all the requirements was achieved with the infrastructure itself created from scratch in
Kubernetes. This BI platform can be expanded using column storage databases created specifically
with OLAP workloads in mind, removing the need for an OLAP cube layer. Based on Machine
Learning algorithms, the platform will be able to perform the predictions needed to make decisions
about Vodafone's Roaming infrastructure
New Fundamental Technologies in Data Mining
The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining
Decision Support Systems
Decision support systems (DSS) have evolved over the past four decades from theoretical concepts into real world computerized applications. DSS architecture contains three key components: knowledge base, computerized model, and user interface. DSS simulate cognitive decision-making functions of humans based on artificial intelligence methodologies (including expert systems, data mining, machine learning, connectionism, logistical reasoning, etc.) in order to perform decision support functions. The applications of DSS cover many domains, ranging from aviation monitoring, transportation safety, clinical diagnosis, weather forecast, business management to internet search strategy. By combining knowledge bases with inference rules, DSS are able to provide suggestions to end users to improve decisions and outcomes. This book is written as a textbook so that it can be used in formal courses examining decision support systems. It may be used by both undergraduate and graduate students from diverse computer-related fields. It will also be of value to established professionals as a text for self-study or for reference
IDEAS-1997-2021-Final-Programs
This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)
Cryptographic solutions of organization’s memory protection from the point of management’s knowledge
Moderne kompanije se svakim danom suočavaju sa problemom opterećenosti
velikom količinom informacija i podataka, a što otežava njihovo poslovanje i donošenje
efikasnih poslovnih odluka. Pronalaženje nove suštine primene načina menadžmenta
znanja u smislu efikasnog korišćenja memorije organizacije (znanja), predstavlja sve veću
potrebu kompanija da unaprede svoje poslovanje. Isto tako, zaštita načina pristupanja
memoriji organizacije (znanju kompanije), njegovoj razmeni i upravljanju njime,
kompanije sve više posvećuju pažnju i stavljaju akcenat u svom poslovanju. Primena
koncepta poslovne inteligencije u upravljanju memorije organizacije postaje neizostavan
element strategije uspešnih kompanija. Integrisano automatizovano upravljanje memorijom
organizacije (znanjem jedne kompanije), iako veoma složeno, predstavlja rešenje za
interakciju menadžmenta znanja i informacione tehnologije. Time se stvara mogućnost
potpunog objašnjavanja procesa donošenja odluka u jednoj kompaniji, ali i procesa toka
dokumenata, informacija i podataka. Integrisanim automatizovanim upravljanjem
memorijom organizacije, kompanija ostvaruje mogućnost dobijanja detaljnih podataka na
osnovu kojih je olakšano konkretno poslovno odlučivanje. Takođe, ovde se javlja i zahtev
za zaštitu jednog takvog integrisanog automatizo vanog procesa. U skladu sa određenim i
usvojenim međunarodnim standardima (ISO 27001), menadžment u ovakvom sistemu
kakav je memorija organizacije treba da osigura efikasnu implementaciju, praćenje i
unapređenje sistema za rukovanje bezbednošću memorije organizacije.
Zaštita i bezbednost memorije organizacije kroz kriptografska rešenje treba da
zadovolji balans između zahteva korisnika, funkcionalnosti unutar memorije organizacije i
potrebe zaštite osetljivih podataka i čuvanje njihovog integriteta. Ovakav integrisani
automatizovani proces upravljanja memorijom organizacije predstavlja jedno rešenje koje
bi svoju upotrebu moglo da nađe kako u oblasti učenja inteligentnih sistema, tako i u
postojećim sistemima savremenog poslovnog odlučivanja. Jedan od predloženog načina
rešenja zaštite integrisanog sistema za proces upravljanja memorijom organizacije u ovom
radu biće i mogućnost snimanja u šifrovanom obliku, čime podaci postaju dostupni samo
kroz informacioni sistem kompanije. U ovom radu biće predstavljeno sopstveno
kriptografsko rešenje zaštite memorije organizacije sa stanovišta menadžmenta znanja.
Pristup dokumentima i podacima će imati samo ovlašćeni korisnici sistema na osnovu
definisanih dozvola pristupa. Autentičnost dokumenata i njihova nepromenljivost bi se
obezbedila pomoću digitalnih potpisa, što predložena kriptografska rešenja obezbeđuju u
skladu sa aktuelnim zakonskim propisima za elektronski dokument. Isto tako, biće
razmotreni principi i modeli koji obezbeđuju i zaštitu podataka i privilegovan pristup
podacima, a sve u cilju donošenja odluka zasnovanih na memoriji organizacije. Najbolji
primer za ovakvu analizu su bezbednosno-informativne agencije, a brojni su primeri, kako
dobrih organizacija, tako i propusta u njihovom radu
Pervasive data science applied to the society of services
Dissertação de mestrado integrado em Information Systems Engineering and ManagementWith the technological progress that has been happening in the last few years, and now with the actual implementation of the Internet of Things concept, it is possible to observe an enormous amount of data being collected each minute. Well, this brings along a problem: “How can we process such amount of data in order to extract relevant knowledge in useful time?”. That’s not an easy issue to solve, because most of the time one needs to deal not just with tons but also with different kinds of data, which makes the problem even more complex.
Today, and in an increasing way, huge quantities of the most varied types of data are produced. These data alone do not add value to the organizations that collect them, but when subjected to data analytics processes, they can be converted into crucial information sources in the core business. Therefore, the focus of this project is to explore this problem and try to give it a modular solution, adaptable to different realities, using recent technologies and one that allows users to access information where and whenever they wish.
In the first phase of this dissertation, bibliographic research, along with a review of the same sources, was carried out in order to realize which kind of solutions already exists and also to try to solve the remaining questions.
After this first work, a solution was developed, which is composed by four layers, and consists in getting the data to submit it to a treatment process (where eleven treatment functions are included to actually fulfill the multidimensional data model previously designed); and then an OLAP layer, which suits not just structured data but unstructured data as well, was constructed. In the end, it is possible to consult a set of four dashboards (available on a web application) based on more than twenty basic queries and that allows filtering data with a dynamic query.
For this case study, and as proof of concept, the company IOTech was used, a company that provides the data needed to accomplish this dissertation, and based on which five Key Performance Indicators were defined.
During this project two different methodologies were applied: Design Science Research, in the research field, and SCRUM, in the practical component.Com o avanço tecnológico que se tem vindo a notar nos últimos anos e, atualmente, com a implementação do conceito Internet of Things, é possível observar o enorme crescimento dos volumes de dados recolhidos a cada minuto. Esta realidade levanta uma problemática: “Como podemos processar grandes volumes dados e extrair conhecimento a partir deles em tempo útil?”. Este não é um problema fácil de resolver pois muitas vezes não estamos a lidar apenas com grandes volumes de dados, mas também com diferentes tipos dos mesmos, o que torna a problemática ainda mais complexa.
Atualmente, grandes quantidades dos mais variados tipos de dados são geradas. Estes dados por si só não acrescentam qualquer valor às organizações que os recolhem. Porém, quando submetidos a processos de análise, podem ser convertidos em fontes de informação cruciais no centro do negócio. Assim sendo, o foco deste projeto é explorar esta problemática e tentar atribuir-lhe uma solução modular e adaptável a diferentes realidades, com base em tecnologias atuais que permitam ao utilizador aceder à informação onde e quando quiser.
Na primeira fase desta dissertação, foi executada uma pesquisa bibliográfica, assim como, uma revisão da literatura recolhida nessas mesmas fontes, a fim de compreender que soluções já foram propostas e quais são as questões que requerem uma resposta.
Numa segunda fase, foi desenvolvida uma solução, composta por quatro modulos, que passa por submeter os dados a um processo de tratamento (onde estão incluídas onze funções de tratamento, com o objetivo de preencher o modelo multidimensional previamente desenhado) e, posteriormente, desenvolver uma camada OLAP que seja capaz de lidar não só com dados estruturados, mas também dados não estruturados. No final, é possível consultar um conjunto de quatro dashboards disponibilizados numa plataforma web que tem como base mais de vinte queries iniciais, e filtros com base numa query dinamica.
Para este caso de estudo e como prova de conceito foi utilizada a empresa IOTech, empresa que disponibilizará os dados necessários para suportar esta dissertação, e com base nos quais foram definidos cinco Key Performance Indicators.
Durante este projeto foram aplicadas diferentes metodologias: Design Science Research, no que diz respeito à pesquisa, e SCRUM, no que diz respeito à componente prática
Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web
If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches
Constructing data marts from web sources using a graph common model
At a time when humans and devices are generating more information than ever, activities such as data mining and machine learning become crucial. These activities enable us to understand and interpret the information we have and predict, or better prepare ourselves for, future events. However, activities such as data mining cannot be performed without a layer of data management to clean, integrate, process and make available the necessary datasets. To that extent, large and costly data flow processes such as Extract-Transform-Load are necessary to extract from disparate information sources to generate ready-for-analyses datasets. These datasets are generally in the form of multi-dimensional cubes from which different data views can be extracted for the purpose of different analyses. The process of creating a multi-dimensional cube from integrated data sources is significant. In this research, we present a methodology to generate these cubes automatically or in some cases, close to automatic, requiring very little user interaction. A construct called a StarGraph acts as a canonical model for our system, to which imported data sources are transformed. An ontology-driven process controls the integration of StarGraph schemas and simple OLAP style functions generate the cubes or datasets. An extensive evaluation is carried out using a large number of agri data sources with user-defined case studies to identify sources for integration and the types of analyses required for the final data cubes