163 research outputs found
Acesso remoto dinâmico e seguro a bases de dados com integração de políticas de acesso suave
The amount of data being created and shared has grown greatly in recent
years, thanks in part to social media and the growth of smart devices.
Managing the storage and processing of this data can give a competitive edge
when used to create new services, to enhance targeted advertising, etc. To
achieve this, the data must be accessed and processed. When applications
that access this data are developed, tools such as Java Database Connectivity,
ADO.NET and Hibernate are typically used. However, while these tools aim to
bridge the gap between databases and the object-oriented programming
paradigm, they focus only on the connectivity issue. This leads to increased
development time as developers need to master the access policies to write
correct queries. Moreover, when used in database applications within noncontrolled
environments, other issues emerge such as database credentials
theft; application authentication; authorization and auditing of large groups of
new users seeking access to data, potentially with vague requirements;
network eavesdropping for data and credential disclosure; impersonating
database servers for data modification; application tampering for unrestricted
database access and data disclosure; etc.
Therefore, an architecture capable of addressing these issues is necessary to
build a reliable set of access control solutions to expand and simplify the
application scenarios of access control systems. The objective, then, is to
secure the remote access to databases, since database applications may be
used in hard-to-control environments and physical access to the host
machines/network may not be always protected. Furthermore, the authorization
process should dynamically grant the appropriate permissions to users that
have not been explicitly authorized to handle large groups seeking access to
data. This includes scenarios where the definition of the access requirements is
difficult due to their vagueness, usually requiring a security expert to authorize
each user individually. This is achieved by integrating and auditing soft access
policies based on fuzzy set theory in the access control decision-making
process. A proof-of-concept of this architecture is provided alongside a
functional and performance assessment.A quantidade de dados criados e partilhados tem crescido nos últimos anos,
em parte graças às redes sociais e à proliferação dos dispositivos inteligentes.
A gestão do armazenamento e processamento destes dados pode fornecer
uma vantagem competitiva quando usados para criar novos serviços, para
melhorar a publicidade direcionada, etc. Para atingir este objetivo, os dados
devem ser acedidos e processados. Quando as aplicações que acedem a
estes dados são desenvolvidos, ferramentas como Java Database
Connectivity, ADO.NET e Hibernate são normalmente utilizados. No entanto,
embora estas ferramentas tenham como objetivo preencher a lacuna entre as
bases de dados e o paradigma da programação orientada por objetos, elas
concentram-se apenas na questão da conectividade. Isto aumenta o tempo de
desenvolvimento, pois os programadores precisam dominar as políticas de
acesso para escrever consultas corretas. Além disso, quando usado em
aplicações de bases de dados em ambientes não controlados, surgem outros
problemas, como roubo de credenciais da base de dados; autenticação de
aplicações; autorização e auditoria de grandes grupos de novos utilizadores
que procuram acesso aos dados, potencialmente com requisitos vagos; escuta
da rede para obtenção de dados e credenciais; personificação de servidores
de bases de dados para modificação de dados; manipulação de aplicações
para acesso ilimitado à base de dados e divulgação de dados; etc.
Uma arquitetura capaz de resolver esses problemas é necessária para
construir um conjunto confiável de soluções de controlo de acesso, para
expandir e simplificar os cenários de aplicação destes sistemas. O objetivo,
então, é proteger o acesso remoto a bases de dados, uma vez que as
aplicações de bases de dados podem ser usados em ambientes de difícil
controlo e o acesso físico às máquinas/rede nem sempre está protegido.
Adicionalmente, o processo de autorização deve conceder dinamicamente as
permissões adequadas aos utilizadores que não foram explicitamente
autorizados para suportar grupos grandes de utilizadores que procuram aceder
aos dados. Isto inclui cenários em que a definição dos requisitos de acesso é
difícil devido à sua imprecisão, geralmente exigindo um especialista em
segurança para autorizar cada utilizador individualmente. Este objetivo é
atingido no processo de decisão de controlo de acesso com a integração e
auditaria das políticas de acesso suaves baseadas na teoria de conjuntos
difusos. Uma prova de conceito desta arquitetura é fornecida em conjunto com
uma avaliação funcional e de desempenho.Programa Doutoral em Informátic
Disaster Response System
With integration of geospatial information system into a conventional information system a basic disaster response information system is implemented. The result is a report on various useful technologies and software engineering methodologies that could be utilized to implement a preliminary system, which in turn clarifies many uncertainties and surprises that are typical of many such systems. The foundations of my project include the Unified Process of software development, the relational data models, the decision tree technique, class design principles such as the MVC pattern
Extracting a Relational Database Schema from a Document Database
As NoSQL databases become increasingly used, more methodologies emerge for migrating from relational databases to NoSQL databases. Meanwhile, there is a lack of methodologies that assist in migration in the opposite direction, from NoSQL to relational. As software is being iterated upon, use cases may change. A system which was originally developed with a NoSQL database may accrue needs which require Atomic, Consistency, Isolation, and Durability (ACID) features that NoSQL systems lack, such as consistency across nodes or consistency across re-used domain objects. Shifting requirements could result in the system being changed to utilize a relational database. While there are some tools available to transfer data between an existing document database and existing relational database, there has been no work for automatically generating the relational database based upon the data already in the NoSQL system. Not taking the existing data into account can lead to inconsistencies during data migration. This thesis describes a methodology to automatically generate a relational database schema from the implicit schema of a document database. This thesis also includes details of how the methodology is implemented, and what could be enhanced in future works
THE USE OF RECOMMENDER SYSTEMS IN WEB APPLICATIONS – THE TROI CASE
Avoiding digital marketing, surveys, reviews and online users behavior approaches on digital age are the key elements for a powerful businesses to fail, there are some systems that should preceded some artificial intelligence techniques. In this direction, the use of data mining for recommending relevant items as a new state of the art technique is increasing user satisfaction as well as the business revenues. And other related information gathering approaches in order to our systems thing and acts like humans. To do so there is a Recommender System that will be elaborated in this thesis. How people interact, how to calculate accurately and identify what people like or dislike based on their online previous behaviors. The thesis includes also the methodologies recommender system uses, how math equations helps Recommender Systems to calculate user’s behavior and similarities. The filters are important on Recommender System, explaining if similar users like the same product or item, which is the probability of neighbor user to like also. Here comes collaborative filters, neighborhood filters, hybrid recommender system with the use of various algorithms the Recommender Systems has the ability to predict whether a particular user would prefer an item or not, based on the user’s profile and their activities. The use of Recommender Systems are beneficial to both service providers and users. Thesis cover also the strength and weaknesses of Recommender Systems and how involving Ontology can improve it. Ontology-based methods can be used to reduce problems that content-based recommender systems are known to suffer from. Based on Kosovar’s GDP and youngsters job perspectives are desirable for improvements, the demand is greater than the offer. I thought of building an intelligence system that will be making easier for Kosovars to find the appropriate job that suits their profile, skills, knowledge, character and locations. And that system is called TROI Search engine that indexes and merge all local operating job seeking websites in one platform with intelligence features. Thesis will present the design, implementation, testing and evaluation of a TROI search engine. Testing is done by getting user experiments while using running environment of TROI search engine. Results show that the functionality of the recommender system is satisfactory and helpful
Disaster Response System
With integration of geospatial information system into a conventional information system a basic disaster response information system is implemented. The result is a report on various useful technologies and software engineering methodologies that could be utilized to implement a preliminary system, which in turn clarifies many uncertainties and surprises that are typical of many such systems. The foundations of my project include the Unified Process of software development, the relational data models, the decision tree technique, class design principles such as the MVC pattern
DACTyL:towards providing the missing link between clinical and telehealth data
This document conveys the findings of the Data Analytics, Clinical, Telehealth, Link (DACTyL) project. This nine-month project started at January 2013 and was conducted at Philips Research in the Care Management Solution group and as part of the Data Analysis for Home Healthcare (DA4HH) project. The DA4HH charter is to perform and support retrospective analyses of data from Home Healthcare products, such as Motiva telehealth. These studies will provide valid insights in actual clinical aspects, usage and behavior of installed products and services. The insights will help to improve service offerings, create clinical algorithms for better outcome, and validate and substantiate claims on efficacy and cost-effectiveness. The current DACTyL project aims at developing and implementing an architecture and infrastructure to meet the most demanding need from Motiva telehealth customers on return on investment (ROI). These customers are hospitals that offer Motiva telehealth to their patients. In order to provide the Motiva service cost-effectively, they need to have insight into the actual cost, benefit and resource utilization when it comes to Motiva deployment compared to their usual routine care. Additional stakeholders for these ROI-related data are Motiva customer consultants and research scientists from Philips for strengthening their messaging and service deliveries to arrive at better patient care
DACTyL:towards providing the missing link between clinical and telehealth data
This document conveys the findings of the Data Analytics, Clinical, Telehealth, Link (DACTyL) project. This nine-month project started at January 2013 and was conducted at Philips Research in the Care Management Solution group and as part of the Data Analysis for Home Healthcare (DA4HH) project. The DA4HH charter is to perform and support retrospective analyses of data from Home Healthcare products, such as Motiva telehealth. These studies will provide valid insights in actual clinical aspects, usage and behavior of installed products and services. The insights will help to improve service offerings, create clinical algorithms for better outcome, and validate and substantiate claims on efficacy and cost-effectiveness. The current DACTyL project aims at developing and implementing an architecture and infrastructure to meet the most demanding need from Motiva telehealth customers on return on investment (ROI). These customers are hospitals that offer Motiva telehealth to their patients. In order to provide the Motiva service cost-effectively, they need to have insight into the actual cost, benefit and resource utilization when it comes to Motiva deployment compared to their usual routine care. Additional stakeholders for these ROI-related data are Motiva customer consultants and research scientists from Philips for strengthening their messaging and service deliveries to arrive at better patient care
Análise colaborativa de grandes conjuntos de séries temporais
The recent expansion of metrification on a daily basis has led to the production
of massive quantities of data, and in many cases, these collected metrics
are only useful for knowledge building when seen as a full sequence of
data ordered by time, which constitutes a time series. To find and interpret
meaningful behavioral patterns in time series, a multitude of analysis software
tools have been developed. Many of the existing solutions use annotations
to enable the curation of a knowledge base that is shared between a group
of researchers over a network. However, these tools also lack appropriate
mechanisms to handle a high number of concurrent requests and to properly
store massive data sets and ontologies, as well as suitable representations
for annotated data that are visually interpretable by humans and explorable by
automated systems. The goal of the work presented in this dissertation is to
iterate on existing time series analysis software and build a platform for the
collaborative analysis of massive time series data sets, leveraging state-of-the-art technologies for querying, storing and displaying time series and annotations.
A theoretical and domain-agnostic model was proposed to enable
the implementation of a distributed, extensible, secure and high-performant
architecture that handles various annotation proposals in simultaneous and
avoids any data loss from overlapping contributions or unsanctioned changes.
Analysts can share annotation projects with peers, restricting a set of collaborators
to a smaller scope of analysis and to a limited catalog of annotation
semantics. Annotations can express meaning not only over a segment of time,
but also over a subset of the series that coexist in the same segment. A novel
visual encoding for annotations is proposed, where annotations are rendered
as arcs traced only over the affected series’ curves in order to reduce visual
clutter. Moreover, the implementation of a full-stack prototype with a reactive
web interface was described, directly following the proposed architectural and
visualization model while applied to the HVAC domain. The performance of
the prototype under different architectural approaches was benchmarked, and
the interface was tested in its usability. Overall, the work described in this dissertation
contributes with a more versatile, intuitive and scalable time series
annotation platform that streamlines the knowledge-discovery workflow.A recente expansão de metrificação diária levou à produção de quantidades
massivas de dados, e em muitos casos, estas métricas são úteis para
a construção de conhecimento apenas quando vistas como uma sequência
de dados ordenada por tempo, o que constitui uma série temporal. Para se
encontrar padrões comportamentais significativos em séries temporais, uma
grande variedade de software de análise foi desenvolvida. Muitas das soluções
existentes utilizam anotações para permitir a curadoria de uma base
de conhecimento que é compartilhada entre investigadores em rede. No entanto,
estas ferramentas carecem de mecanismos apropriados para lidar com
um elevado número de pedidos concorrentes e para armazenar conjuntos
massivos de dados e ontologias, assim como também representações apropriadas
para dados anotados que são visualmente interpretáveis por seres
humanos e exploráveis por sistemas automatizados. O objetivo do trabalho
apresentado nesta dissertação é iterar sobre o software de análise de séries
temporais existente e construir uma plataforma para a análise colaborativa
de grandes conjuntos de séries temporais, utilizando tecnologias estado-de-arte
para pesquisar, armazenar e exibir séries temporais e anotações. Um
modelo teórico e agnóstico quanto ao domínio foi proposto para permitir a
implementação de uma arquitetura distribuída, extensível, segura e de alto
desempenho que lida com várias propostas de anotação em simultâneo e
evita quaisquer perdas de dados provenientes de contribuições sobrepostas
ou alterações não-sancionadas. Os analistas podem compartilhar projetos
de anotação com colegas, restringindo um conjunto de colaboradores a uma
janela de análise mais pequena e a um catálogo limitado de semântica de
anotação. As anotações podem exprimir significado não apenas sobre um
intervalo de tempo, mas também sobre um subconjunto das séries que coexistem
no mesmo intervalo. Uma nova codificação visual para anotações é
proposta, onde as anotações são desenhadas como arcos traçados apenas
sobre as curvas de séries afetadas de modo a reduzir o ruído visual. Para
além disso, a implementação de um protótipo full-stack com uma interface
reativa web foi descrita, seguindo diretamente o modelo de arquitetura e visualização
proposto enquanto aplicado ao domínio AVAC. O desempenho do
protótipo com diferentes decisões arquiteturais foi avaliado, e a interface foi
testada quanto à sua usabilidade. Em geral, o trabalho descrito nesta dissertação
contribui com uma abordagem mais versátil, intuitiva e escalável para
uma plataforma de anotação sobre séries temporais que simplifica o fluxo de
trabalho para a descoberta de conhecimento.Mestrado em Engenharia Informátic
Automatic information retrieval through text-mining
The dissertation presented for obtaining the Master’s Degree in Electrical Engineering and Computer Science, at Universidade Nova de Lisboa, Faculdade de Ciências e TecnologiaNowadays, around a huge amount of firms in the European Union catalogued as Small and Medium Enterprises (SMEs), employ almost a great portion of the active workforce in Europe. Nonetheless, SMEs cannot afford implementing neither methods nor tools to systematically adapt innovation as a part of their business process. Innovation is the engine to be competitive in the globalized environment, especially in the current
socio-economic situation. This thesis provides a platform that when integrated with ExtremeFactories(EF) project, aids SMEs to become more competitive by means of monitoring schedule functionality.
In this thesis a text-mining platform that possesses the ability to schedule a gathering
information through keywords is presented. In order to develop the platform, several
choices concerning the implementation have been made, in the sense that one of them
requires particular emphasis is the framework, Apache Lucene Core 2 by supplying an efficient text-mining tool and it is highly used for the purpose of the thesis
Improving an Open Source Geocoding Server
A common problem in geocoding is that the postal addresses as requested by the user differ from the addresses as described in the database. The online, open source geocoder called Nominatim is one of the most used geocoders nowadays. However, this geocoder lacks the interactivity that most of the online geocoders already offer. The Nominatim geocoder provides no feedback to the user while typing addresses. Also, the geocoder cannot deal with any misspelling errors introduced by the user in the requested address. This thesis is about extending the functionality of the Nominatim geocoder to provide fuzzy search and autocomplete features. In this work I propose a new index and search strategy for the OpenStreetMap reference dataset. Also, I extend the search algorithm to geocode new address types such as street intersections. Both the original Nominatim geocoder and the proposed solution are compared using metrics such as the precision of the results, match rate and keystrokes saved by the autocomplete feature. The test addresses used in this work are a subset selected among the Swedish addresses available in the OpenStreetMap data set. The results show that the proposed geocoder performs better when compared to the original Nominatim geocoder. In the proposed geocoder, the users get address suggestions as they type, adding interactivity to the original geocoder. Also, the proposed geocoder is able to find the right address in the presence of errors in the user query with a match rate of 98%.The demand of geospatial information is increasing during the last years. There are more and more mobile applications and services that require from the users to enter some information about where they are, or the address of the place they want to find for example. The systems that convert postal addresses or place descriptions into coordinates are called geocoders. How good or bad a geocoder is not only depends on the information the geocoder contains, but also on how easy is for the users to find the desired addresses. There are many well-known web sites that we use in our everyday life to find the location of an address. For example sites like Google Maps, Bing Maps or Yahoo Maps are accessed by millions of users every day to use such services. Among the main features of the mentioned geocoders are the ability to predict the address the user is writing in the search box, and sometimes even to correct any misspellings introduced by the user. To make it more complicated, the predictions and error corrections these systems perform are done in real time. The owners of these address search engines usually impose some restrictions on the number of addresses a user is allowed to search monthly, above which the user needs to pay a fee in order to keep using the system. This limit is usually high enough for the end user, but it might not be enough for the software developers that want to use geospatial data in their products. There is a free alternative to the address search engines mentioned above called Nominatim. Nominatim is an open source project whose purpose is to search addresses among the OpenStreetMap dataset. OpenStreetMap is a collaborative project that tries to map places in the real world into coordinates. The main drawback of Nominatim is that the usability is not as good as the competitors. Nominatim is unable to find addresses that are not correctly spelled, neither predicts the user needs. In order for this address search engine to be among the most used the prediction and error correction features need to be added. In this thesis work I extend the search algorithms of Nominatim to add the functionality mentioned above. The address search engine proposed in this thesis offers a free and open source alternative to users and systems that require access to geospatial data without restrictions
- …