240 research outputs found
Implementation of an information retrieval system within a central knowledge management system
Páginas numeradas: I-XIII, 14-126Estágio realizado na Wipro Portugal SA e orientado pelo Eng.º Hugo NetoTese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
A Study on the Open Source Digital Library Software's: Special Reference to DSpace, EPrints and Greenstone
The richness in knowledge has changed access methods for all stake holders in
retrieving key knowledge and relevant information. This paper presents a study
of three open source digital library management software used to assimilate and
disseminate information to world audience. The methodology followed involves
online survey and study of related software documentation and associated
technical manuals.Comment: 9 Pages, 3 Figures, 1 Table, "Published with International Journal of
Computer Applications (IJCA)
Migrating Cornetto Lexicon to New XML Database Engine
The original Cornetto project started to develop a new complex-structured lexicon for the Dutch language. The lexicon building process works with information from two current electronic dictionaries -- the Referentie Bestand Nederlands (RBN), which contains FrameNet-like structures, and the Dutch wordnet (DWN) with the usual wordnet structures. The resulting Cornetto lexicon is stored in a system called Cornetto database, which is built over the Dictionary Editor and Browser platform. In this paper, we describe a transition of the Cornetto database system to a new database backend based on large set of tests that were run on four selected (out of twenty) available XML database systems. We present the technical details of the Cornetto editing process and the results before and after the database transition.Cílem projektu Cornetto bylo vytvořit nový komplexní lexikon nizozemského jazyka. Při tvorbě lexikonu se pracuje s informace ze dvou existujících elektronických slovníků - Referentie Bestand Nederlands (RBN), který obsahuje struktury podobné FrameNetu, a Dutch Wordnet (DWN) s obvyklou strukturou wordnetu. Výsledný lexikon je uložen v systému nazvaném Cornetto database, který je postaven na platformě Dictionary Editor and Browser (DEB). V článku popisujeme přechod systému Cornetto na novou databázi, která byla vybrána pomocí rozsáhlé sady testů provedených na čtyřech vybraných (z více než dvaceti existujících) XML databázových systémech. Jsou popsány technické podrobnosti editace databáze Cornetto a výsledky srovnávající stav před a po změně databáze
Enterprise Search in the European Union: A Techno-economic Analysis
This Report contributes to the work being carried out by IPTS on the potential of Search, discussing, in particular, the prospects of Enterprise search as well as the main challenges and opportunities. It is part of CHORUS+, an initiative supported by the Directorate General Information Society and Media. Information about CHORUS+ is available at http://avmediasearch.euJRC.J.3-Information Societ
Impliance: A Next Generation Information Management Appliance
ably successful in building a large market and adapting to the changes of the
last three decades, its impact on the broader market of information management
is surprisingly limited. If we were to design an information management system
from scratch, based upon today's requirements and hardware capabilities, would
it look anything like today's database systems?" In this paper, we introduce
Impliance, a next-generation information management system consisting of
hardware and software components integrated to form an easy-to-administer
appliance that can store, retrieve, and analyze all types of structured,
semi-structured, and unstructured information. We first summarize the trends
that will shape information management for the foreseeable future. Those trends
imply three major requirements for Impliance: (1) to be able to store, manage,
and uniformly query all data, not just structured records; (2) to be able to
scale out as the volume of this data grows; and (3) to be simple and robust in
operation. We then describe four key ideas that are uniquely combined in
Impliance to address these requirements, namely the ideas of: (a) integrating
software and off-the-shelf hardware into a generic information appliance; (b)
automatically discovering, organizing, and managing all data - unstructured as
well as structured - in a uniform way; (c) achieving scale-out by exploiting
simple, massive parallel processing, and (d) virtualizing compute and storage
resources to unify, simplify, and streamline the management of Impliance.
Impliance is an ambitious, long-term effort to define simpler, more robust, and
more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement
(http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute,
display, and perform the work, make derivative works and make commercial use
of the work, but, you must attribute the work to the author and CIDR 2007.
3rd Biennial Conference on Innovative Data Systems Research (CIDR) January
710, 2007, Asilomar, California, US
Development of a centralized log management system
Os registos de um sistema são uma peça crucial de qualquer sistema e fornecem
uma visão útil daquilo que este está fazendo e do que acontenceu em caso de falha.
Qualquer processo executado num sistema gera registos em algum formato.
Normalmente, estes registos ficam armazenados em memória local. À medida que os
sistemas evoluiram, o número de registos a analisar também aumentou, e, como
consequência desta evolução, surgiu a necessidade de produzir um formato de registos
uniforme, minimizando assim dependências e facilitando o processo de análise.
A ams é uma empresa que desenvolve e cria soluções no mercado dos sensores.
Com vinte e dois centros de design e três locais de fabrico, a empresa fornece os seus
serviços a mais de oito mil clientes em todo o mundo. Um centro de design está
localizado no Funchal, no qual está incluida uma equipa de engenheiros de aplicação
que planeiam e desenvolvem applicações de software para clientes internos. O processo
de desenvolvimento destes engenheiros envolve várias aplicações e programas, cada
um com o seu próprio sistema de registos.
Os registos gerados por cada aplicação são mantido em sistemas de
armazenamento distintos. Se um desenvolvedor ou administrador quiser solucionar um
problema que abrange várias aplicações, será necessário percorrer as várias localizações
onde os registos estão armazenados, colecionando-os e correlacionando-os de forma a
melhor entender o problema. Este processo é cansativo e, se o ambiente for
dimensionado automaticamente, a solução de problemas semelhantes torna-se
inconcebível.
Este projeto teve como principal objetivo resolver estes problemas, criando
assim um Sistema de Gestão de Registos Centralizado capaz de lidar com registos de
várias fontes, como também fornecer serviços que irão ajudar os desenvolvedores e
administradores a melhor entender os diferentes ambientes afetados.
A solução final foi desenvolvida utilizando um conjunto de diferentes tecnologias
de código aberto, tais como a Elastic Stack (Elasticsearch, Logstash e Kibana), Node.js,
GraphQL e Cassandra.
O presente documento descreve o processo e as decisões tomadas para chegar
à solução apresentada.Logs are a crucial piece of any system and give a helpful insight into what it is
doing as well as what happened in case of failure. Every process running on a system
generates logs in some format. Generally, these logs are written to local storage
resources. As systems evolved, the number of logs to analyze increased, and, as a
consequence of this progress, there was the need of having a standardized log format,
minimizing dependencies and making the analysis process easier.
ams is a company that develops and creates sensor solutions. With twenty-two
design centers and three manufacturing locations, the company serves to over eight
thousand clients worldwide. One design center is located in Funchal, which includes a
team of application engineers that design and develop software applications to clients
inside the company. The application engineer’s development process is comprised of
several applications and programs, each having its own logging system.
Log entries generated by different applications are kept in separate storage
systems. If a developer or administrator wants to troubleshoot an issue that includes
several applications, he/she would have to go to different database systems or locations
to collect the logs and correlate them across the several requests. This is a tiresome
process and if the environment is auto-scaled, then troubleshooting an issue is
inconceivable.
This project aimed to solve these problems by creating a Centralized Log
Management System that was capable of handling logs from a variety of sources, as well
as to provide services that will help developers and administrators better understand
the different affected environments.
The deployed solution was developed using a set of different open-source
technologies, such as the Elastic Stack (Elasticsearch, Logstash and Kibana), Node.js,
GraphQL and Cassandra.
The present document describes the process and decisions taken to achieve the
solution
Indexing and Searching Document Collections using Lucene
The amount of information available to a person is growing day by day; hence retrieving the correct information in a timely manner plays a very important role. This thesis talks about indexing document collections and fetching the right information with the help of a database. The primary role of a database is to store the additional information which may be or may not be available in the document collection by itself. The indexing of document collection is performed by Lucene, while the search application is strongly integrated with a database. In this thesis a highly efficient, scalable, customized search tool is built using Lucene. The search tool is capable of indexing and searching databases, PDF documents, word documents and text files
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
- …