4,000 research outputs found
Probabilistic SynSet Based Concept Location
Concept location is a common task in program comprehension techniques, essential in many approaches used for software care and software evolution. An important goal of this process is to discover a mapping between source code and human oriented concepts.
Although programs are written in a strict and formal language, natural language terms and sentences like identifiers (variables or functions names), constant strings or comments, can still be found embedded in programs. Using terminology concepts and natural language processing techniques these terms can be exploited to discover clues about which real world concepts source code is addressing.
This work extends symbol tables build by compilers with ontology driven constructs, extends synonym sets defined by linguistics, with automatically created Probabilistic SynSets from software domain parallel corpora. And using a relational algebra, creates semantic bridges between program elements and human oriented concepts, to enhance concept location tasks
WIKI::SCORE: a collaborative environment for music transcription and publishing
Music sources are most commonly shared in music scores scanned or printed on paper sheets. These artifacts are rich in information, but since they are images it is hard to re-use and share their content in todays’ digital world. There are modern languages that can be used to transcribe music sheets, this is still a time consuming task, because of the complexity involved in the process and the typical huge size of the original documents.
WIKI::SCORE is a collaborative environment where several people work together to transcribe music sheets to a shared medium, using the notation. This eases the process of transcribing huge documents, and stores the document in a well known notation, that can be used later on to publish the whole content in several formats, such as a PDF document, images or audio files for example.(undefined
Defining a probabilistic translation dictionaries algebra
Probabilistic Translation Dictionaries are around for some time, but there is a lack of a formal definition for their structure and base operations. In this article we start by discussing what these resources are, what researchers are using them for, and what tools can be used to create this them. Including a formal definition and a proposal for a XML schema for dictionaries interchange. Follows a discussion of a set of useful operations that can be performed over probabilistic translation dictionaries, like union, intersection, domain restriction and compo- sition. Together with this algebra formalization some insights on the operations usefulness and application are presented.This work is partially supported by Per-Fide.
The Per-Fide project is supported in part by a grant (Reference No. PTDC/CLEL-LI/108948/2008) from the Portuguese Foundation for Science and Technology and it is co-funded by the European Regional Development Fund
Conclave: Writing programs to understand programs
Software maintainers are often challenged with source code changes to improve software systems, or eliminate defects, in unfamiliar programs. To undertake these tasks a sufficient understanding of the system, or at least a small part of it, is required. One of the most time consuming tasks of this process is locating which parts of the code are responsible for some key functionality or feature. This paper introduces Conclave, an environment for software analysis, that enhances program comprehension activities. Programmers use natural languages to describe and discuss the problem domain, programming languages to write source code, and markup languages to have programs talking with other programs, and so this system has to cope with this heterogeneity of dialects, and provide tools in all these areas to effectively contribute to the understanding process. The source code, the problem domain, and the side effects of running the program are represented in the system using ontologies. A combination of tools (specialized in different kinds of languages) create mappings between the different domains. Conclave provides facilities for feature location, code search, and views of the software that ease the process of understanding the code, devising changes. The underlying feature location technique explores natural language terms used in programs (e.g. function and variable names); using textual analysis and a collection of Natural Language Processing techniques, computes synonymous sets of terms. These sets are used to score relatedness between program elements, and search queries or problem domain concepts, producing sorted ranks of program elements that address the search criteria, or concepts respectively. © Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques.(undefined)info:eu-repo/semantics/publishedVersio
PFTL: a systematic approach for describing filesystem tree processors
Today, most developers prefer to store information in databases. But plain filesystems were used for years, and are still used, to store information, commonly in files of heterogeneous formats that are organized in directory trees. This approach is a very flexible and natural way to create hierarchical organized structures of documents. We can devise a formal notation to describe a filesystem tree structure, similar to a grammar, assuming that filenames can be considered terminal symbols, and directory names non-terminal symbols. This specification would allow to derive correct language sentences (combination of terminal symbols) and to associate semantic actions, that can produce arbitrary side effects, to each valid sentence, just as we do in common parser generation tools. These specifications can be used to systematically process files in directory trees, and the final result depends on the semantic actions associated with each production rule. In this paper we revamped an old idea of using a domain specific language to implement these specifications similar to context free grammars. And introduce some examples of applications that can be built using this approach
PFTL: a systematic approach for describing filesystem tree processors
Today, most developers prefer to store information in databases. But plain filesystems were used for years, and are still used, to store information, commonly in files of heterogeneous formats that are organized in directory trees.
This approach is a very flexible and natural way to create hierarchical organized structures of documents.
We can devise a formal notation to describe a filesystem tree structure, similar to a grammar, assuming that filenames can be considered terminal symbols, and directory names non-terminal symbols. This specification would allow to derive correct language sentences (combination of terminal symbols) and to associate semantic actions, that can produce arbitrary side effects, to each valid sentence, just as we do in common parser generation tools. These specifications can be used to systematically process files in directory trees, and the final result depends on the semantic actions associated with each production rule.
In this paper we revamped an old idea of using a domain specific language to implement these specifications similar to context free grammars. And introduce some examples of applications that can be built using this approach
Deep learning powered question-answering framework for organizations digital transformation
In the context of digital transformation by governments, the public sector and other organizations, many information is moving to digital platforms. Chatbots and similar question-answering systems are becoming popular to answer information queries, opposed to browsing online repositories or webpages. State-of-the-art approaches for these systems may be laborious to implement, hard to train and maintain, and also require a high level of expertise. This work explores the definition of a generic framework to systematically build question-answering systems. A sandbox implementation of this framework enables the deployment of turnkey systems, directly from already existing collections of documents. These systems can then be used to provide a question-answering system communication channel to enrich the organization digital presence.This work is financed by the ERDF - European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme and by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia, within project POCI-01-0145-FEDER-029946 (DaVinci)
OML - Ontology Manipulation Language
Dissertação de Mestrado em InformáticaOntologies are a common approach used in nowadays for formal representation of concepts in a structured way. Natural language processing, translation tasks, or building blocks for the new web 2.0 (social networks for example) are instances of areas where the adoption of this approach is emerging and quickly growing.
Ontologies are easy to store and can be easily build from other data structures. Due to their structural nature, data processing can be automated into simple operations. Also new knowledge can be quickly infered, many
times based on simple mathematics properties. All these qualities brought together make ontologies a strong candidate for knowledge representation.
To perform all of these tasks over ontologies most of the times custom made tools are developed, that can be hard to adapt for future uses.
The purpose of the work presented in this dissertation is to study and implement tools that can be used to manipulate and maintain ontologies in a abstract and intuitive way. We specify a expressive and powerful, yet simple, domain specific language created to perform actions on ontologies. We will use this actions to manipulate knowledge in ontologies, infer new relations or concepts and also maintain the existing ones valid. We developed a set of tools and engines to implement this language in order to be able to use it. We illustrate the use of this technology with some simple case studies.Ontologias são uma opção muito utilizada hoje em dia para representar formalmente conceitos de uma forma estruturada. Processamento de linguagem natural, tarefas de tradução, ou componentes associados à web 2.0 (redes sociais por exemplo) são instâncias de ´áreas onde a adopção desta aproximação está a emergir e a crescer rapidamente.
Ontologias são fáceis de armazenar e podem ser facilmente construídas a partir de outras estruturas de dados. Devido `a sua natureza estruturada, o processamento de dados pode ser automatizado em operações simples. Além disso pode ser inferido novo conhecimento rapidamente, muitas vezes baseado em propriedades matemáticas simples. Todas estas qualidades em conjunto fazem das ontologias fortes candidatas para a representação de conhecimento.
Na maior parte dos casos, para executar este tipo de operações, são desenvolvidas ferramentas customizadas à medida que podem ser difíceis de adaptar para uso futuro.
O objectivo do trabalho apresentado nesta dissertação é estudar e implementar ferramentas que podem ser utilizadas para manipular e manter ontologias de uma forma abstracta e intuitiva. Especificamos uma linguagem de domínio específico simples, no entanto expressiva e poderosa para efectuar operações sobre ontologias. Vamos usar estas operações para manipular o conhecimento em ontologias, inferir novas relações ou conceitos e também para manter os existentes válidos. Foram desenvolvidas um conjunto de ferramentas e motores que implementam esta linguagem de modo a que possamos utilizá-la. Ilustramos o uso desta tecnologia com alguns casos de estudo simples
Monitorização ambiental do emissário submarino da Foz do Arelho usando um veículo submarino autónomo
- …
