25 research outputs found

    El proceso de minería de datos asistido por ontologías

    Get PDF
    En este artículo mostraremos los avances obtenidos en la investigación y el desarrollo del proceso de Minería de Datos asistido por Ontologías. Además expondremos un modelo general para la aplicación de las ontologías, como así también, los tipos de ontologías planteadas. La principal motivación para la inclusión de ontologías en dicho proceso es la necesidad de incluir el conocimiento previo en los estudios de minería. Dicho conocimiento puede ser provenir del proceso mismo o del dominio de aplicación comprendido. Nuestro objetivo es el mejoramiento integral del proceso, a partir de un mejor entendimiento del dominio de aplicación, de los resultados obtenidos en sesiones previas y de la aplicación de la o las técnicas más convenientes de acuerdo a problema a resolver.Eje: Ingeniería de Software y Base de DatosRed de Universidades con Carreras en Informática (RedUNCI

    El proceso de minería de datos asistido por ontologías

    Get PDF
    En este artículo mostraremos los avances obtenidos en la investigación y el desarrollo del proceso de Minería de Datos asistido por Ontologías. Además expondremos un modelo general para la aplicación de las ontologías, como así también, los tipos de ontologías planteadas. La principal motivación para la inclusión de ontologías en dicho proceso es la necesidad de incluir el conocimiento previo en los estudios de minería. Dicho conocimiento puede ser provenir del proceso mismo o del dominio de aplicación comprendido. Nuestro objetivo es el mejoramiento integral del proceso, a partir de un mejor entendimiento del dominio de aplicación, de los resultados obtenidos en sesiones previas y de la aplicación de la o las técnicas más convenientes de acuerdo a problema a resolver.Eje: Ingeniería de Software y Base de DatosRed de Universidades con Carreras en Informática (RedUNCI

    El proceso de minería de datos asistido por ontologías

    Get PDF
    En este artículo mostraremos los avances obtenidos en la investigación y el desarrollo del proceso de Minería de Datos asistido por Ontologías. Además expondremos un modelo general para la aplicación de las ontologías, como así también, los tipos de ontologías planteadas. La principal motivación para la inclusión de ontologías en dicho proceso es la necesidad de incluir el conocimiento previo en los estudios de minería. Dicho conocimiento puede ser provenir del proceso mismo o del dominio de aplicación comprendido. Nuestro objetivo es el mejoramiento integral del proceso, a partir de un mejor entendimiento del dominio de aplicación, de los resultados obtenidos en sesiones previas y de la aplicación de la o las técnicas más convenientes de acuerdo a problema a resolver.Eje: Ingeniería de Software y Base de DatosRed de Universidades con Carreras en Informática (RedUNCI

    Classifiers for modeling of mineral potential

    Get PDF
    [Extract] Classification and allocation of land-use is a major policy objective in most countries. Such an undertaking, however, in the face of competing demands from different stakeholders, requires reliable information on resources potential. This type of information enables policy decision-makers to estimate socio-economic benefits from different possible land-use types and then to allocate most suitable land-use. The potential for several types of resources occurring on the earth's surface (e.g., forest, soil, etc.) is generally easier to determine than those occurring in the subsurface (e.g., mineral deposits, etc.). In many situations, therefore, information on potential for subsurface occurring resources is not among the inputs to land-use decision-making [85]. Consequently, many potentially mineralized lands are alienated usually to, say, further exploration and exploitation of mineral deposits. Areas with mineral potential are characterized by geological features associated genetically and spatially with the type of mineral deposits sought. The term 'mineral deposits' means .accumulations or concentrations of one or more useful naturally occurring substances, which are otherwise usually distributed sparsely in the earth's crust. The term 'mineralization' refers to collective geological processes that result in formation of mineral deposits. The term 'mineral potential' describes the probability or favorability for occurrence of mineral deposits or mineralization. The geological features characteristic of mineralized land, which are called recognition criteria, are spatial objects indicative of or produced by individual geological processes that acted together to form mineral deposits. Recognition criteria are sometimes directly observable; more often, their presence is inferred from one or more geographically referenced (or spatial) datasets, which are processed and analyzed appropriately to enhance, extract, and represent the recognition criteria as spatial evidence or predictor maps. Mineral potential mapping then involves integration of predictor maps in order to classify areas of unique combinations of spatial predictor patterns, called unique conditions [51] as either barren or mineralized with respect to the mineral deposit-type sought

    Doctor of Philosophy

    Get PDF
    dissertationPublic health surveillance systems are crucial for the timely detection and response to public health threats. Since the terrorist attacks of September 11, 2001, and the release of anthrax in the following month, there has been a heightened interest in public health surveillance. The years immediately following these attacks were met with increased awareness and funding from the federal government which has significantly strengthened the United States surveillance capabilities; however, despite these improvements, there are substantial challenges faced by today's public health surveillance systems. Problems with the current surveillance systems include: a) lack of leveraging unstructured public health data for surveillance purposes; and b) lack of information integration and the ability to leverage resources, applications or other surveillance efforts due to systems being built on a centralized model. This research addresses these problems by focusing on the development and evaluation of new informatics methods to improve the public health surveillance. To address the problems above, we first identified a current public surveillance workflow which is affected by the problems described and has the opportunity for enhancement through current informatics techniques. The 122 Mortality Surveillance for Pneumonia and Influenza was chosen as the primary use case for this dissertation work. The second step involved demonstrating the feasibility of using unstructured public health data, in this case death certificates. For this we created and evaluated a pipeline iv composed of a detection rule and natural language processor, for the coding of death certificates and the identification of pneumonia and influenza cases. The second problem was addressed by presenting the rationale of creating a federated model by leveraging grid technology concepts and tools for the sharing and epidemiological analyses of public health data. As a case study of this approach, a secured virtual organization was created where users are able to access two grid data services, using death certificates from the Utah Department of Health, and two analytical grid services, MetaMap and R. A scientific workflow was created using the published services to replicate the mortality surveillance workflow. To validate these approaches, and provide proofs-of-concepts, a series of real-world scenarios were conducted

    Um sistema de telemedicina de baixo custo em larga escala

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Ciência da ComputaçãoA Telemedicina é um serviço baseado na utilização dos recursos de comunicação e informática visando fornecer auxilio médico à distância para um paciente, fornecendo ferramentas para que um médico especialista realize sua função sem que seja necessário o contato direto entre eles. Para que um sistema de telemedicina seja funcional e adequado, é necessário entender a necessidade operacional e técnica, assim como delimitar que tipos de recursos são importantes para que o médico execute adequadamente seu trabalho, mesmo estando distante de seu ambiente normal de trabalho. Apresentamos aqui um modelo de rede de telemedicina em larga escala voltado para a realização de laudos a distância de exames de radiologia e cardiologia de acordo com a realidade, necessidades e capacidades operacionais e funcionais identificadas para o Estado de Santa Catarina. Demonstramos a viabilidade da utilização da infra-estrutura de rede e aparelhagem médica computadorizada já instalada para diminuir os custos de deslocamento de pacientes e o tempo de espera para a realização do laudo dos exames médicos auxiliados por computador, agilizando o atendimento médico aos pacientes. Garantimos a segurança, o armazenamento e a correta visualização dos exames, através da utilização de protocolos e padrões utilizados internacionalmente

    Analysis and Modular Approach for Text Extraction from Scientific Figures on Limited Data

    Get PDF
    Scientific figures are widely used as compact, comprehensible representations of important information. The re-usability of these figures is however limited, as one can rarely search directly for them, since they are mostly indexing by their surrounding text (e. g., publication or website) which often does not contain the full-message of the figure. In this thesis, the focus is on making the content of scientific figures accessible by extracting the text from these figures. A modular pipeline for unsupervised text extraction from scientific figures, based on a thorough analysis of the literature, was built to address the problem. This modular pipeline was used to build several unsupervised approaches, to evaluate different methods from the literature and new methods and method combinations. Some supervised approaches were built as well for comparison. One challenge, while evaluating the approaches, was the lack of annotated data, which especially needed to be considered when building the supervised approach. Three existing datasets were used for evaluation as well as two datasets of 241 scientific figures which were manually created and annotated. Additionally, two existing datasets for text extraction from other types of images were used for pretraining the supervised approach. Several experiments showed the superiority of the unsupervised pipeline over common Optical Character Recognition engines and identified the best unsupervised approach. This unsupervised approach was compared with the best supervised approach, which, despite of the limited amount of training data available, clearly outperformed the unsupervised approach.Infografiken sind ein viel verwendetes Medium zur kompakten Darstellung von Kernaussagen. Die Nachnutzbarkeit dieser Abbildungen ist jedoch häufig limitiert, da sie schlecht auffindbar sind, da sie meist über die umschließenden Medien, wie beispielsweise Publikationen oder Webseiten, und nicht über ihren Inhalt indexiert sind. Der Fokus dieser Arbeit liegt auf der Extraktion der textuellen Inhalte aus Infografiken, um deren Inhalt zu erschließen. Ausgehend von einer umfangreichen Analyse verwandter Arbeiten, wurde ein generalisierender, modularer Ansatz für die unüberwachte Textextraktion aus wissenschaftlichen Abbildungen entwickelt. Mit diesem modularen Ansatz wurden mehrere unüberwachte Ansätze und daneben auch noch einige überwachte Ansätze umgesetzt, um diverse Methoden aus der Literatur sowie neue und bisher noch nicht genutzte Methoden zu vergleichen. Eine Herausforderung bei der Evaluation war die geringe Menge an annotierten Abbildungen, was insbesondere beim überwachten Ansatz Methoden berücksichtigt werden musste. Für die Evaluation wurden drei existierende Datensätze verwendet und zudem wurden zusätzlich zwei Datensätze mit insgesamt 241 Infografiken erstellt und mit den nötigen Informationen annotiert, sodass insgesamt 5 Datensätze für die Evaluation verwendet werden konnten. Für das Pre-Training des überwachten Ansatzes wurden zudem zwei Datensätze aus verwandten Textextraktionsbereichen verwendet. In verschiedenen Experimenten wird gezeigt, dass der unüberwachte Ansatz besser funktioniert als klassische Texterkennungsverfahren und es wird aus den verschiedenen unüberwachten Ansätzen der beste ermittelt. Dieser unüberwachte Ansatz wird mit dem überwachten Ansatz verglichen, der trotz begrenzter Trainingsdaten die besten Ergebnisse liefert

    Geometric intersection problems

    Full text link

    Uma metodologia eficiente para recuperação de exames médicos DICOM por similaridade de caracteristicas visuais

    Get PDF
    Dissertação (Mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Ciência da Computaçao.À medida que se iniciou o processo de popularização de exames médicos em formato digital, surgiu à necessidade de se desenvolver técnicas capazes de facilitar o processo de tomada de decisão médica. Nesse contexto, técnicas de Recuperação de Imagens Médicas Baseada no Conteúdo - Content-Based Medical Image Retrieval (CBMIR) [MULLER, 2004a] têm sido empregadas
    corecore