7 research outputs found

    Sección bibliográfica

    Get PDF

    Classification of Text Documents Using a Logical Analysis Approach.

    Get PDF
    The main problem investigated in this dissertation is as follows: Given are two samples of documents each from one of two disjoint collections of documents. The question is how to obtain a set of patterns of text features that make a document in the two samples (and other unclassified documents) to be classified correctly in one and only one document class. A sample of 2,897 documents from the TIPSTER collection was used to investigate this problem. This problem was divided into the following four subproblems. The first subproblem consists of identifying the set keywords to describe the documents\u27 content. Computational results of twenty experiments suggested that single-word keywords addressed the main problem effectively. The second subproblem requires a methodology to construct classification rules to infer the class of unclassified documents. A logical analysis approach called the One Clause At a Time algorithm (OCAT) is used to address this problem. Its accuracy is compared to the one of the Vector Space Model (VSM), a benchmarking methodology in document classification processes. Under identical experimental conditions, some computational results suggests that the OCAT algorithm is more accurate than the VSM to solve the main problem. The third subproblem consists of providing a methodology to construct better rules as more documents become available. This problem has been investigated using the OCAT algorithm under a guided and a random teaming approach. Computational results on three samples of 510 documents indicate that the guided teaming approach constructs more accurate rules. In the fourth subproblem an incremental version of the OCAT algorithm is required. The algorithm is needed to speed up the construction of the classification rules. Computational results on three samples of 336 documents each show that: (i) the classification rules become accurate more rapidly, (ii) the CPU times are substantially reduced, and ( iii) the rules become more complex as more documents were added to the experiment. In summary, the results of this research suggest with high confidence that the incremental OCAT algorithm can perform better than the VSM and that it can deliver better and faster results for the classification of large collections of documents

    Efecto de un sistema silvopastoril sobre la calidad de la leche, comparado con un sistema de producción convencional

    Get PDF
    En este estudio se evaluó el efecto de un sistema Silvopastoril sobre la calidad de la leche, comparado con un sistema de producción convencional, además cuáles factores de la planta afectan los niveles de grasa y proteína en la leche, asimismo se desarrollaron 6 RNA para predecir porcentajes de grasa y proteína. En el SSP, se utilizó botón de oro (Tithonia diversifolia) mientras que para el convencional uno con pasto estrella (Cynodon nlemfuensis). Las variables evaluadas para las dos pasturas fueron: MS, CEN PC, FDN, y ENER y para la calidad de la leche se midió el % de GRASA y % de PROTEINA. El análisis de varianza no mostró diferencias significativas para la fuente de variación bloques en las variables bromatológicas de las pasturas ni para las variables de calidad de leche, pero si se encontraron diferencias altamente significativas (P0,01) para las tres pasturas evaluadas (BO, ES(BO) y ES). La MS, PC y las CEN tanto para ES y ES(BO) no presentaron diferencias significativas, mientras que el BO si presentaron. Con respecto a las composiciones de FDN, ENER, la grasa y la proteína de la leche fueron significativamente diferentes para los tres tratamientos. BO tuvo un porcentaje superior al promedio en la grasa y la proteína de la leche. Por su parte, el asocio ES(BO) también mostró un porcentaje mayor, comparado con el ES. Para las RNA, se seleccionaron las que registraron el mayor R2. El R2 para la grasa y proteína de la leche en el BO fue de 0,9601 y 0,9622 respectivamente. 0,957 y 0,8957 para la grasa y la proteína de la leche respectivamente en un sistema de ES(BO); para el ES, los valores de R2 fueron de 0.9646 y 0.938 para la grasa y proteína de la leche respectivamente. Con el sistema Silvopastoril se obtuvieron los mejores valores de Resumen: grasa y proteína de la leche, lo que significa una mejor calidad de la misma, comparada con un sistema de producción convencional. Además el uso de las redes neuronales artificiales permitió predecir valores de grasa y proteína para los dos sistemas estudiados, con un alto nivel de predicción.//Abstract: In this study the effect of a Silvopastoril system on the quality of milk as compared to a conventional production system was determined as well as which plant factors were affecting the levels of fat and protein in milk, lastly six RNA to predict percentages of fat and protein were developed. For the purpose of the research the buttercup as SSP was used (Tithonia diversifolia) and for the conventional one with star grass (Cynodon nlemfuensis). The variables analyzed for pastures were MS, CEN CP, NDF, and ENER and for measuring the quality of milk the% of FAT and % of PROTEIN were evaluated. The analysis of variance exhibited no significant differences for the source of variation block in the bromatological variables of pastures or for the milk quality variables, but highly significant differences were found (P 0.01) for the three evaluated pastures (BO, ES (BO) and ES). MS, CP and CEN of the ES and ES(BO) did not showed significant differences, whereas BO did. FDN, ENER, fat and milk protein were significantly different for the three treatments. BO had a higher percentage in relation to the fat and milk protein average. Meanwhile, the union ES (BO) also showed a higher percentage compared to the ES. For RNA, the ones showing the highest R2 were selected. The R2 for fat and milk protein in the BO was 0.9601 and 0.9622 respectively. 0.957 and 0.8957 for the fat and milk protein respectively in a ES system (BO); for the ES, the R2 values were 0.9646 and 0.938 for fat and milk protein respectively. Silvopastoril system obtained the best values of fat and milk protein, which means a better quality of it, compared to a conventional production system. Furthermore, the use of artificial neural networks allowed to predict values of fat and protein for the two systems studied, with a high level of prediction.Maestrí

    Formal concept matching and reinforcement learning in adaptive information retrieval

    Get PDF
    The superiority of the human brain in information retrieval (IR) tasks seems to come firstly from its ability to read and understand the concepts, ideas or meanings central to documents, in order to reason out the usefulness of documents to information needs, and secondly from its ability to learn from experience and be adaptive to the environment. In this work we attempt to incorporate these properties into the development of an IR model to improve document retrieval. We investigate the applicability of concept lattices, which are based on the theory of Formal Concept Analysis (FCA), to the representation of documents. This allows the use of more elegant representation units, as opposed to keywords, in order to better capture concepts/ideas expressed in natural language text. We also investigate the use of a reinforcement leaming strategy to learn and improve document representations, based on the information present in query statements and user relevance feedback. Features or concepts of each document/query, formulated using FCA, are weighted separately with respect to the documents they are in, and organised into separate concept lattices according to a subsumption relation. Furthen-nore, each concept lattice is encoded in a two-layer neural network structure known as a Bidirectional Associative Memory (BAM), for efficient manipulation of the concepts in the lattice representation. This avoids implementation drawbacks faced by other FCA-based approaches. Retrieval of a document for an information need is based on concept matching between concept lattice representations of a document and a query. The learning strategy works by making the similarity of relevant documents stronger and non-relevant documents weaker for each query, depending on the relevance judgements of the users on retrieved documents. Our approach is radically different to existing FCA-based approaches in the following respects: concept formulation; weight assignment to object-attribute pairs; the representation of each document in a separate concept lattice; and encoding concept lattices in BAM structures. Furthermore, in contrast to the traditional relevance feedback mechanism, our learning strategy makes use of relevance feedback information to enhance document representations, thus making the document representations dynamic and adaptive to the user interactions. The results obtained on the CISI, CACM and ASLIB Cranfield collections are presented and compared with published results. In particular, the performance of the system is shown to improve significantly as the system learns from experience.The School of Computing, University of Plymouth, UK

    Einsatz neuronaler Netze als Transferkomponenten beim Retrieval in heterogenen Dokumentbeständen

    Full text link
    "Die zunehmende weltweite Vernetzung und der Aufbau von digitalen Bibliotheken führt zu neuen Möglichkeiten bei der Suche in mehreren Datenbeständen. Dabei entsteht das Problem der semantischen Heterogenität, da z.B. Begriffe in verschiedenen Kontexten verschiedene Bedeutung haben können. Die dafür notwendigen Transferkomponenten bilden eine neue Herausforderung, für die neuronale Netze gut geeignet sind." (Autorenreferat
    corecore