9 research outputs found

    Automatic assessment of elecrtical exams

    Get PDF
    Sähköiset tentit ovat yleistyneet huomattavasti ja samalla näille on luotu monia eri tenttijärjestelmiä. Myös automaattisen tarkastamisen kehittäminen on hiljalleen kasvattanut suosiotaan. Suuri osa tästä kehityksestä on painottunut esseiden tarkastamiseen. Samaa kehitystä voitaneen myös soveltaa muissa tenttityypeissä. Tässä työssä tarkastellaan eri tenttityyppejä ja miten niitä on sähköistetty, eli vaihdettu paperi vastausalustana tietokoneeseen. Esseiden kirjoittamiseen tietokonetta on jo käytetty pitkään, mutta tenttien tekemiseen sitä on alettu vasta viimeaikoina hyödyntämään suuremmalla mittakaavalla. Matematiikan tenttien sähköistämisessä suurin ongelma on matematiikan kaavojen ja merkkien kirjoittaminen tietokoneella, mutta tämä ei ole ylitsepääsemätön este. Tietokoneiden hyödyntäminen vain tentteihin vastaamiseen jättää paljon tietokoneen potentiaalista käyttämättä; tietokoneilla voidaan myös tarkastaa tentit automaattisesti. Näin opettajien työtaakka pienenee huomattavasti, ja opiskelijoiden mahdollisuudet itsenäiseen oppimiseen kasvaa. Jo yksinkertainenkin ohjelma, joka helpottaa tenttijärjestelmistä saatavaa vastausaineiston hallintaa ja tarkastamista, johtaa tähän tavoitteeseen. Tässä työssä esitellään tähän tehtävään tehty ohjelma, jolle on annettu nimeksi PunaKynä

    A Study of Retrieval Success with Original Works of Art Comparing the Subject Index Terms Provided by Experts in Art Museums With Those Provided By Novice and Intermediate Indexers

    Get PDF
    This paper compares the retrieval success of terms for searching online art museum collections of two different origins: the use of terms that are the natural byproducts of curatorial processes and those provided by volunteer gallery teachers and students. The terms used by scholars and gallery teachers obtained the best retrieval, with approximately 15% of terms successfully retrieving the desired work. Little successful application of the terms available in the Art and Architecture Thesaurus (AAT) or of the terms used by scholars was seen in the online museum collections. Overall, the terms supplied by study participants had poor retrieval success. Application of additional index terms describing the basic elements, materials and colors featured in the works and terms from the AAT could improve retrieval

    The Relative Effectiveness of Text and Images in Image Search Result Listings

    Get PDF
    This study was conducted to determine the best type of image surrogate to use within search result sets: Text, Image Preview, or Text + Image Preview. Users' performance and satisfaction with the three different image surrogates within search result sets were evaluated. Data was collected from 28 participants via a web-based system of questionnaires and logs of their interactions with result set presentations. Of the three image surrogate types, Image Preview and Text + Image Preview surrogates consistently outperformed Text surrogates on measures of the time required to make relevance judgments, the quality of those relevance judgments, perceived ease of use and perceived usefulness. While relevance judgment scoring with Image Preview and Text + Image Preview surrogates was identical, answers to the post-session questionnaire indicated that users may prefer the Text + Image Preview surrogate, as it was "liked best overall" by more people

    Human-Centered Content-Based Image Retrieval

    Get PDF
    Retrieval of images that lack a (suitable) annotations cannot be achieved through (traditional) Information Retrieval (IR) techniques. Access through such collections can be achieved through the application of computer vision techniques on the IR problem, which is baptized Content-Based Image Retrieval (CBIR). In contrast with most purely technological approaches, the thesis Human-Centered Content-Based Image Retrieval approaches the problem from a human/user centered perspective. Psychophysical experiments were conducted in which people were asked to categorize colors. The data gathered from these experiments was fed to a Fast Exact Euclidean Distance (FEED) transform (Schouten & Van den Broek, 2004), which enabled the segmentation of color space based on human perception (Van den Broek et al., 2008). This unique color space segementation was exploited for texture analysis and image segmentation, and subsequently for full-featured CBIR. In addition, a unique CBIR-benchmark was developed (Van den Broek et al., 2004, 2005). This benchmark was used to explore what and how several parameters (e.g., color and distance measures) of the CBIR process influence retrieval results. In contrast with other research, users judgements were assigned as metric. The online IR and CBIR system Multimedia for Art Retrieval (M4ART) (URL: http://www.m4art.org) has been (partly) founded on the techniques discussed in this thesis. References: - Broek, E.L. van den, Kisters, P.M.F., and Vuurpijl, L.G. (2004). The utilization of human color categorization for content-based image retrieval. Proceedings of SPIE (Human Vision and Electronic Imaging), 5292, 351-362. [see also Chapter 7] - Broek, E.L. van den, Kisters, P.M.F., and Vuurpijl, L.G. (2005). Content-Based Image Retrieval Benchmarking: Utilizing Color Categories and Color Distributions. Journal of Imaging Science and Technology, 49(3), 293-301. [see also Chapter 8] - Broek, E.L. van den, Schouten, Th.E., and Kisters, P.M.F. (2008). Modeling Human Color Categorization. Pattern Recognition Letters, 29(8), 1136-1144. [see also Chapter 5] - Schouten, Th.E. and Broek, E.L. van den (2004). Fast Exact Euclidean Distance (FEED) transformation. In J. Kittler, M. Petrou, and M. Nixon (Eds.), Proceedings of the 17th IEEE International Conference on Pattern Recognition (ICPR 2004), Vol 3, p. 594-597. August 23-26, Cambridge - United Kingdom. [see also Appendix C

    Finding hidden semantics of text tables

    Get PDF
    Combining data from different sources for further automatic processing is often hindered by differences in the underlying semantics and representation. Therefore when linking information presented in documents in tabular form with data held in databases, it is important to determine as much information about the table and its content. Important information about the table data is often given in the text surrounding the table in that document. The table's creators cannot clarify all the semantics in the table itself therefore they use the table context or the text around it to give further information. These semantics are very useful when integrating and using this data, but are often difficult to detect automatically. We propose a solution to part of this problem based on a domain ontology. The input to our system is a document that contains tabular data and the system aims to find semantics in the document that are related to the tabular data. The output of our system is a set of detected semantics linked to the corresponding table. The system uses elements of semantic detection, semantic representation, and data integration. Semantic detection uses a domain ontology, in which we store concepts of that domain. This allows us to analyse the content of the document (text) and detect context information about the tables present in a document containing tabular data. Our approach consists of two components: (1) extract, from the domain ontology, concepts, synonyms, and relations that correspond to the table data. (2) Build a tree for the paragraphs and use this tree to detect the hidden semantics by searching for words matching the extracted concepts. Semantic representation techniques then allow representation of the detected semantics of the table data. Our system represents the detected semantics, as either 'semantic units' or 'enhanced metadata'. Semantic units are a flexible set of meta-attributes that describe the meaning of the data item along with the detected semantics. In addition, each semantic unit has a concept label associated with it that specifies the relationship between the unit and the real world aspects it describes. In the enhanced metadata, table metadata is enhanced with the semantics and representation context found in the text. Integrating data in our proposed system takes place in two steps. First, the semantic units are converted to a common context, reflecting the application. This is achieved by using appropriate conversion functions. Secondly, the semantically identical semantic units, will be identified and integrated into a common representation. This latter is the subject of future work. Thus the research has shown that semantics about a table are in the text and how it is possible to locate and use these semantics by transforming them into an appropriate form to enhance the basic table metadata

    Information Retrieval Beyond the Text Document

    Get PDF
    published or submitted for publicatio

    Combinatoric Models of Information Retrieval Ranking Methods and Performance Measures for Weakly-Ordered Document Collections

    Get PDF
    This dissertation answers three research questions: (1) What are the characteristics of a combinatoric measure, based on the Average Search Length (ASL), that performs the same as a probabilistic version of the ASL?; (2) Does the combinatoric ASL measure produce the same performance result as the one that is obtained by ranking a collection of documents and calculating the ASL by empirical means?; and (3) When does the ASL and either the Expected Search Length, MZ-based E, or Mean Reciprocal Rank measure both imply that one document ranking is better than another document ranking? Concepts and techniques from enumerative combinatorics and other branches of mathematics were used in this research to develop combinatoric models and equations for several information retrieval ranking methods and performance measures. Empirical, statistical, and simulation means were used to validate these models and equations. The document cut-off performance measure equation variants that were developed in this dissertation can be used for performance prediction and to help study any vector V of ranked documents, at arbitrary document cut-off points, provided that (1) relevance is binary and (2) the following information can be determined from the ranked output: the document equivalence classes and their relative sequence, the number of documents in each equivalence class, and the number of relevant documents that each class contains. The performance measure equations yielded correct values for both strongly- and weakly-ordered document collections
    corecore