687,722 research outputs found

    Classification of Occluded Objects using Fast Recurrent Processing

    Full text link
    Recurrent neural networks are powerful tools for handling incomplete data problems in computer vision, thanks to their significant generative capabilities. However, the computational demand for these algorithms is too high to work in real time, without specialized hardware or software solutions. In this paper, we propose a framework for augmenting recurrent processing capabilities into a feedforward network without sacrificing much from computational efficiency. We assume a mixture model and generate samples of the last hidden layer according to the class decisions of the output layer, modify the hidden layer activity using the samples, and propagate to lower layers. For visual occlusion problem, the iterative procedure emulates feedforward-feedback loop, filling-in the missing hidden layer activity with meaningful representations. The proposed algorithm is tested on a widely used dataset, and shown to achieve 2×\times improvement in classification accuracy for occluded objects. When compared to Restricted Boltzmann Machines, our algorithm shows superior performance for occluded object classification.Comment: arXiv admin note: text overlap with arXiv:1409.8576 by other author

    Problematika dan Analisis Kecurangan untuk Menurunkan Similarity yang Tidak Terdeteksi oleh similarity tool

    Get PDF
    This research aims to describe the problem of cheating in reducing similarity using Turnitin software in writing scientific articles, as well as the factors causing cheating problems carried out by students. The research method used is qualitative with a case study approach. Data collection techniques include observation, documentation, and interviews with students taking the scientific paper course. The sample consists of students taking the scientific paper course with focus on research mathematics in Open University, Study Program X, using purposive sampling technique. The research findings indicate, there are cheating techniques used to reduce similarity that are undetectable by Turnitin: (1) adjusting spaces in a text that is too large or too small, (2) converting text files into images, (3) adding specific letters to the manuscript, (4) inserting specific small-sized numbers that are almost invisible, (5) intentionally making typing errors, and (6) adding specific symbols to the scientific article manuscript

    Matching of Descriptive Labels to Glossary Descriptions

    Full text link
    Semantic text similarity plays an important role in software engineering tasks in which engineers are requested to clarify the semantics of descriptive labels (e.g., business terms, table column names) that are often consists of too short or too generic words and appears in their IT systems. We formulate this type of problem as a task of matching descriptive labels to glossary descriptions. We then propose a framework to leverage an existing semantic text similarity measurement (STS) and augment it using semantic label enrichment and set-based collective contextualization where the former is a method to retrieve sentences relevant to a given label and the latter is a method to compute similarity between two contexts each of which is derived from a set of texts (e.g., column names in the same table). We performed an experiment on two datasets derived from publicly available data sources. The result indicated that the proposed methods helped the underlying STS correctly match more descriptive labels with the descriptions

    Estudio sobre la información de texto contenida en imágenes web

    Get PDF
    La indexació i la recerca de pàgines web es basa en l'anàlisi de text. La tecnologia actual encara no pot processar d'una manera eficient i suficientment ràpida el text contingut a les imatges de les pàgines web. Aquest fet planteja un problema important d'indexació però també d'inaccessibilitat. Per poder quantificar aquest problema hem desenvolupat una aplicació software que ens permet realitzar un estudi sobre aquesta situació. Hem utilitzat aquest software per analitzar un conjunt de pàgines web representatives de la situació actual a Internet. Aquests resultats obtinguts s'han analitzat i comparat amb estudis anteriors.La indexación y la búsqueda de páginas web se basan en el análisis de texto. La tecnología actual, aún no puede procesar de una manera eficiente y suficientemente rápida el texto contenido en las imágenes de las páginas WWW. Este hecho plantea un problema importante de indexación pero también de inaccesibilidad. Para poder cuantificar este problema hemos desarrollado una aplicación software que nos permite realizar un estudio sobre esta situación. Hemos utilizado este software para analizar un conjunto de páginas web representativas de la situación actual en Internet. Estos resultados obtenidos se han analizado y comparado con estudios anteriores.Indexing and searching for WWW pages is relying on analyzing text. Current technology cannot process in an efficient way and quickly enough the text embedded in images on WWW pages. This fact is a significant indexing problem but inaccessibility too. To quantify this problem we have developed a software application that allows us to conduct a study on this. We have used this software to analyze a set of web pages representing the current Internet situation. These results have been analyzed and compared with previous studies.Nota: Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia

    Analysis of the Correlation Between the Lexical Profile and Coh-Metrix 3.0 Text Easability and Readability Indices of the Korean CSAT From 1994–2022

    Get PDF
    The Korean College Scholastic Ability Test (CSAT) is a highly competitive standardized assessment that graduating high-school seniors complete in the hope of getting a good score which will improve their chances of admission to a university of choice. The CSAT contains an English Section that has been described by scholars and educators alike as being far too difficult for the official English language curriculum to serve as sufficient preparation. The test’s lack of construct validity has been the basis for calls to revise the test to be better reflective of the school curriculum so that it can serve the evaluative purpose for which it is intended. Use of automated text evaluation methods with the software Coh-Metrix 3.0 in recent years has allowed scholars to quantify different dimensions of the text of the CSAT English Section, such as cohesion and syntactic complexity, that contribute to its reading difficulty. Older research conducted before the introduction of this software into the field used word frequency counts in large corpora such as the British National Corpus (BNC) as a measure of word familiarity or unfamiliarity, which was thought to directly contribute to difficulty because as the proportion of low-frequency words in a text increases against the proportion of high-frequency words, the word knowledge burden of the text increases in proportion. Since the introduction of automated software-based tools like Coh-Metrix 3.0 and Lexical Complexity Analyzer (LCA), these corpus-based research methods have largely fallen by the wayside. In this paper, I maintain that despite its lower sophistication, corpus-based lexical analysis can still produce uniquely meaningful findings because of the degree of manual control the researcher is afforded in calibrating the parameters of the text base and, most importantly, in selecting the ranges of word family frequency that are best tailored to a text rather than having the ranges or functions of frequency assigned automatically by software. This study reports correlations between the outputs of these two methodologies that both inform us about the validity of Coh-Metrix 3.0’s use in CSAT studies and quantify the strength of the role of word frequency in causing the excessive difficulty of the CSAT English Section

    Japanese-English Parallel Corpora in the Classroom : Applications and Challenges

    Get PDF
    Computerized corpora have given linguists crucial new insights on the usage of language. With the help of software, it is possible to index the words which appear in a large collection of text and analyze word usage and frequency. Data Driven Learning looks at how students can benefit from their own direct use of corpora. While monolingual corpora have a steep learning curve and are often too difficult for language learners, a solution to this problem may be found in bilingual parallel corpora, which are built from authentically translated text. This article looks at Eijiro on the WEB and Weblio, two online Japanese-English parallel corpus based websites. Some guided practice exercises developed by the author for use in university level English language writing classes in Japan are discussed, and some of the challenges in training students to use these resources to improve their English language writing are presented

    Mobile touch interfaces for the elderly

    Get PDF
    Elderly people are not averse to buying and using electronic gadgets. However regarding certain devices there is a persistent complaint about the "buttons being too small". Therefore the arrival of mobile touch devices like the iPhone and iPod Touch should be able to circumvent that problem because the button size and arrangement is under software control. However these devices have some accessibility issues which are identified. The accessibility issues stem from the one-size-fits-all concept. A solution is proposed which involves having a range of interface styles. A new user gesture called the shake is proposed to switch between interface styles. A separate investigation is made into the different possibilities for free-text entry
    • …
    corecore