16 research outputs found

    Web Genre Detection

    Get PDF
    Import 03/08/2012Hlavním cílem práce je prozkoumání existujících metod automatické detekce žánrů webových stránek, dále vybranou metodu implementovat pro detekci několika žánrů a otestovat.V práci bude popsaná stručná definice webových žánrů, čím se odlišují, definice vlastností webových stránek. Dále budou popsány předchozí prováděné experimenty a některé existující metody detekce. Aby mohla být přesnost vybrané metody ověřena, bude navržena desktopová aplikace pro stahování webových stránek, parsování webových stránek a detekci jejich žánrů. Nakonec bude přesnost detekce mojí metody porovnána s jinými metodami.The main aim of my work is exploration of existing methods of automatic web genre detection, then I implement and test the chosen method for several genres. In my work definition of web genre, differences between each other and features of web genres will be described briefly. Then previous solved experiments and some existing methods of detection will be described. In order to check accuracy of chosen method desktop application will be designed. It will be able to downloading and parsing a web pages and detect its genre. Finally accuracy of my method will be compared with other methods.460 - Katedra informatikyvýborn


    Full text link
    [EN] The purpose of this paper is to contribute to the analysis of cyberjournalistic documents by proposing a taxonomy to structure web-genre corpora. It takes into account the peculiarities of this field, the new genres, their hybridization and complexness. In this sense, the taxonomy presented in this paper does not match a single theoretical framework, but it tries to gather the guidelines of various works intended to study online journalism and its genres. This theoretical flexibility is needed to set up a proposal good enough to suit the current needs of the area. The paper also describes the main axes of the taxonomy, defines its communication unit and remarks the values and limitations of such a work. Its result is a highly structured and document-oriented database, a tool that will enable users to understand the current trends, to create new hybrids, and to detect the changes that happen within this field that is widening the horizons of the usage of language.Ezeiza Ramos, J.; Payá Ruiz, X.; Elordui Urkiza, U.; Epelde Pagola, I. (2011). TOWARDS A FACETED TAXONOMY TO STRUCTURE WEBGENRE CORPORA. Revista de Lingüística y Lenguas Aplicadas. 6:139-150. doi:10.4995/rlyla.2011.899SWORD139150

    Estimação da Usabilidade de Sites e-commerce Pelo Método da Máxima Verossimilhança

    Get PDF
    This article investigates the performance of Maximum Likelihood (ML)method to estimate the usability of e-commerce sites and compares theperformance between the software Bilog-MG ® and Excel ® in theestimation of these usability. For this, we used real data from a study onthe degree of usability of 361 e-commerce sites, which were applied 32items calibrated by the unidimensional logistic model of two parameters(MLU2) of Item Response Theory (IRT). The estimation process by MLwas developed in BILOG® and Excel® softwares. The results showedthat the ML method is flawed when there is a constant pattern of response,which may occur during application of the first items on the questionnaire.However, the method performs well when the pattern of responses is notconstant. Moreover, process performance prepared in Excel® was betterthan in conventional software BILOG-MG®. The parameters of the itemsalso influence the estimation of ML.O presente artigo investiga o desempenho do método da Máxima Verossimilhança (MV) na estimação da usabilidade de sites e-commerce e compara o desempenho entre os Softwares BILOG-MG® e Excel® na estimação dessas usabilidades. Para isso, foram utilizados dados reais de um estudo sobre o grau de usabilidade de 361 sites de e-commerce, no qual foram aplicados 32 itens calibrados por meio do modelo logístico unidimensional de dois parâmetros (MLU2) da Teoria da Resposta ao Item (TRI). O processo de estimação da usabilidade por MV foi feito nos softwares BILOG-MG® e Excel®. Os resultados mostraram que o método de MV apresenta deficiências quando existe um padrão de resposta constante, o que pode ocorrer durante a aplicação dos primeiros itens do questionário. Entretanto, o método apresenta um bom desempenho quando o padrão de respostas não é constante. Além disso, o desempenho do processo elaborado no Excel® foi melhor do que no software convencional BILOG-MG®. Os parâmetros dos itens também influenciam a estimação por MV

    Genre analysis of structured e-mails for corpus profiling.

    Get PDF
    This paper reports on our approach to the analysis of genre recognition using eyetracking. We focused on a collection of different types of email which could represent different datasets, such as, mailing lists for calls for papers, newsletters, etc. We found that genre analysis based on purpose, form and layout features is potentially effective for identifying the characteristics of these datasets and we have highlighted some of the new important features of genres. The results from a pilot study showed a clear effect, with an interaction between the email texts and the visual cues or features perceived and also the strategies employed for the processing of the texts. We found, in our small sample, that readers can determine the purpose and form of genres and that during this process some readers do skim the shape of the e-mails (form)

    An integrative semiotic methodology for IS research

    Get PDF
    Semiotics studies the production, transmission and interpretation of meaning represented symbolically in signs and messages, primarily but not exclusively in language. For information systems (IS) the domain of semiosis consists of human and non-human interactions based on technologically-mediated communication in the social, material and personal worlds. The paper argues that semiosis has immense bearing on processes of communication central to the advanced information and communications technologies studied by IS scholars. Its use separately, or in mixed methods approaches, enriches areas of central concern to the IS field, and is particularly apt when researching internet-based development and applications, for example virtual worlds and social media. This paper presents a four step structured methodology, informed by a central theoretical semiotic framework to provide practical guidelines for operationalizing semiotics in IS research. Thus, using illustrative examples, the paper provides a step-by-step semiotics approach to research based on distinctive semiotic concepts and their relationships – producer, consumer, medium, code, message and content – and how, at an integrating level, the personal, social and material worlds relate through sociation, embodiment and socio-materiality


    Get PDF
    Higher education requires intense information practices for knowledge diffusion, application, and innovation. Faculty assess and use a variety of documents when they teach their students. They make complex credibility assessments, and they use information with varying degrees of perceived credibility to achieve their teaching goals. Unfortunately, existing credibility research often stops once documents are selected. Our knowledge of the associations between credibility assessments and information use remains limited. Additionally, scholars agree professional tasks are associated with the genres of the documents used to accomplish these tasks. For example, instructional genres – including tutorials and lesson plans – are particularly useful to tasks related to educational pursuits. Despite the potential benefits that the identification of genres might provide in searching, navigation, and comprehension of information, researchers rarely exploit it to facilitate faculty’s document assessments and information use in support of their teaching. To solve the above problems, this study aimed at uncovering the associations between credibility assessments and information use tasks with respect to document genres in the context of university teaching. Specifically, it investigated whether there were associations: (1) between the criteria faculty employed to assess the credibility of the documents they used to support their teaching and the genres of these documents; (2) between the credibility criteria they employed to assess and the information use tasks they performed to use these documents; and (3) between the genres of these documents and the information use tasks they performed. Understanding the above associations could enhance our knowledge of the roles of document genres in making credibility assessments and information use decisions in the context of university teaching. This study took a mixed-method, bottom-up approach to uncovering the above associations. It first employed qualitative citation analysis to identify the genres of the documents faculty used in their courses based on the citations in their teaching materials (e.g., syllabi, lecture slides, lab notes, and links to resources). Customized genre repertoires that detailed the contexts in which different genres were used in Excel format were created. Semi-structured interviews were then implemented to collect data about the courses included in this study, the general criteria faculty employed to select documents for their courses, the tasks they performed to use the information in the genres this study selected for in-depth interviews, and the criteria they employed to assess the selected genres. Interviews were fully transcribed for qualitative content analysis. The results of this study indicate the criteria faculty employed served as function enablers that bridged the selected genres and the information use tasks they performed to use these genres. Credibility was one of the function enablers that enabled faculty to use the selected genres to perform different tasks. It played different roles in different tasks. It played a leading role in teaching tasks that developed students’ advanced learning skills and helped students to continue their learning. It also played a leading role in information use tasks that involved subject experts, professional orginations, and diverse genres originated from heterogeneous sources. The results also indicate the information use tasks faculty performed served as inclusion and exclusion criteria for genres. The information use tasks determined the information characteristics of genres that mattered in faculty’s task performance. This study shed new light on existing knowledge about genre-task associations by: (1) Exploring these associations in the context of university teaching; (2) Explicating these associations through the perception of credibility; and (3) Adding the criterion-genre and criterion-task associations to complement these associations. This study also enhanced our understanding of credibility in the context of university teaching. Finally, this study made several methodological contributions, including: (1) Transforming citation analysis from bibliographic records to research tools that engaged participants and ensured the accuracy of data; (2) Transforming citation analysis from bibliographic records to customized genre repertoires that preserved the contexts of information use; and (3) Developing rules to consistently select genres for investigating task-genre associations across disciplinary boundaries

    Proceedings of the 9th Dutch-Belgian Information Retrieval Workshop

    Get PDF