275,224 research outputs found

    XLIndy: interactive recognition and information extraction in spreadsheets

    Get PDF
    Over the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit (visual and textual) information. This translates into a bottleneck, when it comes to automatic analysis and extraction of information. Therefore, we present XLIndy, a Microsoft Excel add-in with a machine learning back-end, written in Python. It showcases our novel methods for layout inference and table recognition in spreadsheets. For a selected task and method, users can visually inspect the results, change configurations, and compare different runs. This enables iterative fine-tuning. Additionally, users can manually revise the predicted layout and tables, and subsequently save them as annotations. The latter is used to measure performance and (re-)train classifiers. Finally, data in the recognized tables can be extracted for further processing. XLIndy supports several standard formats, such as CSV and JSON.Peer ReviewedPostprint (author's final draft

    Profile transformation in mobile technology based educational systems : a thesis presented in partial fulfillment of the requirements for the degree of Master of Information Science in Information Systems at Massey University, Palmerston North, New Zealand

    Get PDF
    In order to meet the learning needs from various types of students, computer aided education systems try to include new methods to provide personalized education to every student. From the early 1970s, a lot of adaptive educational systems have been created to provide training on a variety of subjects. Combined with the Internet, the adaptive educational systems have become web-based and even more popular. Recently, the development of mobile technology has made the web-based adaptive educational systems accessible through mobile phones. It is necessary that the students can also receive adaptive educational contents on mobile phones. This research project investigated the possible student's preference differences between Personal Computer (PC) and mobile phone, and then proposed a student profile transformation framework to address such differences. This research project conducted two surveys on the student profile transformation between PC and mobile phone. A demo web-based educational system that could be accessed from both PC and mobile phone was also developed for participants of the surveys to give more real and precise responses. Based on Felder-Silverman Learning Style Theory (Felder, 1993; Felder & Silverman, 1988) and the results of the surveys, this thesis proposes a student profile template and a student profile transformation framework, which both fully considered the influences of device capabilities and locations on students' preferences on mobile phones. Furthermore, the proposed framework integrates a solution for unsupported preferences and preference conflicts. By implementing the proposed template and framework, the students' preference changes between PC and mobile phone are automatically updated according to various device capabilities and locations, and then the students can receive adaptive educational contents that meet their updated preferences

    HTMLPhish: Enabling Phishing Web Page Detection by Applying Deep Learning Techniques on HTML Analysis

    Get PDF
    Recently, the development and implementation of phishing attacks require little technical skills and costs. This uprising has led to an ever-growing number of phishing attacks on the World Wide Web. Consequently, proactive techniques to fight phishing attacks have become extremely necessary. In this paper, we propose HTMLPhish, a deep learning based datadriven end-to-end automatic phishing web page classification approach. Specifically, HTMLPhish receives the content of the HTML document of a web page and employs Convolutional Neural Networks (CNNs) to learn the semantic dependencies in the textual contents of the HTML. The CNNs learn appropriate feature representations from the HTML document embeddings without extensive manual feature engineering. Furthermore, our proposed approach of the concatenation of the word and character embeddings allows our model to manage new features and ensure easy extrapolation to test data. We conduct comprehensive experiments on a dataset of more than 50,000 HTML documents that provides a distribution of phishing to benign web pages obtainable in the real-world that yields over 93% Accuracy and True Positive Rate. Also, HTMLPhish is a completely language-independent and client-side strategy which can, therefore, conduct web page phishing detection regardless of the textual language

    Granular approach to adaptivity in problem-based learning

    Get PDF
    Constructivist approach to learning has been around for quite some time. The constructivist theory has resulted in the development of a wide variety of learning environments, however the problem-based learning (PBL) environment is one of the most ideal and most popular area that implements the constructivism theory. PBL is an attractive approach to foster learner's critical problem solving and self-directed learning skills. However, it is difficult to implement effective PBL environments. A majority of existing PBL environments suffers from the fact that the students easily get inundated by the fine granularity of the problems and loose focus of overall aims of the learning process. This project has introduced student adaptivity technology into PBL environments to improve the effectiveness and efficiency of the learning process. To demonstrate the idea of PBL with student adaptivity, a web-based prototype is implemented in Process Costing, within the field of Accounting. Based on the architecture of the web-based intelligent educational systems, the problem base module is introduced. The basic architecture of the system is a typical three-tier, client-server structure. The client tier has the presentation interfaces that are implemented as HTML frames and run in a web browser. The application programs for performing adaptation, which were developed using PHP, reside in the middle layer, and communicate directly with the backend database: problem base, knowledge base that is the third tier. The web server as the communication channel also resides in the middle tier. With the system, students work on the real world costing calculation problems, and the system evaluates students' performance results on the problems to provide adaptation to the students. In summary, this project has successfully introduced the student adaptivity into the PBL environment. The strategies used in this thesis can be applied into the pure PBL educational systems to improve their adaptation capability

    Slovenian Virtual Gallery on the Internet

    Get PDF
    The Slovenian Virtual Gallery (SVG) is a World Wide Web based multimedia collection of pictures, text, clickable-maps and video clips presenting Slovenian fine art from the gothic period up to the present days. Part of SVG is a virtual gallery space where pictures hang on the walls while another part is devoted to current exhibitions of selected Slovenian art galleries. The first version of this application was developed in the first half of 1995. It was based on a file system for storing all the data and custom developed software for search, automatic generation of HTML documents, scaling of pictures and remote management of the system. Due to the fast development of Web related tools a new version of SVG was developed in 1997 based on object-oriented relational database server technology. Both implementations are presented and compared in this article with issues related to the transion between the two versions. At the end, we will also discuss some extensions to SVG. We will present the GUI (Graphical User Interface) developed specially for presentation of current exhibitions over the Web which is based on GlobalView panoramic navigation extension to developed Internet Video Server (IVS). And since SVG operates with a lot of image data, we will confront with the problem of Image Content Retrieval

    Web-course search engine : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University

    Get PDF
    The World Wide Web is an amazing place that people's lives more and more rely on. Especially, for the young generation, they spend a significant amount of their play and study time using the Internet. Many tools have been developed to help the educational users in finding educational resources. These tools include various search engines. Web directories and educational domain gateways. Nevertheless, these systems have many weaknesses that made them unsuitable for the specific search needs of the learners. The research presented in this thesis describes the development of the Web-course search engine, which is a friendly, efficient and accurate helper for the learners to get what they want in the vast Internet ocean. The most attractive feature of this system is that the system uses one universal language, which lets the searchers and the resources "communicate" with each other. Then the learner searchers can find the Web-based educational resources that are most fit to their needs and course providers can provide all necessary information about their courseware. This universal language is one widely acceptable Metadata standard. Following the Metadata standard, the system collects exact information about educational resources, provides adequate search parameters for search and returns evaluative results. By using the Web-course search engine, the learners and the other educational users are able to find useful, valuable and related educational resources more effectively and efficiently. Some improvement suggestions of the search mechanism in the World Wide Web have been brought forward for the future research as a result of this project

    The WEB Book experiments in electronic textbook design

    Get PDF
    This paper describes a series of three evaluations of electronic textbooks on the Web, which focused on assessing how appearance and design can affect users' sense of engagement and directness with the material. The EBONI Project's methodology for evaluating electronic textbooks is outlined and each experiment is described, together with an analysis of results. Finally, some recommendations for successful design are suggested, based on an analysis of all experimental data. These recommendations underline the main findings of the evaluations: that users want some features of paper books to be preserved in the electronic medium, while also preferring electronic text to be written in a scannable style
    corecore