14 research outputs found

    DOM-based Content Extraction of HTML Documents

    Get PDF
    Web pages often contain clutter around the body of the article as well as distracting features that take away from the true information that the user is pursuing. This can range from pop-up ads to flashy banners to unnecessary images and links scattered around the screen. Extraction of 'useful and relevant' content from web pages, has many applications ranging from lightweight environments, like cell phone and PDA browsing, to speech rendering for the visually impaired, to text summarization Most approaches to removing the clutter or making the content more readable involves either changing the size of the font or simply removing certain HTML-denoted components like images, thus taking away from the webpage's inherent look and feel. Unlike Content Reformatting, which aims to reproduce the entire webpage in a more convenient form, our solution directly addresses Content Extraction. We have developed a framework that employs an easily extensible set of techniques that incorporate advantages of previous work on content extraction while limiting the disadvantages. Our key insight is to work with the Document Object Model tree (after parsing and correcting the HTML), rather than with raw HTML markup. We have implemented our approach in a publicly available Web proxy that anyone can use to extract content from HTML web pages for their own purposes

    User requirements for location based services : an analysis on the basis of literature

    Full text link
    The high global penetration of mobile telephony provides a strong basis for the development and diffusion of mobile business applications. Especially for location based services, i.e. mobile services that consider the user’s current location to add value to the service provided, a high potential to become a major market success is seen. Nevertheless the development of mobile business and location based services has so far been lagging behind expert expecations. One of the reasons for this disappointing development is the failure of application developers to center their efforts on potential users and their needs and demands. The following paper therefore reviews the existing literature on user requirements in mobile business and location based services. A definition and characterization of location based services is given and a framework to categorize existing location based services is developed. Additionally, usefulness and usability are put in concrete terms as they are identified as the main determinants of end-user acceptance of location based services. Security concerns of potential users of location based services are analyzed and further limitations of the diffusion of location based services are discussed

    Extraction Contextuelle de Concepts Ontologiques pour le Web Sémantique

    No full text
    National audienceDe nombreux travaux de recherche, s'intéressant à l'annotation, l'intégration des données, les services web, etc. reposent sur les ontologies. Le développement de ces applications dépend de la richesse conceptuelle des ontologies. Dans cet article, nous présentons l'extraction des concepts ontologiques à partir de documents HTML. Afin d'améliorer ce processus, nous proposons un algorithme de clustering hiérarchique non supervisé intitulé " Extraction de Concepts Ontologiques " (ECO) ; celui-ci utilise d'une façon incrémentale l'algorithme de partitionnement Kmeans et est guidé par un contexte structurel. Ce dernier exploite la structure HTML ainsi que la position du mot afin d'optimiser la pondération de chaque terme ainsi que la sélection du co-occurrent le plus proche sémantiquement. Guidé par ce contexte, notre algorithme adopte un processus incrémental assurant un raffinement successif des contextes de chaque mot. Il offre, également, le choix entre une exécution entièrement automatique ou interactive. Nous avons expérimenté notre proposition sur un corpus du domaine du tourisme en français. Les résultats ont montré que notre algorithme améliore la qualité conceptuelle ainsi que la pertinence des concepts ontologiques extraits

    CaSePer: An efficient model for personalized web page change detection based on segmentation

    Get PDF
    AbstractUsers who visit a web page repeatedly at frequent intervals are more interested in knowing the recent changes that have occurred on the page than the entire contents of the web page. Because of the increased dynamism of web pages, it would be difficult for the user to identify the changes manually. This paper proposes an enhanced model for detecting changes in the pages, which is called CaSePer (Change detection based on Segmentation with Personalization). The change detection is micro-managed by introducing web page segmentation. The web page change detection process is made efficient by having it perform a dual-step process. The proposed method reduces the complexity of the change-detection by focusing only on the segments in which the changes have occurred. The user-specific personalized change detection is also incorporated in the proposed model. The model is validated with the help of a prototype implementation. The experiments conducted on the prototype implementation confirm a 77.8% improvement and a 97.45% accuracy rate

    An Architectural Design of a Conference System for Mobile Terminals

    Get PDF
    Recently the demands of the Internet services for the mobile environment are rapidly increasing with the growth of the Internet. Nevertheless, the technologies for the services are just in the beginning. A few simple services are only provided as compared with the diverse services on the wire-networked Internet. The mobile devices are so handicapped in many ways that the technologies should be different with those for the desktop systems. A Small display, no keyboard, and the low bandwidth of the mobile network should be considered to develop the Internet services for the mobile environment. The Internet technologies such as mobile IP, WAP, WML, VoiceXML, and the mobile browsers are appeared for the mobile Internet services. In this paper, the mobile Internet technologies are adapted to the audio teleconference service. Because the service is one of the most important Internet services, and also the mobile devices usually have the telephone functionalities, the service is going to be the killer application of the mobile Internet services. The technologies including WML, VoiceXML, and H.323 are appropriately tailored and the architecture of the service is proposed. The architectural model is implemented in a simulated mobile environment. The mobile audio teleconference service with the WWW and ftp services is proven to be very feasible with the architecture and tailored technologies proposed in this paper.목차 Abstract = ii 제1장 서론 = 1 제2장 기반기술 연구고찰 = 4 2.1 VoiceXML(Voice eXtensible Markup Language) = 4 2.2 WML(Wireless Markup Language) = 8 2.3 H.323 (회의 시스템을 위한 표준) = 10 제3장 이동 단말기를 위한 회의 시스템 = 16 3.1 회의 시스템 구조를 위한 가정 = 16 3.2 다자간 회의 서비스를 위한 시나리오 = 16 3.3 기반기술의 문제점과 해결방안 = 18 3.4 제안 모델의 기본 구조 = 21 3.4.1 다자간 회의 시스템을 위한 기본 구조 = 21 3.4.2 단순 음성회의 접속 = 22 3.4.3 음성회의 도중 데이터 서비스 = 23 3.5 음성 회의 시스템을 위한 동작 = 24 제4장 실험 = 28 4.1 실험환경 = 28 4.2 VoiceXML과 WML을 통한 메뉴 출력 = 29 4.3 음성 회의 서비?보? 위한 동작 = 33 제5장 결론 = 34 참고문헌 = 3

    Sistema di adattamento automatico di applicazioni interattive desktop per dispositivi mobili

    Get PDF
    Il lavoro di tesi presenta un sistema, basato su un proxy server, per l'adattamento automatico di pagine web all'accesso tramite dispositivi mobili. Il sistema esegue il processo di trasformazione prendendo in considerazione la descrizione logica delle pagine ed il costo in termini di spazio occupato dagli elementi che compongono l’interfaccia utente (testo, immagini, bottoni, ecc)

    Internet on mobiles: evolution of usability and user experience

    Get PDF
    The mobile Internet is no longer a new phenomenon; the first mobile devices supporting web access were introduced over 10 years ago. During the past ten years technology and business infrastructure have evolved and the number of mobile Internet users has increased all over the world. Service user interface, technology and business infrastructure have built a framework for service adaptation: they can act as enablers or as barriers. Users evaluate how the new technology adds value to their life based on multiple factors. This dissertation has its focus in the area of human-computer interaction research and practices. The overall goal of my research has been to improve the usability and the user experience of mobile Internet services. My research has sought answers to questions relevant in service development process. Questions have varied during the years, the main question being: How to design and create mobile Internet services that people can use and want to use? I have sought answers mostly from a human factors perspective, but have also taken the elements form technology and business infrastructure into consideration. In order to answer the questions raised in service development projects, we have investigated the mobile Internet services in the laboratory and in the field. My research has been conducted in various countries in 3 continents: Asia, Europe and North America. These studies revealed differences in mobile Internet use in different countries and between user groups. Studies in this dissertation were conducted between years 1998 and 2007 and show how questions and research methods have evolved during the time. Good service creation requires that all three factors: technology, business infrastructure and users are taken in consideration. When using knowledge on users in decision making, it is important to understand that the different phases of the service development cycle require the different kind of information on users. It is not enough to know about the users, the knowledge about users has to be transferred into decisions. The service has to be easy to use so that people can use it. This is related to usability. Usability is a very important factor in service adoption, but it is not enough. The service has to have relevant content from user perspective. The content is the reason why people want to use the service. In addition to the content and the ease of use, people evaluate the goodness of the service based on many other aspects: the cost, the availability and the reliability of the system for example. A good service is worth trying and after the first experience, is it worth using. These aspects are considered to influence the 'user experience' of the system. In this work I use lexical analysis to evaluate how the words "usability" and "user experience" are used in mobile HCI conference papers during the past 10 years. The use of both words has increased during the period and reflects the evolution of research questions and methodology over time
    corecore