40 research outputs found

    A Pairwise Dataset for GUI Conversion and Retrieval between Android Phones and Tablets

    Full text link
    With the popularity of smartphones and tablets, users have become accustomed to using different devices for different tasks, such as using their phones to play games and tablets to watch movies. To conquer the market, one app is often available on both smartphones and tablets. However, although one app has similar graphic user interfaces (GUIs) and functionalities on phone and tablet, current app developers typically start from scratch when developing a tablet-compatible version of their app, which drives up development costs and wastes existing design resources. Researchers are attempting to employ deep learning in automated GUIs development to enhance developers' productivity. Deep learning models rely heavily on high-quality datasets. There are currently several publicly accessible GUI page datasets for phones, but none for pairwise GUIs between phones and tablets. This poses a significant barrier to the employment of deep learning in automated GUI development. In this paper, we collect and make public the Papt dataset, which is a pairwise dataset for GUI conversion and retrieval between Android phones and tablets. The dataset contains 10,035 phone-tablet GUI page pairs from 5,593 phone-tablet app pairs. We illustrate the approaches of collecting pairwise data and statistical analysis of this dataset. We also illustrate the advantages of our dataset compared to other current datasets. Through preliminary experiments on this dataset, we analyse the present challenges of utilising deep learning in automated GUI development and find that our dataset can assist the application of some deep learning models to tasks involving automatic GUI development.Comment: 10 pages, 9 figure

    Improving the Efficiency of Mobile User Interface Development through Semantic and Data-Driven Analyses

    Get PDF
    Having millions of mobile applications from Google Play and Apple's App store, the smartphone is becoming a necessity in our life. People could access a wide variety of services by using the mobile application, between which user interfaces (UIs) work as an important proxy.A well-designed UI makes an application easy, practical, and efficient to use. However, due to the rapid application iteration speed and the shortage of UI designers, developers are required to design the UIs and implement them in a short time.As a result, they may be unaware of or compromise some important factors related to usability and accessibility during the process of developing user interfaces of mobile applications.Therefore, efficient and useful tools are needed to enhance the efficiency of the development of user interfaces. In this thesis, I proposed three techniques to improve the efficiency of designing and developing user interfaces through semantic and data-driven analyses. First, I proposed a UI design search engine to help designers or developers quickly create trendy and practical UI designs by exposing them to UI designs in real applications. I collected a large-scale UI design dataset by automatically exploring UIs from top-downloaded Android applications, and designed an image autoencoder-based UI design engine to enable finer-grained UI design search. Second, during the process of understanding the real UIs implementation, I found that existing applications have a severe accessibility issue of lacking labels for image-based buttons. Such an issue will hinder the blind users to access the key functionalities on UIs. As blind users need to rely on screen readers to read content on UIs, it requires the developers to set up appropriate labels for image-based buttons.Therefore, I proposed LabelDroid, which aims to automatically generate labels (i.e., the content description) of image-based buttons while developers implement UIs. Finally, as the above techniques all require the view hierarchical information, which contains the bounds and type of contained elements, to achieve the goal, it is essential to generalize these techniques to a broader scope. For example, UIs in the design-sharing platforms do not have any metadata about the elements. To do this, I conducted the first large-scale empirical study on evaluating existing object detection methods of detecting elements in UIs. By understanding the unique characteristics of UI elements and UIs, I proposed a hybrid method to boost the accuracy and precision of detecting elements on user interfaces. Such a fundamental method can be beneficial to many downstream applications, such as UI design search, UI code generation, and UI testing. In conclusion, I proposed three techniques to enhance the efficiency of designing and developing the user interfaces on mobile applications through semantic and data-driven analyses. Such methods could easily generalize to a broader scope, such as user interfaces of desktop apps and websites.I expect my proposed techniques and the understanding of user interfaces can facilitate the following research

    Automação da Identificação de UI Design Patterns a partir de Wireframes utilizando Machine Learning

    Get PDF
    TCC(graduação) - Universidade Federal de Santa Catarina. Centro Tecnológico. Sistemas de Informação.A popularização dos smartphones está mudando o dia a dia das pessoas de diversas formas, proporcionando acesso à internet na palma da mão em qualquer lugar. Diante disso, a demanda por aplicativos para estes dispositivos vêm crescendo juntamente com esta popularização. Durante o processo de desenvolvimento de um aplicativo, é necessário projetar e criar interfaces gráficas nas quais o usuário irá interagir com as funções criadas, enviando e recebendo informações. Esta etapa é importante para o sucesso de um aplicativo, porém executada de forma manual demanda um esforço considerável. Neste sentido, o presente trabalho apresenta um protótipo para suporte a melhoria de design de interfaces gráficas para Android criadas com App Inventor, utilizando UI design patterns. Este trabalho consiste no desenvolvimento de um modelo de classificação de wireframes em relação a um grupo de UI design patterns comuns à maioria dos aplicativos. O modelo construído utilizando ResNet50 atingiu 90% de acurácia, recall e F1-Score na classificação de wireframes em 7 conjuntos de UI Design Patterns. Esta predição poderá ser utilizada como base para realizar sugestões ao design de UIs em desenvolvido. Tais sugestões poderiam ajudar que a interface esteja embasada em conceitos e diretrizes reconhecidos, trazendo familiaridade e robustez. Desta forma contribuindo com a qualidade de design das interfaces dos apps criados com App Inventor e contribuindo ao processo de aprendizagem do estudante em relação a UI na aplicação educacional

    A Survey for Graphic Design Intelligence

    Full text link
    Graphic design is an effective language for visual communication. Using complex composition of visual elements (e.g., shape, color, font) guided by design principles and aesthetics, design helps produce more visually-appealing content. The creation of a harmonious design requires carefully selecting and combining different visual elements, which can be challenging and time-consuming. To expedite the design process, emerging AI techniques have been proposed to automatize tedious tasks and facilitate human creativity. However, most current works only focus on specific tasks targeting at different scenarios without a high-level abstraction. This paper aims to provide a systematic overview of graphic design intelligence and summarize literature in the taxonomy of representation, understanding and generation. Specifically we consider related works for individual visual elements as well as the overall design composition. Furthermore, we highlight some of the potential directions for future explorations.Comment: 10 pages, 2 figure

    Describing UI Screenshots in Natural Language

    Get PDF
    Funding Information: We acknowledge the computational resources provided by the Aalto Science-IT project. We thank Homayun Afrabandpey, Daniel Buschek, Jussi Jokinen, and Jörg Tiedemann for reviewing an earlier draft of this article. This work has been supported by the Horizon 2020 FET program of the European Union through the ERA-NET Cofund funding (grant CHIST-ERA-20-BCI-001), the European Innovation Council Pathfinder program (SYMBIOTIK project), and the Academy of Finland (grants 291556, 318559, 310947). Publisher Copyright: © 2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.Being able to describe any user interface (UI) screenshot in natural language can promote understanding of the main purpose of the UI, yet currently it cannot be accomplished with state-of-the-art captioning systems. We introduce XUI, a novel method inspired by the global precedence effect to create informative descriptions of UIs, starting with an overview and then providing fine-grained descriptions about the most salient elements. XUI builds upon computational models for topic classification, visual saliency prediction, and natural language generation (NLG). XUI provides descriptions with up to three different granularity levels that, together, describe what is in the interface and what the user can do with it. We found that XUI descriptions are highly readable, are perceived to accurately describe the UI, and score similarly to human-generated UI descriptions. XUI is available as open-source software.Peer reviewe

    Deep deformable models for 3D human body

    Get PDF
    Deformable models are powerful tools for modelling the 3D shape variations for a class of objects. However, currently the application and performance of deformable models for human body are restricted due to the limitations in current 3D datasets, annotations, and the model formulation itself. In this thesis, we address the issue by making the following contributions in the field of 3D human body modelling, monocular reconstruction and data collection/annotation. Firstly, we propose a deep mesh convolutional network based deformable model for 3D human body. We demonstrate the merit of this model in the task of monocular human mesh recovery. While outperforming current state of the art models in mesh recovery accuracy, the model is also light weighted and more flexible as it can be trained end-to-end and fine-tuned for a specific task. A second contribution is a bone level skinned model of 3D human mesh, in which bone modelling and identity-specific variation modelling are decoupled. Such formulation allows the use of mesh convolutional networks for capturing detailed identity specific variations, while explicitly controlling and modelling the pose variations through linear blend skinning with built-in motion constraints. This formulation not only significantly increases the accuracy in 3D human mesh reconstruction, but also facilitates accurate in the wild character animation and retargetting. Finally we present a large scale dataset of over 1.3 million 3D human body scans in daily clothing. The dataset contains over 12 hours of 4D recordings at 30 FPS, consisting of 7566 dynamic sequences of 3D meshes from 4205 subjects. We propose a fast and accurate sequence registration pipeline which facilitates markerless motion capture and automatic dense annotation for the raw scans, leading to automatic synthetic image and annotation generation that boosts the performance for tasks such as monocular human mesh reconstruction.Open Acces

    ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots

    Full text link
    We present a new task and dataset, ScreenQA, for screen content understanding via question answering. The existing screen datasets are focused either on structure and component-level understanding, or on a much higher-level composite task such as navigation and task completion. We attempt to bridge the gap between these two by annotating 80,000+ question-answer pairs over the RICO dataset in hope to benchmark the screen reading comprehension capacity
    corecore