167 research outputs found

    Improving routine operation management

    Get PDF
    The focus of the research project is to investigate the daily operation management in an organisation and give suggestions to improve their business. This research project is to help the organisation improve their daily operations which will tend to improve the overall business. A SWOT analysis was conducted to identify errors and poor performance areas for operations. The 3cā€™s theory was utilised to enhance the results and research. That theory includes competitors, customers and climate, because if someone wants to improve their business then they must compete with their competitors and gather knowledge of their customers. To complete this research, I will go through the personal observation and some informal discussion with the co-workers and managers. The qualitative research method has been selected. To conclude, the business can improve by cultivating work efficiency and maximum utilisation of equipment

    Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning

    Full text link
    Cinemagraphs are a compelling way to convey dynamic aspects of a scene. In these media, dynamic and still elements are juxtaposed to create an artistic and narrative experience. Creating a high-quality, aesthetically pleasing cinemagraph requires isolating objects in a semantically meaningful way and then selecting good start times and looping periods for those objects to minimize visual artifacts (such a tearing). To achieve this, we present a new technique that uses object recognition and semantic segmentation as part of an optimization method to automatically create cinemagraphs from videos that are both visually appealing and semantically meaningful. Given a scene with multiple objects, there are many cinemagraphs one could create. Our method evaluates these multiple candidates and presents the best one, as determined by a model trained to predict human preferences in a collaborative way. We demonstrate the effectiveness of our approach with multiple results and a user study.Comment: To appear in ICCV 2017. Total 17 pages including the supplementary materia

    Controllable Text-to-Image Generation with GPT-4

    Full text link
    Current text-to-image generation models often struggle to follow textual instructions, especially the ones requiring spatial reasoning. On the other hand, Large Language Models (LLMs), such as GPT-4, have shown remarkable precision in generating code snippets for sketching out text inputs graphically, e.g., via TikZ. In this work, we introduce Control-GPT to guide the diffusion-based text-to-image pipelines with programmatic sketches generated by GPT-4, enhancing their abilities for instruction following. Control-GPT works by querying GPT-4 to write TikZ code, and the generated sketches are used as references alongside the text instructions for diffusion models (e.g., ControlNet) to generate photo-realistic images. One major challenge to training our pipeline is the lack of a dataset containing aligned text, images, and sketches. We address the issue by converting instance masks in existing datasets into polygons to mimic the sketches used at test time. As a result, Control-GPT greatly boosts the controllability of image generation. It establishes a new state-of-art on the spatial arrangement and object positioning generation and enhances users' control of object positions, sizes, etc., nearly doubling the accuracy of prior models. Our work, as a first attempt, shows the potential for employing LLMs to enhance the performance in computer vision tasks

    Scaling Novel Object Detection with Weakly Supervised Detection Transformers

    Full text link
    Weakly supervised object detection (WSOD) enables object detectors to be trained using image-level class labels. However, the practical application of current WSOD models is limited, as they operate at small scales and require extensive training and refinement. We propose the Weakly Supervised Detection Transformer, which enables efficient knowledge transfer from a large-scale pretraining dataset to WSOD finetuning on hundreds of novel objects. We leverage pretrained knowledge to improve the multiple instance learning framework used in WSOD, and experiments show our approach outperforms the state-of-the-art on datasets with twice the novel classes than previously shown.Comment: CVPR 2022 Workshop on Attention and Transformers in Visio

    A deep active learning system for species identification and counting in camera trap images

    Get PDF
    1. A typical camera trap survey may produce millions of images that require slow, expensive manual review. Consequently, critical conservation questions may be answered too slowly to support decisionā€making. Recent studies demonstrated the potential for computer vision to dramatically increase efficiency in imageā€based biodiversity surveys; however, the literature has focused on projects with a large set of labeled training images, and hence many projects with a smaller set of labeled images cannot benefit from existing machine learning techniques. Furthermore, even sizable projects have struggled to adopt computer vision methods because classification models overfit to specific image backgrounds (i.e., camera locations). 2. In this paper, we combine the power of machine intelligence and human intelligence via a novel active learning system to minimize the manual work required to train a computer vision model. Furthermore, we utilize object detection models and transfer learning to prevent overfitting to camera locations. To our knowledge, this is the first work to apply an active learning approach to camera trap images. 3. Our proposed scheme can match stateā€ofā€theā€art accuracy on a 3.2 million image dataset with as few as 14,100 manual labels, which means decreasing manual labeling effort by over 99.5%. Our trained models are also less dependent on background pixels, since they operate only on cropped regions around animals. 4. The proposed active deep learning scheme can significantly reduce the manual labor required to extract information from camera trap images. Automation of information extraction will not only benefit existing camera trap projects, but can also catalyze the deployment of larger camera trap arrays
    • ā€¦
    corecore