51 research outputs found

    Study of functional outcome of total hip arthroplasty in a series of cases of hip pathologies done in rural population

    Get PDF
    Background: The objective of the study was to assess the functional outcome of total hip arthroplasty (THA) done in a series of cases of hip pathologies rural population.Methods: A retrospective randomized controlled study conducted in 50 cases of hip arthritis (38 males and 12 females)  treated with uncemented THA for an average follow-up of  2 years  at department of orthopedics MGM Medical College, Kamothe, Navi Mumbai. Harris hip scoring system was used for the functional scoring and the postoperative radiographs were assessed by Gruen zones for the femoral component and DeLee and Charnley zones for the acetabular component. All patients were evaluated pre operatively and post operatively 3 months 6 months, 12months, 2years with Harris Hip score.Results: 81% of our patients scored 85 points or better for a rating of excellent by Harris hip score system. 90% patients had little /no pain post operatively, whereas walking ability improved and was unlimited in 80% of the patients post operatively. Harris hip score improved from 40 to 80. 80.5% -excellent, 13.80% -good, 5.7% -fair results. Poor results were not seen in any patient.Conclusions: THR provided excellent pain relief, adequate stability, and remarkable range of motion in severely painful, refractory hip. A significant improvement was seen at two year follow-up.

    Visual Programming for Text-to-Image Generation and Evaluation

    Full text link
    As large language models have demonstrated impressive performance in many domains, recent works have adopted language models (LMs) as controllers of visual modules for vision-and-language tasks. While existing work focuses on equipping LMs with visual understanding, we propose two novel interpretable/explainable visual programming frameworks for text-to-image (T2I) generation and evaluation. First, we introduce VPGen, an interpretable step-by-step T2I generation framework that decomposes T2I generation into three steps: object/count generation, layout generation, and image generation. We employ an LM to handle the first two steps (object/count generation and layout generation), by finetuning it on text-layout pairs. Our step-by-step T2I generation framework provides stronger spatial control than end-to-end models, the dominant approach for this task. Furthermore, we leverage the world knowledge of pretrained LMs, overcoming the limitation of previous layout-guided T2I works that can only handle predefined object classes. We demonstrate that our VPGen has improved control in counts/spatial relations/scales of objects than state-of-the-art T2I generation models. Second, we introduce VPEval, an interpretable and explainable evaluation framework for T2I generation based on visual programming. Unlike previous T2I evaluations with a single scoring model that is accurate in some skills but unreliable in others, VPEval produces evaluation programs that invoke a set of visual modules that are experts in different skills, and also provides visual+textual explanations of the evaluation results. Our analysis shows VPEval provides a more human-correlated evaluation for skill-specific and open-ended prompts than widely used single model-based evaluation. We hope our work encourages future progress on interpretable/explainable generation and evaluation for T2I models. Website: https://vp-t2i.github.ioComment: 18 pages; Project website: https://vp-t2i.github.i

    VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning

    Full text link
    Although recent text-to-video (T2V) generation methods have seen significant advancements, most of these works focus on producing short video clips of a single event with a single background (i.e., single-scene videos). Meanwhile, recent large language models (LLMs) have demonstrated their capability in generating layouts and programs to control downstream visual modules such as image generation models. This raises an important question: can we leverage the knowledge embedded in these LLMs for temporally consistent long video generation? In this paper, we propose VideoDirectorGPT, a novel framework for consistent multi-scene video generation that uses the knowledge of LLMs for video content planning and grounded video generation. Specifically, given a single text prompt, we first ask our video planner LLM (GPT-4) to expand it into a 'video plan', which involves generating the scene descriptions, the entities with their respective layouts, the background for each scene, and consistency groupings of the entities and backgrounds. Next, guided by this output from the video planner, our video generator, Layout2Vid, has explicit control over spatial layouts and can maintain temporal consistency of entities/backgrounds across scenes, while only trained with image-level annotations. Our experiments demonstrate that VideoDirectorGPT framework substantially improves layout and movement control in both single- and multi-scene video generation and can generate multi-scene videos with visual consistency across scenes, while achieving competitive performance with SOTAs in open-domain single-scene T2V generation. We also demonstrate that our framework can dynamically control the strength for layout guidance and can also generate videos with user-provided images. We hope our framework can inspire future work on better integrating the planning ability of LLMs into consistent long video generation.Comment: Project page: https://videodirectorgpt.github.i

    FixMyPose: Pose Correctional Captioning and Retrieval

    Full text link
    Interest in physical therapy and individual exercises such as yoga/dance has increased alongside the well-being trend. However, such exercises are hard to follow without expert guidance (which is impossible to scale for personalized feedback to every trainee remotely). Thus, automated pose correction systems are required more than ever, and we introduce a new captioning dataset named FixMyPose to address this need. We collect descriptions of correcting a "current" pose to look like a "target" pose (in both English and Hindi). The collected descriptions have interesting linguistic properties such as egocentric relations to environment objects, analogous references, etc., requiring an understanding of spatial relations and commonsense knowledge about postures. Further, to avoid ML biases, we maintain a balance across characters with diverse demographics, who perform a variety of movements in several interior environments (e.g., homes, offices). From our dataset, we introduce the pose-correctional-captioning task and its reverse target-pose-retrieval task. During the correctional-captioning task, models must generate descriptions of how to move from the current to target pose image, whereas in the retrieval task, models should select the correct target pose given the initial pose and correctional description. We present strong cross-attention baseline models (uni/multimodal, RL, multilingual) and also show that our baselines are competitive with other models when evaluated on other image-difference datasets. We also propose new task-specific metrics (object-match, body-part-match, direction-match) and conduct human evaluation for more reliable evaluation, and we demonstrate a large human-model performance gap suggesting room for promising future work. To verify the sim-to-real transfer of our FixMyPose dataset, we collect a set of real images and show promising performance on these images.Comment: AAAI 2021 (18 pages, 16 figures; webpage: https://fixmypose-unc.github.io/

    DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning

    Full text link
    Text-to-image (T2I) generation has seen significant growth over the past few years. Despite this, there has been little work on generating diagrams with T2I models. A diagram is a symbolic/schematic representation that explains information using structurally rich and spatially complex visualizations (e.g., a dense combination of related objects, text labels, directional arrows, connection lines, etc.). Existing state-of-the-art T2I models often fail at diagram generation because they lack fine-grained object layout control when many objects are densely connected via complex relations such as arrows/lines and also often fail to render comprehensible text labels. To address this gap, we present DiagrammerGPT, a novel two-stage text-to-diagram generation framework that leverages the layout guidance capabilities of LLMs (e.g., GPT-4) to generate more accurate open-domain, open-platform diagrams. In the first stage, we use LLMs to generate and iteratively refine 'diagram plans' (in a planner-auditor feedback loop) which describe all the entities (objects and text labels), their relationships (arrows or lines), and their bounding box layouts. In the second stage, we use a diagram generator, DiagramGLIGEN, and a text label rendering module to generate diagrams following the diagram plans. To benchmark the text-to-diagram generation task, we introduce AI2D-Caption, a densely annotated diagram dataset built on top of the AI2D dataset. We show quantitatively and qualitatively that our DiagrammerGPT framework produces more accurate diagrams, outperforming existing T2I models. We also provide comprehensive analysis including open-domain diagram generation, vector graphic diagram generation in different platforms, human-in-the-loop diagram plan editing, and multimodal planner/auditor LLMs (e.g., GPT-4Vision). We hope our work can inspire further research on diagram generation via T2I models and LLMs.Comment: Project page: https://diagrammerGPT.github.io

    Hierarchical Video-Moment Retrieval and Step-Captioning

    Full text link
    There is growing interest in searching for information from large video corpora. Prior works have studied relevant tasks, such as text-based video retrieval, moment retrieval, video summarization, and video captioning in isolation, without an end-to-end setup that can jointly search from video corpora and generate summaries. Such an end-to-end setup would allow for many interesting applications, e.g., a text-based search that finds a relevant video from a video corpus, extracts the most relevant moment from that video, and segments the moment into important steps with captions. To address this, we present the HiREST (HIerarchical REtrieval and STep-captioning) dataset and propose a new benchmark that covers hierarchical information retrieval and visual/textual stepwise summarization from an instructional video corpus. HiREST consists of 3.4K text-video pairs from an instructional video dataset, where 1.1K videos have annotations of moment spans relevant to text query and breakdown of each moment into key instruction steps with caption and timestamps (totaling 8.6K step captions). Our hierarchical benchmark consists of video retrieval, moment retrieval, and two novel moment segmentation and step captioning tasks. In moment segmentation, models break down a video moment into instruction steps and identify start-end boundaries. In step captioning, models generate a textual summary for each step. We also present starting point task-specific and end-to-end joint baseline models for our new benchmark. While the baseline models show some promising results, there still exists large room for future improvement by the community. Project website: https://hirest-cvpr2023.github.ioComment: CVPR 2023 (15 pages; the first two authors contributed equally; Project website: https://hirest-cvpr2023.github.io

    Knowledge attitude and behavior practices regarding clinical presentation, transmission, preventive measures and management of malaria and dengue among the health care personnel

    Get PDF
    Background: According to WHO, in 2020, there were an estimated 241 million cases of malaria worldwide. The estimated number of malaria deaths stood at 627000 in 2020. Similarly, the global incidence of dengue has grown dramatically with about half of the world's population now at risk. The present study is an attempt to assess the knowledge attitude and behaviour practices regarding clinical presentation, transmission, preventive measures and management of malaria and dengue among the health care personnel (HCPs).Methods: The present cross-sectional study was carried out in the department of community medicine, MGM medical college Indore. Among one tribal (Barwani) and one non-tribal district of Indore, participant selection was done by simple random sampling technique using chit method of all districts covered under Indore division. The ethical clearance was obtained from our institute ethical committee.  Results: The advice given by all the HCPs for the prevention of malaria infection is eradication of breeding site of mosquito by preventing water stagnation. The 75% ANMs, 90% lab technicians, 100% MOs, malaria inspectors and MPWs were aware of the time of the bite of female anopheles’ mosquito. Majority of the HCPs were aware of the time of the bite of female Aedes mosquito, the warning signs dengue infection and were of the opinion that they give advice of keeping drinking water containers (Cisterns, tanks) tight closed.Conclusions: All the HCPs were aware of the prominent symptoms of malaria and promoted actively the integrated vector control measures in their allocated areas of work
    corecore