39 research outputs found

    WebArena: A Realistic Web Environment for Building Autonomous Agents

    Full text link
    With advances in generative AI, there is now potential for autonomous agents to manage daily tasks via natural language commands. However, current agents are primarily created and tested in simplified synthetic environments, leading to a disconnect with real-world scenarios. In this paper, we build an environment for language-guided agents that is highly realistic and reproducible. Specifically, we focus on agents that perform tasks on the web, and create an environment with fully functional websites from four common domains: e-commerce, social forum discussions, collaborative software development, and content management. Our environment is enriched with tools (e.g., a map) and external knowledge bases (e.g., user manuals) to encourage human-like task-solving. Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions. The tasks in our benchmark are diverse, long-horizon, and designed to emulate tasks that humans routinely perform on the internet. We experiment with several baseline agents, integrating recent techniques such as reasoning before acting. The results demonstrate that solving complex tasks is challenging: our best GPT-4-based agent only achieves an end-to-end task success rate of 14.41%, significantly lower than the human performance of 78.24%. These results highlight the need for further development of robust agents, that current state-of-the-art large language models are far from perfect performance in these real-life tasks, and that WebArena can be used to measure such progress.Comment: Our code, data, environment reproduction resources, and video demonstrations are publicly available at https://webarena.dev

    Vector competence evaluation of mosquitoes for Tahyna virus PJ01 strain, a new Orthobunyavirus in China

    Get PDF
    IntroductionTahyna virus (TAHV), an arbovirus of the genus Orthobunyavirus, is a cause of human diseases and less studied worldwide. In this study, a new strain of TAHV was isolated from Aedes sp. mosquitoes collected in Panjin city, Liaoning province. However, the competent vector of TAHV in China is still unknown.MethodsThe genome of newly isolated TAHV was sequenced and phylogenetic analysis is performed. Aedes albopictus and Culex pipiens pallens were orally infected with artificial virus blood meals (1:1 of virus suspension and mouse blood), the virus was detected in the midgut, ovary, salivary gland and saliva of the mosquitoes. Then, the transmission and dissemination rates, vertical transmission and horizontal transmission of the virus by the mosquitoes were assessed.ResultsPhylogenetic analysis revealed that the virus shared high similarity with TAHV and was named the TAHV PJ01 strain. After oral infection with virus blood meals, Ae. albopictus showed positive for the virus in all tested tissues with an extrinsic incubation period of 2 days and a fluctuating increasement of transmission and dissemination rates. Whereas no virus was detected in the saliva of Cx. pipiens pallens. Suckling mice bitten by infectious Ae. albopictus developed obvious neurological symptoms, including inactivity, hind-leg paralysis and difficulty turning over, when the virus titer reached 1.70×105 PFU/mL in the brain. Moreover, TAHV was detected in the eggs, larvae and adults of F1 offspring of Ae. albopictus.DiscussionAe. albopictus is an efficient vector to transmit TAHV but Cx. pipiens pallens is not. Ae. albopictus is also a reservoir host that transmits the virus vertically, which further increases the risk of outbreaks. This study has important epidemiological implications for the surveillance of pathogenic viruses in China and guiding comprehensive vector control strategies to counteract potential outbreaks in future

    Contact Mode Guided Motion Planning for Quasidynamic Dexterous Manipulation in 3D

    Full text link
    This paper presents Contact Mode Guided Manipulation Planning (CMGMP) for 3D quasistatic and quasidynamic rigid body motion planning in dexterous manipulation. The CMGMP algorithm generates hybrid motion plans including both continuous state transitions and discrete contact mode switches, without the need for pre-specified contact sequences or pre-designed motion primitives. The key idea is to use automatically enumerated contact modes of environment-object contacts to guide the tree expansions during the search. Contact modes automatically synthesize manipulation primitives, while the sampling-based planning framework sequences those primitives into a coherent plan. We test our algorithm on fourteen 3D manipulation tasks, and validate our models by executing some plans open-loop on a real robot-manipulator syste

    Intelligent Prediction Model of the Thermal and Moisture Comfort of the Skin-Tight Garment

    No full text
    In order to improve the efficiency and accuracy of predicting the thermal and moisture comfort of skin-tight clothing (also called skin-tight underwear), principal component analysis (PCA) is used to reduce the dimensions of related variables and eliminate the multicollinearity relationship among variables. Then, the optimized variables are used as the input parameters of the coupled intelligent model of the genetic algorithm (GA) and back propagation (BP) neural network, and the thermal and moisture comfort of different tights (tight tops and tight trousers) under different sports conditions is analysed. At the same time, in order to verify the superiority of the genetic algorithm and BP neural network intelligent model, the prediction results of GA-BP, PCA-BP and BP are compared with this model. The results show that principal component analysis (PCA) improves the accuracy and adaptability of the GA-BP neural network in predicting thermal and humidity comfort. The forecasting effect of the PCA-GA-BP neural network is obviously better than that of the GA-BP, PCA-BP, BP model, which can accurately predict the thermal and moisture comfort of tight-fitting sportswear. The model has better forecasting accuracy and a simpler structure

    Temperature and Humidity Data Evaluation of Tight Sportswear During Motion Based on Intelligent Modeling

    No full text
    A neural network structure of Long Short Term Memory (LSTM) is proposed which could be used to predict the temperaturę and humidity of other key parts from the temperature and humidity data of some parts of the human body when wearing tight sportswear, so as to realize the temperature and humidity data prediction of all key points of the human body. The temperaturę and humidity of different people wearing tights were collected by DHT sensors. The experimental results show that the LSTM neural network structure proposed has higher prediction accuracy than other algorithms, and the model evaluates the feasibility of temperature and humidity data of tights in a state of motion, which facilitates the study of dynamic thermal and humid comfort and reduces the time cost of analyzing the temperature and humidity distribution and changing the law during human movement. It will effectively promote the study of temperature and humidity changes when people wear sports tights, provide theoretical reference for the study of human skin temperature in the field of sports medicine, and provide practical guidance for the application of human skin temperature changes in sports clothing production, diagnosis and prevention of sports injuries

    DataSheet1_Effects of arch support doses on the center of pressure and pressure distribution of running using statistical parametric mapping.pdf

    No full text
    Insoles with an arch support have been used to address biomechanical risk factors of running. However, the relationship between the dose of support and running biomechanics remains unclear. The purpose of this study was to determine the effects of changing arch support doses on the center of pressure (COP) and pressure mapping using statistical parametric mapping (SPM). Nine arch support variations (3 heights * 3 widths) and a flat insole control were tested on fifteen healthy recreational runners using a 1-m Footscan pressure plate. The medial-lateral COP (COPML) coordinates and the total COP velocity (COPVtotal) were calculated throughout the entirety of stance. One-dimensional and two-dimensional SPM were performed to assess differences between the arch support and control conditions for time series of COP variables and pressure mapping at a pixel level, respectively. Two-way ANOVAs were performed to test the main effect of the arch support height and width, and their interaction on the peak values of the COPVtotal. The results showed that the COPVtotal during the forefoot contact and forefoot push off phases was increased by arch supports, while the COP medial-lateral coordinates remained unchanged. There was a dose-response effect of the arch support height on peak values of the COPVtotal, with a higher support increasing the first and third valleys but decreasing the third peak of the COPVtotal. Meanwhile, a higher arch support height shifted the peak pressure from the medial forefoot and rearfoot to the medial arch. It is concluded that changing arch support doses, primarily the height, systematically altered the COP velocities and peak plantar pressure at a pixel level during running. When assessing subtle modifications in the arch support, the COP velocity was a more sensitive variable than COP coordinates. SPM provides a high-resolution view of pressure comparisons, and is recommended for future insole/footwear investigations to better understand the underlying mechanisms and improve insole design.</p

    Photoionization cross section measurements of the excited states of cobalt in the near-threshold region

    No full text
    We present measurements of photoionization cross-sections of the excited states of cobalt using a two-color, two-step resonance ionization technique in conjunction with a molecular beam time of flight (TOF) mass spectrometer. The atoms were produced by the laser vaporization of a cobalt rod, coupled with a supersonic gas jet. The absolute photoionization cross-sections at threshold and near-threshold regions (0-1.2 eV) were measured, and the measured values ranged from 4.2±0.7 Mb to 10.5±1.8 Mb. The lifetimes of four odd parity energy levels are reported for the first time
    corecore