287 research outputs found
YouTube AV 50K: An Annotated Corpus for Comments in Autonomous Vehicles
With one billion monthly viewers, and millions of users discussing and
sharing opinions, comments below YouTube videos are rich sources of data for
opinion mining and sentiment analysis. We introduce the YouTube AV 50K dataset,
a freely-available collections of more than 50,000 YouTube comments and
metadata below autonomous vehicle (AV)-related videos. We describe its creation
process, its content and data format, and discuss its possible usages.
Especially, we do a case study of the first self-driving car fatality to
evaluate the dataset, and show how we can use this dataset to better understand
public attitudes toward self-driving cars and public reactions to the accident.
Future developments of the dataset are also discussed.Comment: in Proceedings of the Thirteenth International Joint Symposium on
Artificial Intelligence and Natural Language Processing (iSAI-NLP 2018
Modification Method of Tooth Profile of Locomotive Traction Gear Based on Rodent Arm Variation
Locomotive traction gear is the key component to power transmission and speed control in locomotive transmission system, which plays an important role in locomotive running speed and load-carrying torque. Considering that there is not universal rule for the method of modification of locomotive gear at present, in this paper, the tooth profile modification is considered with the combination of the increased contact ratio and the variation of the moment arm of action. Based on the principle of modification, according to the load direction after modification, the change rule of moment arm of action after modification is determined, and the interval range of tooth profile modification is also determined. Taking a certain locomotive traction gear as an example, the results obtained through the method of modification which based on combining moment arm of action variation with the increase of contact ratio and the method based on the traditional empirical formula are compared through finite element simulation respectively, on this account to verify the superiority of the theory of modification, which has important theoretical significance for profile modification of locomotive traction gear
EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Understanding human tasks through video observations is an essential
capability of intelligent agents. The challenges of such capability lie in the
difficulty of generating a detailed understanding of situated actions, their
effects on object states (i.e., state changes), and their causal dependencies.
These challenges are further aggravated by the natural parallelism from
multi-tasking and partial observations in multi-agent collaboration. Most prior
works leverage action localization or future prediction as an indirect metric
for evaluating such task understanding from videos. To make a direct
evaluation, we introduce the EgoTaskQA benchmark that provides a single home
for the crucial dimensions of task understanding through question-answering on
real-world egocentric videos. We meticulously design questions that target the
understanding of (1) action dependencies and effects, (2) intents and goals,
and (3) agents' beliefs about others. These questions are divided into four
types, including descriptive (what status?), predictive (what will?),
explanatory (what caused?), and counterfactual (what if?) to provide diagnostic
analyses on spatial, temporal, and causal understandings of goal-oriented
tasks. We evaluate state-of-the-art video reasoning models on our benchmark and
show their significant gaps between humans in understanding complex
goal-oriented egocentric videos. We hope this effort will drive the vision
community to move onward with goal-oriented video understanding and reasoning.Comment: Published at NeurIPS Track on Datasets and Benchmarks 202
Towards Benchmarking GUI Compatibility Testing on Mobile Applications
GUI is a bridge connecting user and application. Existing GUI testing tasks
can be categorized into two groups: functionality testing and compatibility
testing. While the functionality testing focuses on detecting application
runtime bugs, the compatibility testing aims at detecting bugs resulting from
device or platform difference. To automate testing procedures and improve
testing efficiency, previous works have proposed dozens of tools. To evaluate
these tools, in functionality testing, researchers have published testing
benchmarks. Comparatively, in compatibility testing, the question of ``Do
existing methods indeed effectively assist test cases replay?'' is not well
answered. To answer this question and advance the related research in GUI
compatibility testing, we propose a benchmark of GUI compatibility testing. In
our experiments, we compare the replay success rate of existing tools. Based on
the experimental results, we summarize causes which may lead to ineffectiveness
in test case replay and propose opportunities for improving the
state-of-the-art
Cascade-DETR: Delving into High-Quality Universal Object Detection
Object localization in general environments is a fundamental part of vision
systems. While dominating on the COCO benchmark, recent Transformer-based
detection methods are not competitive in diverse domains. Moreover, these
methods still struggle to very accurately estimate the object bounding boxes in
complex environments.
We introduce Cascade-DETR for high-quality universal object detection. We
jointly tackle the generalization to diverse domains and localization accuracy
by proposing the Cascade Attention layer, which explicitly integrates
object-centric information into the detection decoder by limiting the attention
to the previous box prediction. To further enhance accuracy, we also revisit
the scoring of queries. Instead of relying on classification scores, we predict
the expected IoU of the query, leading to substantially more well-calibrated
confidences. Lastly, we introduce a universal object detection benchmark,
UDB10, that contains 10 datasets from diverse domains. While also advancing the
state-of-the-art on COCO, Cascade-DETR substantially improves DETR-based
detectors on all datasets in UDB10, even by over 10 mAP in some cases. The
improvements under stringent quality requirements are even more pronounced. Our
code and models will be released at https://github.com/SysCV/cascade-detr.Comment: Accepted in ICCV 2023. Our code and models will be released at
https://github.com/SysCV/cascade-det
Mobile Phone Graph Evolution: Findings, Model and Interpretation
What are the features of mobile phone graph along the time? How to model these features? What are the interpretation for the evolutional graph generation process? To answer the above challenging problems, we analyze a massive who-call-whom networks as long as a year, gathered from records of two large mobile phone communication networks both with 2 million users and 2 billion of calls.We examine the calling behavior distribution at multiple time scales (e.g., day, week, month and quarter), and find that the distribution is not only skewed with a heavy tail, but also changing at different time scales. How to model the changing behavior, and whether there exists a distribution fitting the multi-scale data well? In this paper, first, we define a d-stable distribution and a Multi-scale Distribution Fitting (MsDF) problem. Second, to analyze our observed distributions at different time scales, we propose a framework, ScalePower, which not only fits the multi-scale data distribution very well, but also works as a convolutional distribution mixture to explain the generation mechanism of the multi-scale distribution changing behavior. Third, ScalePower can conduct a fitting approximation from a small time scale data to a large time scale. Furthermore, we illustrate the interesting and appealing findings from our ScalePower model and large scale real life data sets. © 2011 IEEE
- …