5,197 research outputs found
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
Language is increasingly being used to define rich visual recognition
problems with supporting image collections sourced from the web. Structured
prediction models are used in these tasks to take advantage of correlations
between co-occurring labels and visual input but risk inadvertently encoding
social biases found in web corpora. In this work, we study data and models
associated with multilabel object classification and visual semantic role
labeling. We find that (a) datasets for these tasks contain significant gender
bias and (b) models trained on these datasets further amplify existing bias.
For example, the activity cooking is over 33% more likely to involve females
than males in a training set, and a trained model further amplifies the
disparity to 68% at test time. We propose to inject corpus-level constraints
for calibrating existing structured prediction models and design an algorithm
based on Lagrangian relaxation for collective inference. Our method results in
almost no performance loss for the underlying recognition task but decreases
the magnitude of bias amplification by 47.5% and 40.5% for multilabel
classification and visual semantic role labeling, respectively.Comment: 11 pages, published in EMNLP 201
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
Knowledge Transfer from Weakly Labeled Audio using Convolutional Neural Network for Sound Events and Scenes
In this work we propose approaches to effectively transfer knowledge from
weakly labeled web audio data. We first describe a convolutional neural network
(CNN) based framework for sound event detection and classification using weakly
labeled audio data. Our model trains efficiently from audios of variable
lengths; hence, it is well suited for transfer learning. We then propose
methods to learn representations using this model which can be effectively used
for solving the target task. We study both transductive and inductive transfer
learning tasks, showing the effectiveness of our methods for both domain and
task adaptation. We show that the learned representations using the proposed
CNN model generalizes well enough to reach human level accuracy on ESC-50 sound
events dataset and set state of art results on this dataset. We further use
them for acoustic scene classification task and once again show that our
proposed approaches suit well for this task as well. We also show that our
methods are helpful in capturing semantic meanings and relations as well.
Moreover, in this process we also set state-of-art results on Audioset dataset,
relying on balanced training set.Comment: ICASSP 201
Recommended from our members
Proceedings ICPW'07: 2nd International Conference on the Pragmatic Web, 22-23 Oct. 2007, Tilburg: NL
Proceedings ICPW'07: 2nd International Conference on the Pragmatic Web, 22-23 Oct. 2007, Tilburg: N
Visualizing Research Digital Libraries with Open Standards
Large-scale research Digital Libraries (DLs) contain a large array of potentially useful metadata. Yet, many popular DLs do not provide a convenient way to navigate the metadata or to visualize classification schema in the user session. For example, in the broad world of Management Information Systems (MIS) research, a high-level overview of MIS topics and their inter-relationships would be useful to navigate a MIS DL before zooming in on a specific article. To address this obstacle, this paper describes a prototype, the Technical Report Visualizer System (TRV), which uses a wide variety of open standards to show DL classification metadata in the navigation interface. The system captures MIS article metadata from the Open Archives Initiative (OAI) compliant arXiv e-Print archive at Cornell University. The OAI Protocol for Metadata Harvesting (OAI-PMH) is used to collect the topic metadata; the articles\u27 Association for Computing Machinery\u27s (ACM) Computing Classification System codes. We display the topic metadata in a Java hyperbolic tree and make use of XML conceptual product and implementation product standards and specifications, such as the Dublin Core and BiblioML bibliographic metadata sets, XML Topic Maps, Xalan and Xerces, to link user navigation activity to the abstracts and full text contents of the articles. We discuss the flexibility and convenience of XML standards and link this effort to related digital library visualization approaches. Keywords
Syntactic and Semantic Analysis and Visualization of Unstructured English Texts
People have complex thoughts, and they often express their thoughts with complex sentences using natural languages. This complexity may facilitate efficient communications among the audience with the same knowledge base. But on the other hand, for a different or new audience this composition becomes cumbersome to understand and analyze. Analysis of such compositions using syntactic or semantic measures is a challenging job and defines the base step for natural language processing.
In this dissertation I explore and propose a number of new techniques to analyze and visualize the syntactic and semantic patterns of unstructured English texts.
The syntactic analysis is done through a proposed visualization technique which categorizes and compares different English compositions based on their different reading complexity metrics. For the semantic analysis I use Latent Semantic Analysis (LSA) to analyze the hidden patterns in complex compositions. I have used this technique to analyze comments from a social visualization web site for detecting the irrelevant ones (e.g., spam). The patterns of collaborations are also studied through statistical analysis.
Word sense disambiguation is used to figure out the correct sense of a word in a sentence or composition. Using textual similarity measure, based on the different word similarity measures and word sense disambiguation on collaborative text snippets from social collaborative environment, reveals a direction to untie the knots of complex hidden patterns of collaboration
A Visual Interactive Analytic Tool for Filtering and Summarizing Large Health Data Sets Coded with Hierarchical Terminologies (VIADS).
BACKGROUND: Vast volumes of data, coded through hierarchical terminologies (e.g., International Classification of Diseases, Tenth Revision-Clinical Modification [ICD10-CM], Medical Subject Headings [MeSH]), are generated routinely in electronic health record systems and medical literature databases. Although graphic representations can help to augment human understanding of such data sets, a graph with hundreds or thousands of nodes challenges human comprehension. To improve comprehension, new tools are needed to extract the overviews of such data sets. We aim to develop a visual interactive analytic tool for filtering and summarizing large health data sets coded with hierarchical terminologies (VIADS) as an online, and publicly accessible tool. The ultimate goals are to filter, summarize the health data sets, extract insights, compare and highlight the differences between various health data sets by using VIADS. The results generated from VIADS can be utilized as data-driven evidence to facilitate clinicians, clinical researchers, and health care administrators to make more informed clinical, research, and administrative decisions. We utilized the following tools and the development environments to develop VIADS: Django, Python, JavaScript, Vis.js, Graph.js, JQuery, Plotly, Chart.js, Unittest, R, and MySQL.
RESULTS: VIADS was developed successfully and the beta version is accessible publicly. In this paper, we introduce the architecture design, development, and functionalities of VIADS. VIADS includes six modules: user account management module, data sets validation module, data analytic module, data visualization module, terminology module, dashboard. Currently, VIADS supports health data sets coded by ICD-9, ICD-10, and MeSH. We also present the visualization improvement provided by VIADS in regard to interactive features (e.g., zoom in and out, customization of graph layout, expanded information of nodes, 3D plots) and efficient screen space usage.
CONCLUSIONS: VIADS meets the design objectives and can be used to filter, summarize, compare, highlight and visualize large health data sets that coded by hierarchical terminologies, such as ICD-9, ICD-10 and MeSH. Our further usability and utility studies will provide more details about how the end users are using VIADS to facilitate their clinical, research or health administrative decision making
- …