48 research outputs found
Beyond Classification: Latent User Interests Profiling from Visual Contents Analysis
User preference profiling is an important task in modern online social
networks (OSN). With the proliferation of image-centric social platforms, such
as Pinterest, visual contents have become one of the most informative data
streams for understanding user preferences. Traditional approaches usually
treat visual content analysis as a general classification problem where one or
more labels are assigned to each image. Although such an approach simplifies
the process of image analysis, it misses the rich context and visual cues that
play an important role in people's perception of images. In this paper, we
explore the possibilities of learning a user's latent visual preferences
directly from image contents. We propose a distance metric learning method
based on Deep Convolutional Neural Networks (CNN) to directly extract
similarity information from visual contents and use the derived distance metric
to mine individual users' fine-grained visual preferences. Through our
preliminary experiments using data from 5,790 Pinterest users, we show that
even for the images within the same category, each user possesses distinct and
individually-identifiable visual preferences that are consistent over their
lifetime. Our results underscore the untapped potential of finer-grained visual
preference profiling in understanding users' preferences.Comment: 2015 IEEE 15th International Conference on Data Mining Workshop
Evaluation of the green development efficiency of marine fish culture in China
Green development efficiency (GDE) is an important criterion for measuring the level of green development. GDE considers not only economic development efficiency but also environmental costs. In China, marine fish culture, as one of the pillar industries of mariculture, promotes green development and industrial transformation and upgradation. Based on data from the field surveys of marine fish farmers (2017–2019) and the China Fishery Statistical Yearbook (2018–2020), this study establishes an evaluation index system and uses the super-slack-based measure model (Super-SBM) to evaluate the GDE of marine fish culture. The results show that the average GDE of marine fish culture in China was 0.9529, which was in an inefficient state. As for culture species, golden pompano (Trachinotus ovatus) and cobia (Rachycentron canadum) were the two species farmed in an efficient state, with a GDE of 1.2107 and 1.0659, respectively. Regarding culture modes, green modes (offshore cage aquaculture, industrial recirculating aquaculture, and engineering pond aquaculture) were in an efficient state, with a GDE of 1.2310, 1.0827, and 1.0401, respectively. Traditional modes (industrial flow-through aquaculture, ordinary cage aquaculture, and ordinary pond aquaculture) were in an inefficient state, with their GDE being 0.9884, 0.8746, and 0.8248, respectively. Green modes have higher GDE than traditional modes. In contrast, the production and culture areas of green modes were less than those of traditional modes because the profits of the same species in green modes were lower than those in traditional modes. The results of this study present an objective assessment of the GDE of marine fish culture in China and provide valuable insights for analyzing the mechanisms to improve the GDE of marine fish culture
S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs
The traditional Dialogue State Tracking (DST) problem aims to track user
preferences and intents in user-agent conversations. While sufficient for
task-oriented dialogue systems supporting narrow domain applications, the
advent of Large Language Model (LLM)-based chat systems has introduced many
real-world intricacies in open-domain dialogues. These intricacies manifest in
the form of increased complexity in contextual interactions, extended dialogue
sessions encompassing a diverse array of topics, and more frequent contextual
shifts. To handle these intricacies arising from evolving LLM-based chat
systems, we propose joint dialogue segmentation and state tracking per segment
in open-domain dialogue systems. Assuming a zero-shot setting appropriate to a
true open-domain dialogue system, we propose S3-DST, a structured prompting
technique that harnesses Pre-Analytical Recollection, a novel grounding
mechanism we designed for improving long context tracking. To demonstrate the
efficacy of our proposed approach in joint segmentation and state tracking, we
evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as
well as publicly available DST and segmentation datasets. Across all datasets
and settings, S3-DST consistently outperforms the state-of-the-art,
demonstrating its potency and robustness the next generation of LLM-based chat
systems
PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers
Powerful large language models have facilitated the development of writing
assistants that promise to significantly improve the quality and efficiency of
composition and communication. However, a barrier to effective assistance is
the lack of personalization in LLM outputs to the author's communication style
and specialized knowledge. In this paper, we address this challenge by
proposing PEARL, a retrieval-augmented LLM writing assistant personalized with
a generation-calibrated retriever. Our retriever is trained to select historic
user-authored documents for prompt augmentation, such that they are likely to
best personalize LLM generations for a user request. We propose two key
novelties for training our retriever: 1) A training data selection method that
identifies user requests likely to benefit from personalization and documents
that provide that benefit; and 2) A scale-calibrating KL-divergence objective
that ensures that our retriever closely tracks the benefit of a document for
personalized generation. We demonstrate the effectiveness of PEARL in
generating personalized workplace social media posts and Reddit comments.
Finally, we showcase the potential of a generation-calibrated retriever to
double as a performance predictor and further improve low-quality generations
via LLM chaining.Comment: Pre-print, work in progres