Search CORE

9,152 research outputs found

Think Globally, Act Locally: On the Optimal Seeding for Nonsubmodular Influence Maximization

Author: Schoenebeck Grant
Tao Biaoshuai
Yu Fang-Yi
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)
Publication date: 01/01/2019
Field of study

We study the r-complex contagion influence maximization problem. In the influence maximization problem, one chooses a fixed number of initial seeds in a social network to maximize the spread of their influence. In the r-complex contagion model, each uninfected vertex in the network becomes infected if it has at least r infected neighbors. In this paper, we focus on a random graph model named the stochastic hierarchical blockmodel, which is a special case of the well-studied stochastic blockmodel. When the graph is not exceptionally sparse, in particular, when each edge appears with probability omega (n^{-(1+1/r)}), under certain mild assumptions, we prove that the optimal seeding strategy is to put all the seeds in a single community. This matches the intuition that in a nonsubmodular cascade model placing seeds near each other creates synergy. However, it sharply contrasts with the intuition for submodular cascade models (e.g., the independent cascade model and the linear threshold model) in which nearby seeds tend to erode each others\u27 effects. Finally, we show that this observation yields a polynomial time dynamic programming algorithm which outputs optimal seeds if each edge appears with a probability either in omega (n^{-(1+1/r)}) or in o (n^{-2})

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Optimal Scoring Rule Design

Author: Chen Yiling
Yu Fang-Yi
Publication venue
Publication date: 15/07/2021
Field of study

This paper introduces an optimization problem for proper scoring rule design. Consider a principal who wants to collect an agent's prediction about an unknown state. The agent can either report his prior prediction or access a costly signal and report the posterior prediction. Given a collection of possible distributions containing the agent's posterior prediction distribution, the principal's objective is to design a bounded scoring rule to maximize the agent's worst-case payoff increment between reporting his posterior prediction and reporting his prior prediction. We study two settings of such optimization for proper scoring rules: static and asymptotic settings. In the static setting, where the agent can access one signal, we propose an efficient algorithm to compute an optimal scoring rule when the collection of distributions is finite. The agent can adaptively and indefinitely refine his prediction in the asymptotic setting. We first consider a sequence of collections of posterior distributions with vanishing covariance, which emulates general estimators with large samples, and show the optimality of the quadratic scoring rule. Then, when the agent's posterior distribution is a Beta-Bernoulli process, we find that the log scoring rule is optimal. We also prove the optimality of the log scoring rule over a smaller set of functions for categorical distributions with Dirichlet priors

arXiv.org e-Print Archive

Learning and Strongly Truthful Multi-Task Peer Prediction: A Variational Approach

Author: Schoenebeck Grant
Yu Fang-Yi
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)
Publication date: 30/09/2020
Field of study

Peer prediction mechanisms incentivize agents to truthfully report their signals even in the absence of verification by comparing agents' reports with those of their peers. In the detail-free multi-task setting, agents respond to multiple independent and identically distributed tasks, and the mechanism does not know the prior distribution of agents' signals. The goal is to provide an

\epsilon

-strongly truthful mechanism where truth-telling rewards agents "strictly" more than any other strategy profile (with

\epsilon

additive error), and to do so while requiring as few tasks as possible. We design a family of mechanisms with a scoring function that maps a pair of reports to a score. The mechanism is strongly truthful if the scoring function is "prior ideal," and

\epsilon

-strongly truthful as long as the scoring function is sufficiently close to the ideal one. This reduces the above mechanism design problem to a learning problem -- specifically learning an ideal scoring function. We leverage this reduction to obtain the following three results. 1) We show how to derive good bounds on the number of tasks required for different types of priors. Our reduction applies to myriad continuous signal space settings. This is the first peer-prediction mechanism on continuous signals designed for the multi-task setting. 2) We show how to turn a soft-predictor of an agent's signals (given the other agents' signals) into a mechanism. This allows the practical use of machine learning algorithms that give good results even when many agents provide noisy information. 3) For finite signal spaces, we obtain

\epsilon

-strongly truthful mechanisms on any stochastically relevant prior, which is the maximal possible prior. In contrast, prior work only achieves a weaker notion of truthfulness (informed truthfulness) or requires stronger assumptions on the prior.Comment: 39 pages, 1 figur

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Privacy-Preserving and Outsourced Multi-User k-Means Clustering

Author: Bertino Elisa
Liu Dongxi
Rao Fang-Yu
Samanthula Bharath K.
Yi Xun
Publication venue
Publication date: 14/12/2014
Field of study

Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Often, the entities involved in the data mining process are end-users or organizations with limited computing and storage resources. As a result, such entities may want to refrain from participating in the PPDM process. To overcome this issue and to take many other benefits of cloud computing, outsourcing PPDM tasks to the cloud environment has recently gained special attention. We consider the scenario where n entities outsource their databases (in encrypted format) to the cloud and ask the cloud to perform the clustering task on their combined data in a privacy-preserving manner. We term such a process as privacy-preserving and outsourced distributed clustering (PPODC). In this paper, we propose a novel and efficient solution to the PPODC problem based on k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers altogether through an efficient transformation technique. Our solution builds the clusters securely in an iterative fashion and returns the final cluster centers to all entities when a pre-determined termination condition holds. The proposed solution protects data confidentiality of all the participating entities under the standard semi-honest model. To the best of our knowledge, ours is the first work to discuss and propose a comprehensive solution to the PPODC problem that incurs negligible cost on the participating entities. We theoretically estimate both the computation and communication costs of the proposed protocol and also demonstrate its practical value through experiments on a real dataset.Comment: 16 pages, 2 figures, 5 table

arXiv.org e-Print Archive

Crossref

Montclair State University Digital Commons

RMIT Research Repository

Purdue E-Pubs

Development of System Modules for Children’s Games with Vision and Music-Based Interactive Real-Time Feedback Modules - A Design-Based Research Approach

Author: Liu Fang-Yu
Tzeng Sy-Yi
Publication venue: Liverpool John Moores University
Publication date: 31/10/2023
Field of study

Most past research on young children’s attention focused on the design of multimedia games based on visual stimulation. In contrast, few studies have been on the development of teaching tools focusing on auditory stimulation. This study aims to develop a real-time interactive digital game with music and eye tracking for young children. The Design-Based Research (DBR) approach was adopted. Melodic tunes and lyrics composed by the researcher constitute the auditory stimulation, paired with visual images, in a game emphasizing interactivity between game content and players. Discussions were held between the various members of the developing team, during which the game developers and domain experts proposed suggestions to the researcher, who then continuously fine-tuned the game in line with the research objective. Our preliminary findings suggested that DBR, which emphasizes child-centered design, provides a novel and innovative approach to digital game design

PATT40 (LJMU)

Three dimensional shape comparison of flexible proteins using the local-diameter descriptor

Author: Fang Yi
Liu Yu-Shen
Ramani Karthik
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Techniques for inferring the functions of the protein by comparing their shape similarity have been receiving a lot of attention. Proteins are functional units and their shape flexibility occupies an essential role in various biological processes. Several shape descriptors have demonstrated the capability of protein shape comparison by treating them as rigid bodies. But this may give rise to an incorrect comparison of flexible protein shapes. Results We introduce an efficient approach for comparing flexible protein shapes by adapting a <it>local diameter </it>(LD) <it>descriptor</it>. The LD descriptor, developed recently to handle skeleton based shape deformations <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, is adapted in this work to capture the invariant properties of shape deformations caused by the motion of the protein backbone. Every sampled point on the protein surface is assigned a value measuring the diameter of the 3D shape in the neighborhood of that point. The LD descriptor is built in the form of a one dimensional histogram from the distribution of the diameter values. The histogram based shape representation reduces the shape comparison problem of the flexible protein to a simple distance calculation between 1D feature vectors. Experimental results indicate how the LD descriptor accurately treats the protein shape deformation. In addition, we use the LD descriptor for protein shape retrieval and compare it to the effectiveness of conventional shape descriptors. A sensitivity-specificity plot shows that the LD descriptor performs much better than the conventional shape descriptors in terms of consistency over a family of proteins and discernibility across families of different proteins. Conclusion Our study provides an effective technique for comparing the shape of flexible proteins. The experimental results demonstrate the insensitivity of the LD descriptor to protein shape deformation. The proposed method will be potentially useful for molecule retrieval with similar shapes and rapid structure retrieval for proteins. The demos and supplemental materials are available on <url>https://engineering.purdue.edu/PRECISE/LDD</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Purdue E-Pubs

Peer Prediction for Learning Agents

Author: Chen Yiling
Feng Shi
Yu Fang-Yi
Publication venue
Publication date: 27/10/2022
Field of study

Peer prediction refers to a collection of mechanisms for eliciting information from human agents when direct verification of the obtained information is unavailable. They are designed to have a game-theoretic equilibrium where everyone reveals their private information truthfully. This result holds under the assumption that agents are Bayesian and they each adopt a fixed strategy across all tasks. Human agents however are observed in many domains to exhibit learning behavior in sequential settings. In this paper, we explore the dynamics of sequential peer prediction mechanisms when participants are learning agents. We first show that the notion of no regret alone for the agents' learning algorithms cannot guarantee convergence to the truthful strategy. We then focus on a family of learning algorithms where strategy updates only depend on agents' cumulative rewards and prove that agents' strategies in the popular Correlated Agreement (CA) mechanism converge to truthful reporting when they use algorithms from this family. This family of algorithms is not necessarily no-regret, but includes several familiar no-regret learning algorithms (e.g multiplicative weight update and Follow the Perturbed Leader) as special cases. Simulation of several algorithms in this family as well as the

\epsilon

-greedy algorithm, which is outside of this family, shows convergence to the truthful strategy in the CA mechanism.Comment: 34 pages, 9 figure

arXiv.org e-Print Archive