Search CORE

25 research outputs found

Catching a Viral Video

Author: Broxton T
Interian Yannet
Vaver J
Wattenhofer M
Publication venue: USF Scholarship: a digital repository @ Gleeson Library | Geschke Center
Publication date: 01/01/2013
Field of study

The sharing and re-sharing of videos on social sites, blogs e-mail, and other means has given rise to the phenomenon of viral videos - videos that become popular through internet sharing. In this paper we seek to better understand viral videos on YouTube by analyzing sharing and its relationship to video popularity using millions of YouTube videos. The socialness of a video is quantified by classifying the referrer sources for video views as social (e.g. an emailed link, Facebook referral) or non-social (e.g. a link from related videos). We find that viewership patterns of highly social videos are very different from less social videos. For example, the highly social videos rise to, and fall from, their peak popularity more quickly than less social videos. We also find that not all highly social videos become popular, and not all popular videos are highly social. By using our insights on viral videos we are able develop a method for ranking blogs and websites on their ability to spread viral videos

Crossref

University of San Francisco

Recommended from our members

Expert-augmented machine learning.

Author: Auerbach Andrew
Delgado Elier
Eaton Eric
Friedman Jerome H
Gennatas Efstathios D
Interian Yannet
Luna José Marcio
Pirracchio Romain
Reichmann Lara G
Simone Charles B
Solberg Timothy D
Ungar Lyle H
Valdes Gilmer
van der Laan Mark J
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Machine learning is proving invaluable across disciplines. However, its success is often limited by the quality and quantity of available data, while its adoption is limited by the level of trust afforded by given models. Human vs. machine performance is commonly compared empirically to decide whether a certain task should be performed by a computer or an expert. In reality, the optimal learning strategy may involve combining the complementary strengths of humans and machines. Here, we present expert-augmented machine learning (EAML), an automated method that guides the extraction of expert knowledge and its integration into machine-learned models. We used a large dataset of intensive-care patient data to derive 126 decision rules that predict hospital mortality. Using an online platform, we asked 15 clinicians to assess the relative risk of the subpopulation defined by each rule compared to the total sample. We compared the clinician-assessed risk to the empirical risk and found that, while clinicians agreed with the data in most cases, there were notable exceptions where they overestimated or underestimated the true risk. Studying the rules with greatest disagreement, we identified problems with the training data, including one miscoded variable and one hidden confounder. Filtering the rules based on the extent of disagreement between clinician-assessed risk and empirical risk, we improved performance on out-of-sample data and were able to train with less data. EAML provides a platform for automated creation of problem-specific priors, which help build robust and dependable machine-learning models in critical applications

eScholarship - University of California

Approximation algorithm for random MAX-k-SAT

Author: Yannet Interian
Publication venue
Publication date
Field of study

Abstract. We provide a rigorous analysis of a greedy approximation algorithm for the maximum random k-SAT (MAX-R-kSAT) problem. The algorithm assigns variables one at a time in a predefined order. A variable is assigned TRUE if it occurs more often positively than negatively; otherwise, it is assigned FALSE. After each variable assignment, problem instance is simplified and a new variable is selected. We show that this algorithm gives a 10/9.5-approximation, improving over the 9/8-approximation given by de la Vega and Karpinski [7]. The new approximation ratio is achieved by using a different algorithm than the one proposed in [7], along with a new upper bound on the maximum number of clauses that can be satisfied in a random k-SAT formula [2].

CiteSeerX

Computing Genomic Midpoints

Author: Richard Durrett
Yannet Interian
Publication venue
Publication date
Field of study

This paper proposes a new algorithm for the genomic median problem that combines greedy and stochastic search. Our computational experiments suggest that for more complex problems our algorithm finds better solutions than previous approaches. In particular we find an improved midpoint for a human-mouse-rat comparison with 424 markers. In order to understand why such problems are hard, we explore a phase transition in the complexity of the median problem for random data, associated with the emergence of a giant component in the breakpoint graph

CiteSeerX