347 research outputs found

    Influence Maximization in Social Networks: A Survey

    Full text link
    Online social networks have become an important platform for people to communicate, share knowledge and disseminate information. Given the widespread usage of social media, individuals' ideas, preferences and behavior are often influenced by their peers or friends in the social networks that they participate in. Since the last decade, influence maximization (IM) problem has been extensively adopted to model the diffusion of innovations and ideas. The purpose of IM is to select a set of k seed nodes who can influence the most individuals in the network. In this survey, we present a systematical study over the researches and future directions with respect to IM problem. We review the information diffusion models and analyze a variety of algorithms for the classic IM algorithms. We propose a taxonomy for potential readers to understand the key techniques and challenges. We also organize the milestone works in time order such that the readers of this survey can experience the research roadmap in this field. Moreover, we also categorize other application-oriented IM studies and correspondingly study each of them. What's more, we list a series of open questions as the future directions for IM-related researches, where a potential reader of this survey can easily observe what should be done next in this field

    Influence maximisation towards target users and minimal diffusion of information based on information needs

    Get PDF
    Influence maximisation within social network is essential to the modern business. Influence Maximisation Problem (IMP) involves the minimal selection of influencers that leads to maximum contagion while minimizing Diffusion Cost (DC). Previous models of IMP do not consider DC in spreading information towards target users. Furthermore, influencer selection for varying information needs was not considered which leads to influence overlaps and elimination of weak nodes. This study proposes the Information Diffusion towards Target Users (IDTU) algorithm to enhance influencer selection while minimizing the DC. IDTU was developed on greedy approach by using graph sketches to improve the selection of influencers that maximize influence spread to a set of target users. Moreover, the influencer identification based on specific needs was implemented using a General Additive Model on four fundamental centralities. Experimental method was used by employing five social network datasets including Epinions, Wiki-Vote, SlashDot, Facebook and Twitter from Stanford data repository. Evaluation on IDTU was performed against 3 greedy and 6 heuristics benchmark algorithms. IDTU identified all the specified target nodes while lowering the DC by up to 79%. In addition, the influence overlap problem was reduced by lowering up to an average of six times of the seed set size. Results showed that selecting the top influencers using a combination of metrics is effective in minimizing DC and maximizing contagion up to 77% and 32% respectively. The proposed IDTU has been able to maximize information diffusion while minimizing DC. It demonstrates a more balanced and nuanced approach regarding influencer selection. This will be useful for business and social media marketers in leveraging their promotional activities

    Machine learning in the real world with multiple objectives

    Full text link
    Machine learning (ML) is ubiquitous in many real-world applications. Existing ML systems are based on optimizing a single quality metric such as prediction accuracy. These metrics typically do not fully align with real-world design constraints such as computation, latency, fairness, and acquisition costs that we encounter in real-world applications. In this thesis, we develop ML methods for optimizing prediction accuracy while accounting for such real-world constraints. In particular, we introduce multi-objective learning in two different setups: resource-efficient prediction and algorithmic fairness in language models. First, we focus on decreasing the test-time computational costs of prediction systems. Budget constraints arise in many machine learning problems. Computational costs limit the usage of many models on small devices such as IoT or mobile phones and increase the energy consumption in cloud computing. We design systems that allow on-the-fly modification of the prediction model for each input sample. These sample-adaptive systems allow us to leverage wide variability in sample complexity where we learn policies for selecting cheap models for low complexity instances and using descriptive models only for complex ones. We utilize multiple--objective approach where one minimizes the system cost while preserving predictive accuracy. We demonstrate significant speed-ups in the fields of computer vision, structured prediction, natural language processing, and deep learning. In the context of fairness, we first demonstrate that a naive application of ML methods runs the risk of amplifying social biases present in data. This danger is particularly acute for methods based on word embeddings, which are increasingly gaining importance in many natural language processing applications of ML. We show that word embeddings trained on Google News articles exhibit female/male gender stereotypes. We demonstrate that geometrically, gender bias is captured by unique directions in the word embedding vector space. To remove bias we formulate a empirical risk objective with fairness constraints to remove stereotypes from embeddings while maintaining desired associations. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduces gender bias in embeddings, while preserving its useful properties such as the ability to cluster related concepts

    PROGRAM REVIEW 1993: Self Study Report Department of Biometry

    Get PDF
    The CSRS review team applauds the statistical expertise of the Biometry Department which began as the statistical Laboratory in 1957 and culminated with the current academic Department of Biometry. This enhancement has been highlighted by a significant increase in the number of faculty and staff, the initiation of a Master of Science program, and the provision of graduate assistant stipends. With the presence of seven faculty, the imminent increase from seven to fifteen graduate students, the establishment of statistical consulting with numerous IANR faculty, and the diverse research and teaching expertise of the faculty, the department is poised to provide greater service to the University. Future goals may include: (1) establishment of a statistical department worthy of national recognition by joining the faculties of the Department of Biometry and the Division of statistics from the Department of Mathematics and Statistics, and (2) the formation of a PhD program in Statistics that encompasses biometry and theoretical statistics. It is apparent that the faculty is capable of conducting statistical research of a more theoretical nature. However, securing research grants as principal investigators or as co-investigators with other UNL faculty is required to fully support those research endeavors. With successful grant activity a greater portion of research results should be published in statistical journals. Based on discussions, consultations on experimental design and data analysis are appropriate and much appreciated by IANR faculty. While personal consultations have been highly beneficial, the initiation of a Help Desk provides rapid and accurate response to straightforward statistical questions; thereby relieving the Biometry faculty for personal ·consultation on more complex statistical issues. The help desk provides valuable and real world training for graduate students. Courses taught as a service to undergraduate and graduate IANR students appear to be appropriate in number and content. With a master\u27s program successfully started, more formal policies for recruitment, selection, advising, and placement should be initiated. Further attention is required to provide space, computers, and advisers for graduate students. Faculty expressed an appreciation for the strong support provided to Biometry by the administration from the Head, Deans, and Vice Chancellor. The team however, notes several management concerns including: the lack of faculty meetings; inadequate communications among Head, faculty, and graduate students; the need for continual curriculum improvement; and the lack of sufficient office and laboratory space. Concern is also raised about the potential over-commitment to international consulting at the expense of performing departmental functions. The team is reluctant to recommend the immediate initiation of a PhD program in the Biometry Department. Establishment of a successful master\u27s program before pursuing the doctoral program appears prudent. The merger of the two UNL statistical groups into one department would position UNL for a strong PhD program

    Three fundamental pillars of decision-centered teamwork

    Get PDF
    This thesis introduces a novel paradigm in artificial intelligence: decision-centered teamwork. Decision-centered teamwork is the analysis of agent teams that iteratively take joint decisions into solving complex problems. Although teams of agents have been used to take decisions in many important domains, such as: machine learning, crowdsourcing, forecasting systems, and even board games; a study of a general framework for decisioncentered teamwork has never been presented in the literature before. I divide decision-centered teamwork in three fundamental challenges: (i) Agent Selection, which consists of selecting a set of agents from an exponential universe of possible teams; (ii) Aggregation of Opinions, which consists of designing methods to aggregate the opinions of different agents into taking joint team decisions; (iii) Team Assessment, which consists of designing methods to identify whether a team is failing, allowing a “coordinator” to take remedial procedures. In this thesis, I handle all these challenges. For Agent Selection, I introduce novel models of diversity for teams of voting agents. My models rigorously show that teams made of the best agents are not necessarily optimal, and also clarify in which situations diverse teams should be preferred. In particular, I show that diverse teams get stronger as the number of actions increases, by analyzing how the agents’ probability distribution function over actions changes. This has never been presented before in the ensemble systems literature. I also show that diverse teams have a great applicability for design problems, where the objective is to maximize the number of optimal solutions for human selection, combining for the first time social choice with number theory. All of these theoretical models and predictions are verified in real systems, such as Computer Go and architectural design. In particular, for architectural design I optimize the design of buildings with agent teams not only for cost and project requirements, but also for energy-efficiency, being thus an essential domain for sustainability. Concerning Aggregation of Opinions, I evaluate classical ranked voting rules from social choice in Computer Go, only to discover that plurality leads to the best results. This happens because real agents tend to have very noisy rankings. Hence, I create a ranking by sampling extraction technique, leading to significantly better results with the Borda voting rule. A similar study is also performed in the social networks domain, in the context of influence maximization. Additionally, I study a novel problem in social networks: I assume only a subgraph of the network is initially known, and we must spread influence and learn the graph simultaneously. I analyze a linear combination of two greedy algorithms, outperforming both of them. This domain has a great potential for health, as I run experiments in four real-life social networks from the homeless population of Los Angeles, aiming at spreading HIV prevention information. Finally, with regards to Team Assessment, I develop a domain independent team assessment methodology for teams of voting agents. My method is within a machine learning framework, and learns a prediction model over the voting patterns of a team, instead of learning over the possible states of the problem. The methodology is tested and verified in Computer Go and Ensemble Learning

    Comparative Summarization of Document Collections

    Get PDF
    Comparing documents is an important task that help us in understanding the differences between documents. Example of document comparisons include comparing laws on same related subject matter in different jurisdictions, comparing the specifications of similar product from different manufacturers. One can see that the need for comparison does not stop at individual documents, and extends to large collections of documents. For example comparing the writing styles of an author early vs late in their life, identifying linguistic and lexical patterns of different political ideologies, or discover commonalities of political arguments in disparate events. Comparing large document collections calls for automated algorithms to do so. Every day a huge volume of documents are produced in social and news media. There has been a lot of research in summarizing individual document such as a news article, or document collections such as a collection of news articles on a related topic or event. However, comparatively summarizing different document collections, or comparative summarization is under-explored problem in terms of methodology, datasets, evaluations and applicability in different domains. To address this, in this thesis, we make three types of contributions to comparative summarization, methodology, datasets and evaluation, and empirical measurements on a range of settings where comparative summarization can be applied. We propose a new formulation the problem of comparative summarization as competing binary classifiers. This formulation help us to develop new unsupervised and supervised methods for comparative summarization. Our methods are based on Maximum Mean Discrepancy (MMD), a metric that measures the distance between two sets of datapoints (or documents). The unsupervised methods incorporate information coverage, information diversity and discriminativeness of the prototypes based on global-model of sentence-sentence similarity, and be optimized with greedy and gradient methods. We show the efficacy of the approach in summarizing a long running news topic over time. Our supervised method improves the unsupervised methods, and can learn the importance of prototypes based on surface features (e.g., position, length, presence of cue words) and combine different text feature representations. Our supervised method meets or exceeds the state-of-the-art performance in benchmark datasets. We design new scalable automatic and crowd-sourced extrinsic evaluations of comparative summaries when human written ground truth summaries are not available. To evaluate our methods, we develop two new datasets on controversial news topics -- CONTROVNEWS2017 and NEWS2019+BIAS datasets which we use in different experiments. We use CONTROVNEWS2017, which consists of news articles on controversial topics to evaluate our unsupervised methods in summarizing over time. We use NEWS2019+BIAS, which again consists of news articles on controversial news topics, along with media bias labels to empirically study the applicability of methods. Finally, we measure the distinguishability and summarizability of document collections to quantify the applicability of our methods in different domains. We measure these metrics in a newly curated NEWS2019+BIAS dataset in comparing articles over time, and across ideological leanings of media outlets. First, we observe that the summarizability is proportional to the distinguishability, and identify the groups of articles that are less or more distinguishable.Second, better distinguishability and summarizability is amenable to the choice of document representations according to the comparisons we make, either over time, or across ideological leanings of media outlets. We also apply the comparative summarization method to the task of comparing stances in the social media domain

    Understanding and enabling nutrition and agriculture linkages: development and implementation of home-grown school feeding in Nepal

    Get PDF
    Providing nutritionally balanced diets through ecologically sustainable and equitable food systems is the most profound challenge facing us today. The current state of food and nutrition security is in many ways is a legacy of the green revolution and neoliberal market based political economy. Technocratic and market- based approaches have contributed to creating a highly homogenised food system at the expense of diversity, ecological sustainability and nutrition quality. The origin of agriculture around 10000 years ago and the processes of domestication provide useful insights on the key drivers of food production that influence policy and programmes even today. More importantly there is compelling evidence which shows how the transition to agriculture adversely impacted human health in a wide range of contexts. The study is an action research project primarily based on design, implementation and evaluation of ‘Home Grown School Feeding’ in eight districts across the three main agroecological zones of Nepal. It provides important policy and programmatic evidence on enabling decentralized food systems which are nutritionally and ecologically sensitive, as part of a government led universal food-based safety net project. Based on action research inquiry process, the thesis develops concepts and theories through the different chapters to contribute to our understanding of food systems and programme design. The intervention creates an effective platform for food system mediation through different pathways. Evidence on intervention governance through ‘food sovereignty’ lens demonstrates how HSGF interventions can also promote equity in food systems in terms of policies, funding and knowledge. COVID-19 pandemic control measures have contributed to undermining food and nutrition security, with the poorest being hit the hardest and young children potentially facing life-long consequences. Overall evidence from the thesis including the recent Covid crisis highlights the importance of resilient and context sensitive food production and it is an emphatic reminder of the need to have integrated public health-nutrition-ecology approach to food systems.Open Acces

    Personaneinsatz- und Tourenplanung für Mitarbeiter mit Mehrfachqualifikationen

    Get PDF
    In workforce routing and scheduling there are many applications in which differently skilled workers must perform jobs that occur at different locations, where each job requires a particular combination of skills. In many such applications, a group of workers must be sent out to provide all skills required by a job. Examples are found in maintenance operations, the construction sector, health care operations, or consultancies. In this thesis, we analyze the combined problem of composing worker groups (teams) and routing these teams under goals expressing service-, fairness-, and cost-objectives. We develop mathematical optimization models and heuristic solution methods for an integrated solution and a sequential solution of the teaming- and routing-subproblems . Computational experiments are conducted to identify the tradeoff of better solution quality and computational effort
    corecore