40 research outputs found
Recommended from our members
A Bayesian Metric for Network Similarity
Networks of every kind and in numerous fields areomnipresent in todayās society (e.g. brain networks, socialnetworks) and are the intense subject of research. It wouldbe of great utility to have a computationally efficient andgenerally applicable method for assessing similarity ofnetworks. The field (going back to the 1950s) has not comeup with such a method (albeit a few moves in this directionexist, such as Jaccard coefficients, QAP--quadraticassignment procedure, and more recently Menezes & Roth,2013, and Asta & Shalizi, 2014). I present a Bayesian-basedmetric for assessing similarity of two networks, possibly ofdifferent size, that include nodes and links between nodes. Iassume the nodes are labeled so that both the nodes andlinks between two nodes that are shared between the twonetworks can be identified.The method calculates similarity as (a monotonictransformation of) the odds that the two observed networks,termed V and W, were produced by random sampling froma single master network, termed G, as opposed to generationby two different but similar networks, termed Gv and Gw.The simplest form of the method ignores strengths thatcould be assigned to nodes and links, and considers onlynodes and links that are, or are not, shared by the networks.Suppose there are n V nodes and N V links only in V, n Wnodes and N W links only in W and n c nodes and N c linksshared between the networks. Thus the number of nodes inV is n c + n V and the number in W is n c + n W . The number ofunique nodes in both V and W is n c + n V + n W = n. Thenumber of links in V is N c + N V and the number in W is N c +N W . The number of unique links in both V and W is N c + N V+ N W = N.The single master network, G, is assumed to consist of theunion of the nodes and links in the two networks, and has nnodes and N links. The probability a given shared node willbe randomly and independently sampled twice is[(n V +n c )/n][(n W +n c )/n]. The probability a given shared linkwill be randomly and independently sampled twice is[(N V +N c )/N][(N W +N c )/N].If there are two generating networks I assume they eachhave n nodes and N links. I also assume they are similar, because we would not be comparing dissimilar networks.The degree of similarity is controlled by ātuningāparameters 1 : Gv and Gw are assumed to share Ī±n nodes andĪ²N links. The probability a given shared node will besampled twice is then Ī±[(n V +n c )/n][(n W +n c )/n], and theprobability a given shared link will be sampled twice isĪ²[(N V +N c )/N][(N W +N c )/N]. The likelihood ratio Ī» js for G vs(GV, GW) as generator of a given shared node is then 1/Ī±and the likelihood ratio Ļ js of a given shared link is then 1/Ī².For a non-shared node, say in V, similar reasoning gives alikelihood ratio Ī» kV of[1-(n W +n c )/n)] /[1ā Ī±(n W +n c )/n]and for a non-shared link a likelihood ratio Ļ kV of[1-(N W +N c )/n)] /[1ā Ī±(N W +N c )/N]For a non-shared node or link in W substitute a Wsubscript for the V subscript in these likelihood ratios.Computational efficiency is a necessity if the similaritymetric is to be applied to large networks. For this reason Ido not calculate the exact probabilities for the numbers ofshared and non-shared nodes and links that are observed(the combinatoric complexity of such calculations isenormous). Instead I make the simplifying assumption thateach node and link contribute the likelihood ratios givenabove and that the total odds is obtained by multiplying allthe likelihood ratios together. This simplification canperhaps be justified if similar distortion is produced by thissimplifying assumption for both the cases of G and (G V ,G W )as generators. Under this simplifying assumption the overallodds becomes:Ļ(1/2) = (Ī» js ) nc (Ī» kV ) nV (Ī» jW ) nW (Ļ js ) Nc (Ļ kV ) NV (Ļ jW ) NWTaking the log of this product converts the calculation tosums and makes calculation highly efficient.This abstract is too short to permit giving the different andmore complex results that hold for the several cases whenthe nodes and/or links have associated strengths. I give asummary of some of the results here. The results for linksand nodes are similar so consider the results for nodes. Letthere be just one set of strength values, Si for the i-th node.Norm these to sum to 1.0. For either generation by G or(Gv,Gw) assume sampling is made without replacement andproportional to strength. Let Ziv and Ziw be theprobabilities that node i will be sampled by n v +n c samples,or n w +n c samples respectively. The Zās would be difficult to obtain analytically but could be estimated by Monte Carlosampling. Consider two possibilities for the way that Gv andGw overlap. In Case A the probability a node will be sharedis simply Ī±, independent of strength. In Case B, theprobability a node will be shared is an increasingfunction of strength, Y i .For Case A the likelihood ratio for a shared node i is:1/Ī±. For a node k only in V the likelihood ratio is: Ī» kV =(1-ĀāZkw)/{1 ā Ī± (1-ĀāZkw)}. For a node only in W exchangethe subscripts v and w. Then we have for the odds due tonodes: Ļ D = (1/Ī±) nc Ī k (Ī» kV )Ī j (Ī» jW ).For Case B the likelihood ratio for a shared node i is1/Y i . For a node k only in V the likelihood ratio is: Ī» kV =(1-ĀāZkw)/{1āY k (1-ĀāZkw)}. Again switch v and w subscriptsfor a node only in W. Then we have for the odds due tonodes: Ļ D = Ī i (1/Y i )Ī k (Ī» kV )Ī j (Ī» jW ).These expressions would have analogous forms forlinks, with different Ns, Zās and Yās, and the overall oddswould, as before, be a product of the odds for nodes andthe odds for links.The critical difference between Cases A and B is thedegree to which evidence based on an observed sharednode or link is strength dependent: For Case B thisevidence rises as strength decreases. This should raiseconcerns: However strengths are obtained there is likelyto be measurement noise that reduces the reliability oflow strength values. This might argue in favor of usingCase A, or if one preferred Case B to restrict the Yi valuesto lie above a lower bound. The idea would be to letevidence depend most on the nodes (or links with highstrength values.It should be observed that the existence of acomputationally efficient and generally applicable metricfor network similarity would allow alignment of non-labeled networks. One would search for the alignment ofnodes that would maximize the metric.I have many relevant publications demonstrating somedegree of expertise in Bayesian modeling (e.g.: Shiffrin &Chandramouli, in press; Shiffrin, Chandramouli, &GrĆ¼nwald, 2015; Chandramouli & Shiffrin, 2015; Nelson &Shiffrin, 2013; Cox & Shiffrin, 2012; Shiffrin, Lee, Kim, &Wagenmakers, 2008; Cohen, Shiffrin, Gold, Ross, & Ross,2007; Denton & Shiffrin; Huber, Shiffrin, Lyle, & Ruys,2001; Shiffrin & Steyvers, 1997). I note that the presentresults are in a vague sense an extension of the metricproposed for matching memory probes to memory tracesthat are given in Cox and Shiffrin (2012) and in theappendix of Nelsonb and Shiffrin (2013)
Recommended from our members
Computational Models of Classical Conditioning: A Qualitative Evaluation and Comparison
Classical conditioning is a fundamental paradigm in the study of learning and thus in understanding cognitive processes and behaviour, for which we need comprehensive and accurate models. This paper aims at evaluating and comparing a collection of influential computational models of classical conditioning by analysing the models themselves and against one another qualitatively. The results will clarify the state of the art in the area and help develop a standard model of classical conditioning
Recommended from our members
Experience as a Free Parameter in the Cognitive Modeling of Language
To account for natural variability in cognitive processing, it is
standard practice to optimize the parameters of a model to
account for behavioral data. However, variability reflecting the
information to which one has been exposed is usually ignored.
Nevertheless, most language theories assign a large role to an
individualās experience with language. We present a new way to
fit language-based behavioral data that combines simple learning
and processing mechanisms using optimization of language
materials. We demonstrate that benchmark fits on multiple
linguistic tasks can be achieved using this method and will argue
that one must account not only for the internal parameters of a
model but also the external experience that people receive when
theorizing about human behavior
Five principles for studying people's use of heuristics
Abstract: The fast and frugal heuristics framework assumes that people rely on an adaptive toolbox of simple decision strategiesācalled heuristicsāto make inferences, choices, estimations, and other decisions. Each of these heuristics is tuned to regularities in the structure of the task environment and each is capable of exploiting the ways in which basic cognitive capacities work. In doing so, heuristics enable adaptive behavior. In this article, we give an overview of the framework and formulate five principles that should guide the study of peopleās adaptive toolbox. We emphasize that models of heuristics should be (i) precisely defined; (ii) tested comparatively; (iii) studied in line with theories of strategy selection; (iv) evaluated by how well they predict new data; and (vi) tested in the real world in addition to the laboratory. Key words: fast and frugal heuristics; experimental design; model testing As we write this article, international financial markets are in turmoil. Large banks are going bankrupt almost daily. It is a difficult situation for financial decision makers ā regardless of whether they are lay investors trying to make small-scale profits here and there or professionals employed by the finance industry. To safeguard their investments, these decision makers need to be able to foresee uncertain future economic developments, such as which investments are likely to be the safest and which companies are likely to crash next. In times of rapid waves of potentially devastating financial crashes, these informed bets must often be made quickly, with little time for extensive information search or computationally demanding calculations of likely future returns. Lay stock traders in particular have to trust the contents of their memories, relying on incomplete, imperfec
Impaired learning to dissociate advantageous and disadvantageous risky choices in adolescents
Adolescence is characterized by a surge in maladaptive risk-taking behaviors, but whether and how this relates to developmental changes in experience-based learning is largely unknown. In this preregistered study, we addressed this issue using a novel task that allowed us to separate the learning-driven optimization of risky choice behavior over time from overall risk-taking tendencies. Adolescents (12ā17Ā years old) learned to dissociate advantageous from disadvantageous risky choices less well than adults (20ā35Ā years old), and this impairment was stronger in early than mid-late adolescents. Computational modeling revealed that adolescentsā suboptimal performance was largely due to an inefficiency in core learning and choice processes. Specifically, adolescents used a simpler, suboptimal, expectation-updating process and a more stochastic choice policy. In addition, the modeling results suggested that adolescents, but not adults, overvalued the highest rewards. Finally, an exploratory latent-mixture model analysis indicated that a substantial proportion of the participants in each age group did not engage in experience-based learning but used a gamblerās fallacy strategy, stressing the importance of analyzing individual differences. Our results help understand why adolescents tend to make more, and more persistent, maladaptive risky decisions than adults when the values of these decisions have to be learned from experience
Personalized bank campaign using artificial neural networks
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsNowadays, high market competition requires Banks to focus more at individual customersĀ“ behaviors. Specifically, customers prefer a personal relationship with the finance institution and they want to receive exclusive offers. Thus, a successful cross-sell and up- sell personalized campaign requires to know the individual client interest for the offer. The aim of this project is to create a model, that, is able to identify the probability of a customer to buy a product of the bank. The strategic plan is to run a long-term personalized campaign and the challenge is to create a model which remains accurate during this time. The source datasets consist of 12 dataMarts, which represent a monthly snapshot of the Bankās dataWarehouse between April 2016 and March 2017. They consist of 191 original variables, which contain personal and transactional information and around 1.400.000 clients each. The selected modeling technique is Artificial Neural Networks and specifically, Multilayer Perceptron running with Back-propagation. The results showed that the model performs well and the business can use it to optimize the profitability. Despite the good results, business must monitor the modelĀ“s outputs to check the performance through time
If God Handed Us the Ground-Truth Theory of Memory, How Would We Recognize It?
What makes a scientific theory convincing? We have many formal models of human memory, but no agreement about which is the right one. If anything, we agree that they are all wrong. By analyzing the properties of successful theories in physics, I propose that we will be convinced by a theory of memory only when it is able to make precise point predictions for individual peopleās behavior in any new memory task, manipulation, or paradigm we could construct, without refitting parameters to do so or only by estimating its parameters for each individual on an independent standardized battery of tests. Such a theory would not only be able to accurately describe lab-based empirical effects but would also be practically useful. I highlight how some of our current model development and evaluation practices might be holding us back and outline some important empirical steps necessary to evaluate theories by this standard