9 research outputs found
Disentanglement of Correlated Factors via Hausdorff Factorized Support
A grand goal in deep learning research is to learn representations capable of
generalizing across distribution shifts. Disentanglement is one promising
direction aimed at aligning a models representations with the underlying
factors generating the data (e.g. color or background). Existing
disentanglement methods, however, rely on an often unrealistic assumption: that
factors are statistically independent. In reality, factors (like object color
and shape) are correlated. To address this limitation, we propose a relaxed
disentanglement criterion - the Hausdorff Factorized Support (HFS) criterion -
that encourages a factorized support, rather than a factorial distribution, by
minimizing a Hausdorff distance. This allows for arbitrary distributions of the
factors over their support, including correlations between them. We show that
the use of HFS consistently facilitates disentanglement and recovery of
ground-truth factors across a variety of correlation settings and benchmarks,
even under severe training correlations and correlation shifts, with in parts
over +60% in relative improvement over existing disentanglement methods. In
addition, we find that leveraging HFS for representation learning can even
facilitate transfer to downstream tasks such as classification under
distribution shifts. We hope our original approach and positive empirical
results inspire further progress on the open problem of robust generalization
Causal Structure Learning Supervised by Large Language Model
Causal discovery from observational data is pivotal for deciphering complex
relationships. Causal Structure Learning (CSL), which focuses on deriving
causal Directed Acyclic Graphs (DAGs) from data, faces challenges due to vast
DAG spaces and data sparsity. The integration of Large Language Models (LLMs),
recognized for their causal reasoning capabilities, offers a promising
direction to enhance CSL by infusing it with knowledge-based causal inferences.
However, existing approaches utilizing LLMs for CSL have encountered issues,
including unreliable constraints from imperfect LLM inferences and the
computational intensity of full pairwise variable analyses. In response, we
introduce the Iterative LLM Supervised CSL (ILS-CSL) framework. ILS-CSL
innovatively integrates LLM-based causal inference with CSL in an iterative
process, refining the causal DAG using feedback from LLMs. This method not only
utilizes LLM resources more efficiently but also generates more robust and
high-quality structural constraints compared to previous methodologies. Our
comprehensive evaluation across eight real-world datasets demonstrates
ILS-CSL's superior performance, setting a new standard in CSL efficacy and
showcasing its potential to significantly advance the field of causal
discovery. The codes are available at
\url{https://github.com/tyMadara/ILS-CSL}
Probabilistic Models of Motor Production
N. Bernstein defined the ability of the central neural system (CNS) to control many degrees of freedom of a physical body with all its redundancy and flexibility as the main problem in motor control. He pointed at that man-made mechanisms usually have one, sometimes two degrees of freedom (DOF); when the number of DOF increases further, it becomes prohibitively hard to control them. The brain, however, seems to perform such control effortlessly. He suggested the way the brain might deal with it: when a motor skill is being acquired, the brain artificially limits the degrees of freedoms, leaving only one or two. As the skill level increases, the brain gradually "frees" the previously fixed DOF, applying control when needed and in directions which have to be corrected, eventually arriving to the control scheme where all the DOF are "free". This approach of reducing the dimensionality of motor control remains relevant even today.
One the possibles solutions of the Bernstetin's problem is the hypothesis of motor primitives (MPs) - small building blocks that constitute complex movements and facilitite motor learnirng and task completion. Just like in the visual system, having a homogenious hierarchical architecture built of similar computational elements may be beneficial.
Studying such a complicated object as brain, it is important to define at which level of details one works and which questions one aims to answer. David Marr suggested three levels of analysis: 1. computational, analysing which problem the system solves; 2. algorithmic, questioning which representation the system uses and which computations it performs; 3. implementational, finding how such computations are performed by neurons in the brain. In this thesis we stay at the first two levels, seeking for the basic representation of motor output.
In this work we present a new model of motor primitives that comprises multiple interacting latent dynamical systems, and give it a full Bayesian treatment. Modelling within the Bayesian framework, in my opinion, must become the new standard in hypothesis testing in neuroscience. Only the Bayesian framework gives us guarantees when dealing with the inevitable plethora of hidden variables and uncertainty.
The special type of coupling of dynamical systems we proposed, based on the Product of Experts, has many natural interpretations in the Bayesian framework. If the dynamical systems run in parallel, it yields Bayesian cue integration. If they are organized hierarchically due to serial coupling, we get hierarchical priors over the dynamics. If one of the dynamical systems represents sensory state, we arrive to the sensory-motor primitives. The compact representation that follows from the variational treatment allows learning of a motor primitives library. Learned separately, combined motion can be represented as a matrix of coupling values.
We performed a set of experiments to compare different models of motor primitives. In a series of 2-alternative forced choice (2AFC) experiments participants were discriminating natural and synthesised movements, thus running a graphics Turing test. When available, Bayesian model score predicted the naturalness of the perceived movements. For simple movements, like walking, Bayesian model comparison and psychophysics tests indicate that one dynamical system is sufficient to describe the data. For more complex movements, like walking and waving, motion can be better represented as a set of coupled dynamical systems. We also experimentally confirmed that Bayesian treatment of model learning on motion data is superior to the simple point estimate of latent parameters. Experiments with non-periodic movements show that they do not benefit from more complex latent dynamics, despite having high kinematic complexity.
By having a fully Bayesian models, we could quantitatively disentangle the influence of motion dynamics and pose on the perception of naturalness. We confirmed that rich and correct dynamics is more important than the kinematic representation.
There are numerous further directions of research. In the models we devised, for multiple parts, even though the latent dynamics was factorized on a set of interacting systems, the kinematic parts were completely independent. Thus, interaction between the kinematic parts could be mediated only by the latent dynamics interactions. A more flexible model would allow a dense interaction on the kinematic level too.
Another important problem relates to the representation of time in Markov chains. Discrete time Markov chains form an approximation to continuous dynamics. As time step is assumed to be fixed, we face with the problem of time step selection. Time is also not a explicit parameter in Markov chains. This also prohibits explicit optimization of time as parameter and reasoning (inference) about it. For example, in optimal control boundary conditions are usually set at exact time points, which is not an ecological scenario, where time is usually a parameter of optimization. Making time an explicit parameter in dynamics may alleviate this
Sparse Randomized Shortest Paths Routing with Tsallis Divergence Regularization
This work elaborates on the important problem of (1) designing optimal
randomized routing policies for reaching a target node t from a source note s
on a weighted directed graph G and (2) defining distance measures between nodes
interpolating between the least cost (based on optimal movements) and the
commute-cost (based on a random walk on G), depending on a temperature
parameter T. To this end, the randomized shortest path formalism (RSP,
[2,99,124]) is rephrased in terms of Tsallis divergence regularization, instead
of Kullback-Leibler divergence. The main consequence of this change is that the
resulting routing policy (local transition probabilities) becomes sparser when
T decreases, therefore inducing a sparse random walk on G converging to the
least-cost directed acyclic graph when T tends to 0. Experimental comparisons
on node clustering and semi-supervised classification tasks show that the
derived dissimilarity measures based on expected routing costs provide
state-of-the-art results. The sparse RSP is therefore a promising model of
movements on a graph, balancing sparse exploitation and exploration in an
optimal way
Probabilistic Models of Motor Production
N. Bernstein defined the ability of the central neural system (CNS) to control many degrees of freedom of a physical body with all its redundancy and flexibility as the main problem in motor control. He pointed at that man-made mechanisms usually have one, sometimes two degrees of freedom (DOF); when the number of DOF increases further, it becomes prohibitively hard to control them. The brain, however, seems to perform such control effortlessly. He suggested the way the brain might deal with it: when a motor skill is being acquired, the brain artificially limits the degrees of freedoms, leaving only one or two. As the skill level increases, the brain gradually "frees" the previously fixed DOF, applying control when needed and in directions which have to be corrected, eventually arriving to the control scheme where all the DOF are "free". This approach of reducing the dimensionality of motor control remains relevant even today.
One the possibles solutions of the Bernstetin's problem is the hypothesis of motor primitives (MPs) - small building blocks that constitute complex movements and facilitite motor learnirng and task completion. Just like in the visual system, having a homogenious hierarchical architecture built of similar computational elements may be beneficial.
Studying such a complicated object as brain, it is important to define at which level of details one works and which questions one aims to answer. David Marr suggested three levels of analysis: 1. computational, analysing which problem the system solves; 2. algorithmic, questioning which representation the system uses and which computations it performs; 3. implementational, finding how such computations are performed by neurons in the brain. In this thesis we stay at the first two levels, seeking for the basic representation of motor output.
In this work we present a new model of motor primitives that comprises multiple interacting latent dynamical systems, and give it a full Bayesian treatment. Modelling within the Bayesian framework, in my opinion, must become the new standard in hypothesis testing in neuroscience. Only the Bayesian framework gives us guarantees when dealing with the inevitable plethora of hidden variables and uncertainty.
The special type of coupling of dynamical systems we proposed, based on the Product of Experts, has many natural interpretations in the Bayesian framework. If the dynamical systems run in parallel, it yields Bayesian cue integration. If they are organized hierarchically due to serial coupling, we get hierarchical priors over the dynamics. If one of the dynamical systems represents sensory state, we arrive to the sensory-motor primitives. The compact representation that follows from the variational treatment allows learning of a motor primitives library. Learned separately, combined motion can be represented as a matrix of coupling values.
We performed a set of experiments to compare different models of motor primitives. In a series of 2-alternative forced choice (2AFC) experiments participants were discriminating natural and synthesised movements, thus running a graphics Turing test. When available, Bayesian model score predicted the naturalness of the perceived movements. For simple movements, like walking, Bayesian model comparison and psychophysics tests indicate that one dynamical system is sufficient to describe the data. For more complex movements, like walking and waving, motion can be better represented as a set of coupled dynamical systems. We also experimentally confirmed that Bayesian treatment of model learning on motion data is superior to the simple point estimate of latent parameters. Experiments with non-periodic movements show that they do not benefit from more complex latent dynamics, despite having high kinematic complexity.
By having a fully Bayesian models, we could quantitatively disentangle the influence of motion dynamics and pose on the perception of naturalness. We confirmed that rich and correct dynamics is more important than the kinematic representation.
There are numerous further directions of research. In the models we devised, for multiple parts, even though the latent dynamics was factorized on a set of interacting systems, the kinematic parts were completely independent. Thus, interaction between the kinematic parts could be mediated only by the latent dynamics interactions. A more flexible model would allow a dense interaction on the kinematic level too.
Another important problem relates to the representation of time in Markov chains. Discrete time Markov chains form an approximation to continuous dynamics. As time step is assumed to be fixed, we face with the problem of time step selection. Time is also not a explicit parameter in Markov chains. This also prohibits explicit optimization of time as parameter and reasoning (inference) about it. For example, in optimal control boundary conditions are usually set at exact time points, which is not an ecological scenario, where time is usually a parameter of optimization. Making time an explicit parameter in dynamics may alleviate this
Proposta de arquitetura para sistema especialista híbrido e a correspondente metodologia de aquisição do conhecimento /
Tese (Doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico
A Quantitative and High-Throughput Approach to Gene Regulation in Escherichia coli
Measurements in biology have reached a level of precision that demands quantitative modeling. This is particularly true in the field of gene regulation, where concepts from physics such as thermodynamics have allowed for accurate models to be made.
Many issues remain. DNA sequencing is routine enough to sequence new genomes in days and cheap enough to use deep sequencing to perform precision measurements, but our ability to interpret the wealth of genomic data is lagging behind, especially in the realm of gene regulation. The primary reason is that we lack any information what so ever as to the basic regulatory details of approximately 65 percent of operons even in E. coli, the best understood organism in biology. As a result we cannot use our hard won modeling efforts to understand any of these operons.
This work takes steps to address these issues. First we use 30 LacI mutants as a test case to prove that we can make quantitatively accurate models of gene expression and sequence-dependent binding energies of transcription factors and RNA polymerase.
Next we note that much of the quantitative insight available on transcriptional regulation relies on work on only a few model regulatory systems such as LacI as was considered above. We develop an approach, through a combination of massively parallel reporter assays, mass spectrometry, and information-theoretic modeling that can be used to dissect bacterial promoters in a systematic and scalable way. We demonstrate that we can uncover a qualitative list of transcription factor binding sites as well as their associated quantitative details from both well-studied and previously uncharacterized promoters in E. coli.
Finally we extend the above method to over 100 E. coli promoters using over 12 growth conditions. We show the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulation. In many cases, we identify which transcription factors mediate their regulation. The method introduced clears a path for fully characterizing the regulatory genome of E. coli and advances towards the goal of using this method on a wide variety of other organisms including other prokaryotes and eukaryotes such as Drosophila melanogaster.</p