118 research outputs found

    Exploring the topical structure of short text through probability models : from tasks to fundamentals

    Get PDF
    Recent technological advances have radically changed the way we communicate. Today’s communication has become ubiquitous and it has fostered the need for information that is easier to create, spread and consume. As a consequence, we have experienced the shortening of text messages in mediums ranging from electronic mailing, instant messaging to microblogging. Moreover, the ubiquity and fast-paced nature of these mediums have promoted their use for unthinkable tasks. For instance, reporting real-world events was classically carried out by news reporters, but, nowadays, most interesting events are first disclosed on social networks like Twitter by eyewitness through short text messages. As a result, the exploitation of the thematic content in short text has captured the interest of both research and industry. Topic models are a type of probability models that have traditionally been used to explore this thematic content, a.k.a. topics, in regular text. Most popular topic models fall into the sub-class of LVMs (Latent Variable Models), which include several latent variables at the corpus, document and word levels to summarise the topics at each level. However, classical LVM-based topic models struggle to learn semantically meaningful topics in short text because the lack of co-occurring words within a document hampers the estimation of the local latent variables at the document level. To overcome this limitation, pooling and hierarchical Bayesian strategies that leverage on contextual information have been essential to improve the quality of topics in short text. In this thesis, we study the problem of learning semantically meaningful and predictive representations of text in two distinct phases: • In the first phase, Part I, we investigate the use of LVM-based topic models for the specific task of event detection in Twitter. In this situation, the use of contextual information to pool tweets together comes naturally. Thus, we first extend an existing clustering algorithm for event detection to use the topics learned from pooled tweets. Then, we propose a probability model that integrates topic modelling and clustering to enable the flow of information between both components. • In the second phase, Part II and Part III, we challenge the use of local latent variables in LVMs, specially when the context of short messages is not available. First of all, we study the evaluation of the generalization capabilities of LVMs like PFA (Poisson Factor Analysis) and propose unbiased estimation methods to approximate it. With the most accurate method, we compare the generalization of chordal models without latent variables to that of PFA topic models in short and regular text collections. In summary, we demonstrate that by integrating clustering and topic modelling, the performance of event detection techniques in Twitter is improved due to the interaction between both components. Moreover, we develop several unbiased likelihood estimation methods for assessing the generalization of PFA and we empirically validate their accuracy in different document collections. Finally, we show that we can learn chordal models without latent variables in text through Chordalysis, and that they can be a competitive alternative to classical topic models, specially in short text.Els avenços tecnològics han canviat radicalment la forma que ens comuniquem. Avui en dia, la comunicació és ubiqua, la qual cosa fomenta l’ús de informació fàcil de crear, difondre i consumir. Com a resultat, hem experimentat l’escurçament dels missatges de text en diferents medis de comunicació, des del correu electrònic, a la missatgeria instantània, al microblogging. A més de la ubiqüitat, la naturalesa accelerada d’aquests medis ha promogut el seu ús per tasques fins ara inimaginables. Per exemple, el relat d’esdeveniments era clàssicament dut a terme per periodistes a peu de carrer, però, en l’actualitat, el successos més interessants es publiquen directament en xarxes socials com Twitter a través de missatges curts. Conseqüentment, l’explotació de la informació temàtica del text curt ha atret l'interès tant de la recerca com de la indústria. Els models temàtics (o topic models) són un tipus de models de probabilitat que tradicionalment s’han utilitzat per explotar la informació temàtica en documents de text. Els models més populars pertanyen al subgrup de models amb variables latents, els quals incorporen varies variables a nivell de corpus, document i paraula amb la finalitat de descriure el contingut temàtic a cada nivell. Tanmateix, aquests models tenen dificultats per aprendre la semàntica en documents curts degut a la manca de coocurrència en les paraules d’un mateix document, la qual cosa impedeix una correcta estimació de les variables locals. Per tal de solucionar aquesta limitació, l’agregació de missatges segons el context i l’ús d’estratègies jeràrquiques Bayesianes són essencials per millorar la qualitat dels temes apresos. En aquesta tesi, estudiem en dos fases el problema d’aprenentatge d’estructures semàntiques i predictives en documents de text: En la primera fase, Part I, investiguem l’ús de models temàtics amb variables latents per la detecció d’esdeveniments a Twitter. En aquest escenari, l’ús del context per agregar tweets sorgeix de forma natural. Per això, primer estenem un algorisme de clustering per detectar esdeveniments a partir dels temes apresos en els tweets agregats. I seguidament, proposem un nou model de probabilitat que integra el model temàtic i el de clustering per tal que la informació flueixi entre ambdós components. En la segona fase, Part II i Part III, qüestionem l’ús de variables latents locals en models per a text curt sense context. Primer de tot, estudiem com avaluar la capacitat de generalització d’un model amb variables latents com el PFA (Poisson Factor Analysis) a través del càlcul de la likelihood. Atès que aquest càlcul és computacionalment intractable, proposem diferents mètodes d estimació. Amb el mètode més acurat, comparem la generalització de models chordals sense variables latents amb la del models PFA, tant en text curt com estàndard. En resum, demostrem que integrant clustering i models temàtics, el rendiment de les tècniques de detecció d’esdeveniments a Twitter millora degut a la interacció entre ambdós components. A més a més, desenvolupem diferents mètodes d’estimació per avaluar la capacitat generalizadora dels models PFA i validem empíricament la seva exactitud en diverses col·leccions de text. Finalment, mostrem que podem aprendre models chordals sense variables latents en text a través de Chordalysis i que aquests models poden ser una bona alternativa als models temàtics clàssics, especialment en text curt.Postprint (published version

    Vicrostatin – An Anti-Invasive Multi-Integrin Targeting Chimeric Disintegrin with Tumor Anti-Angiogenic and Pro-Apoptotic Activities

    Get PDF
    Similar to other integrin-targeting strategies, disintegrins have previously shown good efficacy in animal cancer models with favorable pharmacological attributes and translational potential. Nonetheless, these polypeptides are notoriously difficult to produce recombinantly due to their particular structure requiring the correct pairing of multiple disulfide bonds for biological activity. Here, we show that a sequence-engineered disintegrin (called vicrostatin or VCN) can be reliably produced in large scale amounts directly in the oxidative cytoplasm of Origami B E. coli. Through multiple integrin ligation (i.e., αvβ3, αvβ5, and α5β1), VCN targets both endothelial and cancer cells significantly inhibiting their motility through a reconstituted basement membrane. Interestingly, in a manner distinct from other integrin ligands but reminiscent of some ECM-derived endogenous anti-angiogenic fragments previously described in the literature, VCN profoundly disrupts the actin cytoskeleton of endothelial cells (EC) inducing a rapid disassembly of stress fibers and actin reorganization, ultimately interfering with EC's ability to invade and form tubes (tubulogenesis). Moreover, here we show for the first time that the addition of a disintegrin to tubulogenic EC sandwiched in vitro between two Matrigel layers negatively impacts their survival despite the presence of abundant haptotactic cues. A liposomal formulation of VCN (LVCN) was further evaluated in vivo in two animal cancer models with different growth characteristics. Our data demonstrate that LVCN is well tolerated while exerting a significant delay in tumor growth and an increase in the survival of treated animals. These results can be partially explained by potent tumor anti-angiogenic and pro-apoptotic effects induced by LVCN

    Contact Dynamics Modelling for Robotic Task Simulation

    Get PDF
    This thesis presents the theoretical derivations and the implementation of a contact dynamics modelling system based on compliant contact models. The system was designed to be used as a general-purpose modelling tool to support the task planning process space-based robot manipulator systems. This operational context imposes additional requirements on the contact dynamics modelling system beyond the usual ones of fidelity and accuracy. The system must not only be able to generate accurate and reliable simulation results, but it must do it in a reasonably short period of time, such that an operations engineer can investigate multiple scenarios within a few hours. The system is easy to interface with existing simulation facilities. All physical parameters of the contact model can be identified experimentally or can be obtained by other means through analysis or theoretical derivations based on the material properties. Similarly, the numerical parameters can be selected automatically or by using heuristic rules that give an indication of the range of values that would ensure that the simulations results are qualitatively correct. The contact dynamics modelling system is comprised of two contact models. On one hand, a point contact model is proposed to tackle simulations involving bodies with non-conformal surfaces. Since it is based on Hertz theory, the contacting surfaces must be smooth and without discontinuity, i.e., no corners or sharp edges. The point contact model includes normal damping and tangential friction and assumes the contact surface is very small, such that the contact force is assumed to be acting through a point. An expression to set the normal damping as a function of the effective coefficient of restitution is given. A new seven-parameter friction model is introduced. The friction model is based on a bristle friction model, and is adapted to the context of 3-dimensional frictional impact modelling with introduction of load-dependent bristle stiffness and damping terms, and with the expression of the bristle deformation in vectorial form. The model features a dwell-time stiction force dependency and is shown to be able to reproduce the dynamic nature of the friction phenomenon. A second contact model based on the Winkler elastic foundation model is then proposed to deal with a more general class of geometries. This so-called volumetric contact model is suitable for a broad range of contact geometries, as long as the contact surface can be approximated as being flat. A method to deal with objects where this latter approximation is not reasonable is also presented. The effect of the contact pressure distribution across the contact surface is accounted for in the form of the rolling resistance torque and spinning friction torque. It is shown that the contact forces and moments can be expressed in terms of the volumetric properties of the volume of interference between the two bodies, defined as the volume spanned by the intersection of the two undeformed geometries of the colliding bodies. The properties of interest are: the volume of the volume of interference, the position of its centroid, and its inertia tensor taken about the centroid. The analysis also introduces a new way of defining the contact normal; it is shown that the contact normal must correspond to one of the eigenvectors of the inertia tensor. The investigation also examines how the Coulomb friction is affected by the relative motion of the objects. The concept of average surface velocity is introduced. It accounts for both the relative translational and angular motions of the contacting surfaces. The average surface velocity is then used to find dimensionless factors that relate friction force and spinning torque caused by the Coulomb friction. These latter factors are labelled the Contensou factors. Also, the radius of gyration of the moment of inertia of the volume of interference about the contact normal was shown to correlate the spinning Coulomb friction torque to the translational Coulomb friction force. A volumetric version of the seven-parameter bristle friction model is then presented. The friction model includes both the tangential friction force and spinning friction torque. The Contensou factors are used to control the behaviour of the Coulomb friction. For both contact models, the equations are derived from first principles, and the behaviour of each contact model characteristic was studied and simulated. When available, the simulation results were compared with benchmark results from the literature. Experiments were performed to validate the point contact model using a six degrees-of-freedom manipulator holding a half-spherical payload, and coming into contact with a flat plate. Good correspondence between the simulated and experimental results was obtained

    Determining Visual Motion in the Deep Learning Era

    Get PDF
    Determining visual motion, or optical flow, is a fundamental problem in computer vision and has stimulated continuous research interests in the past few decades. Other than pure academic pursuit, the progress made in optical flow research also has applications in many fields, including video processing, graphics, robotics and medical applications. Traditionally, optical flow estimation has been formulated as solving an optimisation problem, often by minimising an energy function. The energy function is designed based on the brightness constancy assumption, which often fails in real-world scenarios due to lighting changes, shadows and occlusions, resulting in the failure of traditional algorithms. Another weakness of traditional optimisation approaches is the slow runtime, since iterative methods are often employed when solving for the optical flow, which can take as long as a few seconds to a minute. This becomes problematic in real-world applications. The recent surge of deep learning techniques has enabled the formulation of optical flow estimation as a learning problem. Recent papers have shown significant performance improvements compared to traditional approaches as well as significantly faster runtime. Despite the recent progress in the learning approaches for optical flow, there still remain challenging cases where current approaches fail, such as occlusions, featureless regions (the aperture problem), and large motions for small objects. Current methods are also limited by the large consumption of GPU memory. An intermediate representation named cost volume is often employed which scales quadratically with the number of pixels. This 4D representation acts as a memory bottleneck for modern optical flow approaches, which prevents scaling up to high-resolution images. In this PhD thesis, we show long-range modelling and sparse representations are important cornerstones for modern optical flow estimation. We first show regularising flow prediction with an estimated essential matrix can improve flow prediction performance in mostly rigid scenes, particularly challenging cases such as featureless regions and motion blur. We then demonstrate that a sparse cost volume can just be as effective as a dense cost volume, with significantly less memory consumption. This brings hope for future optical flow research where image resolutions are further increased. Finally, we show that incorporating a self-attention module to globally aggregate motion features helps improve state-of-the-art flow prediction. Modelling long-range connections are particularly helpful for dealing with occlusions

    Characterizing and comparing acoustic representations in convolutional neural networks and the human auditory system

    Full text link
    Le traitement auditif dans le cerveau humain et dans les systèmes informatiques consiste en une cascade de transformations représentationnelles qui extraient et réorganisent les informations pertinentes pour permettre l'exécution des tâches. Cette thèse s'intéresse à la nature des représentations acoustiques et aux principes de conception et d'apprentissage qui soutiennent leur développement. Les objectifs scientifiques sont de caractériser et de comparer les représentations auditives dans les réseaux de neurones convolutionnels profonds (CNN) et la voie auditive humaine. Ce travail soulève plusieurs questions méta-scientifiques sur la nature du progrès scientifique, qui sont également considérées. L'introduction passe en revue les connaissances actuelles sur la voie auditive des mammifères et présente les concepts pertinents de l'apprentissage profond. Le premier article soutient que les questions philosophiques les plus pressantes à l'intersection de l'intelligence artificielle et biologique concernent finalement la définition des phénomènes à expliquer et ce qui constitue des explications valables de tels phénomènes. Je surligne les théories pertinentes de l'explication scientifique que j’espére fourniront un échafaudage pour de futures discussions. L'article 2 teste un modèle populaire de cortex auditif basé sur des modulations spectro-temporelles. Nous constatons qu'un modèle linéaire entraîné uniquement sur les réponses BOLD aux ondulations dynamiques simples (contenant seulement une fréquence fondamentale, un taux de modulation temporelle et une échelle spectrale) peut se généraliser pour prédire les réponses aux mélanges de deux ondulations dynamiques. Le troisième article caractérise la spécificité linguistique des couches CNN et explore l'effet de l'entraînement figé et des poids aléatoires. Nous avons observé trois régions distinctes de transférabilité: (1) les deux premières couches étaient entièrement transférables, (2) les couches 2 à 8 étaient également hautement transférables, mais nous avons trouvé évidence de spécificité de la langue, (3) les couches suivantes entièrement connectées étaient plus spécifiques à la langue mais pouvaient être adaptées sur la langue cible. Dans l'article 4, nous utilisons l'analyse de similarité pour constater que la performance supérieure de l'entraînement figé obtenues à l'article 3 peuvent être attribuées aux différences de représentation dans l'avant-dernière couche: la deuxième couche entièrement connectée. Nous analysons également les réseaux aléatoires de l'article 3, dont nous concluons que la forme représentationnelle est doublement contrainte par l'architecture et la forme de l'entrée et de la cible. Pour tester si les CNN acoustiques apprennent une hiérarchie de représentation similaire à celle du système auditif humain, le cinquième article compare l'activité des réseaux «freeze trained» de l'article 3 à l'activité IRMf 7T dans l'ensemble du système auditif humain. Nous ne trouvons aucune évidence d'une hiérarchie de représentation partagée et constatons plutôt que tous nos régions auditifs étaient les plus similaires à la première couche entièrement connectée. Enfin, le chapitre de discussion passe en revue les mérites et les limites d'une approche d'apprentissage profond aux neurosciences dans un cadre de comparaison de modèles. Ensemble, ces travaux contribuent à l'entreprise naissante de modélisation du système auditif avec des réseaux de neurones et constituent un petit pas vers une science unifiée de l'intelligence qui étudie les phénomènes qui se manifestent dans l'intelligence biologique et artificielle.Auditory processing in the human brain and in contemporary machine hearing systems consists of a cascade of representational transformations that extract and reorganize relevant information to enable task performance. This thesis is concerned with the nature of acoustic representations and the network design and learning principles that support their development. The primary scientific goals are to characterize and compare auditory representations in deep convolutional neural networks (CNNs) and the human auditory pathway. This work prompts several meta-scientific questions about the nature of scientific progress, which are also considered. The introduction reviews what is currently known about the mammalian auditory pathway and introduces the relevant concepts in deep learning.The first article argues that the most pressing philosophical questions at the intersection of artificial and biological intelligence are ultimately concerned with defining the phenomena to be explained and with what constitute valid explanations of such phenomena. I highlight relevant theories of scientific explanation which we hope will provide scaffolding for future discussion. Article 2 tests a popular model of auditory cortex based on frequency-specific spectrotemporal modulations. We find that a linear model trained only on BOLD responses to simple dynamic ripples (containing only one fundamental frequency, temporal modulation rate, and spectral scale) can generalize to predict responses to mixtures of two dynamic ripples. Both the third and fourth article investigate how CNN representations are affected by various aspects of training. The third article characterizes the language specificity of CNN layers and explores the effect of freeze training and random weights. We observed three distinct regions of transferability: (1) the first two layers were entirely transferable between languages, (2) layers 2--8 were also highly transferable but we found some evidence of language specificity, (3) the subsequent fully connected layers were more language specific but could be successfully finetuned to the target language. In Article 4, we use similarity analysis to find that the superior performance of freeze training achieved in Article 3 can be largely attributed to representational differences in the penultimate layer: the second fully connected layer. We also analyze the random networks from Article 3, from which we conclude that representational form is doubly constrained by architecture and the form of the input and target. To test whether acoustic CNNs learn a similar representational hierarchy as that of the human auditory system, the fifth article presents a similarity analysis to compare the activity of the freeze trained networks from Article 3 to 7T fMRI activity throughout the human auditory system. We find no evidence of a shared representational hierarchy and instead find that all of our auditory regions were most similar to the first fully connected layer. Finally, the discussion chapter reviews the merits and limitations of a deep learning approach to neuroscience in a model comparison framework. Together, these works contribute to the nascent enterprise of modeling the auditory system with neural networks and constitute a small step towards a unified science of intelligence that studies the phenomena that are exhibited in both biological and artificial intelligence

    Mobile Ad-Hoc Networks

    Get PDF
    Being infrastructure-less and without central administration control, wireless ad-hoc networking is playing a more and more important role in extending the coverage of traditional wireless infrastructure (cellular networks, wireless LAN, etc). This book includes state-of the-art techniques and solutions for wireless ad-hoc networks. It focuses on the following topics in ad-hoc networks: vehicular ad-hoc networks, security and caching, TCP in ad-hoc networks and emerging applications. It is targeted to provide network engineers and researchers with design guidelines for large scale wireless ad hoc networks

    Seamless design of energy management systems

    Get PDF
    The contributions of the research are (a) an infrastructure of data acquisition systems that provides the necessary information for an automated EMS system enabling autonomous distributed state estimation, model validation, simplified protection, and seamless integration of other EMS applications, (b) an object-oriented, interoperable, and unified component model that can be seamlessly integrated with a variety of applications of the EMS, (c) a distributed dynamic state estimator (DDSE) based on the proposed data acquisition system and the object-oriented, interoperable, and unified component model, (d) a physically-based synchronous machine model, which is expressed in terms of the actual self and mutual inductances of the synchronous machine windings as a function of rotor position, for the purpose of synchronous machine parameters identification, and (e) a robust and highly efficient algorithm for the optimal power flow (OPF) problem, one of the most important applications of the EMS, based on the validated states and models of the power system provided by the proposed DDSE.Ph.D

    A Scalable and Adaptive Network on Chip for Many-Core Architectures

    Get PDF
    In this work, a scalable network on chip (NoC) for future many-core architectures is proposed and investigated. It supports different QoS mechanisms to ensure predictable communication. Self-optimization is introduced to adapt the energy footprint and the performance of the network to the communication requirements. A fault tolerance concept allows to deal with permanent errors. Moreover, a template-based automated evaluation and design methodology and a synthesis flow for NoCs is introduced
    • …
    corecore