157 research outputs found

    Exploring the topical structure of short text through probability models : from tasks to fundamentals

    Get PDF
    Recent technological advances have radically changed the way we communicate. Today’s communication has become ubiquitous and it has fostered the need for information that is easier to create, spread and consume. As a consequence, we have experienced the shortening of text messages in mediums ranging from electronic mailing, instant messaging to microblogging. Moreover, the ubiquity and fast-paced nature of these mediums have promoted their use for unthinkable tasks. For instance, reporting real-world events was classically carried out by news reporters, but, nowadays, most interesting events are first disclosed on social networks like Twitter by eyewitness through short text messages. As a result, the exploitation of the thematic content in short text has captured the interest of both research and industry. Topic models are a type of probability models that have traditionally been used to explore this thematic content, a.k.a. topics, in regular text. Most popular topic models fall into the sub-class of LVMs (Latent Variable Models), which include several latent variables at the corpus, document and word levels to summarise the topics at each level. However, classical LVM-based topic models struggle to learn semantically meaningful topics in short text because the lack of co-occurring words within a document hampers the estimation of the local latent variables at the document level. To overcome this limitation, pooling and hierarchical Bayesian strategies that leverage on contextual information have been essential to improve the quality of topics in short text. In this thesis, we study the problem of learning semantically meaningful and predictive representations of text in two distinct phases: • In the first phase, Part I, we investigate the use of LVM-based topic models for the specific task of event detection in Twitter. In this situation, the use of contextual information to pool tweets together comes naturally. Thus, we first extend an existing clustering algorithm for event detection to use the topics learned from pooled tweets. Then, we propose a probability model that integrates topic modelling and clustering to enable the flow of information between both components. • In the second phase, Part II and Part III, we challenge the use of local latent variables in LVMs, specially when the context of short messages is not available. First of all, we study the evaluation of the generalization capabilities of LVMs like PFA (Poisson Factor Analysis) and propose unbiased estimation methods to approximate it. With the most accurate method, we compare the generalization of chordal models without latent variables to that of PFA topic models in short and regular text collections. In summary, we demonstrate that by integrating clustering and topic modelling, the performance of event detection techniques in Twitter is improved due to the interaction between both components. Moreover, we develop several unbiased likelihood estimation methods for assessing the generalization of PFA and we empirically validate their accuracy in different document collections. Finally, we show that we can learn chordal models without latent variables in text through Chordalysis, and that they can be a competitive alternative to classical topic models, specially in short text.Els avenços tecnològics han canviat radicalment la forma que ens comuniquem. Avui en dia, la comunicació és ubiqua, la qual cosa fomenta l’ús de informació fàcil de crear, difondre i consumir. Com a resultat, hem experimentat l’escurçament dels missatges de text en diferents medis de comunicació, des del correu electrònic, a la missatgeria instantània, al microblogging. A més de la ubiqüitat, la naturalesa accelerada d’aquests medis ha promogut el seu ús per tasques fins ara inimaginables. Per exemple, el relat d’esdeveniments era clàssicament dut a terme per periodistes a peu de carrer, però, en l’actualitat, el successos més interessants es publiquen directament en xarxes socials com Twitter a través de missatges curts. Conseqüentment, l’explotació de la informació temàtica del text curt ha atret l'interès tant de la recerca com de la indústria. Els models temàtics (o topic models) són un tipus de models de probabilitat que tradicionalment s’han utilitzat per explotar la informació temàtica en documents de text. Els models més populars pertanyen al subgrup de models amb variables latents, els quals incorporen varies variables a nivell de corpus, document i paraula amb la finalitat de descriure el contingut temàtic a cada nivell. Tanmateix, aquests models tenen dificultats per aprendre la semàntica en documents curts degut a la manca de coocurrència en les paraules d’un mateix document, la qual cosa impedeix una correcta estimació de les variables locals. Per tal de solucionar aquesta limitació, l’agregació de missatges segons el context i l’ús d’estratègies jeràrquiques Bayesianes són essencials per millorar la qualitat dels temes apresos. En aquesta tesi, estudiem en dos fases el problema d’aprenentatge d’estructures semàntiques i predictives en documents de text: En la primera fase, Part I, investiguem l’ús de models temàtics amb variables latents per la detecció d’esdeveniments a Twitter. En aquest escenari, l’ús del context per agregar tweets sorgeix de forma natural. Per això, primer estenem un algorisme de clustering per detectar esdeveniments a partir dels temes apresos en els tweets agregats. I seguidament, proposem un nou model de probabilitat que integra el model temàtic i el de clustering per tal que la informació flueixi entre ambdós components. En la segona fase, Part II i Part III, qüestionem l’ús de variables latents locals en models per a text curt sense context. Primer de tot, estudiem com avaluar la capacitat de generalització d’un model amb variables latents com el PFA (Poisson Factor Analysis) a través del càlcul de la likelihood. Atès que aquest càlcul és computacionalment intractable, proposem diferents mètodes d estimació. Amb el mètode més acurat, comparem la generalització de models chordals sense variables latents amb la del models PFA, tant en text curt com estàndard. En resum, demostrem que integrant clustering i models temàtics, el rendiment de les tècniques de detecció d’esdeveniments a Twitter millora degut a la interacció entre ambdós components. A més a més, desenvolupem diferents mètodes d’estimació per avaluar la capacitat generalizadora dels models PFA i validem empíricament la seva exactitud en diverses col·leccions de text. Finalment, mostrem que podem aprendre models chordals sense variables latents en text a través de Chordalysis i que aquests models poden ser una bona alternativa als models temàtics clàssics, especialment en text curt.Postprint (published version

    Preclinical studies for the gene therapy of leukocyte adhesion deficiency type 1

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Medicina, Departamento de Bioquímica. Fecha de lectura: 25-02-2015La Deficiencia de Adhesión Leucocitaria Tipo I (DAL-I) es una inmunodeficiencia primaria producida por mutaciones en el gen ITGB2, que codifica para la proteína CD18 o subunidad β2. Esta proteína se asocia con diferentes subunidades CD11 para formar las integrinas β2, las cuales se expresan en la membrana de los leucocitos y les permiten adherirse al endotelio como paso previo a la extravasación. Las mutaciones en el gen ITGB2 producen una reducida, ausente o aberrante expresión de CD18, lo que se traduce en la ausencia de integrinas β2 en la membrana leucocitaria. Los leucocitos de estos pacientes, principalmente los neutrófilos, son incapaces de abandonar el torrente sanguíneo y combatir las infecciones que se producen en los diferentes tejidos. Así, estos pacientes sufren infecciones bacterianas graves y recurrentes que pueden conducir a su muerte. En la presente tesis doctoral nos propusimos inicialmente caracterizar una cepa de ratón que presenta una mutación hipomórfica en el gen itgb2 (CD18HYP) para su posterior uso como modelo animal de DAL-I. Estos ratones mostraban una expresión baja pero detectable de integrinas β2 en células linfoides y mieloides y presentaban leucocitosis y defectos en la migración de neutrófilos en respuesta a agentes inflamatorios. Sorprendentemente, la médula ósea (MO) de estos ratones presentaba un mayor contenido de células madre hematopoyéticas (CMHs). Además, las células de MO de los ratones CD18HYP presentaban una mayor capacidad de repoblación competitiva que las de un ratón WT. A través de diferentes experimentos in vivo observamos que la ausencia de CD18 conduce a la expansión del compartimento de CMHs incluso en presencia de un ambiente hematopoyético WT, lo que demuestra por primera vez el papel de CD18 en la regulación del nicho hematopoyético. Al tratarse de una enfermedad hematopoyética monogénica, DAL-I es una enfermedad idónea para ser tratada mediante terapia génica (TG) ex vivo. Por ello, generamos cuatro vectores lentivirales (VLs) autoinactivantes en los que la expresión del gen ITGB2 está controlada por diferentes promotores, tanto ubicuos (PGK y UCOE) como mieloides (Chim y MIM). Estos VLs demostraron su eficacia para recuperar la expresión de integrinas β2 y corregir el fenotipo en dos modelos in vitro: una línea linfoblastoide derivada de un paciente con DAL-I y CMHs sanas interferidas para la expresión de CD18 (mediante la utilización de un VL que expresa un RNAsh capaz de reconocer el RNAm de CD18). Finalmente, pasamos a realizar experimentos de TG ex vivo en los ratones CD18HYP. Los ratones tratados por TG presentaban células en sangre periférica y en MO que expresaban la proteína humana CD18 (hCD18). Además, estas células presentaban una mayor expresión de la proteína endógena CD11a (mCD11a) en comparación con ratones no tratados, lo que indicaba que la expresión ectópica de hCD18 rescataba la expresión de las integrinas β2 de ratón. La expresión de hCD18 en todos los grupos tratados siempre fue mayor en neutrófilos que en linfocitos, aunque fueron aquellos ratones tratados con el vector LV:Chim.hCD18 en los que el ratio entre la expresión mieloide y la expresión linfoide fue mayor. Al retransplantar la MO de estos ratones, se observaron células hCD18+ con expresión incrementada de mCD11a en los receptores secundarios, lo que indicaba que los VLs-hCD18 eran capaces de transducir CMHs con capacidad de repoblación a largo plazo. La recuperación de valores normales de leucocitos en la sangre y el restablecimiento de la capacidad de extravasación de los neutrófilos corroboraron la eficacia terapéutica de estos vectores. Aunque se obtuvieron resultados muy similares con todos los VLs, el promotor ideal para la TG de DAL-I sería aquel que expresase hCD18 en todas las células hematopoyéticas y que dirija una alta expresión en el linaje mieloide, puesto que es el tipo celular más afectado por la deficiencia en esta proteína. Por eso, y en base a los resultados obtenidos, proponemos el uso del vector LV.Chim.hCD18 para un futuro ensayo de terapia génica en pacientes con DAL-I.Leukocyte Adhesion Deficiency Type I (LAD-I) is a primary immunodeficiency caused by mutations in ITGB2 gene, encoding for CD18 protein (also known as β2 subunit). This protein binds to different CD11 subunits to form β2 integrins, which are expressed in the leukocyte membrane and allow leukocytes to firmly adhere to the endothelium as a previous step to the extravasation. In LAD-I patients, ITGB2 mutations lead to absent, low or aberrant CD18 expression, which results in absent or low β2 integrin expression on the leukocyte membrane. CD18 deficient leukocytes, especially neutrophils, fail to extravasate from the bloodstream to infected tissues. LAD-I patients suffer from recurrent and severe infections leading normally to death. In the present doctoral thesis, we initially aimed to characterize a mouse strain presenting an hypomorphic mutation in the itgb2 gene (CD18HYP), which would be later used as a LAD-I animal model. These mice displayed low but still present β2 integrin expression in lymphoid and myeloid cells and showed leukocytosis and defects in neutrophil extravasation capacity in response to different inflammatory stimuli. Surprisingly, CD18HYP bone marrow (BM) showed enrichment in haematopoietic stem cells (HSCs). Moreover, BM cells from CD18HYP mice presented a higher competitive repopulation ability compared to WT cells. Through different in vivo experiments, we demonstrated that CD18 deficiency lead to the expansion of HSCs even in the presence of a WT haematopoietic environment, which pointed out the role of CD18 in the regulation of the haematopoietic niche for the first time. As a monogenic haematopoietic disorder, LAD-I is a good candidate for ex vivo haematopoietic gene therapy (GT). For this, we generated four lentiviral vectors (LVs) in which ITGB2 gene is expressed under ubiquitous (UCOE and PGK) or myeloid (Chim and MIM) promoters. These LVs succeeded in the CD18 expression and phenotype restoration in two different in vitro models: a lymphoblastic cell line derived from a LAD-I patient and in CD18-interferred healthy HSCs (generated by the use of a LV expressing a shRNA against the CD18 mRNA). Finally, we carried our GT experiments in CD18HYP mice. GT-treated CD18HYP mice showed peripheral blood and BM cells expressing hCD18 protein, which in addition showed higher expression of mCD11a subunit in comparison with untreated mice. This indicated that ectopic hCD18 expression restores murine β2 integrins. hCD18 expression was always higher in myeloid than in lymphoid cells in all treated groups, although LV.Chim.hCD18-treated animals showed the highest ratio between myeloid and lymphoid expression. When BM from these animals was re-transplanted into secondary recipients, hCD18+ cells with increased mCD11a expression could be observed, indicating that all CD18-LVs were able to transduce true and long-term HSCs. The recovery of normal number of leukocytes and the reestablishment of inflammation-mediated neutrophil extravasation capacity supported the therapeutic effect of these LVs Although similar results were obtained with all the hCD18-LVs, the ideal promoter for the GT of LAD-I would be the one driving a high expression of CD18 in the myeloid lineage but also expressing at a low level in other haematopoietic cells. For this reason, and on the basis of these results, we propose the use of LV.Chim.hCD18 for a future LAD-I GT clinical trial

    Contact Dynamics Modelling for Robotic Task Simulation

    Get PDF
    This thesis presents the theoretical derivations and the implementation of a contact dynamics modelling system based on compliant contact models. The system was designed to be used as a general-purpose modelling tool to support the task planning process space-based robot manipulator systems. This operational context imposes additional requirements on the contact dynamics modelling system beyond the usual ones of fidelity and accuracy. The system must not only be able to generate accurate and reliable simulation results, but it must do it in a reasonably short period of time, such that an operations engineer can investigate multiple scenarios within a few hours. The system is easy to interface with existing simulation facilities. All physical parameters of the contact model can be identified experimentally or can be obtained by other means through analysis or theoretical derivations based on the material properties. Similarly, the numerical parameters can be selected automatically or by using heuristic rules that give an indication of the range of values that would ensure that the simulations results are qualitatively correct. The contact dynamics modelling system is comprised of two contact models. On one hand, a point contact model is proposed to tackle simulations involving bodies with non-conformal surfaces. Since it is based on Hertz theory, the contacting surfaces must be smooth and without discontinuity, i.e., no corners or sharp edges. The point contact model includes normal damping and tangential friction and assumes the contact surface is very small, such that the contact force is assumed to be acting through a point. An expression to set the normal damping as a function of the effective coefficient of restitution is given. A new seven-parameter friction model is introduced. The friction model is based on a bristle friction model, and is adapted to the context of 3-dimensional frictional impact modelling with introduction of load-dependent bristle stiffness and damping terms, and with the expression of the bristle deformation in vectorial form. The model features a dwell-time stiction force dependency and is shown to be able to reproduce the dynamic nature of the friction phenomenon. A second contact model based on the Winkler elastic foundation model is then proposed to deal with a more general class of geometries. This so-called volumetric contact model is suitable for a broad range of contact geometries, as long as the contact surface can be approximated as being flat. A method to deal with objects where this latter approximation is not reasonable is also presented. The effect of the contact pressure distribution across the contact surface is accounted for in the form of the rolling resistance torque and spinning friction torque. It is shown that the contact forces and moments can be expressed in terms of the volumetric properties of the volume of interference between the two bodies, defined as the volume spanned by the intersection of the two undeformed geometries of the colliding bodies. The properties of interest are: the volume of the volume of interference, the position of its centroid, and its inertia tensor taken about the centroid. The analysis also introduces a new way of defining the contact normal; it is shown that the contact normal must correspond to one of the eigenvectors of the inertia tensor. The investigation also examines how the Coulomb friction is affected by the relative motion of the objects. The concept of average surface velocity is introduced. It accounts for both the relative translational and angular motions of the contacting surfaces. The average surface velocity is then used to find dimensionless factors that relate friction force and spinning torque caused by the Coulomb friction. These latter factors are labelled the Contensou factors. Also, the radius of gyration of the moment of inertia of the volume of interference about the contact normal was shown to correlate the spinning Coulomb friction torque to the translational Coulomb friction force. A volumetric version of the seven-parameter bristle friction model is then presented. The friction model includes both the tangential friction force and spinning friction torque. The Contensou factors are used to control the behaviour of the Coulomb friction. For both contact models, the equations are derived from first principles, and the behaviour of each contact model characteristic was studied and simulated. When available, the simulation results were compared with benchmark results from the literature. Experiments were performed to validate the point contact model using a six degrees-of-freedom manipulator holding a half-spherical payload, and coming into contact with a flat plate. Good correspondence between the simulated and experimental results was obtained

    Model Predictive Control Technique of Multilevel Inverter for PV Applications

    Get PDF
    Renewable energy sources, such as solar, wind, hydro, and biofuels, continue to gain popularity as alternatives to the conventional generation system. The main unit in the renewable energy system is the power conditioning system (PCS). It is highly desirable to obtain higher efficiency, lower component cost, and high reliability for the PCS to decrease the levelized cost of energy. This suggests a need for new inverter configurations and controls optimization, which can achieve the aforementioned needs. To achieve these goals, this dissertation presents a modified multilevel inverter topology for grid-tied photovoltaic (PV) system to achieve a lower cost and higher efficiency comparing with the existing system. In addition, this dissertation will also focus on model predictive control (MPC) which controls the modified multilevel topology to regulate the injected power to the grid. A major requirement for the PCS is harvesting the maximum power from the PV. By incorporating MPC, the performance of the maximum power point tracking (MPPT) algorithm to accurately extract the maximum power is improved for multilevel DC-DC converter. Finally, this control technique is developed for the quasi-z-source inverter (qZSI) to accurately control the DC link voltage, input current, and produce a high quality grid injected current waveform compared with the conventional techniques. This dissertation presents a modified symmetrical and asymmetrical multilevel DC-link inverter (MLDCLI) topology with less power switches and gate drivers. In addition, the MPC technique is used to drive the modified and grid connected MLDCLI. The performance of the proposed topology with finite control set model predictive control (FCS-MPC) is verified by simulation and experimentally. Moreover, this dissertation introduces predictive control to achieve maximum power point for grid-tied PV system to quicken the response by predicting the error before the switching signal is applied to the converter. Using the modified technique ensures the iii system operates at maximum power point which is more economical. Thus, the proposed MPPT technique can extract more energy compared to the conventional MPPT techniques from the same amount of installed solar panel. In further detail, this dissertation proposes the FCS-MPC technique for the qZSI in PV system. In order to further improve the performance of the system, FCS-MPC with one step horizon prediction has been implemented and compared with the classical PI controller. The presented work shows the proposed control techniques outperform the ones of the conventional linear controllers for the same application. Finally, a new method of the parallel processing is presented to reduce the time processing for the MPC
    corecore