204 research outputs found

    Unsupervised Visual and Textual Information Fusion in Multimedia Retrieval - A Graph-based Point of View

    Full text link
    Multimedia collections are more than ever growing in size and diversity. Effective multimedia retrieval systems are thus critical to access these datasets from the end-user perspective and in a scalable way. We are interested in repositories of image/text multimedia objects and we study multimodal information fusion techniques in the context of content based multimedia information retrieval. We focus on graph based methods which have proven to provide state-of-the-art performances. We particularly examine two of such methods : cross-media similarities and random walk based scores. From a theoretical viewpoint, we propose a unifying graph based framework which encompasses the two aforementioned approaches. Our proposal allows us to highlight the core features one should consider when using a graph based technique for the combination of visual and textual information. We compare cross-media and random walk based results using three different real-world datasets. From a practical standpoint, our extended empirical analysis allow us to provide insights and guidelines about the use of graph based methods for multimodal information fusion in content based multimedia information retrieval.Comment: An extended version of the paper: Visual and Textual Information Fusion in Multimedia Retrieval using Semantic Filtering and Graph based Methods, by J. Ah-Pine, G. Csurka and S. Clinchant, submitted to ACM Transactions on Information System

    Overview of the Relational Analysis approach in Data-Mining and Multi-criteria Decision Making

    Get PDF
    International audienceIn this chapter we introduce a general framework called the Relational Analysis approach and its related contributions and applications in the fields of data analysis, data mining and multi-criteria decision making. This approach was initiated by J.F. Marcotorchino and P. Michaud at the end of the 70's and has generated many research activities. However, the aspects of this framework that we would like to focus on are of a theoretical kind. Indeed, we are aimed at recalling the background and the basics of this framework, the unifying results and the modeling contributions that it has allowed to achieve. Besides, the main tasks that we are interested in are the ranking aggregation problem, the clustering problem and the block seriation problem. Those problems are combinatorial ones and the computational considerations of such tasks in the context of the RA methodology will not be covered. However, among the list of references that we give thoughout this chapter, there are numerous articles that the interested reader could consult to this end

    Hypergraph Modelization of a Syntactically Annotated English Wikipedia Dump

    Get PDF
    International audienceWikipedia, the well known internet encyclopedia, is nowadays a widely used source of information. To leverage its rich information, already parsed versions of Wikipedia have been proposed. We present an annotated dump of the English Wikipedia. This dump draws upon previously released Wikipedia parsed dumps. Still, we head in a different direction. In this parse we focus more into the syntactical characteristics of words: aside from the classical Part-of-Speech (PoS) tags and dependency parsing relations, we provide the full constituent parse branch for each word in a succinct way. Additionally, we propose a hypergraph network representation of the extracted linguistic information. The proposed modelization aims to take advantage of the information stocked within our parsed Wikipedia dump. We hope that by releasing these resources, researchers from the concerned communities will have a ready-to-experiment Wikipedia corpus to compare and distribute their work. We render public our parsed Wikipedia dump as well as the tool (and its source code) used to perform the parse. The hypergraph network and its related metadata is also distributed

    Classification ascendante hiérarchique à noyaux et une application aux données textuelles

    Get PDF
    National audienceLa formule de Lance et Williams permet d'unifier plusieurs méthodes de classification ascendante hiérarchique (CAH). Dans cet article, nous suppo-sons que les données sont représentées dans un espace euclidien et nous établis-sons une nouvelle expression de cette formule en utilisant les similarités cosinus au lieu des distances euclidiennes au carré. Notre approche présente les avan-tages suivants. D'une part, elle permet d'étendre naturellement les méthodes classiques de CAH aux fonctions noyau. D'autre part, elle permet d'appliquer des méthodes d'écrêtage permettant de rendre la matrice de similarités creuse afin d'améliorer la complexité de la CAH. L'application de notre approche sur des tâches de classification automatique de données textuelles montre d'une part, que le passage à l'échelle est amélioré en mémoire et en temps de traitement; d'autre part, que la qualité des résultats est préservée voire améliorée

    Backprojection for Training Feedforward Neural Networks in the Input and Feature Spaces

    Full text link
    After the tremendous development of neural networks trained by backpropagation, it is a good time to develop other algorithms for training neural networks to gain more insights into networks. In this paper, we propose a new algorithm for training feedforward neural networks which is fairly faster than backpropagation. This method is based on projection and reconstruction where, at every layer, the projected data and reconstructed labels are forced to be similar and the weights are tuned accordingly layer by layer. The proposed algorithm can be used for both input and feature spaces, named as backprojection and kernel backprojection, respectively. This algorithm gives an insight to networks with a projection-based perspective. The experiments on synthetic datasets show the effectiveness of the proposed method.Comment: Accepted (to appear) in International Conference on Image Analysis and Recognition (ICIAR) 2020, Springe

    Principal Component Analysis Using Structural Similarity Index for Images

    Full text link
    Despite the advances of deep learning in specific tasks using images, the principled assessment of image fidelity and similarity is still a critical ability to develop. As it has been shown that Mean Squared Error (MSE) is insufficient for this task, other measures have been developed with one of the most effective being Structural Similarity Index (SSIM). Such measures can be used for subspace learning but existing methods in machine learning, such as Principal Component Analysis (PCA), are based on Euclidean distance or MSE and thus cannot properly capture the structural features of images. In this paper, we define an image structure subspace which discriminates different types of image distortions. We propose Image Structural Component Analysis (ISCA) and also kernel ISCA by using SSIM, rather than Euclidean distance, in the formulation of PCA. This paper provides a bridge between image quality assessment and manifold learning opening a broad new area for future research.Comment: Paper for the methods named "Image Structural Component Analysis (ISCA)" and "Kernel Image Structural Component Analysis (Kernel ISCA)

    Weighted Fisher Discriminant Analysis in the Input and Feature Spaces

    Full text link
    Fisher Discriminant Analysis (FDA) is a subspace learning method which minimizes and maximizes the intra- and inter-class scatters of data, respectively. Although, in FDA, all the pairs of classes are treated the same way, some classes are closer than the others. Weighted FDA assigns weights to the pairs of classes to address this shortcoming of FDA. In this paper, we propose a cosine-weighted FDA as well as an automatically weighted FDA in which weights are found automatically. We also propose a weighted FDA in the feature space to establish a weighted kernel FDA for both existing and newly proposed weights. Our experiments on the ORL face recognition dataset show the effectiveness of the proposed weighting schemes.Comment: Accepted (to appear) in International Conference on Image Analysis and Recognition (ICIAR) 2020, Springe

    Evaluating the Viscoelastic Properties of Tissue from Laser Speckle Fluctuations

    Get PDF
    Most pathological conditions such as atherosclerosis, cancer, neurodegenerative, and orthopedic disorders are accompanied with alterations in tissue viscoelasticity. Laser Speckle Rheology (LSR) is a novel optical technology that provides the invaluable potential for mechanical assessment of tissue in situ. In LSR, the specimen is illuminated with coherent light and the time constant of speckle fluctuations, Ï„, is measured using a high speed camera. Prior work indicates that Ï„ is closely correlated with tissue microstructure and composition. Here, we investigate the relationship between LSR measurements of Ï„ and sample mechanical properties defined by the viscoelastic modulus, G*. Phantoms and tissue samples over a broad range of viscoelastic properties are evaluated using LSR and conventional mechanical testing. Results demonstrate a strong correlation between Ï„ and |G*| for both phantom (r = 0.79, p <0.0001) and tissue (r = 0.88, p<0.0001) specimens, establishing the unique capability of LSR in characterizing tissue viscoelasticity

    Population based trends in mortality, morbidity and treatment for very preterm- and very low birth weight infants over 12 years

    Get PDF
    BACKGROUND: Over the last two decades, improvements in medical care have been associated with a significant increase and better outcome of very preterm (VP, < 32 completed gestational weeks) and very low birth weight (VLBW, < 1500 g) infants. Only a few publications analyse changes of their short-term outcome in a geographically defined area over more than 10 years. We therefore aimed to investigate the net change of VP- and VLBW infants leaving the hospital without major complications. METHODS: Our population-based observational cohort study used the Minimal Neonatal Data Set, a database maintained by the Swiss Society of Neonatology including information of all VP- and VLBW infants. Perinatal characteristics, mortality and morbidity rates and the survival free of major complications were analysed and their temporal trends evaluated. RESULTS: In 1996, 2000, 2004, and 2008, a total number of 3090 infants were enrolled in the Network Database. At the same time the rate of VP- and VLBW neonates increased significantly from 0.87% in 1996 to 1.10% in 2008 (p < 0.001). The overall mortality remained stable by 13%, but the survival free of major complications increased from 66.9% to 71.7% (p < 0.01). The percentage of infants getting a full course of antenatal corticosteroids increased from 67.7% in 1996 to 91.4% in 2008 (p < 0.001). Surfactant was given more frequently (24.8% in 1996 compared to 40.1% in 2008, p < 0.001) and the frequency of mechanical ventilation remained stable by about 43%. However, the use of CPAP therapy increased considerably from 43% to 73.2% (p < 0.001). Some of the typical neonatal pathologies like bronchopulmonary dysplasia, necrotising enterocolitis and intraventricular haemorrhage decreased significantly (p ≤ 0.02) whereas others like patent ductus arteriosus and respiratory distress syndrome increased (p < 0.001). CONCLUSIONS: Over the 12-year observation period, the number of VP- and VLBW infants increased significantly. An unchanged overall mortality rate and an increase of survivors free of major complication resulted in a considerable net gain in infants with potentially good outcome

    Clustering cliques for graph-based summarization of the biomedical research literature

    Get PDF
    BACKGROUND: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts). RESULTS: SemRep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm based on common arguments shared among cliques. The validity of the clusters in the summaries produced was compared to the Silhouette-generated baseline for cohesion, separation and overall validity. The theme labels were also compared to a reference standard produced with major MeSH headings. CONCLUSIONS: For 11 topics in the testing data set, the overall validity of clusters from the system summary was 10% better than the baseline (43% versus 33%). While compared to the reference standard from MeSH headings, the results for recall, precision and F-score were 0.64, 0.65, and 0.65 respectively
    • …
    corecore