140 research outputs found

    Soil-Structure Interaction and ULS Design of Complex Deep Foundations

    Get PDF
    In conventional design of deep foundations, some important positive effects evolving from the interaction of the bearing elements and the subsoil (Soil-Structure Interaction) are not utilised. These positive effects especially arise when using Combined Pile-Raft Foundations (CPRFs). The application of numerical methods during the design process of such foundations, which is explicitly allowed in Eurocode 7, is capable of regarding these effects. This paper deals with an approach using numerical methods within the ULS design for complex foundations and discusses case histories where CPRFs are used as a foundation for high-rise buildings in Frankfurt am Main. The paper will be finalised with an introduction to the Seasonal Thermal Storage where the piles of a deep foundation are used as energy piles to store or extract heat in the surrounding subsoil

    Scaling MLPs: A Tale of Inductive Bias

    Full text link
    In this work we revisit the most fundamental building block in deep learning, the multi-layer perceptron (MLP), and study the limits of its performance on vision tasks. Empirical insights into MLPs are important for multiple reasons. (1) Given the recent narrative "less inductive bias is better", popularized due to transformers eclipsing convolutional models, it is natural to explore the limits of this hypothesis. To that end, MLPs offer an ideal test bed, being completely free of any inductive bias. (2) MLPs have almost exclusively been the main protagonist in the deep learning theory literature due to their mathematical simplicity, serving as a proxy to explain empirical phenomena observed for more complex architectures. Surprisingly, experimental datapoints for MLPs are very difficult to find in the literature, especially when coupled with large pre-training protocols. This discrepancy between practice and theory is worrying: Do MLPs reflect the empirical advances exhibited by practical models? Or do theorists need to rethink the role of MLPs as a proxy? We provide insights into both these aspects. We show that the performance of MLPs drastically improves with scale (93% on CIFAR10, 79% on CIFAR100, 69% on TinyImageNet), highlighting that lack of inductive bias can indeed be compensated. We observe that MLPs mimic the behaviour of their modern counterparts faithfully, with some components in the learning setting however surprisingly exhibiting stronger or unexpected behaviours. Due to their inherent computational efficiency, large pre-training experiments become more accessible for academic researchers. All of our experiments were run on a single GPU

    Random Teachers are Good Teachers

    Full text link
    In this work, we investigate the implicit regularization induced by teacher-student learning dynamics in self-distillation. To isolate its effect, we describe a simple experiment where we consider teachers at random initialization instead of trained teachers. Surprisingly, when distilling a student into such a random teacher, we observe that the resulting model and its representations already possess very interesting characteristics; (1) we observe a strong improvement of the distilled student over its teacher in terms of probing accuracy. (2) The learned representations are data-dependent and transferable between different tasks but deteriorate strongly if trained on random inputs. (3) The student checkpoint contains sparse subnetworks, so-called lottery tickets, and lies on the border of linear basins in the supervised loss landscape. These observations have interesting consequences for several important areas in machine learning: (1) Self-distillation can work solely based on the implicit regularization present in the gradient dynamics without relying on any dark knowledge, (2) self-supervised learning can learn features even in the absence of data augmentation and (3) training dynamics during the early phase of supervised training do not necessarily require label information. Finally, we shed light on an intriguing local property of the loss landscape: the process of feature learning is strongly amplified if the student is initialized closely to the teacher. These results raise interesting questions about the nature of the landscape that have remained unexplored so far. Code is available at https://github.com/safelix/dinopl

    Disentangling Linear Mode-Connectivity

    Full text link
    Linear mode-connectivity (LMC) (or lack thereof) is one of the intriguing characteristics of neural network loss landscapes. While empirically well established, it unfortunately still lacks a proper theoretical understanding. Even worse, although empirical data points are abound, a systematic study of when networks exhibit LMC is largely missing in the literature. In this work we aim to close this gap. We explore how LMC is affected by three factors: (1) architecture (sparsity, weight-sharing), (2) training strategy (optimization setup) as well as (3) the underlying dataset. We place particular emphasis on minimal but non-trivial settings, removing as much unnecessary complexity as possible. We believe that our insights can guide future theoretical works on uncovering the inner workings of LMC.Comment: 9 pages, 5 figure

    Science in Russia: Factors of Modernization and Resources for Development

    Get PDF
    One of the most strategic resources of a country is its science and technology complex. To be productive, scientists need excellent conditions for doing research. Consequently, they choose such place where they can work efficiently, they even leave their motherland for researching under better conditions. The study describes the key indicators of efficient research activity. To characterize the science of a country, we distinguish two groups: indicators of scientific and technological capabilities and indicators for assessing the impact of scientific productivity. We use statistical data of reports published in Russia from 2003 to 2014. We did the comparative analysis of performance indicators of research activity in some European and Asian OECD member countries. That provides an overview of the main trends of development and the state of world science. Using data analysis presented in Science Watch, Web of Science and Scopus databases, OECD STAN database, ANBERD, the data published by National Science Foundation and the RAND Corporation, we identify key indicators of research activity in the world. We point out the negative tendency of emigration of Russian scientists and highlight the main reasons of this process. In conclusion, we outline the main reasons of the crisis in Russian science

    CLIP-Guided Vision-Language Pre-training for Question Answering in 3D Scenes

    Full text link
    Training models to apply linguistic knowledge and visual concepts from 2D images to 3D world understanding is a promising direction that researchers have only recently started to explore. In this work, we design a novel 3D pre-training Vision-Language method that helps a model learn semantically meaningful and transferable 3D scene point cloud representations. We inject the representational power of the popular CLIP model into our 3D encoder by aligning the encoded 3D scene features with the corresponding 2D image and text embeddings produced by CLIP. To assess our model's 3D world reasoning capability, we evaluate it on the downstream task of 3D Visual Question Answering. Experimental quantitative and qualitative results show that our pre-training method outperforms state-of-the-art works in this task and leads to an interpretable representation of 3D scene features.Comment: CVPRW 2023. Code will be made publicly available: https://github.com/AlexDelitzas/3D-VQ

    Multi-CLIP: Contrastive Vision-Language Pre-training for Question Answering tasks in 3D Scenes

    Full text link
    Training models to apply common-sense linguistic knowledge and visual concepts from 2D images to 3D scene understanding is a promising direction that researchers have only recently started to explore. However, it still remains understudied whether 2D distilled knowledge can provide useful representations for downstream 3D vision-language tasks such as 3D question answering. In this paper, we propose a novel 3D pre-training Vision-Language method, namely Multi-CLIP, that enables a model to learn language-grounded and transferable 3D scene point cloud representations. We leverage the representational power of the CLIP model by maximizing the agreement between the encoded 3D scene features and the corresponding 2D multi-view image and text embeddings in the CLIP space via a contrastive objective. To validate our approach, we consider the challenging downstream tasks of 3D Visual Question Answering (3D-VQA) and 3D Situated Question Answering (3D-SQA). To this end, we develop novel multi-modal transformer-based architectures and we demonstrate how our pre-training method can benefit their performance. Quantitative and qualitative experimental results show that Multi-CLIP outperforms state-of-the-art works across the downstream tasks of 3D-VQA and 3D-SQA and leads to a well-structured 3D scene feature space.Comment: The first two authors contributed equall
    corecore