368 research outputs found

    Comparative aspects of pegmatitic and pneumatolytic evolution in Cornish granites

    Get PDF
    Imperial Users onl

    Raman fingerprint of semi-metal WTe2 from bulk to monolayer

    Get PDF
    Tungsten ditelluride (WTe2), a layered transition-metal dichalcogenide (TMD), has recently demonstrated an extremely large magnetoresistance effect, which is unique among TMDs. This fascinating feature seems to be correlated with its special electronic structure. Here, we report the observation of 6 Raman peaks corresponding to the A_2^4, A_1^9, A_1^8, A_1^6, A_1^5 and A_1^2 phonons, from the 33 Raman-active modes predicted for WTe2. This provides direct evidence to distinguish the space group of WTe2 from that of other TMDs. Moreover, the Raman evolution of WTe2 from bulk to monolayer is clearly revealed. It is interesting to find that the A_2^4 mode, centered at ~109.8 cm-1, is forbidden in a monolayer, which may be attributable to the transition of the point group from C2v (bulk) to C2h (monolayer). Our work characterizes all observed Raman peaks in the bulk and few-layer samples and provides a route to study the physical properties of two-dimensional WTe2.Comment: 19 pages, 4 figures and 2 table

    Postglacial sea-level change: novel insights from physical and statistical modelling

    Get PDF
    Developing accurate projections of future sea-level change is a key challenge for the entire science community under the current warming climate. Due to the fact that modern instrumental sea-level observations are only available since the 19-20th century, sea-level projections based on them can only capture short-term effects, leaving physical processes that dominate over longer timescales underestimated. Therefore, an essential step towards accurate and robust long-term sea-level projections is to investigate the physical processes that impact the spatio-temporal evolution of sea-level change over centennial to millennial timescales. Due to sometimes scarce and often noisy palaeo sea-level observations, mechanisms of sea-level change over geological timescales are still not well-understood, with many outstanding questions to be resolved. This thesis develops novel physical and statistical models to better understand the mechanisms behind postglacial sea-level change. Specifically, this thesis focuses on three outstanding problems that are not only important in postglacial sea-level change but also in understanding past ice sheet dynamics and palaeoclimate change. Firstly, a statistical framework is developed to invert the sources of meltwater pulse 1A, the largest and most rapid global sea-level rise event of the last deglaciation, with sophisticated treatment of uncertainties associated with sea-level reconstructions and geophysical modelling. The results suggest there were contributions from North America, 12.0 m (5.6-15.4 m; 95% probability), Scandinavia, 4.6 m (3.2-6.4 m), and Antarctica, 1.3 m (0-5.9 m), giving a total global mean sea-level rise of 17.9 m (15.7-20.2 m) in 500 years. Secondly, the missing ice problem (distinctive imbalance between observed global mean sea-level rise and the reconstructed amount of ice-sheet melt) is revisited by including an extra physical process (sediment isostatic adjustment, SIA) which has not been considered in this problem before. In particular, this thesis investigates the impact of SIA on local RSL variation across the Great Barrier Reef (GBR), the world's largest mixed carbonate-siliciclastic sediment system. Based on a Bayesian calibration method, SIA can contribute up to 1.1 m relative sea-level rise in the outer shelf of the southern central GBR from 28 ka to present. Because the SIA-induced RSL rise is unrelated to ice mass loss, failing to correct for this signal will lead to systematic overestimation of grounded ice volume. Therefore, incorporating the SIA process will reduce the global grounded ice volume estimate for the Last Glacial Maximum (LGM), which can help to mitigate the missing ice problem. Lastly, robust global barystatic sea-level maps with minimum dependency on the detailed geometry of past ice sheet change are reconstructed. Estimating such maps requires physical simulation of relative sea-level corresponding to thousands of different ice histories, which is computationally prohibitive. To improve this situation, this thesis develops a statistical emulator which can mimic the behaviour of a physics-based model and is computationally much cheaper to evaluate. The results highlight the Seychelles as an exceptionally good place to map barystatic sea level throughout the last deglaciation because RSL at this location only slightly departs from global barystatic sea level, with minor dependency on the assumed ice history. Together, these physical and statistical models present powerful tools to yield novel insights into postglacial sea-level change mechanisms and hence they have the potential to yield more robust, accurate and trust-worthy sea-level change projections

    An Open Source Data Contamination Report for Large Language Models

    Full text link
    Data contamination in model evaluation has become increasingly prevalent with the growing popularity of large language models. It allows models to "cheat" via memorisation instead of displaying true capabilities. Therefore, contamination analysis has become an crucial part of reliable model evaluation to validate results. However, existing contamination analysis is usually conducted internally by large language model developers and often lacks transparency and completeness. This paper presents an extensive data contamination report for over 15 popular large language models across six popular multiple-choice QA benchmarks. We also introduce an open-source pipeline that enables the community to perform contamination analysis on customised data and models. Our experiments reveal varying contamination levels ranging from 1\% to 45\% across benchmarks, with the contamination degree increasing rapidly over time. Performance analysis of large language models indicates that data contamination does not necessarily lead to increased model metrics: while significant accuracy boosts of up to 14\% and 7\% are observed on contaminated C-Eval and Hellaswag benchmarks, only a minimal increase is noted on contaminated MMLU. We also find larger models seem able to gain more advantages than smaller models on contaminated test sets

    LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time-Sensitive Test Construction

    Full text link
    Data contamination in evaluation is getting increasingly prevalent with the emergence of language models pre-trained on super large, automatically crawled corpora. This problem leads to significant challenges in the accurate assessment of model capabilities and generalisations. In this paper, we propose LatestEval, an automatic method that leverages the most recent texts to create uncontaminated reading comprehension evaluations. LatestEval avoids data contamination by only using texts published within a recent time window, ensuring no overlap with the training corpora of pre-trained language models. We develop the LatestEval automated pipeline to 1) gather the latest texts; 2) identify key information, and 3) construct questions targeting the information while removing the existing answers from the context. This encourages models to infer the answers themselves based on the remaining context, rather than just copy-paste. Our experiments demonstrate that language models exhibit negligible memorisation behaviours on LatestEval as opposed to previous benchmarks, suggesting a significantly reduced risk of data contamination and leading to a more robust evaluation. Data and code are publicly available at: https://github.com/liyucheng09/LatestEval.Comment: AAAI 202

    Evaluating Large Language Models for Generalization and Robustness via Data Compression

    Full text link
    Existing methods for evaluating large language models face challenges such as data contamination, sensitivity to prompts, and the high cost of benchmark creation. To address this, we propose a lossless data compression based evaluation approach that tests how models' predictive abilities generalize after their training cutoff. Specifically, we collect comprehensive test data spanning 83 months from 2017 to 2023 and split the data into training and testing periods according to models' training data cutoff. We measure: 1) the compression performance on the testing period as a measure of generalization on unseen data; and 2) the performance gap between the training and testing period as a measure of robustness. Our experiments test 14 representative large language models with various sizes on sources including Wikipedia, news articles, code, arXiv papers, and multi-modal data. We find that the compression rate of many models reduces significantly after their cutoff date, but models such as Mistral and Llama-2 demonstrate a good balance between performance and robustness. Results also suggest that models struggle to generalize on news and code data, but work especially well on arXiv papers. We also find the context size and tokenization implementation have a big impact of on the overall compression performance

    Compressing Context to Enhance Inference Efficiency of Large Language Models

    Full text link
    Large language models (LLMs) achieved remarkable performance across various tasks. However, they face challenges in managing long documents and extended conversations, due to significantly increased computational requirements, both in memory and inference time, and potential context truncation when the input exceeds the LLM's fixed context length. This paper proposes a method called Selective Context that enhances the inference efficiency of LLMs by identifying and pruning redundancy in the input context to make the input more compact. We test our approach using common data sources requiring long context processing: arXiv papers, news articles, and long conversations, on tasks of summarisation, question answering, and response generation. Experimental results show that Selective Context significantly reduces memory cost and decreases generation latency while maintaining comparable performance compared to that achieved when full context is used. Specifically, we achieve a 50\% reduction in context cost, resulting in a 36\% reduction in inference memory usage and a 32\% reduction in inference time, while observing only a minor drop of .023 in BERTscore and .038 in faithfulness on four downstream applications, indicating that our method strikes a good balance between efficiency and performance.Comment: EMNLP 2023. arXiv admin note: substantial text overlap with arXiv:2304.12102; text overlap with arXiv:2303.11076 by other author

    Metaphor Detection via Explicit Basic Meanings Modelling

    Full text link
    One noticeable trend in metaphor detection is the embrace of linguistic theories such as the metaphor identification procedure (MIP) for model architecture design. While MIP clearly defines that the metaphoricity of a lexical unit is determined based on the contrast between its \textit{contextual meaning} and its \textit{basic meaning}, existing work does not strictly follow this principle, typically using the \textit{aggregated meaning} to approximate the basic meaning of target words. In this paper, we propose a novel metaphor detection method, which models the basic meaning of the word based on literal annotation from the training set, and then compares this with the contextual meaning in a target sentence to identify metaphors. Empirical results show that our method outperforms the state-of-the-art method significantly by 1.0\% in F1 score. Moreover, our performance even reaches the theoretical upper bound on the VUA18 benchmark for targets with basic annotations, which demonstrates the importance of modelling basic meanings for metaphor detection.Comment: ACL 202

    On Robustness and Generalization of ML-Based Congestion Predictors to Valid and Imperceptible Perturbations

    Full text link
    There is substantial interest in the use of machine learning (ML)-based techniques throughout the electronic computer-aided design (CAD) flow, particularly methods based on deep learning. However, while deep learning methods have achieved state-of-the-art performance in several applications, recent work has demonstrated that neural networks are generally vulnerable to small, carefully chosen perturbations of their input (e.g. a single pixel change in an image). In this work, we investigate robustness in the context of ML-based EDA tools -- particularly for congestion prediction. As far as we are aware, we are the first to explore this concept in the context of ML-based EDA. We first describe a novel notion of imperceptibility designed specifically for VLSI layout problems defined on netlists and cell placements. Our definition of imperceptibility is characterized by a guarantee that a perturbation to a layout will not alter its global routing. We then demonstrate that state-of-the-art CNN and GNN-based congestion models exhibit brittleness to imperceptible perturbations. Namely, we show that when a small number of cells (e.g. 1%-5% of cells) have their positions shifted such that a measure of global congestion is guaranteed to remain unaffected (e.g. 1% of the design adversarially shifted by 0.001% of the layout space results in a predicted decrease in congestion of up to 90%, while no change in congestion is implied by the perturbation). In other words, the quality of a predictor can be made arbitrarily poor (i.e. can be made to predict that a design is "congestion-free") for an arbitrary input layout. Next, we describe a simple technique to train predictors that improves robustness to these perturbations. Our work indicates that CAD engineers should be cautious when integrating neural network-based mechanisms in EDA flows to ensure robust and high-quality results.Comment: 7 pages, 7 figure
    • …
    corecore