18,322 research outputs found

    Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?

    Full text link
    The use of pretrained embeddings has become widespread in modern e-commerce machine learning (ML) systems. In practice, however, we have encountered several key issues when using pretrained embedding in a real-world production system, many of which cannot be fully explained by current knowledge. Unfortunately, we find that there is a lack of a thorough understanding of how pre-trained embeddings work, especially their intrinsic properties and interactions with downstream tasks. Consequently, it becomes challenging to make interactive and scalable decisions regarding the use of pre-trained embeddings in practice. Our investigation leads to two significant discoveries about using pretrained embeddings in e-commerce applications. Firstly, we find that the design of the pretraining and downstream models, particularly how they encode and decode information via embedding vectors, can have a profound impact. Secondly, we establish a principled perspective of pre-trained embeddings via the lens of kernel analysis, which can be used to evaluate their predictability, interactively and scalably. These findings help to address the practical challenges we faced and offer valuable guidance for successful adoption of pretrained embeddings in real-world production. Our conclusions are backed by solid theoretical reasoning, benchmark experiments, as well as online testings

    A framework for experimental-data-driven assessment of Magnetized Liner Inertial Fusion stagnation image metrics

    Full text link
    A variety of spherical crystal x-ray imager (SCXI) diagnostics have been developed and fielded on Magnetized Liner Inertial Fusion (MagLIF) experiments at the Sandia National Laboratories Z-facility. These different imaging modalities provide detailed insight into different physical phenomena such as mix of liner material into the hot fuel, cold liner emission, or reduce impact of liner opacity. However, several practical considerations ranging from the lack of a consistent spatial fiducial for registration to different point-spread-functions and tuning crystals or using filters to highlight specific spectral regions make it difficult to develop broadly applicable metrics to compare experiments across our stagnation image database without making significant unverified assumptions. We leverage experimental data for a model-free assessment of sensitivities to instrumentation-based features for any specified image metric. In particular, we utilize a database of historical and recent MagLIF data including Nscans=139N_{\text{scans}} = 139 image plate scans gathered across Nexp=67N_{\text{exp}} = 67 different experiments to assess the impact of a variety of features in the experimental observations arising from uncertainties in registration as well as discrepancies in signal-to-noise ratio and instrument resolution. We choose a wavelet-based image metric known as the Mallat Scattering Transform for the study and highlight how alternate metric choices could also be studied. In particular, we demonstrate a capability to understand and mitigate the impact of signal-to-noise, image registration, and resolution difference between images. This is achieved by utilizing multiple scans of the same image plate, sampling random translations and rotations, and applying instrument specific point-spread-functions found by ray tracing to high-resolution datasets, augmenting our data in an effectively model-free fashion.Comment: 17 pages, 14 figure

    Accelerated Sparse Recovery via Gradient Descent with Nonlinear Conjugate Gradient Momentum

    Full text link
    This paper applies an idea of adaptive momentum for the nonlinear conjugate gradient to accelerate optimization problems in sparse recovery. Specifically, we consider two types of minimization problems: a (single) differentiable function and the sum of a non-smooth function and a differentiable function. In the first case, we adopt a fixed step size to avoid the traditional line search and establish the convergence analysis of the proposed algorithm for a quadratic problem. This acceleration is further incorporated with an operator splitting technique to deal with the non-smooth function in the second case. We use the convex ℓ1\ell_1 and the nonconvex ℓ1−ℓ2\ell_1-\ell_2 functionals as two case studies to demonstrate the efficiency of the proposed approaches over traditional methods

    Boosting the Cycle Counting Power of Graph Neural Networks with I2^2-GNNs

    Full text link
    Message Passing Neural Networks (MPNNs) are a widely used class of Graph Neural Networks (GNNs). The limited representational power of MPNNs inspires the study of provably powerful GNN architectures. However, knowing one model is more powerful than another gives little insight about what functions they can or cannot express. It is still unclear whether these models are able to approximate specific functions such as counting certain graph substructures, which is essential for applications in biology, chemistry and social network analysis. Motivated by this, we propose to study the counting power of Subgraph MPNNs, a recent and popular class of powerful GNN models that extract rooted subgraphs for each node, assign the root node a unique identifier and encode the root node's representation within its rooted subgraph. Specifically, we prove that Subgraph MPNNs fail to count more-than-4-cycles at node level, implying that node representations cannot correctly encode the surrounding substructures like ring systems with more than four atoms. To overcome this limitation, we propose I2^2-GNNs to extend Subgraph MPNNs by assigning different identifiers for the root node and its neighbors in each subgraph. I2^2-GNNs' discriminative power is shown to be strictly stronger than Subgraph MPNNs and partially stronger than the 3-WL test. More importantly, I2^2-GNNs are proven capable of counting all 3, 4, 5 and 6-cycles, covering common substructures like benzene rings in organic chemistry, while still keeping linear complexity. To the best of our knowledge, it is the first linear-time GNN model that can count 6-cycles with theoretical guarantees. We validate its counting power in cycle counting tasks and demonstrate its competitive performance in molecular prediction benchmarks

    Procedure-Aware Pretraining for Instructional Video Understanding

    Full text link
    Our goal is to learn a video representation that is useful for downstream procedure understanding tasks in instructional videos. Due to the small amount of available annotations, a key challenge in procedure understanding is to be able to extract from unlabeled videos the procedural knowledge such as the identity of the task (e.g., 'make latte'), its steps (e.g., 'pour milk'), or the potential next steps given partial progress in its execution. Our main insight is that instructional videos depict sequences of steps that repeat between instances of the same or different tasks, and that this structure can be well represented by a Procedural Knowledge Graph (PKG), where nodes are discrete steps and edges connect steps that occur sequentially in the instructional activities. This graph can then be used to generate pseudo labels to train a video representation that encodes the procedural knowledge in a more accessible form to generalize to multiple procedure understanding tasks. We build a PKG by combining information from a text-based procedural knowledge database and an unlabeled instructional video corpus and then use it to generate training pseudo labels with four novel pre-training objectives. We call this PKG-based pre-training procedure and the resulting model Paprika, Procedure-Aware PRe-training for Instructional Knowledge Acquisition. We evaluate Paprika on COIN and CrossTask for procedure understanding tasks such as task recognition, step recognition, and step forecasting. Paprika yields a video representation that improves over the state of the art: up to 11.23% gains in accuracy in 12 evaluation settings. Implementation is available at https://github.com/salesforce/paprika.Comment: CVPR 202

    Random Young towers and quenched decay of correlations for predominantly expanding multimodal circle maps

    Full text link
    In this paper, we study the random dynamical system fωnf_\omega^n generated by a family of maps {fω0:S1→S1}ω∈[−ε,ε],\{f_{\omega_0}: \mathbb S^1 \to \mathbb S^1\}_{\omega \in [-\varepsilon,\varepsilon]}, $f_{\omega_0}(x) = \alpha \xi (x+\omega_0) +a\ (\mathrm{mod }\ 1),where where \xi: \mathbb S^1 \to \mathbb Risanon−degeneratedmap, is a non-degenerated map, a\in \mathbb S^1,, \alpha,\varepsilon>0.Fixingaconstant. Fixing a constant c\in (0,1),weshowthatfor, we show that for \alphasufficientlylargeand sufficiently large and \varepsilon > \alpha^{-1+c},therandomdynamicalsystem the random dynamical system f_\omega^n$ presents a random Young tower structure and quenched decay of correlations.Comment: 38 pages, 0 figure

    Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics

    Get PDF
    Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts. In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact pp-values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited. In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical R2R^2 in least squares regression. In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions

    A Decision Support System for Economic Viability and Environmental Impact Assessment of Vertical Farms

    Get PDF
    Vertical farming (VF) is the practice of growing crops or animals using the vertical dimension via multi-tier racks or vertically inclined surfaces. In this thesis, I focus on the emerging industry of plant-specific VF. Vertical plant farming (VPF) is a promising and relatively novel practice that can be conducted in buildings with environmental control and artificial lighting. However, the nascent sector has experienced challenges in economic viability, standardisation, and environmental sustainability. Practitioners and academics call for a comprehensive financial analysis of VPF, but efforts are stifled by a lack of valid and available data. A review of economic estimation and horticultural software identifies a need for a decision support system (DSS) that facilitates risk-empowered business planning for vertical farmers. This thesis proposes an open-source DSS framework to evaluate business sustainability through financial risk and environmental impact assessments. Data from the literature, alongside lessons learned from industry practitioners, would be centralised in the proposed DSS using imprecise data techniques. These techniques have been applied in engineering but are seldom used in financial forecasting. This could benefit complex sectors which only have scarce data to predict business viability. To begin the execution of the DSS framework, VPF practitioners were interviewed using a mixed-methods approach. Learnings from over 19 shuttered and operational VPF projects provide insights into the barriers inhibiting scalability and identifying risks to form a risk taxonomy. Labour was the most commonly reported top challenge. Therefore, research was conducted to explore lean principles to improve productivity. A probabilistic model representing a spectrum of variables and their associated uncertainty was built according to the DSS framework to evaluate the financial risk for VF projects. This enabled flexible computation without precise production or financial data to improve economic estimation accuracy. The model assessed two VPF cases (one in the UK and another in Japan), demonstrating the first risk and uncertainty quantification of VPF business models in the literature. The results highlighted measures to improve economic viability and the viability of the UK and Japan case. The environmental impact assessment model was developed, allowing VPF operators to evaluate their carbon footprint compared to traditional agriculture using life-cycle assessment. I explore strategies for net-zero carbon production through sensitivity analysis. Renewable energies, especially solar, geothermal, and tidal power, show promise for reducing the carbon emissions of indoor VPF. Results show that renewably-powered VPF can reduce carbon emissions compared to field-based agriculture when considering the land-use change. The drivers for DSS adoption have been researched, showing a pathway of compliance and design thinking to overcome the ‘problem of implementation’ and enable commercialisation. Further work is suggested to standardise VF equipment, collect benchmarking data, and characterise risks. This work will reduce risk and uncertainty and accelerate the sector’s emergence

    The effects of dairy foods intakes on weight change and fracture risk during critical life stages in women

    Full text link
    Menopause and pregnancy are crucial events in women’s lives because women experience a series of physical and psychological changes at these stages. One of the most critical challenges is excessive weight gain during both of these stages, which could contribute to various adverse health events in their later lives. In addition to weight gain, another critical health concern that women face is fragility-related factures. The rate of fragility fractures begins rising in women during their 40s and increases to the end of life. Fractures result in impaired mobility and hospitalization, which can decrease the life quality of women significantly. Identification of modifiable dietary risk factors for excessive weight gain and fracture risk is crucial. The objectives of this dissertation are to estimate the independent effects of total dairy and individual dairy foods (e.g., yogurt, milk, and cheese), alone and in combination with overall diet patterns, physical activity, and other lifestyle factors, on three outcomes among women: weight change during the menopausal transition, weight retention after pregnancy, and risk of fragility-related fractures throughout mid-life and older adult years. Data from two prospective studies of nurses were used: Nurses’ Health Study I (NHS I) and Nurses’ Health Study II (NHS II). NHS II was used for both weight change analyses, while NHS was used for the fracture analyses. The first specific aim for the analysis of weight change during the menopausal transition was to investigate the effects of total dairy, yogurt, milk, and cheese intakes on menopausal weight change (N = 35,177) and risk of obesity (N = 38,892) among women in NHS II. Weights were self-reported in biennial questionnaires. Diet was assessed with food frequency questionnaires (FFQ) every 4 years. Generalized estimating equations were used to assess the adjusted mean weight change using repeated measures of weight change. Cox proportional hazards models were used to estimate risk of obesity, controlling for confounding. The second specific aim relates to the postpartum weight change analyses and were to investigate the effects of total dairy, yogurt, milk, and cheese intakes on postpartum weight retention (N = 18,366) and risk of postpartum obesity (N = 17,126) among women in the NHS II. Generalized linear models were used to assess postpartum weight change as continuous outcomes and multivariable models with a Poisson distribution were used to estimate risk of postpartum obesity. The third specific aim was for the fragility fracture analyses and included investigating the effects of total dairy, yogurt, milk, and cheese on fragility fractures of the hip, wrist, and vertebrae in women ages 40 years and older in NHS I. In total, there are 99,072 women included. Fractures at the wrist and hip were self-reported. For vertebral fractures, we relied on medical record confirmed cases. Proportional hazards models were used to estimate risk of first fracture (including wrist, hip, or vertebral fractures). Results associated with the first aims suggested that more than 2 servings per week (s/w) of yogurt led to consistently less weight gain than that observed in women consuming less than 1 serving per month (s/m) throughout the menopausal transition. Further, this same yogurt intake was associated with a 31% reduced obesity risk (95% CI: 0.64 - 0.74) after adjusting for potential confounders and baseline body mass index (BMI). Higher total dairy intake was also associated with less obesity risk, but the effect was somewhat weaker than that for yogurt. There was a U-shaped relation between milk consumption and obesity risk during perimenopause. Moderate (0.5 s/d -< 1 s/d vs. < 0.5 s/d) milk consumption reduced obesity risk by 17% (95% CI: 0.78 - 0.89), while higher milk (≥1 s/d vs. < 0.5 s/d) consumption led to a marginally statistically significant 6% higher obesity risk. Cheese intake was not associated with obesity risk in perimenopausal women. In the postpartum weight retention analyses, women who consumed moderate amounts of yogurt (1 s/m -< 2 s/w) and higher amounts of yogurt (≥ 2 s/w) had a 0.38 lb and 0.63 lb reduction in postpartum weight retention, respectively, than those who rarely consumed yogurt (< 1 s/m). Moderate and higher cheese intakes were associated with 0.30 lb and 0.64 lb less postpartum weight retention, respectively, than lower cheese intake (< 2 s/w). In the obesity analysis, moderate (1 s/m -< 2 s/w) and higher yogurt (≥ 2 s/w) intakes were associated with 20% (95%: 0.69 - 0.93) and 16% (95%: 0.69 - 1.02) reduced risks of postpartum obesity, but the association was weakened by adjusting for pre-pregnancy BMI. Women with higher levels of activity and higher yogurt intakes had a 39% (95%: 0.50 - 0.74) lower risk of obesity. Higher Alternative Healthy Eating Index 2010 (AHEI) scores alone were associated with a statistically significantly lower obesity risk. Results from our fracture analyses found that women who consumed more than 2 s/d of total dairy had a 19% (95% CI: 0.67 - 0.98) lower fracture risk than those who consumed less than 1 s/w. In terms of individual dairy products, 2 s/d of milk were associated with a 14% (95% CI: 0.77 - 0.95) reduction in fracture risk compared with lower milk consumption (<1 s/w). Higher cheese (≥ 1 s/d vs. < 1 s/w) intake was associated with a non-statistically significant 9% (95% CI: 0.81 - 1.02) reduction in fracture risk. No association was found between yogurt consumption and fracture. In stratified analysis, the intakes of calcium, vitamin D, and protein from non-dietary sources did not modify the inverse association between total dairy or milk intake and fracture risk. In summary, the findings of this dissertation suggested that greater yogurt consumption was inversely associated with weight change during menopausal transition and after pregnancy while intakes of total dairy and milk had beneficial effects on the risk of fragility fractures among women ages 40 years and older

    Multiscale structural optimisation with concurrent coupling between scales

    Get PDF
    A robust three-dimensional multiscale topology optimisation framework with concurrent coupling between scales is presented. Concurrent coupling ensures that only the microscale data required to evaluate the macroscale model during each iteration of optimisation is collected and results in considerable computational savings. This represents the principal novelty of the framework and permits a previously intractable number of design variables to be used in the parametrisation of the microscale geometry, which in turn enables accessibility to a greater range of mechanical point properties during optimisation. Additionally, the microscale data collected during optimisation is stored in a re-usable database, further reducing the computational expense of subsequent iterations or entirely new optimisation problems. Application of this methodology enables structures with precise functionally-graded mechanical properties over two-scales to be derived, which satisfy one or multiple functional objectives. For all applications of the framework presented within this thesis, only a small fraction of the microstructure database is required to derive the optimised multiscale solutions, which demonstrates a significant reduction in the computational expense of optimisation in comparison to contemporary sequential frameworks. The derivation and integration of novel additive manufacturing constraints for open-walled microstructures within the concurrently coupled multiscale topology optimisation framework is also presented. Problematic fabrication features are discouraged through the application of an augmented projection filter and two relaxed binary integral constraints, which prohibit the formation of unsupported members, isolated assemblies of overhanging members and slender members during optimisation. Through the application of these constraints, it is possible to derive self-supporting, hierarchical structures with varying topology, suitable for fabrication through additive manufacturing processes.Open Acces
    • …
    corecore