18,322 research outputs found
Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?
The use of pretrained embeddings has become widespread in modern e-commerce
machine learning (ML) systems. In practice, however, we have encountered
several key issues when using pretrained embedding in a real-world production
system, many of which cannot be fully explained by current knowledge.
Unfortunately, we find that there is a lack of a thorough understanding of how
pre-trained embeddings work, especially their intrinsic properties and
interactions with downstream tasks. Consequently, it becomes challenging to
make interactive and scalable decisions regarding the use of pre-trained
embeddings in practice.
Our investigation leads to two significant discoveries about using pretrained
embeddings in e-commerce applications. Firstly, we find that the design of the
pretraining and downstream models, particularly how they encode and decode
information via embedding vectors, can have a profound impact. Secondly, we
establish a principled perspective of pre-trained embeddings via the lens of
kernel analysis, which can be used to evaluate their predictability,
interactively and scalably. These findings help to address the practical
challenges we faced and offer valuable guidance for successful adoption of
pretrained embeddings in real-world production. Our conclusions are backed by
solid theoretical reasoning, benchmark experiments, as well as online testings
A framework for experimental-data-driven assessment of Magnetized Liner Inertial Fusion stagnation image metrics
A variety of spherical crystal x-ray imager (SCXI) diagnostics have been
developed and fielded on Magnetized Liner Inertial Fusion (MagLIF) experiments
at the Sandia National Laboratories Z-facility. These different imaging
modalities provide detailed insight into different physical phenomena such as
mix of liner material into the hot fuel, cold liner emission, or reduce impact
of liner opacity. However, several practical considerations ranging from the
lack of a consistent spatial fiducial for registration to different
point-spread-functions and tuning crystals or using filters to highlight
specific spectral regions make it difficult to develop broadly applicable
metrics to compare experiments across our stagnation image database without
making significant unverified assumptions. We leverage experimental data for a
model-free assessment of sensitivities to instrumentation-based features for
any specified image metric. In particular, we utilize a database of historical
and recent MagLIF data including image plate scans
gathered across different experiments to assess the
impact of a variety of features in the experimental observations arising from
uncertainties in registration as well as discrepancies in signal-to-noise ratio
and instrument resolution. We choose a wavelet-based image metric known as the
Mallat Scattering Transform for the study and highlight how alternate metric
choices could also be studied. In particular, we demonstrate a capability to
understand and mitigate the impact of signal-to-noise, image registration, and
resolution difference between images. This is achieved by utilizing multiple
scans of the same image plate, sampling random translations and rotations, and
applying instrument specific point-spread-functions found by ray tracing to
high-resolution datasets, augmenting our data in an effectively model-free
fashion.Comment: 17 pages, 14 figure
Accelerated Sparse Recovery via Gradient Descent with Nonlinear Conjugate Gradient Momentum
This paper applies an idea of adaptive momentum for the nonlinear conjugate
gradient to accelerate optimization problems in sparse recovery. Specifically,
we consider two types of minimization problems: a (single) differentiable
function and the sum of a non-smooth function and a differentiable function. In
the first case, we adopt a fixed step size to avoid the traditional line search
and establish the convergence analysis of the proposed algorithm for a
quadratic problem. This acceleration is further incorporated with an operator
splitting technique to deal with the non-smooth function in the second case. We
use the convex and the nonconvex functionals as two
case studies to demonstrate the efficiency of the proposed approaches over
traditional methods
Boosting the Cycle Counting Power of Graph Neural Networks with I-GNNs
Message Passing Neural Networks (MPNNs) are a widely used class of Graph
Neural Networks (GNNs). The limited representational power of MPNNs inspires
the study of provably powerful GNN architectures. However, knowing one model is
more powerful than another gives little insight about what functions they can
or cannot express. It is still unclear whether these models are able to
approximate specific functions such as counting certain graph substructures,
which is essential for applications in biology, chemistry and social network
analysis. Motivated by this, we propose to study the counting power of Subgraph
MPNNs, a recent and popular class of powerful GNN models that extract rooted
subgraphs for each node, assign the root node a unique identifier and encode
the root node's representation within its rooted subgraph. Specifically, we
prove that Subgraph MPNNs fail to count more-than-4-cycles at node level,
implying that node representations cannot correctly encode the surrounding
substructures like ring systems with more than four atoms. To overcome this
limitation, we propose I-GNNs to extend Subgraph MPNNs by assigning
different identifiers for the root node and its neighbors in each subgraph.
I-GNNs' discriminative power is shown to be strictly stronger than Subgraph
MPNNs and partially stronger than the 3-WL test. More importantly, I-GNNs
are proven capable of counting all 3, 4, 5 and 6-cycles, covering common
substructures like benzene rings in organic chemistry, while still keeping
linear complexity. To the best of our knowledge, it is the first linear-time
GNN model that can count 6-cycles with theoretical guarantees. We validate its
counting power in cycle counting tasks and demonstrate its competitive
performance in molecular prediction benchmarks
Procedure-Aware Pretraining for Instructional Video Understanding
Our goal is to learn a video representation that is useful for downstream
procedure understanding tasks in instructional videos. Due to the small amount
of available annotations, a key challenge in procedure understanding is to be
able to extract from unlabeled videos the procedural knowledge such as the
identity of the task (e.g., 'make latte'), its steps (e.g., 'pour milk'), or
the potential next steps given partial progress in its execution. Our main
insight is that instructional videos depict sequences of steps that repeat
between instances of the same or different tasks, and that this structure can
be well represented by a Procedural Knowledge Graph (PKG), where nodes are
discrete steps and edges connect steps that occur sequentially in the
instructional activities. This graph can then be used to generate pseudo labels
to train a video representation that encodes the procedural knowledge in a more
accessible form to generalize to multiple procedure understanding tasks. We
build a PKG by combining information from a text-based procedural knowledge
database and an unlabeled instructional video corpus and then use it to
generate training pseudo labels with four novel pre-training objectives. We
call this PKG-based pre-training procedure and the resulting model Paprika,
Procedure-Aware PRe-training for Instructional Knowledge Acquisition. We
evaluate Paprika on COIN and CrossTask for procedure understanding tasks such
as task recognition, step recognition, and step forecasting. Paprika yields a
video representation that improves over the state of the art: up to 11.23%
gains in accuracy in 12 evaluation settings. Implementation is available at
https://github.com/salesforce/paprika.Comment: CVPR 202
Random Young towers and quenched decay of correlations for predominantly expanding multimodal circle maps
In this paper, we study the random dynamical system generated by
a family of maps $f_{\omega_0}(x) = \alpha \xi (x+\omega_0) +a\
(\mathrm{mod }\ 1),\xi: \mathbb S^1 \to \mathbb Ra\in \mathbb S^1\alpha,\varepsilon>0c\in (0,1)\alpha\varepsilon > \alpha^{-1+c},f_\omega^n$
presents a random Young tower structure and quenched decay of correlations.Comment: 38 pages, 0 figure
Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics
Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts.
In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact -values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited.
In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical in least squares regression.
In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions
A Decision Support System for Economic Viability and Environmental Impact Assessment of Vertical Farms
Vertical farming (VF) is the practice of growing crops or animals using the vertical dimension via multi-tier racks or vertically inclined surfaces. In this thesis, I focus on the emerging industry of plant-specific VF. Vertical plant farming (VPF) is a promising and relatively novel practice that can be conducted in buildings with environmental control and artificial lighting. However, the nascent sector has experienced challenges in economic viability, standardisation, and environmental sustainability. Practitioners and academics call for a comprehensive financial analysis of VPF, but efforts are stifled by a lack of valid and available data.
A review of economic estimation and horticultural software identifies a need for a decision support system (DSS) that facilitates risk-empowered business planning for vertical farmers. This thesis proposes an open-source DSS framework to evaluate business sustainability through financial risk and environmental impact assessments. Data from the literature, alongside lessons learned from industry practitioners, would be centralised in the proposed DSS using imprecise data techniques. These techniques have been applied in engineering but are seldom used in financial forecasting. This could benefit complex sectors which only have scarce data to predict business viability.
To begin the execution of the DSS framework, VPF practitioners were interviewed using a mixed-methods approach. Learnings from over 19 shuttered and operational VPF projects provide insights into the barriers inhibiting scalability and identifying risks to form a risk taxonomy. Labour was the most commonly reported top challenge. Therefore, research was conducted to explore lean principles to improve productivity.
A probabilistic model representing a spectrum of variables and their associated uncertainty was built according to the DSS framework to evaluate the financial risk for VF projects. This enabled flexible computation without precise production or financial data to improve economic estimation accuracy. The model assessed two VPF cases (one in the UK and another in Japan), demonstrating the first risk and uncertainty quantification of VPF business models in the literature. The results highlighted measures to improve economic viability and the viability of the UK and Japan case.
The environmental impact assessment model was developed, allowing VPF operators to evaluate their carbon footprint compared to traditional agriculture using life-cycle assessment. I explore strategies for net-zero carbon production through sensitivity analysis. Renewable energies, especially solar, geothermal, and tidal power, show promise for reducing the carbon emissions of indoor VPF. Results show that renewably-powered VPF can reduce carbon emissions compared to field-based agriculture when considering the land-use change.
The drivers for DSS adoption have been researched, showing a pathway of compliance and design thinking to overcome the ‘problem of implementation’ and enable commercialisation. Further work is suggested to standardise VF equipment, collect benchmarking data, and characterise risks. This work will reduce risk and uncertainty and accelerate the sector’s emergence
The effects of dairy foods intakes on weight change and fracture risk during critical life stages in women
Menopause and pregnancy are crucial events in women’s lives because women experience a series of physical and psychological changes at these stages. One of the most critical challenges is excessive weight gain during both of these stages, which could contribute to various adverse health events in their later lives. In addition to weight gain, another critical health concern that women face is fragility-related factures. The rate of fragility fractures begins rising in women during their 40s and increases to the end of life. Fractures result in impaired mobility and hospitalization, which can decrease the life quality of women significantly. Identification of modifiable dietary risk factors for excessive weight gain and fracture risk is crucial. The objectives of this dissertation are to estimate the independent effects of total dairy and individual dairy foods (e.g., yogurt, milk, and cheese), alone and in combination with overall diet patterns, physical activity, and other lifestyle factors, on three outcomes among women: weight change during the menopausal transition, weight retention after pregnancy, and risk of fragility-related fractures throughout mid-life and older adult years.
Data from two prospective studies of nurses were used: Nurses’ Health Study I (NHS I) and Nurses’ Health Study II (NHS II). NHS II was used for both weight change analyses, while NHS was used for the fracture analyses. The first specific aim for the analysis of weight change during the menopausal transition was to investigate the effects of total dairy, yogurt, milk, and cheese intakes on menopausal weight change (N = 35,177) and risk of obesity (N = 38,892) among women in NHS II. Weights were self-reported in biennial questionnaires. Diet was assessed with food frequency questionnaires (FFQ) every 4 years. Generalized estimating equations were used to assess the adjusted mean weight change using repeated measures of weight change. Cox proportional hazards models were used to estimate risk of obesity, controlling for confounding.
The second specific aim relates to the postpartum weight change analyses and were to investigate the effects of total dairy, yogurt, milk, and cheese intakes on postpartum weight retention (N = 18,366) and risk of postpartum obesity (N = 17,126) among women in the NHS II. Generalized linear models were used to assess postpartum weight change as continuous outcomes and multivariable models with a Poisson distribution were used to estimate risk of postpartum obesity.
The third specific aim was for the fragility fracture analyses and included investigating the effects of total dairy, yogurt, milk, and cheese on fragility fractures of the hip, wrist, and vertebrae in women ages 40 years and older in NHS I. In total, there are 99,072 women included. Fractures at the wrist and hip were self-reported. For vertebral fractures, we relied on medical record confirmed cases. Proportional hazards models were used to estimate risk of first fracture (including wrist, hip, or vertebral fractures).
Results associated with the first aims suggested that more than 2 servings per week (s/w) of yogurt led to consistently less weight gain than that observed in women consuming less than 1 serving per month (s/m) throughout the menopausal transition. Further, this same yogurt intake was associated with a 31% reduced obesity risk (95% CI: 0.64 - 0.74) after adjusting for potential confounders and baseline body mass index (BMI). Higher total dairy intake was also associated with less obesity risk, but the effect was somewhat weaker than that for yogurt. There was a U-shaped relation between milk consumption and obesity risk during perimenopause. Moderate (0.5 s/d -< 1 s/d vs. < 0.5 s/d) milk consumption reduced obesity risk by 17% (95% CI: 0.78 - 0.89), while higher milk (≥1 s/d vs. < 0.5 s/d) consumption led to a marginally statistically significant 6% higher obesity risk. Cheese intake was not associated with obesity risk in perimenopausal women.
In the postpartum weight retention analyses, women who consumed moderate amounts of yogurt (1 s/m -< 2 s/w) and higher amounts of yogurt (≥ 2 s/w) had a 0.38 lb and 0.63 lb reduction in postpartum weight retention, respectively, than those who rarely consumed yogurt (< 1 s/m). Moderate and higher cheese intakes were associated with 0.30 lb and 0.64 lb less postpartum weight retention, respectively, than lower cheese intake (< 2 s/w). In the obesity analysis, moderate (1 s/m -< 2 s/w) and higher yogurt (≥ 2 s/w) intakes were associated with 20% (95%: 0.69 - 0.93) and 16% (95%: 0.69 - 1.02) reduced risks of postpartum obesity, but the association was weakened by adjusting for pre-pregnancy BMI. Women with higher levels of activity and higher yogurt intakes had a 39% (95%: 0.50 - 0.74) lower risk of obesity. Higher Alternative Healthy Eating Index 2010 (AHEI) scores alone were associated with a statistically significantly lower obesity risk.
Results from our fracture analyses found that women who consumed more than 2 s/d of total dairy had a 19% (95% CI: 0.67 - 0.98) lower fracture risk than those who consumed less than 1 s/w. In terms of individual dairy products, 2 s/d of milk were associated with a 14% (95% CI: 0.77 - 0.95) reduction in fracture risk compared with lower milk consumption (<1 s/w). Higher cheese (≥ 1 s/d vs. < 1 s/w) intake was associated with a non-statistically significant 9% (95% CI: 0.81 - 1.02) reduction in fracture risk. No association was found between yogurt consumption and fracture. In stratified analysis, the intakes of calcium, vitamin D, and protein from non-dietary sources did not modify the inverse association between total dairy or milk intake and fracture risk.
In summary, the findings of this dissertation suggested that greater yogurt consumption was inversely associated with weight change during menopausal transition and after pregnancy while intakes of total dairy and milk had beneficial effects on the risk of fragility fractures among women ages 40 years and older
Multiscale structural optimisation with concurrent coupling between scales
A robust three-dimensional multiscale topology optimisation framework with concurrent coupling between scales is presented. Concurrent coupling ensures that only the microscale data required to evaluate the macroscale model during each iteration of optimisation is collected and results in considerable computational savings. This represents the principal novelty of the framework and permits a previously intractable number of design variables to be used in the parametrisation of the microscale geometry, which in turn enables accessibility to a greater range of mechanical point properties during optimisation. Additionally, the microscale data collected during optimisation is stored in a re-usable database, further reducing the computational expense of subsequent iterations or entirely new optimisation problems. Application of this methodology enables structures with precise functionally-graded mechanical properties over two-scales to be derived, which satisfy one or multiple functional objectives. For all applications of the framework presented within this thesis, only a small fraction of the microstructure database is required to derive the optimised multiscale solutions, which demonstrates a significant reduction in the computational expense of optimisation in comparison to contemporary sequential frameworks.
The derivation and integration of novel additive manufacturing constraints for open-walled microstructures within the concurrently coupled multiscale topology optimisation framework is also presented. Problematic fabrication features are discouraged through the application of an augmented projection filter and two relaxed binary integral constraints, which prohibit the formation of unsupported members, isolated assemblies of overhanging members and slender members during optimisation. Through the application of these constraints, it is possible to derive self-supporting, hierarchical structures with varying topology, suitable for fabrication through additive manufacturing processes.Open Acces
- …