5 research outputs found
Crack-Net: Prediction of Crack Propagation in Composites
Computational solid mechanics has become an indispensable approach in
engineering, and numerical investigation of fracture in composites is essential
as composites are widely used in structural applications. Crack evolution in
composites is the bridge to elucidate the relationship between the
microstructure and fracture performance, but crack-based finite element methods
are computationally expensive and time-consuming, limiting their application in
computation-intensive scenarios. Here we propose a deep learning framework
called Crack-Net, which incorporates the relationship between crack evolution
and stress response to predict the fracture process in composites. Trained on a
high-precision fracture development dataset generated using the phase field
method, Crack-Net demonstrates a remarkable capability to accurately forecast
the long-term evolution of crack growth patterns and the stress-strain curve
for a given composite design. The Crack-Net captures the essential principle of
crack growth, which enables it to handle more complex microstructures such as
binary co-continuous structures. Moreover, transfer learning is adopted to
further improve the generalization ability of Crack-Net for composite materials
with reinforcements of different strengths. The proposed Crack-Net holds great
promise for practical applications in engineering and materials science, in
which accurate and efficient fracture prediction is crucial for optimizing
material performance and microstructural design
One Neuron Saved Is One Neuron Earned: On Parametric Efficiency of Quadratic Networks
Inspired by neuronal diversity in the biological neural system, a plethora of
studies proposed to design novel types of artificial neurons and introduce
neuronal diversity into artificial neural networks. Recently proposed quadratic
neuron, which replaces the inner-product operation in conventional neurons with
a quadratic one, have achieved great success in many essential tasks. Despite
the promising results of quadratic neurons, there is still an unresolved issue:
\textit{Is the superior performance of quadratic networks simply due to the
increased parameters or due to the intrinsic expressive capability?} Without
clarifying this issue, the performance of quadratic networks is always
suspicious. Additionally, resolving this issue is reduced to finding killer
applications of quadratic networks. In this paper, with theoretical and
empirical studies, we show that quadratic networks enjoy parametric efficiency,
thereby confirming that the superior performance of quadratic networks is due
to the intrinsic expressive capability. This intrinsic expressive ability comes
from that quadratic neurons can easily represent nonlinear interaction, while
it is hard for conventional neurons. Theoretically, we derive the approximation
efficiency of the quadratic network over conventional ones in terms of real
space and manifolds. Moreover, from the perspective of the Barron space, we
demonstrate that there exists a functional space whose functions can be
approximated by quadratic networks in a dimension-free error, but the
approximation error of conventional networks is dependent on dimensions.
Empirically, experimental results on synthetic data, classic benchmarks, and
real-world applications show that quadratic models broadly enjoy parametric
efficiency, and the gain of efficiency depends on the task.Comment: We have shared our code in
https://github.com/asdvfghg/quadratic_efficienc
Recommended from our members
Independent Position and Attitude Control on Multirotor Aerial Platforms
Multirotor aerial platforms have obtained growing attentions in industry and academia, for its simplicity in mechanical structure, agility in maneuverability and ability for vertical take-off and landing (VTOL). Conventional multirotor has underactuated dynamics, and can not be fully controlled in 6 Degree-of-Freedom (DoF). In fact, only its three-dimensional position and yaw angle, called the flat outputs, can be controlled independently. However, for certain applications, such as perching on a vertical vertical wall or flying in a narrow space, the the non-flat outputs, the roll and pitch angles, are independently specified from the position requirements at some particular time. These tasks require the independent control of position and attitude at least partially for certain instants, and are generally challenging for multirotor platforms.This dissertation addresses this issue in two aspects. Firstly, an algorithm is designed for the conventional quadcopter platforms to generate trajectories for tasks with requirements on both position and attitude. It is formulated as an optimization, and converted into a series of convex problems to solve. Constraints on dynamics, space limitations, inputs and states are explicitly included. The algorithm is verified numerically on the task of quadcopter perching at the specified location on a vertical wall.Secondly, a fully actuated multirotor aerial platform is proposed. Commercial quadcopters and passive hinges are used to generate tiltable thrust vectors during flight. This platform has a salient feature for mechanical simplicity, as it does not require additional actuators to control the directions of thrust vectors. A controller for the proposed multirotor platform is designed to enable independent control of position and attitude.The proposed multirotor platform has overactuation in dynamics, which renders a redundancy of 2 DoF for inputs. A new controller is proposed, under which the input allocation scheme searches within this redundancy for smaller thrust forces required to hover at different attitudes. The range of achievable attitudes is enlarged under this new scheme compared with the previously proposed controller, under the same thrust saturation limit for the platform actuators. These controllers are validated with both simulation and experiments and demonstrated by the proposed multirotor aerial platform hovering at non-horizontal attitudes, or tracking independent trajectories for position and attitude simultaneously
Deep ReLU Networks Have Surprisingly Simple Polytopes
A ReLU network is a piecewise linear function over polytopes. Figuring out
the properties of such polytopes is of fundamental importance for the research
and development of neural networks. So far, either theoretical or empirical
studies on polytopes only stay at the level of counting their number, which is
far from a complete characterization of polytopes. To upgrade the
characterization to a new level, here we propose to study the shapes of
polytopes via the number of simplices obtained by triangulating the polytope.
Then, by computing and analyzing the histogram of simplices across polytopes,
we find that a ReLU network has relatively simple polytopes under both
initialization and gradient descent, although these polytopes theoretically can
be rather diverse and complicated. This finding can be appreciated as a novel
implicit bias. Next, we use nontrivial combinatorial derivation to
theoretically explain why adding depth does not create a more complicated
polytope by bounding the average number of faces of polytopes with a function
of the dimensionality. Our results concretely reveal what kind of simple
functions a network learns and its space partition property. Also, by
characterizing the shape of polytopes, the number of simplices be a leverage
for other problems, \textit{e.g.}, serving as a generic functional complexity
measure to explain the power of popular shortcut networks such as ResNet and
analyzing the impact of different regularization strategies on a network's
space partition