Solver#
SolverSAP is the SAP-native timestepper. It requires a
SapModel with at least one articulated joint DOF, a
populated SapContacts buffer, a
SapState input, a
SapState output, a
SapControl object, and a positive timestep.
Timestep Pipeline#
SAP Warp follows the formulation from An Unconstrained Convex Formulation of Compliant Contact and uses Drake’s implementation as a reference. At a high level, each timestep computes free motion, assembles contact, actuation, drive, and limit terms, solves the velocity-level SAP problem, and integrates the resulting state.
Presets#
contact_preset_variant should be read as a discretization choice, not as a
different solver. The same timestep is always organized around a free-motion
velocity, a dynamics metric, contact Jacobians, smooth contact potentials,
actuation terms, drive terms, and limit terms. A preset chooses how those
objects are represented.
For preset \(P\), the velocity solve can be written as
The next configuration is then
The three built-in presets are coherent choices for \((v_P^*,A_P,J_{i,P},w_{i,P},\Phi_P)\):
Preset |
Use |
Main defaults |
|---|---|---|
|
Fast mixed-precision default for large replicated worlds. |
fp32 free motion, fp64 contact solve, fp32 linear solve, fp32 body-inertia contact weights, witness points, midpoint integration. |
|
High-precision ablation of the approximate path. |
fp64 boundary pose, fp64 free motion, fp64 contact solve, fp64 linear solve, fp64 body-inertia contact weights, witness points, midpoint integration. |
|
Drake-style rigid-contact reference mapping. |
fp64 boundary pose, fp64 free motion, fp64 contact solve, fp64 linear
solve, diagonal Delassus contact weights, contact midpoints,
|
The approximate family is
with approx32 and approx64 differing primarily in floating-point
realization. The Drake-style family is
This distinction is useful in experiments. Compare approx32 to
approx64 to measure precision sensitivity,
and compare approx64 to drake to measure the effect of the contact
metric, contact point, and integration rule,
Explicit constructor keyword arguments override preset values after the preset
is expanded, which makes one-factor experiments possible. For example,
contact_preset_variant="approx64" with
contact_weight_mode="diag_delassus" changes only the contact-weight model
while leaving the approximate-family contact point and integration map intact.
Preset Modes#
Each preset is a named bundle of mode variants. The modes can be read directly against the objective above.
contact_weight_modeThe effective contact weight controls the metric scale used by the smooth contact potential. Through the chain rule, a single contact contributes
\[\nabla\ell_i(v)=J_i^T g_i(J_i v;w_i), \qquad \nabla^2\ell_i(v)=J_i^T G_i(J_i v;w_i)J_i.\]body_inertiaestimates \(w_i\) from the two contacted bodies. For contact-frame directions \(C_i=\{c_{t1},c_{t2},c_n\}\), witness \(x_b\), center of mass \(o_b\), body rotation \(R_b\), mass \(m_b\), and diagonal body inertia \(I_b\),\[\omega_b(c) = \frac{1}{m_b} + \left(R_b^T((x_b-o_b)\times c)\right)^T I_b^{-1} \left(R_b^T((x_b-o_b)\times c)\right), \qquad w_i^{\mathrm{body}} = \max\left( \frac{1}{3} \sum_{c\in C_i} \sum_b \omega_b(c), 10^{-12} \right).\]diag_delassusestimates a contact-space mobility from the assembled dynamics matrix and contact Jacobian. The current implementation uses the diagonal dynamics inverse:\[\widehat W_i = J_i\,\mathrm{diag}(A)^{-1}J_i^T, \qquad w_i^{\mathrm{delassus}} = \max\left(\frac{\|\widehat W_i\|_F}{3},10^{-12}\right).\]contact_point_modeThe collision pipeline supplies two witnesses \(x_0,x_1\), a normal, margins, and material data.
witness_pointuses those witnesses directly:\[J_i^{\mathrm{wit}}v = R_{WC}^T\left[V_1(x_1;v)-V_0(x_0;v)\right].\]contact_midpointfirst computes a stiffness-weighted point\[p_C=\alpha_0x_0+\alpha_1x_1, \qquad \alpha_0=\frac{k_0}{k_0+k_1}, \qquad \alpha_1=\frac{k_1}{k_0+k_1},\]and evaluates both body velocities at \(p_C\):
\[J_i^{\mathrm{mid}}v = R_{WC}^T\left[V_1(p_C;v)-V_0(p_C;v)\right].\]position_integrationThe position map runs after the velocity minimization.
midpointuses the incoming and solved SAP velocities,\[\bar v = \frac{v_n+v_{n+1}}{2}, \qquad q_{n+1}=\Phi_{\mathrm{mid}}(q_n,\bar v,h),\]while
sap_euleruses the solved velocity:\[q_{n+1}=\Phi_{\mathrm{sap}}(q_n,v_{n+1},h).\]- Precision Modes
free_motion_solve_precision,contact_solve_precision,contact_linear_solve_precision,sap_contact_weight_precision, anduse_f64_boundary_posecontrol where fp32 or fp64 arrays are used. A value computed through fp32 and then stored in fp64 should be understood as\[x^{(32\rightarrow64)} = \mathrm{cast}_{64}(\mathrm{fl}_{32}(x)).\]The following fp64 contact solve uses this projected value; it does not reconstruct the precision that was discarded earlier. This is why
approx64is the right comparison point when isolating the effect ofapprox32’s mixed-precision path.
Line Search#
line_search_variant is separate from contact_preset_variant. It controls
how the contact solve accepts a Newton step after the preset has chosen the
contact weights, contact point mode, precision path, and integration mode.
The common one-dimensional problem is
The line search evaluates the SAP objective along \(\varphi_k\), including the nonlinear contact, drive, and limit regularizers at the trial velocity.
monotone_decayConservative default. It tests \(\alpha=1,\frac{1}{2},\frac{1}{4},\ldots\) and accepts the first candidate satisfying
\[\varphi_k(\alpha) \le \varphi_k(0) + 10^{-14} + 10^{-12}|\varphi_k(0)|.\]It is the cheapest and most predictable path because it needs only trial costs and a fixed geometric decay.
armijo_decayArmijo-style backtracking controlled by
armijo_c,rho, andline_search_relative_slop. It first probes \(\alpha_{\max}=1/\rho\) and can accept that over-relaxed step if the line derivative is still negative or only slightly positive within the slop. If not, it tests \(1,\rho,\rho^2,\ldots\). The sufficient-decrease test is\[\varphi_k(\alpha) \le \varphi_k(0) + c\,\alpha\,\varphi_k'(0).\]exact_rootRoot-search variant. It brackets the stationary point on \(0\le\alpha\le1.5\) and solves
\[\varphi_k'(\alpha) = \frac{d}{d\alpha}\ell(v_k+\alpha d_k) = 0.\]It uses Newton updates when they stay inside the bracket and bisection when they do not. If the default maximum line-search count is still
40, the solver raises it to100for this variant.
Convergence controls include max_iterations, optimality_abs_tol,
optimality_rel_tol, cost_abs_tol, cost_rel_tol,
line_search_max_iterations, and line_search_relative_slop. The solver
also exposes last_line_search_iterations for profiling accepted steps.
Objective Summary#
SAP advances in two stages. First, free motion predicts the unconstrained
velocity v_star. Second, the solver minimizes a convex velocity objective
that keeps the solution close to free motion in the dynamics metric while
adding smooth terms for contact, friction, actuation, joint drives, and limits.
Solver Configuration and Presets#
The solver surface is SolverSAP. It requires a
SapModel with articulated joint DOFs and does not
support gradient states. benchmark.py constructs it directly from the
loaded scene:
solver = SolverSAP(
loaded.sap_model,
max_rigid_contact=max_rigid_contact,
contact_tau_d=scene_default_shape_tau,
**simulation_solver_kwargs,
)
Pipeline#
step() performs four stages:
Convert public boundary state/control arrays into the solver’s SAP ordering when needed.
Run free motion through
SapFreeMotion.Build contact Jacobians and per-environment dynamics matrices through
SapContactJacobian.Solve the stage-2 SAP velocity problem through
SapContactSolveand integratestate_out.
Collision generation is separate. The caller is expected to run
collide() and pass the
resulting contacts into step().
Blocked Cholesky#
The blocked Cholesky kernels used for the Newton direction are based on NVIDIA Warp’s tile blocked-Cholesky example code and adapted for SAP Warp’s batched, masked, and multi-right-hand-side contact solve paths. The surrounding solver assembly, precision switches, and contact objective are SAP Warp code.
Contact Presets#
contact_preset_variant chooses a coherent numerical discretization of the
same SAP timestep. It is best read as a small experimental design: each preset
selects the precision used by the free-motion and contact stages, the formula
used for the effective contact weight, the point at which the contact Jacobian
is evaluated, and the map that integrates the solved velocity back to
generalized position.
All presets still assemble a stage-2 velocity objective of the form
where \(P\) denotes the selected preset. The minimizer is then integrated by the preset’s position map:
Thus approx32, approx64, and drake are not different contact laws.
They are different approximations to the same discrete variational problem. The
important question is which quantities in \(\ell_P\) are evaluated cheaply,
which are evaluated in reference precision, and which geometric point defines
the relative contact velocity.
Preset |
Intent |
Precision path |
Contact modes |
Integration |
|---|---|---|---|---|
|
Fast mixed-precision default for large batches. |
f32 boundary pose, fp32 free motion, fp64 contact solve, fp32 linear solve, fp32 contact weights. |
|
|
|
Precision ablation of the approximate path. |
f64 boundary pose, fp64 free motion, fp64 contact solve, fp64 linear solve, fp64 contact weights. |
|
|
|
Drake-style reference mapping for the rigid-contact path. |
f64 boundary pose, fp64 free motion, fp64 contact solve, fp64 linear solve, fp64 contact weights. |
|
|
If omitted, SolverSAP uses approx32.
approx_32/approx-32 and approx_64/approx-64 are normalized to
approx32 and approx64.
Preset Matrix#
The table above is compact. The following matrix is the more useful way to read the implementation. Each row names a mathematical object in the objective and shows how the presets choose it.
Object |
|
|
|
|---|---|---|---|
Boundary pose used during contact preparation |
Public pose converted through the f32 path. |
Boundary pose promoted to f64. |
Boundary pose promoted to f64. |
Free-motion solve \(v^*_P\) |
fp32 linear algebra. |
fp64 linear algebra. |
fp64 linear algebra. |
Contact objective buffers |
fp64 scalar objective and derivative buffers. |
fp64 scalar objective and derivative buffers. |
fp64 scalar objective and derivative buffers. |
Newton linear solve \(H_k d_k=-\nabla\ell(v_k)\) |
fp32 blocked Cholesky path. |
fp64 blocked Cholesky path. |
fp64 blocked Cholesky path. |
Effective contact weight \(w_{i,P}\) |
fp32 body-inertia estimate. |
fp64 body-inertia estimate. |
fp64 diagonal-Delassus estimate. |
Contact Jacobian \(J_{i,P}\) |
Relative velocity at the two collision witnesses. |
Relative velocity at the two collision witnesses. |
Relative velocity at a stiffness-weighted contact midpoint. |
Position map \(\Phi_P\) |
Midpoint velocity integration. |
Midpoint velocity integration. |
SAP Euler integration. |
This layout gives the presets a useful interpretation:
approx32Optimizes throughput and memory traffic while keeping the contact objective itself in fp64. It is the default because large replicated worlds are often limited by the free-motion path, contact-weight preparation, and the contact Newton linear solve.
approx64Keeps the approximate geometry and weight model but removes the deliberate fp32 choices. In practice it is the preset to use when a scene behaves differently under
approx32and you want to know whether the difference is primarily numerical precision or the approximate contact model.drakeUses the Drake-style rigid-contact discretization: f64 throughout the SAP path, a Delassus-based effective weight, midpoint contact geometry, and SAP Euler position integration. This is the preset to use when comparing against a reference SAP formulation or when stiff contact stacks are more important than maximum throughput.
The contact solve accepts explicit keyword overrides after the preset has been
expanded. For example, using contact_preset_variant="approx64" and then
setting contact_weight_mode="diag_delassus" creates a single-mode
experiment: precision and integration stay approximate-family, while the weight
model changes.
Approx32#
approx32 is the fast approximate member of the family. It chooses the
body-inertia contact weight, witness-point contact Jacobian, midpoint position
integration, fp32 free motion, fp32 contact weights, and an fp32 Newton linear
solve. The scalar contact objective still runs in fp64, so the contact
potential, gradient, and Hessian contributions are evaluated in the same scalar
precision as the other presets once the approximate data have been assembled.
For a contact \(i\), let \(C_i=\{c_{t1},c_{t2},c_n\}\) be the orthonormal contact frame in world coordinates. For body \(b\), contact witness \(x_b\), center of mass \(o_b\), world rotation \(R_b\), mass \(m_b\), and diagonal body inertia \(I_b\), the directional body mobility used by the implementation is
The body_inertia weight averages these directional mobilities over the two
tangent axes and the normal axis:
In approx32 this expression is evaluated through the fp32 path before being
stored in the fp64 contact buffers. The approximation is intentionally local:
it uses the inertia of the bodies that own the two shapes and the contact
lever arms, but it does not apply the articulated dynamics matrix
\(A_P^{-1}\) to the full contact Jacobian. This saves the per-contact
matrix traversal used by the Delassus path.
The contact Jacobian is evaluated at the witness points produced by collision:
where \(R_{WC}\) maps world velocity into the contact frame and
\(V_b(x;v)\) is the world velocity of body \(b\) at point \(x\).
When the two witnesses are separated by a small positive gap, this measures
relative velocity at the actual pair of geometric witnesses rather than first
collapsing them to a common point. That is why the preset is called
approx: the solve is still a SAP solve, but the contact metric and contact
geometry are less globally coupled than the Drake-style path.
After the velocity solve, position integration uses a midpoint velocity:
For scalar joints this is the familiar \(q_{n+1}=q_n+h\bar v\). For quaternion joints the implementation applies the corresponding midpoint angular increment and normalizes the result. This choice is useful for the approximate presets because it damps some visual discrepancy between the explicit state \(q_n\) and the solved velocity \(v_{n+1}\) when the free-motion path is intentionally lightweight.
Approx64#
approx64 keeps the same mathematical approximations as approx32 but
changes the numerical precision of the path:
The important difference is that boundary pose preparation, free motion,
contact-weight evaluation, the stage-2 scalar buffers, and the blocked Newton
linear solve all use fp64. This makes approx64 the cleanest diagnostic
preset. If
is small for a scene, then the approximate-family choices
body_inertia/witness_point/midpoint dominate the behavior relative
to drake. If \(\Delta v_{\mathrm{round}}\) is large, the scene is
sensitive to fp32 free motion, fp32 contact weights, or the fp32 Newton linear
solve. The preset is therefore useful even when it is not the intended runtime
default: it separates floating-point sensitivity from modeling sensitivity.
Because approx64 still uses witness points, its contact Jacobian has the
same geometric support as approx32. Because it still uses body-inertia
weights, its effective contact metric remains local to the contacted bodies.
The preset should therefore not be interpreted as “Drake in fp64”; it is the
high-precision version of the approximate implementation.
Drake#
drake selects the reference-style rigid-contact path. It keeps the solver
data in fp64, evaluates the contact Jacobian at a single contact point, computes
the effective weight from a Delassus-style matrix, and integrates position with
the SAP Euler map.
For each contact, the implementation forms a diagonal Delassus estimate from the assembled per-environment dynamics matrix:
The use of \(\mathrm{diag}(A_P)^{-1}\) is an implementation choice: it
captures the scale of the contact-space mobility while avoiding a dense solve
for every candidate contact. Compared with body_inertia, this weight sees
the contact Jacobian and the assembled generalized dynamics metric, so the
regularization in \(\ell_i\) is tied more directly to the velocity unknowns
used by the Newton system.
The contact point is a stiffness-weighted midpoint of the two collision witnesses:
with \(\alpha_0=\alpha_1=\frac{1}{2}\) when the denominator is zero. The Jacobian then measures both body velocities at \(p_C\):
This is closer to the usual point-contact abstraction: the normal gap still comes from the collision witnesses and margins, but the velocity constraint is represented at one shared contact point. For stiff contacts this tends to make the local contact model and the generalized dynamics metric agree more closely.
Finally, SAP Euler integration advances positions using the solved velocity:
which reduces to \(q_{n+1}=q_n+h\,v_{n+1}\) for scalar coordinates and uses the corresponding SAP quaternion update for rotational coordinates. The result is a preset whose contact geometry, weight metric, and integration rule match the intended Drake-style SAP discretization more closely than the approximate presets.
Mode Variants#
Each preset is assembled from four independent mode families. Reading them individually is useful when you override one constructor keyword at a time.
Contact Weight Mode#
The effective contact weight \(w_i\) scales the smooth contact potential for contact \(i\). In the Newton system it appears through the contact gradient and Hessian terms
body_inertia is local and cheap. It computes \(w_i\) from body masses,
body-frame diagonal inertias, contact-frame directions, and witness lever arms,
using the average formula shown in the approx32 section. It is selected by
approx32 and approx64.
diag_delassus is dynamics-aware. It first assembles the contact Jacobian
and the per-environment dynamics matrix, then estimates a contact-space
Delassus matrix with \(\mathrm{diag}(A)^{-1}\):
It is selected by drake.
Contact Point Mode#
contact_point_mode selects where the relative velocity in \(J_i v\) is
measured. The collision record always stores two witness points \(x_0\) and
\(x_1\), a normal, two margins, and material data. The mode only changes
the velocity point used by the Jacobian.
witness_point evaluates the velocity of each body at its own witness:
contact_midpoint first computes \(p_C\) and evaluates both velocities
there:
The midpoint mode is paired with the Delassus weight in drake because both
choices make the contact term depend on a single contact-space velocity
measured against the assembled generalized dynamics.
Position Integration Mode#
position_integration is applied after the velocity solve. It does not
change the minimization problem already solved for \(v_{n+1}\), but it
does change the next configuration used by collision and free motion.
midpoint uses the average of the incoming SAP velocity and the solved SAP
velocity:
sap_euler uses the solved velocity directly:
Precision Modes#
The precision knobs choose where fp32/fp64 arrays enter the preset. They do not rename the physical model; they change the floating-point realization of the same equations.
Knob |
Mathematical object |
What changes |
|---|---|---|
|
\(v^*_P\) and the dynamics data prepared by free motion. |
Precision of the linear algebra used before contacts are applied. |
|
\(v\), \(\nabla \ell\), \(H\) |
Scalar precision used by the stage-2 Newton solve. |
|
\(H d = -\nabla \ell\) |
Precision of the blocked Cholesky linear solve for the Newton direction. |
|
\(w_i\) |
Precision used to compute the effective contact weight. |
|
\(w_i\) |
Formula used for the scalar effective weight in contact regularization. |
|
\(J_i\) |
Point where the contact Jacobian measures relative velocity. |
|
\(q_{n+1}\) |
Formula used to integrate positions after \(v_{n+1}\) is solved. |
Precision Knobs#
Each preset can be overridden with explicit constructor kwargs. These are the same knobs described above, exposed directly for controlled comparisons:
free_motion_solve_precisionfp32/f32orfp64/f64for the free-motion solve that prepares \(v^*_P\).contact_solve_precisionScalar dtype used by the stage-2 contact objective buffers.
contact_linear_solve_precisionLinear solve dtype used by the blocked Cholesky path for \(H_k d_k=-\nabla\ell(v_k)\).
sap_contact_weight_precisionPrecision used while computing \(w_i\).
use_f64_boundary_poseConvert boundary poses to f64 for contact preparation.
When a value is computed through fp32 and later stored in an fp64 buffer, it is useful to think of a projection
The subsequent fp64 contact solve cannot recover information lost by the
earlier projection. This is the main distinction between approx32 and
approx64.
Contact Capacity#
max_rigid_contact is the per-world contact capacity used by
SolverSAP and its env-local contact buffers.
benchmark.py reads
simulation.max_rigid_contact as this per-env cap, then sizes the flat
SapCollisionPipeline contact buffer as
simulation.max_rigid_contact * num_worlds.
If generated contacts exceed capacity, the extra contacts are dropped and
solver.last_truncated_contact_count records the truncated count reported by
SapContactJacobian.
Line Search#
line_search_variant is independent from contact_preset_variant. A scene
can use contact_preset_variant="drake" with line_search_variant="armijo_decay",
or an approximate preset with exact_root. The line search controls how the
contact solve accepts a Newton direction after the stage-2 objective has been
assembled.
At Newton iteration \(k\), define the one-dimensional objective
The Newton direction is computed from
so a valid active Newton environment should have \(\varphi_k'(0)=\nabla\ell(v_k)^Td_k<0\). The line search chooses \(\alpha\) and commits
The implementation evaluates the SAP cost along this line, not only a quadratic model. The dynamics term is updated from precomputed line coefficients,
while contact, drive, and limit regularizers are recomputed at the trial velocity. This matters for contact because the SAP regularizers are smooth but nonlinear in \(J_i(v_k+\alpha d_k)\).
monotone_decayConservative default. It tries the sequence
\[\alpha_j = 2^{-j}, \qquad j=0,1,\ldots,\]and accepts the first candidate whose trial cost is monotone up to the fixed device-side relaxation
\[\varphi_k(\alpha_j) \le \varphi_k(0) + 10^{-14} + 10^{-12}\,|\varphi_k(0)|.\]If the step falls below \(10^{-8}\) before satisfying the test, the line search reports a failure status for that environment. This variant does not use the directional slope in its acceptance rule, which makes it simple and cheap. It is the runtime default because it is stable for batched scenes and has predictable work: one fused kernel evaluates the candidate cost, accepts the step, and writes the committed velocity.
armijo_decayArmijo-style backtracking with an over-relaxed first probe.
rhomust lie in(0, 1)andarmijo_ccontrols sufficient decrease. The first trial is\[\alpha_{\max} = \frac{1}{\rho},\]so with the default
rho=0.8the solver first tests \(\alpha_{\max}=1.25\). If the derivative at that point is still negative, or only positive within the line-search slop, the over-relaxed step is accepted. Otherwise the candidate sequence becomes\[\alpha_{\max},\; 1,\; \rho,\; \rho^2,\ldots .\]A candidate is accepted using the Armijo condition
\[\varphi_k(\alpha) \le \varphi_k(0) + c\,\alpha\,\varphi_k'(0),\]together with a finite-difference slope check between consecutive candidate costs. If both the current and previous candidates satisfy Armijo near the bracketed turn in the line, the previous larger step can be selected. The
line_search_relative_slopparameter defaults to \(1000\,\epsilon_{\mathrm{dtype}}\) and is scaled by the local cost magnitude before these near-flat tests are applied.exact_rootExact-root search path. If
line_search_max_iterationsis left at the constructor default40, the solver raises it to100for this variant. It searches for the stationary point of \(\varphi_k\) along the Newton direction:\[\varphi_k'(\alpha) = \frac{d}{d\alpha}\ell(v_k + \alpha d_k) = 0.\]The search interval is
\[0 \le \alpha \le \alpha_{\max}, \qquad \alpha_{\max}=1.5.\]If \(\varphi_k'(\alpha_{\max})\le 0\), the cost is still decreasing at the upper end and the solver accepts \(\alpha_{\max}\). Otherwise it brackets the root between \(0\) and \(\alpha_{\max}\) and starts from the Newton estimate
\[\alpha_0 = \mathrm{clip} \left( \frac{-\varphi_k'(0)}{\varphi_k''(\alpha_{\max})}, 0, \alpha_{\max} \right).\]Each iteration tries a Newton update, but falls back to bisection whenever the Newton point leaves the bracket, becomes non-finite, or is judged too slow. The normalized derivative tolerance is \(10^{-8}\). This variant evaluates the trial derivative and second derivative, including contact Hessian terms, so it is more expensive than the decay variants. It is useful for solver comparisons and stiff scenes where the accepted step length is important enough to justify the additional work.
Convergence and diagnostic controls:
line_search_max_iterationsMaximum number of candidate line-search iterations.
exact_rootpromotes the constructor default from40to100.armijo_candrhoArmijo sufficient-decrease coefficient and geometric decay factor for
armijo_decay.line_search_relative_slopNear-flat tolerance used by the Armijo-style path. If unset, it is \(1000\,\epsilon_{\mathrm{dtype}}\).
cost_abs_tolandcost_rel_tolOuter Newton cost-convergence tolerances. If unset,
monotone_decayuses0.0and5.0e-3;armijo_decayandexact_rootuse1.0e-30and1.0e-15.exact_rootalso uses these values to accept a unit step when the initial directional decrease is already below the cost tolerance.last_line_search_iterationsThe solver records the total accepted line-search iterations from the last step for profiling and comparisons.