OTOM Logo

OTOM CFD WebGPU

Advanced Simulation, 3D Raymarching, & AI Orchestration

Table of Contents

1. System Overview & Architecture

OTOM CFD is a high-performance computational fluid dynamics solver running entirely in the browser via WebGPU. It is heavily inspired by the methodologies popularized by the WaterLily.jl Julia package, specifically bringing Cartesian-grid, boundary-data immersion methods to massive parallel compute shaders.

The software supports both 2D and 3D incompressible Navier-Stokes simulations, a dedicated 2D compressible Euler engine (HLLC Riemann solver) for shock-bearing high-speed flows, multi-phase flows, and high-fidelity volumetric raymarching—all orchestrated through an interactive GUI and an integrated LLM-powered AI assistant.

2. Comprehensive Parameter Guide

The simulation is driven by a uniform buffer passed to the GPU every frame. Parameters are split between physical and visual properties.

ParameterDomainDescription / Physical Meaning
Engine Solver2DIncompressible (projection Navier-Stokes, low speed) or Compressible (Euler/HLLC, shocks & high speed). See §4.
Grid Resolution2D / 3DNumber of computational cells (e.g., 910x512 in 2D, 64³ to 256³ in 3D). Dictates spatial frequency limits.
Flow Mode2D / 3DVelocity enforces a constant inlet speed. Pressure drives flow via boundary pressure gradients (\(\Delta P\)).
Flow Speed (\(U_\infty\))2D / 3DInlet velocity in grid cells per second (or m/s if SI units enabled).
Lattice dt (\(\Delta t\))2D / 3DTime advanced per sub-step. Governs the CFL condition (\(CFL = U \frac{\Delta t}{\Delta x}\)).
Physics Steps2D / 3DNumber of compute passes executed per visual render frame. Increases stability for high-speed flows.
Jacobi Itr2D / 3DIterations of the Poisson pressure solver. Higher values ensure stricter incompressibility (\(\nabla \cdot \mathbf{u} = 0\)).
Viscosity (\(\nu\))2D / 3DKinematic viscosity. Defines fluid thickness. Air ~0.015, Water ~0.001.
Visual ParameterDomainDescription
View Mode2D / 3DSelects the Eulerian field to display: Vorticity, Velocity, Pressure, Phase, or Schlieren (Pressure Gradient).
Gain / VScale2D / 3DMultiplicative factor applied before colormapping. Adjusts the sensitivity of the visualizer.
Isoline Density2D / 3DFrequency of contour lines superimposed on the field to visualize gradients.
Arrow Density2D / 3DSpacing of the vector field arrows. In 3D, this samples a specific 2D slice defined by Slice Depth.

3. Core Numerics & Theory Solver

The core fluid solver uses a staggered Cartesian grid with a fractional-step (Projection) method.

Advection & Projection

1. Advection: The velocity field is self-advected using a semi-Lagrangian back-tracing scheme. To reduce numerical dissipation, we utilize high-order filtering.
2. Divergence: The divergence of the intermediate velocity field (\(\mathbf{u}^*\)) is calculated.
3. Pressure Poisson: A Jacobi iterative solver resolves the pressure field needed to counteract the divergence.
4. Gradient Subtraction: The velocity is corrected: \(\mathbf{u}^{n+1} = \mathbf{u}^* - \Delta t \nabla p\).

Boundary Data Immersion Method (BDIM)

Instead of conforming a mesh to an obstacle, OTOM CFD uses Brinkman Penalization. Solid objects are defined by exact analytical Signed Distance Fields (SDFs). The velocity inside the SDF (\(d \le 0\)) is forced to match the object's velocity, allowing for arbitrary, moving, and overlapping geometries at near-zero computational cost.

4. Compressible Flow Solver Euler / HLLC

Alongside the default incompressible projection solver, OTOM CFD provides a dedicated compressible engine (set Engine Solver = Compressible) for high-speed, transonic, and supersonic flows where density variation and shock waves are physically significant. Instead of enforcing incompressibility, it solves the 2D compressible Euler equations in conservative finite-volume form, so it naturally captures shocks, expansion fans, and contact discontinuities.

Governing Equations

The conserved state stored in every cell is \(\mathbf{Q} = [\rho,\ \rho u,\ \rho v,\ E]^T\) (density, x/y momentum, total energy), evolved by the hyperbolic conservation law:

$$\frac{\partial \mathbf{Q}}{\partial t} + \frac{\partial \mathbf{F}(\mathbf{Q})}{\partial x} + \frac{\partial \mathbf{G}(\mathbf{Q})}{\partial y} = 0$$

The system is closed by the ideal-gas equation of state, where \(\gamma\) is the ratio of specific heats and \(a\) the local speed of sound:

$$p = (\gamma - 1)\left(E - \tfrac{1}{2}\rho(u^2 + v^2)\right), \qquad a = \sqrt{\frac{\gamma p}{\rho}}$$

Godunov Finite-Volume Scheme & HLLC Riemann Solver

Each cell is advanced by the net flux across its four faces. The flux at every interface is obtained by solving a local Riemann problem with the HLLC (Harten–Lax–van Leer–Contact) approximate solver, which restores the central contact/shear wave that the simpler HLL solver smears out—essential for clean slip lines and wakes:

$$\mathbf{Q}^{n+1}_{i,j} = \mathbf{Q}^{n}_{i,j} - \frac{\Delta t}{\Delta x}\!\left(\mathbf{F}_{i+\frac12} - \mathbf{F}_{i-\frac12}\right) - \frac{\Delta t}{\Delta y}\!\left(\mathbf{G}_{j+\frac12} - \mathbf{G}_{j-\frac12}\right)$$

The full stencil—four face fluxes plus the conservative update—is fused into a single WebGPU compute pass for robustness and throughput. The scheme is first-order in space and time; a positivity guard floors density and internal energy each step to guarantee \(\rho>0\) and \(p>0\), preventing the NaN blow-up that otherwise occurs across strong shocks.

Boundary & Obstacle Treatment

  • Inlet (left): Dirichlet free-stream state fixed by the user's Inlet Mach, Inlet Density, and Inlet Pressure.
  • Outlet (right): Zero-gradient (transmissive) extrapolation.
  • Top / bottom: Inviscid slip walls (\(v = 0\)).
  • Obstacles: Treated as inviscid reflecting slip walls. The wall-face flux carries pressure only, \(\mathbf{F}_{wall} = [\,0,\ p\,n_x,\ p\,n_y,\ 0\,]^T\), which produces detached bow shocks ahead of blunt bodies and oblique shocks off wedges.

Stability, Parameters & Visualization

The explicit time step is bounded by the acoustic CFL condition \(\Delta t < \Delta x / (|u| + a)\). If a strong shock destabilizes the field, lower Lattice dt or raise Physics Steps. The engine exposes four parameters (active only in Compressible mode):

ParameterRangePhysical Meaning
Gamma (\(\gamma\))1.1 – 2.0Ratio of specific heats (1.4 for diatomic air).
Inlet Mach0 – 5.0Free-stream Mach number \(M = u/a\). Values > 1 are supersonic.
Inlet Density0.1 – 5.0Free-stream density \(\rho_\infty\).
Inlet Pressure0.1 – 5.0Free-stream static pressure \(p_\infty\).

Recommended view modes: Schlieren (density gradient \(|\nabla \rho|\), the classic shock-visualization technique), Velocity (rendered as Mach number), or Pressure. Lagrangian particles and streamlines are advected directly by the reconstructed velocity \(\mathbf{u} = (\rho u,\ \rho v)/\rho\), so they stream through and bend around shocks just like the field itself.

5. Turbulence & Fidelity LES

Large Eddy Simulation (LES)

The Smagorinsky model calculates sub-grid scale (SGS) stresses. It derives an "eddy viscosity" \(\nu_t\) from the resolved strain rate tensor \(\bar{S}_{ij}\), dissipating energy at the smallest resolvable scales.

$$\nu_{t} = (C_s \Delta)^2 \sqrt{2 \bar{S}_{ij} \bar{S}_{ij}}$$

Vorticity Confinement

To counteract the inherent diffusion of the semi-Lagrangian advection, Vorticity Confinement applies a body force that pushes fluid towards local vorticity maxima, keeping vortices sharp and persistent.

$$\mathbf{f}_{conf} = \epsilon \Delta x (\mathbf{N} \times \omega), \quad \mathbf{N} = \frac{\nabla |\omega|}{|\nabla |\omega||}$$

6. Multi-Phase & Control Volumes

OTOM CFD supports multi-phase simulations (e.g., air and water interaction) by advecting a scalar phase field \(\phi \in [0,1]\).

7. 3D Simulation & Volumetric Rendering WebGPU 3D

The 3D solver extends the 2D logic into a texture_3d environment. Rendering is achieved via a single-pass Raymarching fragment shader.

Physically Based Raymarching

The camera casts rays through the domain bounding box. The shader performs two primary operations per ray:

  1. Sphere Tracing: Finds intersections with 3D solid obstacles using the get_sdf3d function. If hit, it calculates surface normals and applies PBR lighting (Metallic/Roughness/Ambient).
  2. Volume Accumulation: If the ray misses the solid (or before it hits), it steps through the 3D texture, sampling the chosen field (e.g., Vorticity). Color and alpha are accumulated via alpha blending (\(C = C_{src}\alpha + C_{dst}(1-\alpha)\)).

Advanced 3D Features

  • Isosurface Mode: Instead of fuzzy clouds, the volume renderer searches for a specific threshold value, calculates the 3D gradient normal of the fluid field, and renders it as a solid, shaded surface complete with volumetric soft shadows.
  • Cutting Planes: Users can define a plane (\(\mathbf{n} \cdot \mathbf{x} - d = 0\)) to clip the volume or strictly render a 2D cross-section slice inside the 3D domain.

8. Lagrangian Particles & Trails

Both 2D and 3D modes support massless Lagrangian particles advected by the velocity field via an RK1 or Euler step.

9. Integrated AI Agent Architecture LLM

OTOM CFD includes a built-in CFDChatbot that allows users to alter the simulation using natural language. The architecture works as follows:

Function Calling & Parameter Mapping

The agent communicates with the Gemini API using strict JSON schemas (Function Calling). When a user asks "Make the fluid thicker" or "Add a large spinning wing," the LLM maps intent to two primary functions:

  • update_settings: Maps semantic concepts to internal variables (e.g., "thicker" \(\rightarrow\) viscosity: 0.05, "slower" \(\rightarrow\) inletVel: 5.0).
  • set_obstacles / modify_existing_obstacles: Translates geometry requests into SDF parameters (e.g., obsType: 'Engineering', engShape: 'NACA 0012', angle: 15, scale: 2.0).

The Chatbot class intercepts these tool calls and updates the Lil-GUI controllers, which in turn flush the updated data to the WebGPU uniform and storage buffers.

10. Aerodynamics & Analytics

The software calculates Drag (\(C_d\)) and Lift (\(C_l\)) coefficients by integrating pressure and viscous shear forces over the surface of the obstacles.

The Reynolds number is calculated dynamically based on the largest obstacle's chord length \(L\):

$$Re = \frac{U_\infty L}{\nu}$$

Analytics can be exported as structured text or baked into PDF reports with canvas screenshots using the GUI tools.

11. AI Surrogate Models Experimental · ML

The AI tab (brain icon in the left sidebar, directly below Demo) provides a machine-learning workflow for training and running neural-network surrogate models that approximate the physics solver. A surrogate learns the solver's behaviour from data and, once trained, can advance the flow with a single network evaluation instead of the full iterative solve. This is offered as an experimental research feature: surrogates trade physical fidelity for speed, and are most interesting in 3D where the iterative pressure solve is the dominant cost.

Concept: Learning the Per-Step Residual

Rather than predicting the next flow field directly, the networks are trained to predict the residual — the change in the field over one solver step — which the simulation then integrates:

$$ \mathbf{q}_{t+\Delta t} = \mathbf{q}_{t} + \mathcal{N}_\theta(\mathbf{q}_{t}, \, \text{mask}, \, Re) $$

where \( \mathbf{q} \) is the stacked velocity/pressure state, \( \text{mask} \) is the obstacle occupancy, \( Re \) is the (normalised) Reynolds number, and \( \mathcal{N}_\theta \) is the trained network. Predicting the residual keeps the targets small and well-conditioned and improves step-to-step stability. All inputs and outputs are normalised internally by fixed reference scales so the network operates on \(\mathcal{O}(1)\) values.

Resolution lock: a network's weights encode the physical scale of the grid it trained on, so each model is bound to a single grid resolution. Applying a model automatically sets the simulation to that grid. To run at a different resolution you train a separate model — resolution is part of a model's identity, not a runtime knob.

2D Training — Train New Model

Trains a compact U-Net (three encoder/decoder levels with skip connections, ~tens of thousands of parameters) on the 2D incompressible solver using TensorFlow.js. The data pipeline drives the real GPU solver headlessly across randomised parameters, captures (state, next-state) pairs near quasi-steady state, and converts them to normalised residual targets. Controls:

  • Model name — identifier used for saving / loading.
  • Obstacle — Circle (the 2D v1 shape).
  • Re min / Re max, U_in min / U_in max — the Reynolds-number and inlet-velocity ranges sampled during data collection.
  • Samples — number of randomised simulation runs; each contributes several training pairs.
  • Epochs — passes over the collected dataset (Adam optimiser, mean-squared-error loss).
  • Status — live progress: Collecting…, then Epoch n — loss …, then Saved ….

2D inference uses a CPU round-trip (read the field off the GPU, evaluate the network in TensorFlow.js, write the result back). At small 2D grids the data volume is tiny, so this stays real-time.

3D Training — Train 3D Model

Trains a small fully-convolutional 3D CNN (a flat stack of 3×3×3 convolutions, 8 → 16 → 16 channels, plus a 1×1×1 output convolution — deliberately not a U-Net so the GPU inference path needs only one kernel type). The data pipeline runs the real 3D incompressible solver headlessly, reads back the velocity and pressure volumes, and bakes the obstacle's occupancy mask as an additional input channel. Controls:

  • Model name.
  • Grid (³) — cubic resolution: 32, 64, 128, 256, 512 (the same set the 3D solver offers, so an applied model always lands on a valid sim resolution).
  • Obstacle — a dropdown: choose a single shape (Sphere, Box, Cylinder, Wing) or All to train one model that generalises across every shape. Multi-shape generalisation is possible because the obstacle's signed-distance / occupancy mask is supplied to the network as a dedicated input channel.
  • Re / U_in ranges, Samples, Epochs, Status — as in the 2D trainer (3D uses tiny batch sizes because the volumes are large).

Input channels (6): \(v_x, v_y, v_z\), pressure, obstacle occupancy (SDF), normalised Reynolds number. Output channels (4): residual \(\Delta v_x, \Delta v_y, \Delta v_z, \Delta p\).

⚠ Memory & cost at large grids. Cost scales with the number of cells, which grows as the cube of the resolution. Grids 16–64 are practical; 128 is heavy; 256 and 512 are extreme and may exhaust GPU or CPU memory during data collection, dataset assembly, or training (a single 256³ activation tensor is on the order of 1 GB; 512³ is ~8× larger). These options are provided for completeness and experimentation — for a first run prefer Grid 16–32, one shape, ~8–12 Samples, ~10 Epochs, and increase only once it trains successfully.

3D Inference — GPU-Resident Forward Pass

Unlike the 2D path, the trained 3D network runs entirely on the GPU. Its weights are uploaded to GPU buffers and the forward pass executes as a chain of WebGPU compute shaders operating directly on the simulation's velocity/pressure volumes:

$$ \text{pack} \;\rightarrow\; \text{conv3d} \times 4 \;\rightarrow\; \text{write-back} $$

The pack stage gathers the velocity, pressure, and obstacle-mask channels into a packed input buffer; each conv3d stage performs one zero-padded ("same") convolution with ReLU; the write-back stage integrates the predicted residual and writes the new velocity and pressure back to the volume textures (zeroing velocity inside solid cells). Because there is no CPU round-trip, the surrogate can in principle outrun the iterative solver — whose 3D cost is dominated by the many Jacobi pressure-Poisson iterations per step. The advantage is largest at higher resolutions; at small grids the physics solver is already inexpensive, so the surrogate may show little or no speed-up.

Correctness self-test. When a 3D model is applied, a one-off check runs the GPU forward pass and compares it against the reference TensorFlow.js evaluation on the identical input, logging the maximum absolute difference to the browser console. A small value (≈\(10^{-3}\) or below) confirms the GPU kernels reproduce the trained network.

Managing & Applying Models — My Models / Apply Model

My Models lists every saved surrogate with its parameter ranges and final training loss; each row has a trash icon to delete it. Models (weights plus a metadata record) persist in the browser through IndexedDB, so they survive page reloads.

Apply Model selects an Active Model and toggles Enable AI Solver. Enabling a model swaps the surrogate in for the physics solver: a 2D model switches the simulation to 2D incompressible at the model's grid; a 3D model switches to 3D incompressible at the model's grid (open the 3D tab to view the result). Disabling the toggle — or selecting None — instantly restores the real physics solver.

Scope, Accuracy & Practical Tips

  • Experimental: surrogates approximate the solver and are not a substitute for it when accuracy matters. A small network trained on few samples can drift or become unstable over many steps; if the field degrades, disable the AI Solver to recover.
  • Improving quality: increase Samples (raises the achievable accuracy ceiling), increase Epochs (reaches that ceiling), and keep the Re/U_in ranges as narrow as your use-case allows (less variety means the network has less to generalise over). Watch the loss in Status: if it is still falling at the last epoch, add epochs; if it has plateaued, add samples or accept the limit.
  • Speed comparison: to judge whether the surrogate is actually faster, watch the FPS readout in the Performance folder with the AI Solver on versus off at the same grid.
  • Supported obstacles (3D training): Sphere, Box, Cylinder, Wing (the analytical primitives whose occupancy can be computed directly). STL meshes and extruded 2D profiles are not yet covered by the 3D trainer.

12. Academic References & Theoretical Foundations

The numerical methods and boundary immersion techniques driving this WebGPU solver heavily draw upon academic research published in top-tier Elsevier journals. For further reading on the mathematics of WaterLily-style models, consult the following literature:

Weymouth, G. D., & Yue, D. K. P. (2011).
Boundary data immersion method for Cartesian-grid simulations of fluid-body interaction problems.
Journal of Computational Physics, 230(22), 6233-6247. [Elsevier]
Establishes the BDIM framework used for obstacle rendering and interaction without conformal meshing.
Maertens, A. P., & Weymouth, G. D. (2015).
Accurate Cartesian-grid simulations of near-body flows at intermediate Reynolds numbers.
Computer Methods in Applied Mechanics and Engineering, 283, 106-129. [Elsevier]
Expands on projection methods and pressure Poisson solvers over non-conformal grids, foundational for the Jacobi iterations used herein.
Laizet, S., & Lamballais, E. (2009).
High-order compact schemes for incompressible flows: A simple and efficient method with quasi-spectral accuracy.
Journal of Computational Physics, 228(15), 5989-6015. [Elsevier]
Provides context on high-order spatial discretizations necessary for maintaining stability in Cartesian solvers at high Reynolds numbers.
Toro, E. F., Spruce, M., & Speares, W. (1994).
Restoration of the contact surface in the HLL-Riemann solver.
Shock Waves, 4(1), 25-34. [Springer]
Introduces the HLLC approximate Riemann solver used by the compressible Euler engine to capture shocks, contact surfaces, and shear waves.
Antoniadis, A. F., Drikakis, D., Farmakis, P. S., et al. (2022).
UCNS3D: An open-source high-order finite-volume unstructured CFD solver.
Computer Physics Communications, 279, 108453. [Elsevier]
High-order finite-volume methodology (MUSCL/WENO reconstruction with Riemann solvers) that informs the compressible engine's flux-reconstruction roadmap.
Smagorinsky, J. (1963).
General circulation experiments with the primitive equations.
Monthly Weather Review (Historical context; modern implementations reviewed widely in Elsevier's Computers & Fluids).
The basis for the Large Eddy Simulation (LES) sub-grid scale modeling implemented in the software's turbulence settings.