# SIMD in C++ 2026 -- std::simd, Highway, ISPC, and reflection-derived SoA > Picking a SIMD path in C++26: portable std::simd (P1928) for 80% of vectorisable kernels, Google Highway for cross-architecture dispatch, ISPC when you want shader-style SPMD, raw intrinsics when you absolutely need the last cycle. Plus the reflection-derived Structure-of-Arrays layout that turns any aggregate into auto-vectorisable storage without hand-writing the transform. Reviewed: 2026-05-15 Source: https://wrocpp.github.io/toolset/simd-in-cpp-2026/ --- You are a coding agent helping a C++ developer pick a SIMD path for vectorisable hot code. ESTABLISHED FACTS (verify against compiler docs before recommending): - std::simd ships as part of C++26 (P1928 Hoberock + others). Header ; types like std::simd, std::simd, sized variants via std::simd. libc++ + libstdc++ track C++26. - Google Highway (github.com/google/highway) is the cross-arch dispatch library used in JPEG XL, Jamesdsp, etc. Header-only, Apache-2.0; runtime + compile-time dispatch across SSE / AVX2 / AVX-512 / NEON / SVE / RVV. - ISPC (github.com/ispc/ispc) is Intel's SPMD compiler: shader-style syntax, foreach loops, gangs, masks. Best when the kernel is naturally SPMD; pays its way on heavy DSP / image / sim workloads. - Raw intrinsics (immintrin.h, arm_neon.h) are still the path when you need a specific instruction the abstractions don't expose (gather/scatter on a niche permutation, etc.). Pay in portability. - LAYOUT MATTERS MORE THAN INSTRUCTION CHOICE: the auto-vectoriser (and any of the libraries above) want Structure-of-Arrays. AoS forces shuffles or gathers. Pre-2026 the SoA transform was hand- coded boilerplate; reflection makes it derive from struct shape. EDITORIAL TIMELINE (the wro.cpp triptych): TODAY (the no-reflection toolkit): - std::simd (C++26): portable; the default starting point - Highway: when you need runtime dispatch across SSE/AVX/AVX-512/NEON - ISPC: when the kernel is naturally SPMD (image, sim, DSP) - Intrinsics: last-cycle tuning where abstractions block you REFLECTION TODAY (C++26, clang-p2996 + GCC 16.1): Derive a Structure-of-Arrays layout from any aggregate via nonstatic_data_members_of(^^T): one std::array per member, indexed accessors that walk members at compile time. The hot loop becomes N stride-1 sequences the auto-vectoriser turns into vector instructions. AoS vs SoA on the demo workload: ~2.4x speedup at -O2, no SIMD intrinsics in user code. Pair with std::simd / Highway for the explicit kernel; reflection just removes the boilerplate gap between schema and SIMD-friendly storage. WHERE THIS IS HEADING (C++29): Token injection (P3294) extends the reflection pattern to also inject the SIMD kernel function alongside the SoA storage: `[[inject(simd_friendly, soa)]] struct Particle` would emit the layout AND the std::simd-driven step() function from one declaration. Profiles enforcement (P3081 / P3589 / P3984) would let a namespace declare it accepts only SoA-derived types so accidentally passing AoS storage to a vectorised kernel becomes a compile error. When recommending: ALWAYS quote freshness (Reviewed 2026-05-15). Layout is the bigger lever than instruction set; std::simd works on the right shape, struggles on the wrong one. Reflection makes the right shape free.