Daily Digest

Representations measured, dendrites delayed, neurodynamics evolved

2026-05-15 · 4 synopses

A measurement-flavored day. Three of today's items propose new ways to characterize what a model has internalized, or how internal representations evolve under self-supervision. The interpretability conversation is shifting from 'what does a unit do' to 'how do entire representational geometries match across models and brains.'

research

Partial soft-matching distance: a representational similarity metric that admits ambiguity

Standard RSA collapses partial-correspondence into a single scalar. This proposal keeps the matching uncertainty visible — and surfaces structure that the scalar version was averaging away.

Representational Similarity Analysis is the field’s workhorse for comparing internal representations across models, layers, and brain regions. Its weakness is also its strength: it reduces a representation to a single distance matrix and asks how close two such matrices are. Soft, partial, many-to-many correspondences between units get crushed into the average.

This paper proposes a soft-matching distance that retains the matching itself as part of the metric. Two representations whose units partially overlap return a low distance and a high-rank matching tensor that says where the overlap is. The two come bundled; the practitioner sees both.

In a re-analysis of the Kornblith et al. CKA benchmark, the new metric agrees with CKA on the gross structure (transformers cluster together, ConvNets cluster together) but diverges sharply on the inter-architecture comparisons that have been the most contested. The authors argue, with cautious phrasing, that some prior “models are converging” claims may have been measurement artifacts of metric collapse. The disagreement is worth taking seriously precisely because the new metric was not designed to disagree.

Sources: Partial Soft-Matching Distance (arXiv:2603.x)

research

Sparse axonal and dendritic delays let SNNs compete on temporal benchmarks without recurrence

By learning per-synapse propagation delays — sparse, but trainable — feed-forward SNNs match recurrent counterparts on speech and gesture benchmarks at a fraction of the parameter count.

The standard recipe for temporal processing in spiking networks has been recurrence: build a memory by feeding state back into the network. This paper argues recurrence is a heavier hammer than needed when the underlying biology already supplies a cheaper one — variable conduction delays along axons and dendritic compartments.

The model trains a single scalar delay per active synapse, with a sparsity-inducing prior that prunes most delays back toward zero. The result is a structurally feed-forward network that still encodes temporal context, because different paths through the network arrive at downstream neurons with different latencies.

On the SHD spoken-digit benchmark and the DVS128-Gesture dataset, the delay-augmented feed-forward model matches or exceeds recurrent SNNs at roughly one-third the parameter count and with a substantially shorter training horizon. The biological resonance — that real axonal conduction is heterogeneous and that this heterogeneity is widely suspected to contribute to temporal processing — is acknowledged but not overplayed.

Sources: Sparse Axonal and Dendritic Delays (arXiv:2603.x)

research

Self-supervised evolutionary learning evolves neurodynamic identity

A combined evolutionary-and-self-supervised training regime produces small recurrent networks whose hidden-state trajectories cluster by input class without ever being told what the classes are.

The setup is unusual enough to flag plainly: small recurrent networks are trained with no labels, only a contrastive self-supervision objective on their own hidden-state trajectories. An outer evolutionary loop varies the network topology and hyperparameters across generations. Fitness is the discriminability of the resulting trajectory clusters.

The reported result is that, on a battery of classification tasks ranging from MNIST to UCR time series, networks evolved this way reach within 4-9% of fully supervised performance — without ever seeing labels during training. The trajectories themselves, when projected to two dimensions, separate by class with no further nudging.

The claim is bold and deserves replication scrutiny. If it holds, it suggests that representations carrying class structure may be partially recoverable from the architecture’s own dynamics under sufficiently structured self-supervision — a substantially stronger version of the “good features fall out for free” intuition than the literature has so far supported.

Sources: Self-Supervised Evolutionary Learning of Neurodynamics (arXiv:2603.x)

research

Amortized neuron-parameter inference for analog neuromorphic substrates

The chronic obstacle to deploying SNNs on analog neuromorphic chips has been parameter mismatch between digital training and analog hardware. A learned inference network maps from one to the other in milliseconds.

Training an SNN on a GPU and deploying it on analog neuromorphic silicon has always been a translation problem. The digital model thinks in idealized leaky-integrate-and-fire neurons; the analog substrate has process-variation, temperature drift, and capacitor non-linearity that no two chips share exactly. The standard fix is per-chip calibration that takes hours.

This paper proposes amortizing the calibration. A small inference network is trained, on a population of synthetic chips, to map “digital neuron parameters + a short probe response from this physical chip” to “analog parameters that approximate the digital neuron on this specific chip.” At deployment, the probe runs in milliseconds; the inference network’s output configures the chip; the digital model is then deployed without retraining.

The reported per-chip deployment time drops from ~4 hours to under 12 seconds. If reproducible across vendors, this removes one of the largest practical frictions in actual neuromorphic deployment. The follow-up question is whether the inference network generalizes to a new chip family or has to be re-trained per family.

Sources: Amortized Inference of Neuron Parameters (arXiv:2603.x)