Spatial null models¶
The different spatially-constrained null model implementations are the crux of our manuscript. Many of them already had easy-to-use Python interfaces (i.e., Burt-2020 and Moran), while others were easy to translate from open-source MATLAB or R code (i.e., Váša).
Because #reasons, some of the null models were translated / implemented in our lab’s catch-all utility package netneurotools (which you will find is used quite heavily throughout our analysis scripts), while other models (namely, Burt-2018) were implemented in a little helper package developed for this project (parspin).
Here, we provide a brief code snippet highlighting the implementation for each null models used.
We quote relevant text from the manuscript describing the null models.
Note
When brain is used in the code snippets below it references a vector of length (N,) representing the original brain map that is being tested.
When distance is used in the code snippets below it references an array of shape (N, N) representing a geodesic distance matrix.
The n_perm variable is the number of permutations / spins / surrogates to be generated (1,000 for simulations, 10,000 for empirical datasets).
Naive models¶
Parametric¶
Although the exact implementation of the parametric method varies based on the statistical test employed, all implementations share a reliance on standard null distributions. For example, when examining correlation values, the parametric method relies on the Student’s t-distribution; when examining z-statistics, this method uses the standard normal distribution.
from scipy import stats
# map correspondence
rcorr, p_value = stats.pearsonr(brain1, brain2)
# partition specificity
tstat, p_value = stats.ttest_1samp(brain[network], 0)
Non-parametric¶
“The naive non-parametric approach uses a random permutation (i.e., reshuffling) of the data to construct a null distribution, destroying its inherent spatial structure. Each vertex or parcel is uniquely reassigned the value of another vertex or parcel for every permutation.”
from numpy import np
rs = np.random.default_rng(1234)
spins = np.column_stack([rs.permutation(len(brain)) for _ in range(n_perm)])
nulls = brain[spins]
Spatial permutation models¶
Vázquez-Rodríguez (Source code)¶
“The Vázquez-Rodríguez method, which serves as a direct adaptation of the original framework from Alexander-Bloch et al., 2018, NeuroImage but applied to parcellated brain data, was first used in Vázquez-Rodríguez et al., 2019, PNAS. In this adaptation, vertex coordinates are replaced with those of parcel centroids. That is, a rotation is applied to the coordinates for the center-of-mass of each parcel, and parcels are reassigned the value of the closest rotated parcel (i.e., that with the minimum Euclidean distance). If the medial wall is rotated into a region of cortex the value of the nearest parcel is assigned instead, ensuring that all parcels have values for every rotation. This method for handling the medial wall consequently permits the duplicate reassignment of parcel values for every rotation, such that some parcel values may not be present in a given rotation and others may appear more than once. Note that the exact method used to define parcel centroids may impact the performance of this model.”
Vertex data¶
from netneurotools import freesurfer, stats
coords, hemi = freesurfer.get_fsaverage_coords('fsaverage5', 'sphere')
spins = stats.gen_spinsamples(coords, hemi, seed=1234)
nulls = brain[spins]
Parcellated data¶
from netneurotools import datasets, freesurfer, stats
annot = datasets.fetch_cammoun2012('fsaverage5')['scale500']
coords, hemi = freesurfer.find_parcel_centroids(lhannot=annot.lh,
rhannot=annot.rh,
version='fsaverage5',
surf='sphere',
method='surface')
spins = stats.gen_spinsamples(coords, hemi, seed=1234)
nulls = brain[spins]
Baum (Source code)¶
“Used initially in Baum et al., 2020, PNAS, this method projects parcellated brain data to a high-resolution surface mesh, assigning identical values to all the vertices within a given parcel. The projected mesh is subjected to the original spatial permutation reassignment procedure Alexander-Bloch et al., 2018, NeuroImage and re-parcellated by taking the modal (i.e., the most common) value of the vertices in each parcel. When the rotated medial wall completely subsumes a cortical parcel that region is assigned a value of NaN and is removed from subsequent analyses. Notably, this method can result in duplicate assignment of parcel values in each permutation.”
from netneurotools import datasets, freesurfer
annot = datasets.fetch_cammoun2012('fsaverage5')['scale500']
spins = freesurfer.spin_parcels(lhannot=annot.lh, rhannot=annot.rh,
version='fsaverage5', seed=1234)
nulls = brain[spins]
nulls[spins == -1] = np.nan
Cornblath (Source code)¶
“In this method implemented by Cornblath et al., Commun Bio parcellated data are projected to a high-resolution spherical surface mesh, rotated, and re-parcellated by taking the average (i.e., the arithmetic mean) of the vertices in each parcel. When the rotated medial wall completely subsumes a cortical parcel that region is assigned a value of NaN and is removed from subsequent analyses. Because the data are re-parcellated the likelihood of duplicated assignments is very low (though not exactly zero); however, the distribution of re-parcellated values will be slightly different than the original data distribution.”
from netneurotools import datasets, freesurfer
annot = datasets.fetch_cammoun2012('fsaverage5')['scale500']
nulls = freesurfer.spin_data(brain, lhannot=annot.lh, rhannot=annot.rh,
version='fsaverage5', seed=1234)
Váša (Source code)¶
“The first known application of spatial permutations to parcellated data, the Váša method (Váša et al., 2018, Cereb Cortex) attempted to resolve one of the primary drawbacks of the Alexander-Bloch method: duplicate reassignment of values. That is, this method was created so as to yield a “perfect” permutation of the original data for every rotation. Similar to the Vázquez-Rodríguez method, parcel centroids are used instead of vertex coordinates. In order to avoid duplicate reassignments, parcels are iteratively assigned by (1) finding the closest rotated parcel to each original parcel, and (2) assigning the most distant pair of parcels. This two-step process is then repeated for all remaining unassigned parcels until each has been reassigned. Parcels are reassigned without consideration for the medial wall or its rotated location. Note that the exact method used to define parcel centroids may impact the performance of this model.”
from netneurotools import datasets, freesurfer, stats
annot = datasets.fetch_cammoun2012('fsaverage5')['scale500']
coords, hemi = freesurfer.find_parcel_centroids(lhannot=annot.lh,
rhannot=annot.rh,
version='fsaverage5',
surf='sphere',
method='surface')
spins = stats.gen_spinsamples(coords, hemi, method='vasa', seed=1234)
nulls = brain[spins]
Hungarian (Source code)¶
“Similar to the Váša method, the Hungarian method attempts to uniquely reassign each parcel for every rotation. Instead of using an iterative process, however, which can result in globally sub-optimal assignments, this method uses the Hungarian algorithm to solve a linear sum assignment problem (Kuhn, 1955, Nav Res Logist Q). This method attempts to uniquely reassign each parcel such that the global reassignment cost is minimized, where cost is quantified as the distance between the original and rotated parcel centroid coordinates. The medial wall is ignored in all rotations and the optimal reassignment is determined without consideration for its location. Note that the exact method used to define parcel centroids may impact the performance of this model.”
from netneurotools import datasets, freesurfer, stats
annot = datasets.fetch_cammoun2012('fsaverage5')['scale500']
coords, hemi = freesurfer.find_parcel_centroids(lhannot=annot.lh,
rhannot=annot.rh,
version='fsaverage5',
surf='sphere',
method='surface')
spins = stats.gen_spinsamples(coords, hemi, method='hungarian', seed=1234)
nulls = brain[spins]
Parameterized data models¶
Burt-2018 (Source code)¶
The Burt-2018 model must be run separately on each hemisphere; we show the reference call below assuming data from a single hemisphere.
Note also that the brain data provided to the Burt-2018 method must be positive.
“Described in Burt et al., 2018, Nat Neuro, this framework uses a spatial autoregressive model of the form y=ρWy to generate surrogate data. Here, y refers to a Box-Cox transformed, mean-centered brain feature of interest (i.e., a brain map), W is a weight matrix (derived from D, a matrix of the distance between brain regions, and d0, a spatial autocorrelation factor), and ρ is a spatial lag parameter. The parameters ρ and d0 are derived from the data via a least-squares optimization procedure and their estimates ˆρ and ^d0 are used to generate surrogate brain maps according to ysurr=(I−ˆρW[^d0])−1u, where u∼N(0,1) is a vector of random Gaussian noise. Rank-ordered values in the ysurr map are replaced with corresponding values from the original y.”
from parspin import burt
nulls = burt.batch_surrogates(brain, distance, seed=1234)
Burt-2020¶
The Burt-2020 model must be run separately on each hemisphere; we show the reference call below assuming data from a single hemisphere.
“Two years after introducing their spatial autoregressive method, Burt et al., 2020, NeuroImage proposed a novel model to generate surrogate data using variogram estimation. The method operates in two main steps: (1) randomly permute the values in a given brain map, and (2) smooth and re-scale the permuted values to reintroduce spatial autocorrelation characteristic of the original, non-permuted data. Reintroduction of spatial autocorrelation onto the permuted data is achieved via the transformation y=|β|1/2x′+|α|1/2z, where x′ is the permuted data, z∼N(0,1) is a vector of random Gaussian noise, and α and β are estimated via a least-squares optimization between variograms of the original and permuted data. When applied to empirical data, rank-ordered values in the surrogate map are replaced with corresponding values from the original brain map; surrogates maps generated from simulated data use the raw values of y.”
Vertex data (Source code)¶
from brainsmash import mapgen
index = np.argsort(distance, axis=-1)
distance = np.sort(distance, axis=-1)
nulls = brainsmash.mapgen.Sampled(brain, distance, index, seed=1234)(n_perm).T
Parcellated data (Source code)¶
from brainsmash import mapgen
nulls = mapgen.Base(brain, distance, seed=1234)(n_perm).T
Note: for analyses of empirical data in our manuscript we also provided the resample=True parameter.
Surrogates maps for simulated data were always generated with resample=False (default).
Moran (Source code)¶
The Moran model must be run separately on each hemisphere; we show the reference call below assuming data from a single hemisphere.
“Originally developed in the ecology literature (Dray et al., 2011, Geogr Anal; Wagner et al., 2015, Methods Ecol Evol), Moran spectral randomization (MSR) has only been recently applied to neuroimaging data (Paquola et al., 2020, PLoS Biol, vos de Wael et al., Comm Bio, Royer et al., 2020, NeuroImage). Similar to the other parameterized data methods, MSR principally relies on a spatially-informed weight matrix W, usually taking the form of an inverse distance matrix between brain regions. However, rather than using W to estimate parameters via a least-squares approach, MSR uses an eigendecomposition of W to compute spatial eigenvectors that provide an estimate of autocorrelation. These eigenvectors are then used to impose a similar spatial structure on random, normally distributed surrogate data.”
from brainspace import null_models
np.fill_diagonal(dist, 1)
dist **= -1
mrs = null_models.MoranRandomization(joint=True, tol=1e-6, random_state=1234)
nulls = mrs.fit(distance).randomize(brain).T