A sequential generalization of a one-sample cluster-based permutation test
(as described by [4]) or of TFCE (as described by [6]).
Distributes Type I error over multiple, sequential analyses of the data (at
interim sample sizes specified in look_times never to exceed max_n)
using a permutation-based adaptation of the alpha-spending procedure introduced
by Lan and DeMets [1]. This allows data collection to be terminated before
max_n is reached if there is enough evidence to reject the null hypothesis
at an interim analysis, without inflating the false positive rate. This provides
a principled way to determine sample size and can result in substantial
efficiency gains over a fixed-sample design (i.e. can acheive the same
statistical power with a smaller expected sample size) [2, 3].
Parameters:
X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of
observations; remaining dimensions comprise the size of a single observation.
Observations must appear in the order in which they were collected.
Note: that the last dimension of X should correspond to the dimension
represented in the adjacency parameter (e.g., spectral data should be
provided as (observations,frequencies,channels/vertices)).
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the
mean of the data is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the mean of the data is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This
defines a monotonically increasing function such that f(0) = 0 and
f(n_max) = alpha, determining how Type I error is distributed over
sequential analyses. See [2, 3] for details and provided spending functions
in niseq.spending_functions module.
verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default
verbosity level. See the logging documentation and
mne.verbose() for details. Should only be passed as a keyword
argument.
threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic
(note: this is not an alpha level / “p-value”).
If numeric, vertices with data values more extreme than threshold will
be used to form clusters. If None, threshold will be chosen
automatically to correspond to a p-value of 0.05 for the given number of
observations (only valid when using default statistic). If threshold is
a dict (with keys 'start' and 'step') then threshold-free
cluster enhancement (TFCE) will be used (see TFCE example and [6]).
n_permutations (int, default: 1024) – Number of permutations.
stat_fun (callable | None) – Function called to calculate the test statistic. Must accept 1D-array as
input and return a 1D array. If None (the default), uses
mne.stats.ttest_1samp_no_p.
adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be
spatial vertices, frequency bins, time points, etc. For spatial vertices,
see: mne.channels.find_ch_adjacency(). If False, assumes
no adjacency (each location is treated as independent and unconnected).
If None, a regular lattice adjacency is assumed, connecting
each location to its neighbor(s) along the last dimension
of X (or the last two dimensions if X is 2D).
If adjacency is a matrix, it is assumed to be symmetric (only the
upper triangular half is used) and must be square with dimension equal to
X.shape[-1] (for 2D data) or X.shape[-1]*X.shape[-2]
(for 3D data) or (optionally)
X.shape[-1]*X.shape[-2]*X.shape[-3]
(for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
max_step (int) – Maximum distance between samples along the second axis of X to be
considered adjacent (typically the second axis is the “time” dimension).
Only used when adjacency has shape (n_vertices, n_vertices), that is,
when adjacency is only specified for sensors (e.g., via
mne.channels.find_ch_adjacency()), and not via sensors and
further dimensions such as time points (e.g., via an additional call of
mne.stats.combine_adjacency()).
exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering
(e.g., medial wall vertices). Should be the same shape as X.
If None, no points are excluded.
t_power (float) – Power to raise the statistical values (usually t-values) by before
summing (sign will be retained). Note that t_power=0 will give a
count of locations in each cluster, t_power=1 will weight each location
by its statistical score.
out_type ('mask' | 'indices') – Output format of clusters within a list.
If 'mask', returns a list of boolean arrays,
each with the same shape as the input data (or slices if the shape is 1D
and adjacency is None), with True values indicating locations that are
part of a cluster. If 'indices', returns a list of tuple of ndarray,
where each ndarray contains the indices of locations that together form the
given cluster along the given dimension. Note that for large datasets,
'indices' may use far less memory than 'mask'.
Default is 'indices'.
check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint
sets before clustering. This may lead to faster clustering, especially if
the second dimension of X (usually the “time” dimension) is large.
Returns:
looks (dict) – Dictionary containing results of each look at the data, indexed by the
values provided in look_times. Each entry of the dictionary is a tuple
that contains:
obsarray, shape (p[, q][, r])
Statistic observed for all variables.
clusterslist
List type defined by out_type above.
cluster_pvarray
P-value for each cluster.
H0array, shape (n_permutations,)
Max cluster level stats observed under permutation.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These
can be compared to adj_alphas to determine on which looks, if any, one
can reject the null hypothesis.
adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the
false positive rate across multiple, sequential analyses. All p-values
should be compared to the adjusted alpha for the look at which they were
computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance
(other than n_max), but it is important to include all looks that have
already occured in look_times each time you analyze the data to ensure that
valid adjusted significance thresholds are computed. In your final analysis,
look_times should contain the ordered sample sizes at all looks at the data
that occured during the study.
When reporting results, you should minimally include the sample sizes at each
look, the minimum p-values at each look, the adjusted significance thresholds
for each look (to which the p-values are compared), and the value of the
alpha-spending function at each look. See [3] for further recommendations.
References
niseq.cluster_test.sequential_cluster_test_corr(X, y, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
A sequential cluster test for correlations.
A sequential generalization of a cluster-based permutation test
(as described by [4]) or of TFCE (as described by [6]) for testing a
relationship between X and a continuous variable y. Uses Pearson
correlation by default (or its z-transform if using TFCE), but test
statistic can be modified.
Distributes Type I error over multiple, sequential analyses of the data (at
interim sample sizes specified in look_times never to exceed max_n)
using a permutation-based adaptation of the alpha-spending procedure introduced
by Lan and DeMets [1]. This allows data collection to be terminated before
max_n is reached if there is enough evidence to reject the null hypothesis
at an interim analysis, without inflating the false positive rate. This provides
a principled way to determine sample size and can result in substantial
efficiency gains over a fixed-sample design (i.e. can acheive the same
statistical power with a smaller expected sample size) [2, 3].
Parameters:
X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of
observations; remaining dimensions comprise the size of a single observation.
Observations must appear in the order in which they were collected.
Note: that the last dimension of X should correspond to the dimension
represented in the adjacency parameter (e.g., spectral data should be
provided as (observations,frequencies,channels/vertices)).
y (array, shape (n_observations,)) – Value of dependent variable associated with each observation in X.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the
mean of the data is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the mean of the data is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This
defines a monotonically increasing function such that f(0) = 0 and
f(n_max) = alpha, determining how Type I error is distributed over
sequential analyses. See [2, 3] for details and provided spending functions
in niseq.spending_functions module.
verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default
verbosity level. See the logging documentation and
mne.verbose() for details. Should only be passed as a keyword
argument.
threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic
(note: this is not an alpha level / “p-value”).
If numeric, vertices with data values more extreme than threshold will
be used to form clusters. If None, threshold will be chosen
automatically to correspond to a p-value of 0.05 for the given number of
observations (only valid when using default statistic). If threshold is
a dict (with keys 'start' and 'step') then threshold-free
cluster enhancement (TFCE) will be used (see TFCE example and [6]).
n_permutations (int, default: 1024) – Number of permutations.
tail – If tail is 1, the alternative hypothesis is that the
correlation is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the correlation is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the correlation is less than 0 (lower tailed test).
stat_fun (callable() | None, default: None) – Function called to calculate the test statistic. Must accept 1D-array as
input and return a 1D array. If None (the default), computes Pearson
correlation.
adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be
spatial vertices, frequency bins, time points, etc. For spatial vertices,
see: mne.channels.find_ch_adjacency(). If False, assumes
no adjacency (each location is treated as independent and unconnected).
If None, a regular lattice adjacency is assumed, connecting
each location to its neighbor(s) along the last dimension
of X (or the last two dimensions if X is 2D).
If adjacency is a matrix, it is assumed to be symmetric (only the
upper triangular half is used) and must be square with dimension equal to
X.shape[-1] (for 2D data) or X.shape[-1]*X.shape[-2]
(for 3D data) or (optionally)
X.shape[-1]*X.shape[-2]*X.shape[-3]
(for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
max_step (int) – Maximum distance between samples along the second axis of X to be
considered adjacent (typically the second axis is the “time” dimension).
Only used when adjacency has shape (n_vertices, n_vertices), that is,
when adjacency is only specified for sensors (e.g., via
mne.channels.find_ch_adjacency()), and not via sensors and
further dimensions such as time points (e.g., via an additional call of
mne.stats.combine_adjacency()).
exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering
(e.g., medial wall vertices). Should be the same shape as X.
If None, no points are excluded.
t_power (float) – Power to raise the statistical values (usually F-values) by before
summing (sign will be retained). Note that t_power=0 will give a
count of locations in each cluster, t_power=1 will weight each location
by its statistical score.
out_type ('mask' | 'indices') – Output format of clusters within a list.
If 'mask', returns a list of boolean arrays,
each with the same shape as the input data (or slices if the shape is 1D
and adjacency is None), with True values indicating locations that are
part of a cluster. If 'indices', returns a list of tuple of ndarray,
where each ndarray contains the indices of locations that together form the
given cluster along the given dimension. Note that for large datasets,
'indices' may use far less memory than 'mask'.
Default is 'indices'.
check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint
sets before clustering. This may lead to faster clustering, especially if
the second dimension of X (usually the “time” dimension) is large.
Returns:
looks (dict) – Dictionary containing results of each look at the data, indexed by the
values provided in look_times. Each entry of the dictionary is a tuple
that contains:
obsarray, shape (p[, q][, r])
Statistic observed for all variables.
clusterslist
List type defined by out_type above.
cluster_pvarray
P-value for each cluster.
H0array, shape (n_permutations,)
Max cluster level stats observed under permutation.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These
can be compared to adj_alphas to determine on which looks, if any, one
can reject the null hypothesis.
adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the
false positive rate across multiple, sequential analyses. All p-values
should be compared to the adjusted alpha for the look at which they were
computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance
(other than n_max), but it is important to include all looks that have
already occured in look_times each time you analyze the data to ensure that
valid adjusted significance thresholds are computed. In your final analysis,
look_times should contain the ordered sample sizes at all looks at the data
that occured during the study.
When reporting results, you should minimally include the sample sizes at each
look, the minimum p-values at each look, the adjusted significance thresholds
for each look (to which the p-values are compared), and the value of the
alpha-spending function at each look. See [3] for further recommendations.
A sequential generalization of an independet-sample cluster-based
permutation test (as described by [4]) or of TFCE (as described by [6]).
Distributes Type I error over multiple, sequential analyses of the data (at
interim sample sizes specified in look_times never to exceed max_n)
using a permutation-based adaptation of the alpha-spending procedure introduced
by Lan and DeMets [1]. This allows data collection to be terminated before
max_n is reached if there is enough evidence to reject the null hypothesis
at an interim analysis, without inflating the false positive rate. This provides
a principled way to determine sample size and can result in substantial
efficiency gains over a fixed-sample design (i.e. can acheive the same
statistical power with a smaller expected sample size) [2, 3].
Parameters:
X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of
observations; remaining dimensions comprise the size of a single observation.
Observations must appear in the order in which they were collected.
Note: that the last dimension of X should correspond to the dimension
represented in the adjacency parameter (e.g., spectral data should be
provided as (observations,frequencies,channels/vertices)).
labels (array, shape (n_observations,)) – Condition label associated with each observation in X.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
tail (-1 or 0 or 1, default: 0) – If tail is 1, the statistic is thresholded above threshold.
If tail is -1, the statistic is thresholded below threshold.
If tail is 0, the statistic is thresholded on both sides of
the distribution.
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This
defines a monotonically increasing function such that f(0) = 0 and
f(n_max) = alpha, determining how Type I error is distributed over
sequential analyses. See [2, 3] for details and provided spending functions
in niseq.spending_functions module.
verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default
verbosity level. See the logging documentation and
mne.verbose() for details. Should only be passed as a keyword
argument.
threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic
(note: this is not an alpha level / “p-value”).
If numeric, vertices with data values more extreme than threshold will
be used to form clusters. If None, threshold will be chosen
automatically to correspond to a p-value of 0.05 for the given number of
observations (only valid when using default statistic). If threshold is
a dict (with keys 'start' and 'step') then threshold-free
cluster enhancement (TFCE) will be used (see TFCE example and [6]).
n_permutations (int, default: 1024) – Number of permutations.
tail – If tail is 1, the alternative hypothesis is that the
mean of the data is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the mean of the data is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the mean of the data is less than 0 (lower tailed test).
stat_fun (callable | None) – Function called to calculate the test statistic. Must accept 1D-array as
input and return a 1D array. If None (the default), uses
mne.stats.f_oneway.
adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be
spatial vertices, frequency bins, time points, etc. For spatial vertices,
see: mne.channels.find_ch_adjacency(). If False, assumes
no adjacency (each location is treated as independent and unconnected).
If None, a regular lattice adjacency is assumed, connecting
each location to its neighbor(s) along the last dimension
of X (or the last two dimensions if X is 2D).
If adjacency is a matrix, it is assumed to be symmetric (only the
upper triangular half is used) and must be square with dimension equal to
X.shape[-1] (for 2D data) or X.shape[-1]*X.shape[-2]
(for 3D data) or (optionally)
X.shape[-1]*X.shape[-2]*X.shape[-3]
(for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
max_step (int) – Maximum distance between samples along the second axis of X to be
considered adjacent (typically the second axis is the “time” dimension).
Only used when adjacency has shape (n_vertices, n_vertices), that is,
when adjacency is only specified for sensors (e.g., via
mne.channels.find_ch_adjacency()), and not via sensors and
further dimensions such as time points (e.g., via an additional call of
mne.stats.combine_adjacency()).
exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering
(e.g., medial wall vertices). Should be the same shape as X.
If None, no points are excluded.
t_power (float) – Power to raise the statistical values (usually F-values) by before
summing (sign will be retained). Note that t_power=0 will give a
count of locations in each cluster, t_power=1 will weight each location
by its statistical score.
out_type ('mask' | 'indices') – Output format of clusters within a list.
If 'mask', returns a list of boolean arrays,
each with the same shape as the input data (or slices if the shape is 1D
and adjacency is None), with True values indicating locations that are
part of a cluster. If 'indices', returns a list of tuple of ndarray,
where each ndarray contains the indices of locations that together form the
given cluster along the given dimension. Note that for large datasets,
'indices' may use far less memory than 'mask'.
Default is 'indices'.
check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint
sets before clustering. This may lead to faster clustering, especially if
the second dimension of X (usually the “time” dimension) is large.
Returns:
looks (dict) – Dictionary containing results of each look at the data, indexed by the
values provided in look_times. Each entry of the dictionary is a tuple
that contains:
obsarray, shape (p[, q][, r])
Statistic observed for all variables.
clusterslist
List type defined by out_type above.
cluster_pvarray
P-value for each cluster.
H0array, shape (n_permutations,)
Max cluster level stats observed under permutation.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These
can be compared to adj_alphas to determine on which looks, if any, one
can reject the null hypothesis.
adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the
false positive rate across multiple, sequential analyses. All p-values
should be compared to the adjusted alpha for the look at which they were
computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance
(other than n_max), but it is important to include all looks that have
already occured in look_times each time you analyze the data to ensure that
valid adjusted significance thresholds are computed. In your final analysis,
look_times should contain the ordered sample sizes at all looks at the data
that occured during the study.
When reporting results, you should minimally include the sample sizes at each
look, the minimum p-values at each look, the adjusted significance thresholds
for each look (to which the p-values are compared), and the value of the
alpha-spending function at each look. See [3] for further recommendations.
One-sample sequential permutation test with max-type correction.
This is a sequential generalization of
mne.stats.permutations.permutation_t_test.
Uses max-type correction for multiple comparisons [4].
Distributes Type I error over multiple, sequential analyses of the data (at
interim sample sizes specified in look_times never to exceed max_n)
using a permutation-based adaptation of the alpha-spending procedure introduced
by Lan and DeMets [1]. This allows data collection to be terminated before
max_n is reached if there is enough evidence to reject the null hypothesis
at an interim analysis, without inflating the false positive rate. This provides
a principled way to determine sample size and can result in substantial
efficiency gains over a fixed-sample design (i.e. can acheive the same
statistical power with a smaller expected sample size) [2, 3].
Parameters:
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of
observations; remaining dimensions comprise the size of a single observation.
Observations must appear in the order in which they were collected.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the
mean of the data is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the mean of the data is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This
defines a monotonically increasing function such that f(0) = 0 and
f(n_max) = alpha, determining how Type I error is distributed over
sequential analyses. See [2, 3] for details and provided spending functions
in niseq.spending_functions module.
verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default
verbosity level. See the logging documentation and
mne.verbose() for details. Should only be passed as a keyword
argument.
n_permutations (int, default: 1024) – Number of permutations.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
Returns:
looks (dict) – Dictionary containing results of each look at the data, indexed by the
values provided in look_times. Each entry of the dictionary is a tuple
that contains:
obsarray of shape (p[, q][, r])
Test statistic observed for all variables.
p_valuesarray of shape (p[, q][, r])
P-values for all the tests (a.k.a. variables).
H0array of shape [n_permutations]
Max test statistics obtained by permutations.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These
can be compared to adj_alphas to determine on which looks, if any, one
can reject the null hypothesis.
adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the
false positive rate across multiple, sequential analyses. All p-values
should be compared to the adjusted alpha for the look at which they were
computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance
(other than n_max), but it is important to include all looks that have
already occured in look_times each time you analyze the data to ensure that
valid adjusted significance thresholds are computed. In your final analysis,
look_times should contain the ordered sample sizes at all looks at the data
that occured during the study.
When reporting results, you should minimally include the sample sizes at each
look, the minimum p-values at each look, the adjusted significance thresholds
for each look (to which the p-values are compared), and the value of the
alpha-spending function at each look. See [3] for further recommendations.
References
niseq.max_test.sequential_permutation_test_corr(X, y, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
A sequential permutation test for correlations with a max-type correction.
Tests for a relationship between X and a continuous independent variable
y. By default, uses Pearson correlation by default, but the test
statistic can be modified.
Uses max-type correction for multiple comparisons [4].
Distributes Type I error over multiple, sequential analyses of the data (at
interim sample sizes specified in look_times never to exceed max_n)
using a permutation-based adaptation of the alpha-spending procedure introduced
by Lan and DeMets [1]. This allows data collection to be terminated before
max_n is reached if there is enough evidence to reject the null hypothesis
at an interim analysis, without inflating the false positive rate. This provides
a principled way to determine sample size and can result in substantial
efficiency gains over a fixed-sample design (i.e. can acheive the same
statistical power with a smaller expected sample size) [2, 3].
Parameters:
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of
observations; remaining dimensions comprise the size of a single observation.
Observations must appear in the order in which they were collected.
y (array, shape (n_observations,)) – Value of dependent variable associated with each observation in X.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the
correlation is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the correlation is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the correlation is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This
defines a monotonically increasing function such that f(0) = 0 and
f(n_max) = alpha, determining how Type I error is distributed over
sequential analyses. See [2, 3] for details and provided spending functions
in niseq.spending_functions module.
verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default
verbosity level. See the logging documentation and
mne.verbose() for details. Should only be passed as a keyword
argument.
n_permutations (int, default: 1024) – Number of permutations.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
Returns:
looks (dict) – Dictionary containing results of each look at the data, indexed by the
values provided in look_times. Each entry of the dictionary is a tuple
that contains:
obsarray of shape (p[, q][, r])
Test statistic observed for all variables.
p_valuesarray of shape (p[, q][, r])
P-values for all the tests (a.k.a. variables).
H0array of shape [n_permutations]
Max test statistics obtained by permutations.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These
can be compared to adj_alphas to determine on which looks, if any, one
can reject the null hypothesis.
adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the
false positive rate across multiple, sequential analyses. All p-values
should be compared to the adjusted alpha for the look at which they were
computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance
(other than n_max), but it is important to include all looks that have
already occured in look_times each time you analyze the data to ensure that
valid adjusted significance thresholds are computed. In your final analysis,
look_times should contain the ordered sample sizes at all looks at the data
that occured during the study.
When reporting results, you should minimally include the sample sizes at each
look, the minimum p-values at each look, the adjusted significance thresholds
for each look (to which the p-values are compared), and the value of the
alpha-spending function at each look. See [3] for further recommendations.
Independent-sample sequential permutation test with max-type correction.
By default, this is a sequential generalization of an independent sample
max-t procedure if two groups and max-F procedure if more groups.
Uses max-type correction for multiple comparisons [4].
Distributes Type I error over multiple, sequential analyses of the data (at
interim sample sizes specified in look_times never to exceed max_n)
using a permutation-based adaptation of the alpha-spending procedure introduced
by Lan and DeMets [1]. This allows data collection to be terminated before
max_n is reached if there is enough evidence to reject the null hypothesis
at an interim analysis, without inflating the false positive rate. This provides
a principled way to determine sample size and can result in substantial
efficiency gains over a fixed-sample design (i.e. can acheive the same
statistical power with a smaller expected sample size) [2, 3].
Parameters:
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of
observations; remaining dimensions comprise the size of a single observation.
Observations must appear in the order in which they were collected.
labels (array, shape (n_observations,)) – Condition label associated with each observation in X.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the
mean of the data is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the mean of the data is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This
defines a monotonically increasing function such that f(0) = 0 and
f(n_max) = alpha, determining how Type I error is distributed over
sequential analyses. See [2, 3] for details and provided spending functions
in niseq.spending_functions module.
verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default
verbosity level. See the logging documentation and
mne.verbose() for details. Should only be passed as a keyword
argument.
n_permutations (int, default: 1024) – Number of permutations.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
Returns:
looks (dict) – Dictionary containing results of each look at the data, indexed by the
values provided in look_times. Each entry of the dictionary is a tuple
that contains:
obsarray of shape (p[, q][, r])
Test statistic observed for all variables.
p_valuesarray of shape (p[, q][, r])
P-values for all the tests (a.k.a. variables).
H0array of shape [n_permutations]
Max test statistics obtained by permutations.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These
can be compared to adj_alphas to determine on which looks, if any, one
can reject the null hypothesis.
adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the
false positive rate across multiple, sequential analyses. All p-values
should be compared to the adjusted alpha for the look at which they were
computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance
(other than n_max), but it is important to include all looks that have
already occured in look_times each time you analyze the data to ensure that
valid adjusted significance thresholds are computed. In your final analysis,
look_times should contain the ordered sample sizes at all looks at the data
that occured during the study.
When reporting results, you should minimally include the sample sizes at each
look, the minimum p-values at each look, the adjusted significance thresholds
for each look (to which the p-values are compared), and the value of the
alpha-spending function at each look. See [3] for further recommendations.
The O’Brien Fleming spending function.
A common choice for clinical trials or other confirmatory research, this
spending function is conservative for early analyses, saving more power
for later in the study.
A piecewise spending function for adjusting the maximum sample size.
A piecewise spending function to be used when adjusting your maximum sample
size. i.e., the old spending function old_spending_func is used up until
break_n, the intermediate sample size at which you decided to change the
max sample size. After that, a linear function is used that goes from
old_spending_func(break_n) to (new_max_n, alpha).
This is useful if, for instance, (1) you accidentally collect more data than
your original max_n, requiring you to adjust your spending function, or
(2) if a conditional power analysis encourages you to change your sample
size to acheive a desired Type II error rate. Also (3) if you can no longer
collect your original max_n for practical reasons.
If max_n is adjusted multiple times, you can create piecewise spending
functions recursively.
old_spending_func (instance of SpendingFunction) – an initialized instance of a SpendingFunction subclass with
old_spending_func.max_n greater than break_n.
break_n (int) – The interim sample size at which researcher decides to adjust their
maximum sample size.
The Pocock spending function.
A very common spending function that spends your alpha budget somewhat
liberally (compared to e.g. the O’Brien-Fleming function) at the
beginning of study; consequently, you have more power early on in
exchange for a sharper penalty as you approach the maximum sample size.
Computes the predictive power non-parametrically using the Bayesian
bootstrap. Optionally, you can condition on the current data to get
conditional power, which is useful for adaptive designs. Only valid for
one-sample (or paired-sample) tests.
Statistics computed from resamples using the Bayesian bootstrap, as opposed
to the frequentist boostrap, can be interpreted as draws from the posterior
distribution with an uninformative prior [1]. Thus, results here can be
conveniently interpreted as the Bayesian predictive power. As recommended by
[2] (not in English) and helpfully restated by [3] (in English), resampling
weights are drawn from Dirichlet(alpha = 4) for a better approximation.
This functionality is experimental. It is the best catch-all way to do a
power analysis for permutation tests I can think of, and similar resampling
approaches to estimating power have been used in the literature (e.g. by
[4]); however, it should be noted that the neuroimaging literature has not
converged upon a standardized approach to performing power analyses.
The Bayesian bootstrap approach used here incorporates uncertainty about
the effect size into the power estimate, which is handy since uncertainty
about the true effect size is considerable following a small pilot study, or
even a typical psychology/neuroimaging sample size, as pointed out by [5].
Parameters:
X (array, shape (n_observations, p[, q][, r])) – The data from which to resample. X should contain the observations
for one group or paired differences. The first dimension of the array
is the number of observations; remaining dimensions comprise the size of
a single observation. See documentation for user-input test_func for
more details.
test_func (function) – The one-sample sequential test you want to run a power analysis for.
Must accept look_times, n_max, alpha, and verbose
arguments and return results, the middle two of which are the p-values
for each look and the adjusted alphas, respectively. This could be any
user-facing function from niseq that ends in _1samp.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
conditional (bool, default: False) – If True, performs a conditional power analysis; that is, computes
the probability of a design rejecting the null hypothesis given that the
data in X has already been collected and is included in the
analysis, as in an adaptive design. If False (default), performs a
prospective power analysis (e.g. if you’re using pilot data or data from
another study to inform sample size planning for a study that hasn’t
begun data collection).
n_simulations (int, default: 1024) – Number of bootstrap resamples/simulations to perform.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
**test_func_kwargs – You may input any arguments you’d like to be passed to test_func.
Returns:
res – A results dictionary with keys:
'uncorr_instantaneous_power'list of float
The power of a fixed-sample statistical test performed at each look.
'rejection_probability'list of float
The probability that a sequential test rejects the null hypothesis
(for the first time) at each look time.
'cumulative_power'list of float
The power of a sequential test to reject the null hypothesis by each
look time. res['cumulative_power'][-1] is the power of the full
sequential procedure.
'uncorr_cumulative_power'list of float
Cumulative power if the rejection threshold at each look was not
corrected using alpha-spending (as it should be).
'n_expected'float
The expected sample size for the sequential procedure.
'n_simulations'int
The number of bootstrap resamples used.
'n_orig_data'int
The sample size of the original data X, i.e. X.shape[0].
'conditional'bool
Whether the power analysis that was run was conditional (True)
or prospective (False).
'test_func'str
Name of the sequential test function used.
'test_func_kwargs'dict
A record of the arguments passed to the test function, including
look_times and n_max.
Given a permutation null distribution for a corresponding sequence of look
times and an alpha spending function, computes the adjusted significance
thresholds requires to control the false positive rate across all looks.
This isn’t meant to be accessed directly by users, but it can be used
together with generate_permutation_dist to create new sequential tests
if you’re confident you know what you’re doing.
Parameters:
H0 (array of shape (n_permutations, n_looks)) – The joint permutation null distribution of the test statistic across
look times.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order.
Not to exceed max_n.
n_max (int) – Sample size at which data collection is completed, regardless of whether the
null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)
tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the
mean of the data is greater than 0 (upper tailed test). If tail is 0,
the alternative hypothesis is that the mean of the data is different
than 0 (two tailed test). If tail is -1, the alternative hypothesis
is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This
defines a monotonically increasing function such that f(0) = 0 and
f(n_max) = alpha, determining how Type I error is distributed over
sequential analyses. See [2, 3] for details and provided spending functions
in niseq.spending_functions module.
Returns:
spending (array of shape (n_looks,)) – The value of the alpha spending function at each sample size
in look_times.
adj_alphas (array of shape (n_looks,)) – The adjusted significance threshold against which to compare p-values
at each sample size in look_times.
This function computes the test statistic and its permutation distribution
at each look time. It isn’t meant for users to access directly for ordinary
use, though it can be used in combination with find_thresholds to
construct new sequential tests if you’re confident you know what you’re
doing. You’ll want to read the source code carefully to make sure your
statistic function is compatible.
Parameters:
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of
observations; remaining dimensions comprise the size of a single observation.
Observations must appear in the order in which they were collected.
labels (array of shape (n_observations,) | None) – Either condition labels for each observation in X, a continuous
dependent variable to correlate with X, or None. In the latter case,
a one-sample (sign flip) permutation scheme will be used, otherwise an
independent sample (label shuffle) permutation scheme is used.
n_permutations (int, default: 1024) – Number of permutations.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default),
the seed will be obtained from the operating system
(see RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set
to the number of CPU cores. Requires the joblib package.
None (default) is a marker for ‘unset’ that will be interpreted
as n_jobs=1 (sequential execution) unless the call is performed under
a joblib:joblib.parallel_backend() context manager that sets another
value for n_jobs.
verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default
verbosity level. See the logging documentation and
mne.verbose() for details. Should only be passed as a keyword
argument.
statistic (callable(), default: _get_cluster_stats_samples) – The test statistic to compute on the data, e.g. a cluster statistic or
a max-t statistic. The last value statistic returns must be the
omnibus test statistic (e.g. the max-t or the cluster size), though you
can return whatever other stuff you want which will be passed through
the obs dictionary.
**statistic_kwargs – You may pass arbitrary arguments to the statistic function.
Returns:
obs (dict) – The output of statistic indexed by look time in look_times.
H0 (array of shape (n_permutations, n_looks)) – The joint permutation null distribution of the test statistic across
look times.