Welcome to niseq’s documentation!¶
Below is the API documentation for the niseq Python package. Check out our GitHub repository and example notebooks for more information.
Indices and tables¶
Cluster-level Tests¶
- niseq.cluster_test.sequential_cluster_test_1samp(X, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
A sequential one-sample cluster test.
A sequential generalization of a one-sample cluster-based permutation test (as described by [4]) or of TFCE (as described by [6]).
Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in
look_timesnever to exceedmax_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated beforemax_nis reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].- Parameters
X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected. Note: that the last dimension of
Xshould correspond to the dimension represented in the adjacency parameter (e.g., spectral data should be provided as(observations, frequencies, channels/vertices)).look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of
SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions inniseq.spending_functionsmodule.verbose (bool | str | int | None) – Control verbosity of the logging output. If
None, use the default verbosity level. See the logging documentation andmne.verbose()for details. Should only be passed as a keyword argument.threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic (note: this is not an alpha level / “p-value”). If numeric, vertices with data values more extreme than
thresholdwill be used to form clusters. IfNone, threshold will be chosen automatically to correspond to a p-value of 0.05 for the given number of observations (only valid when using default statistic). Ifthresholdis adict(with keys'start'and'step') then threshold-free cluster enhancement (TFCE) will be used (see TFCE example and [6]).n_permutations (int, default: 1024) – Number of permutations.
stat_fun (callable | None) – Function called to calculate the test statistic. Must accept 1D-array as input and return a 1D array. If
None(the default), uses mne.stats.ttest_1samp_no_p.adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be spatial vertices, frequency bins, time points, etc. For spatial vertices, see:
mne.channels.find_ch_adjacency(). IfFalse, assumes no adjacency (each location is treated as independent and unconnected). IfNone, a regular lattice adjacency is assumed, connecting each location to its neighbor(s) along the last dimension ofX(or the last two dimensions ifXis 2D). Ifadjacencyis a matrix, it is assumed to be symmetric (only the upper triangular half is used) and must be square with dimension equal toX.shape[-1](for 2D data) orX.shape[-1] * X.shape[-2](for 3D data) or (optionally)X.shape[-1] * X.shape[-2] * X.shape[-3](for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.max_step (int) – Maximum distance between samples along the second axis of
Xto be considered adjacent (typically the second axis is the “time” dimension). Only used whenadjacencyhas shape (n_vertices, n_vertices), that is, when adjacency is only specified for sensors (e.g., viamne.channels.find_ch_adjacency()), and not via sensors and further dimensions such as time points (e.g., via an additional call ofmne.stats.combine_adjacency()).exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering (e.g., medial wall vertices). Should be the same shape as
X. IfNone, no points are excluded.t_power (float) – Power to raise the statistical values (usually t-values) by before summing (sign will be retained). Note that
t_power=0will give a count of locations in each cluster,t_power=1will weight each location by its statistical score.out_type ('mask' | 'indices') – Output format of clusters within a list. If
'mask', returns a list of boolean arrays, each with the same shape as the input data (or slices if the shape is 1D and adjacency is None), withTruevalues indicating locations that are part of a cluster. If'indices', returns a list of tuple of ndarray, where each ndarray contains the indices of locations that together form the given cluster along the given dimension. Note that for large datasets,'indices'may use far less memory than'mask'. Default is'indices'.check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint sets before clustering. This may lead to faster clustering, especially if the second dimension of
X(usually the “time” dimension) is large.
- Returns
looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in
look_times. Each entry of the dictionary is a tuple that contains:obsarray, shape (p[, q][, r])Statistic observed for all variables.
clusterslistList type defined by out_type above.
cluster_pvarrayP-value for each cluster.
H0array, shape (n_permutations,)Max cluster level stats observed under permutation.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in
look_times. These can be compared toadj_alphasto determine on which looks, if any, one can reject the null hypothesis.adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance (other than
n_max), but it is important to include all looks that have already occured inlook_timeseach time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis,look_timesshould contain the ordered sample sizes at all looks at the data that occured during the study.When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.
References
- 1
Gordon Lan, K. K., & DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika, 70(3), 659-663.
- 2
Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.
- 3
Lakens, D., Pahlke, F., & Wassmer, G. (2021). Group Sequential Designs: A Tutorial. https://doi.org/10.31234/osf.io/x4azm
- 4
Maris, E., & Oostenveld, R. (2007). Nonparametric statistical testing of EEG-and MEG-data. Journal of neuroscience methods, 164(1), 177-190.
- 5
Jona Sassenhagen and Dejan Draschkow. Cluster-based permutation tests of meg/eeg data do not establish significance of effect latency or location. Psychophysiology, 56(6):e13335, 2019. doi:10.1111/psyp.13335.
- 6
Stephen M. Smith and Thomas E. Nichols. Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage, 44(1):83–98, 2009. doi:10.1016/j.neuroimage.2008.03.061.
- niseq.cluster_test.sequential_cluster_test_corr(X, y, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
A sequential cluster test for correlations.
A sequential generalization of a cluster-based permutation test (as described by [4]) or of TFCE (as described by [6]) for testing a relationship between
Xand a continuous variabley. Uses Pearson correlation by default (or its z-transform if using TFCE), but test statistic can be modified.Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in
look_timesnever to exceedmax_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated beforemax_nis reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].- Parameters
X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected. Note: that the last dimension of
Xshould correspond to the dimension represented in the adjacency parameter (e.g., spectral data should be provided as(observations, frequencies, channels/vertices)).y (array, shape (n_observations,)) – Value of dependent variable associated with each observation in
X.look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of
SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions inniseq.spending_functionsmodule.verbose (bool | str | int | None) – Control verbosity of the logging output. If
None, use the default verbosity level. See the logging documentation andmne.verbose()for details. Should only be passed as a keyword argument.threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic (note: this is not an alpha level / “p-value”). If numeric, vertices with data values more extreme than
thresholdwill be used to form clusters. IfNone, threshold will be chosen automatically to correspond to a p-value of 0.05 for the given number of observations (only valid when using default statistic). Ifthresholdis adict(with keys'start'and'step') then threshold-free cluster enhancement (TFCE) will be used (see TFCE example and [6]).n_permutations (int, default: 1024) – Number of permutations.
tail – If tail is 1, the alternative hypothesis is that the correlation is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the correlation is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the correlation is less than 0 (lower tailed test).
stat_fun (callable() | None, default: None) – Function called to calculate the test statistic. Must accept 1D-array as input and return a 1D array. If
None(the default), computes Pearson correlation.adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be spatial vertices, frequency bins, time points, etc. For spatial vertices, see:
mne.channels.find_ch_adjacency(). IfFalse, assumes no adjacency (each location is treated as independent and unconnected). IfNone, a regular lattice adjacency is assumed, connecting each location to its neighbor(s) along the last dimension ofX(or the last two dimensions ifXis 2D). Ifadjacencyis a matrix, it is assumed to be symmetric (only the upper triangular half is used) and must be square with dimension equal toX.shape[-1](for 2D data) orX.shape[-1] * X.shape[-2](for 3D data) or (optionally)X.shape[-1] * X.shape[-2] * X.shape[-3](for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.max_step (int) – Maximum distance between samples along the second axis of
Xto be considered adjacent (typically the second axis is the “time” dimension). Only used whenadjacencyhas shape (n_vertices, n_vertices), that is, when adjacency is only specified for sensors (e.g., viamne.channels.find_ch_adjacency()), and not via sensors and further dimensions such as time points (e.g., via an additional call ofmne.stats.combine_adjacency()).exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering (e.g., medial wall vertices). Should be the same shape as
X. IfNone, no points are excluded.t_power (float) – Power to raise the statistical values (usually F-values) by before summing (sign will be retained). Note that
t_power=0will give a count of locations in each cluster,t_power=1will weight each location by its statistical score.out_type ('mask' | 'indices') – Output format of clusters within a list. If
'mask', returns a list of boolean arrays, each with the same shape as the input data (or slices if the shape is 1D and adjacency is None), withTruevalues indicating locations that are part of a cluster. If'indices', returns a list of tuple of ndarray, where each ndarray contains the indices of locations that together form the given cluster along the given dimension. Note that for large datasets,'indices'may use far less memory than'mask'. Default is'indices'.check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint sets before clustering. This may lead to faster clustering, especially if the second dimension of
X(usually the “time” dimension) is large.
- Returns
looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in
look_times. Each entry of the dictionary is a tuple that contains:obsarray, shape (p[, q][, r])Statistic observed for all variables.
clusterslistList type defined by out_type above.
cluster_pvarrayP-value for each cluster.
H0array, shape (n_permutations,)Max cluster level stats observed under permutation.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in
look_times. These can be compared toadj_alphasto determine on which looks, if any, one can reject the null hypothesis.adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance (other than
n_max), but it is important to include all looks that have already occured inlook_timeseach time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis,look_timesshould contain the ordered sample sizes at all looks at the data that occured during the study.When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.
References
- 1
Gordon Lan, K. K., & DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika, 70(3), 659-663.
- 2
Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.
- 3
Lakens, D., Pahlke, F., & Wassmer, G. (2021). Group Sequential Designs: A Tutorial. https://doi.org/10.31234/osf.io/x4azm
- 4
Maris, E., & Oostenveld, R. (2007). Nonparametric statistical testing of EEG-and MEG-data. Journal of neuroscience methods, 164(1), 177-190.
- 5
Jona Sassenhagen and Dejan Draschkow. Cluster-based permutation tests of meg/eeg data do not establish significance of effect latency or location. Psychophysiology, 56(6):e13335, 2019. doi:10.1111/psyp.13335.
- 6
Stephen M. Smith and Thomas E. Nichols. Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage, 44(1):83–98, 2009. doi:10.1016/j.neuroimage.2008.03.061.
- niseq.cluster_test.sequential_cluster_test_indep(X, labels, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
A sequential independent-sample cluster test.
A sequential generalization of an independet-sample cluster-based permutation test (as described by [4]) or of TFCE (as described by [6]).
Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in
look_timesnever to exceedmax_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated beforemax_nis reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].- Parameters
X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected. Note: that the last dimension of
Xshould correspond to the dimension represented in the adjacency parameter (e.g., spectral data should be provided as(observations, frequencies, channels/vertices)).labels (array, shape (n_observations,)) – Condition label associated with each observation in
X.look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)tail (-1 or 0 or 1, default: 0) – If tail is 1, the statistic is thresholded above threshold. If tail is -1, the statistic is thresholded below threshold. If tail is 0, the statistic is thresholded on both sides of the distribution.
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of
SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions inniseq.spending_functionsmodule.verbose (bool | str | int | None) – Control verbosity of the logging output. If
None, use the default verbosity level. See the logging documentation andmne.verbose()for details. Should only be passed as a keyword argument.threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic (note: this is not an alpha level / “p-value”). If numeric, vertices with data values more extreme than
thresholdwill be used to form clusters. IfNone, threshold will be chosen automatically to correspond to a p-value of 0.05 for the given number of observations (only valid when using default statistic). Ifthresholdis adict(with keys'start'and'step') then threshold-free cluster enhancement (TFCE) will be used (see TFCE example and [6]).n_permutations (int, default: 1024) – Number of permutations.
tail – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).
stat_fun (callable | None) – Function called to calculate the test statistic. Must accept 1D-array as input and return a 1D array. If
None(the default), uses mne.stats.f_oneway.adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be spatial vertices, frequency bins, time points, etc. For spatial vertices, see:
mne.channels.find_ch_adjacency(). IfFalse, assumes no adjacency (each location is treated as independent and unconnected). IfNone, a regular lattice adjacency is assumed, connecting each location to its neighbor(s) along the last dimension ofX(or the last two dimensions ifXis 2D). Ifadjacencyis a matrix, it is assumed to be symmetric (only the upper triangular half is used) and must be square with dimension equal toX.shape[-1](for 2D data) orX.shape[-1] * X.shape[-2](for 3D data) or (optionally)X.shape[-1] * X.shape[-2] * X.shape[-3](for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.max_step (int) – Maximum distance between samples along the second axis of
Xto be considered adjacent (typically the second axis is the “time” dimension). Only used whenadjacencyhas shape (n_vertices, n_vertices), that is, when adjacency is only specified for sensors (e.g., viamne.channels.find_ch_adjacency()), and not via sensors and further dimensions such as time points (e.g., via an additional call ofmne.stats.combine_adjacency()).exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering (e.g., medial wall vertices). Should be the same shape as
X. IfNone, no points are excluded.t_power (float) – Power to raise the statistical values (usually F-values) by before summing (sign will be retained). Note that
t_power=0will give a count of locations in each cluster,t_power=1will weight each location by its statistical score.out_type ('mask' | 'indices') – Output format of clusters within a list. If
'mask', returns a list of boolean arrays, each with the same shape as the input data (or slices if the shape is 1D and adjacency is None), withTruevalues indicating locations that are part of a cluster. If'indices', returns a list of tuple of ndarray, where each ndarray contains the indices of locations that together form the given cluster along the given dimension. Note that for large datasets,'indices'may use far less memory than'mask'. Default is'indices'.check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint sets before clustering. This may lead to faster clustering, especially if the second dimension of
X(usually the “time” dimension) is large.
- Returns
looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in
look_times. Each entry of the dictionary is a tuple that contains:obsarray, shape (p[, q][, r])Statistic observed for all variables.
clusterslistList type defined by out_type above.
cluster_pvarrayP-value for each cluster.
H0array, shape (n_permutations,)Max cluster level stats observed under permutation.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in
look_times. These can be compared toadj_alphasto determine on which looks, if any, one can reject the null hypothesis.adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance (other than
n_max), but it is important to include all looks that have already occured inlook_timeseach time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis,look_timesshould contain the ordered sample sizes at all looks at the data that occured during the study.When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.
References
- 1
Gordon Lan, K. K., & DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika, 70(3), 659-663.
- 2
Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.
- 3
Lakens, D., Pahlke, F., & Wassmer, G. (2021). Group Sequential Designs: A Tutorial. https://doi.org/10.31234/osf.io/x4azm
- 4
Maris, E., & Oostenveld, R. (2007). Nonparametric statistical testing of EEG-and MEG-data. Journal of neuroscience methods, 164(1), 177-190.
- 5
Jona Sassenhagen and Dejan Draschkow. Cluster-based permutation tests of meg/eeg data do not establish significance of effect latency or location. Psychophysiology, 56(6):e13335, 2019. doi:10.1111/psyp.13335.
- 6
Stephen M. Smith and Thomas E. Nichols. Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage, 44(1):83–98, 2009. doi:10.1016/j.neuroimage.2008.03.061.
Max-type Tests¶
- niseq.max_test.sequential_permutation_t_test_1samp(X, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
One-sample sequential permutation test with max-type correction.
This is a sequential generalization of
mne.stats.permutations.permutation_t_test.Uses max-type correction for multiple comparisons [4].
Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in
look_timesnever to exceedmax_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated beforemax_nis reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].- Parameters
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of
SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions inniseq.spending_functionsmodule.verbose (bool | str | int | None) – Control verbosity of the logging output. If
None, use the default verbosity level. See the logging documentation andmne.verbose()for details. Should only be passed as a keyword argument.n_permutations (int, default: 1024) – Number of permutations.
n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.
- Returns
looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in
look_times. Each entry of the dictionary is a tuple that contains:obsarray of shape (p[, q][, r])Test statistic observed for all variables.
p_valuesarray of shape (p[, q][, r])P-values for all the tests (a.k.a. variables).
H0array of shape [n_permutations]Max test statistics obtained by permutations.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in
look_times. These can be compared toadj_alphasto determine on which looks, if any, one can reject the null hypothesis.adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance (other than
n_max), but it is important to include all looks that have already occured inlook_timeseach time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis,look_timesshould contain the ordered sample sizes at all looks at the data that occured during the study.When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.
References
- 1
Gordon Lan, K. K., & DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika, 70(3), 659-663.
- 2
Lakens, D. (2014). Performing high‐powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.
- 3
Lakens, D., Pahlke, F., & Wassmer, G. (2021). Group Sequential Designs: A Tutorial. https://doi.org/10.31234/osf.io/x4azm
- 4
Thomas E. Nichols and Andrew P. Holmes. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Human Brain Mapping, 15(1):1–25, 2002. doi:10.1002/hbm.1058.
- niseq.max_test.sequential_permutation_test_corr(X, y, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
A sequential permutation test for correlations with a max-type correction.
Tests for a relationship between
Xand a continuous independent variabley. By default, uses Pearson correlation by default, but the test statistic can be modified.Uses max-type correction for multiple comparisons [4].
Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in
look_timesnever to exceedmax_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated beforemax_nis reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].- Parameters
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.
y (array, shape (n_observations,)) – Value of dependent variable associated with each observation in
X.look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the correlation is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the correlation is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the correlation is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of
SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions inniseq.spending_functionsmodule.verbose (bool | str | int | None) – Control verbosity of the logging output. If
None, use the default verbosity level. See the logging documentation andmne.verbose()for details. Should only be passed as a keyword argument.n_permutations (int, default: 1024) – Number of permutations.
n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.
- Returns
looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in
look_times. Each entry of the dictionary is a tuple that contains:obsarray of shape (p[, q][, r])Test statistic observed for all variables.
p_valuesarray of shape (p[, q][, r])P-values for all the tests (a.k.a. variables).
H0array of shape [n_permutations]Max test statistics obtained by permutations.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in
look_times. These can be compared toadj_alphasto determine on which looks, if any, one can reject the null hypothesis.adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance (other than
n_max), but it is important to include all looks that have already occured inlook_timeseach time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis,look_timesshould contain the ordered sample sizes at all looks at the data that occured during the study.When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.
References
- 1
Gordon Lan, K. K., & DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika, 70(3), 659-663.
- 2
Lakens, D. (2014). Performing high‐powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.
- 3
Lakens, D., Pahlke, F., & Wassmer, G. (2021). Group Sequential Designs: A Tutorial. https://doi.org/10.31234/osf.io/x4azm
- 4
Thomas E. Nichols and Andrew P. Holmes. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Human Brain Mapping, 15(1):1–25, 2002. doi:10.1002/hbm.1058.
- niseq.max_test.sequential_permutation_test_indep(X, labels, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)¶
Independent-sample sequential permutation test with max-type correction.
By default, this is a sequential generalization of an independent sample max-t procedure if two groups and max-F procedure if more groups.
Uses max-type correction for multiple comparisons [4].
Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in
look_timesnever to exceedmax_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated beforemax_nis reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].- Parameters
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.
labels (array, shape (n_observations,)) – Condition label associated with each observation in
X.look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of
SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions inniseq.spending_functionsmodule.verbose (bool | str | int | None) – Control verbosity of the logging output. If
None, use the default verbosity level. See the logging documentation andmne.verbose()for details. Should only be passed as a keyword argument.n_permutations (int, default: 1024) – Number of permutations.
n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.
- Returns
looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in
look_times. Each entry of the dictionary is a tuple that contains:obsarray of shape (p[, q][, r])Test statistic observed for all variables.
p_valuesarray of shape (p[, q][, r])P-values for all the tests (a.k.a. variables).
H0array of shape [n_permutations]Max test statistics obtained by permutations.
ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in
look_times. These can be compared toadj_alphasto determine on which looks, if any, one can reject the null hypothesis.adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.
spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.
Notes
The number of and timing of looks at the data need not be planned in advance (other than
n_max), but it is important to include all looks that have already occured inlook_timeseach time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis,look_timesshould contain the ordered sample sizes at all looks at the data that occured during the study.When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.
References
- 1
Gordon Lan, K. K., & DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika, 70(3), 659-663.
- 2
Lakens, D. (2014). Performing high‐powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.
- 3
Lakens, D., Pahlke, F., & Wassmer, G. (2021). Group Sequential Designs: A Tutorial. https://doi.org/10.31234/osf.io/x4azm
- 4
Thomas E. Nichols and Andrew P. Holmes. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Human Brain Mapping, 15(1):1–25, 2002. doi:10.1002/hbm.1058.
Alpha Spending Functions¶
- class niseq.spending_functions.LinearSpendingFunction(alpha, max_n)¶
The linear spending function. This is the simplest possible spending function, which distributes Type I error rate allowance evenly over time.
- __call__(n)¶
- Parameters
n (int) – An interim sample size
- Returns
spending – The value of the spending function at
n- Return type
float
- __init__(alpha, max_n)¶
- Parameters
alpha (float) – Desired false-positive rate after all sequential tests.
max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.
- class niseq.spending_functions.OBrienFlemingSpendingFunction(alpha, max_n)¶
The O’Brien Fleming spending function. A common choice for clinical trials or other confirmatory research, this spending function is conservative for early analyses, saving more power for later in the study.
- __call__(n)¶
- Parameters
n (int) – An interim sample size
- Returns
spending – The value of the spending function at
n- Return type
float
- __init__(alpha, max_n)¶
- Parameters
alpha (float) – Desired false-positive rate after all sequential tests.
max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.
- class niseq.spending_functions.PiecewiseSpendingFunction(old_spending_func, break_n, new_max_n)¶
A piecewise spending function for adjusting the maximum sample size. A piecewise spending function to be used when adjusting your maximum sample size. i.e., the old spending function
old_spending_funcis used up untilbreak_n, the intermediate sample size at which you decided to change the max sample size. After that, a linear function is used that goes fromold_spending_func(break_n)to (new_max_n,alpha).This is useful if, for instance, (1) you accidentally collect more data than your original
max_n, requiring you to adjust your spending function, or (2) if a conditional power analysis encourages you to change your sample size to acheive a desired Type II error rate. Also (3) if you can no longer collect your originalmax_nfor practical reasons.If
max_nis adjusted multiple times, you can create piecewise spending functions recursively.- __call__(n)¶
- Parameters
n (int) – An interim sample size
- Returns
spending – The value of the spending function at
n- Return type
float
- __init__(old_spending_func, break_n, new_max_n)¶
- Parameters
old_spending_func (instance of SpendingFunction) – an initialized instance of a SpendingFunction subclass with
old_spending_func.max_ngreater thanbreak_n.break_n (int) – The interim sample size at which researcher decides to adjust their maximum sample size.
new_max_n (int) – The new maximum sample size.
- class niseq.spending_functions.PocockSpendingFunction(alpha, max_n)¶
The Pocock spending function. A very common spending function that spends your alpha budget somewhat liberally (compared to e.g. the O’Brien-Fleming function) at the beginning of study; consequently, you have more power early on in exchange for a sharper penalty as you approach the maximum sample size.
- __call__(n)¶
- Parameters
n (int) – An interim sample size
- Returns
spending – The value of the spending function at
n- Return type
float
- __init__(alpha, max_n)¶
- Parameters
alpha (float) – Desired false-positive rate after all sequential tests.
max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.
- class niseq.spending_functions.SpendingFunction(alpha, max_n)¶
An abstract base class for spending functions
- abstract __call__(n)¶
- Parameters
n (int) – An interim sample size
- Returns
spending – The value of the spending function at
n- Return type
float
- __init__(alpha, max_n)¶
- Parameters
alpha (float) – Desired false-positive rate after all sequential tests.
max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.
Power Analysis¶
- niseq.power.bootstrap.bootstrap_predictive_power_1samp(X, test_func, look_times, n_max, alpha=0.05, conditional=False, n_simulations=1024, seed=None, n_jobs=None, **test_func_kwargs)¶
Predictive power analysis via Bayesian bootstrap
Computes the predictive power non-parametrically using the Bayesian bootstrap. Optionally, you can condition on the current data to get conditional power, which is useful for adaptive designs. Only valid for one-sample (or paired-sample) tests.
Statistics computed from resamples using the Bayesian bootstrap, as opposed to the frequentist boostrap, can be interpreted as draws from the posterior distribution with an uninformative prior [1]. Thus, results here can be conveniently interpreted as the Bayesian predictive power. As recommended by [2] (not in English) and helpfully restated by [3] (in English), resampling weights are drawn from Dirichlet(alpha = 4) for a better approximation.
This functionality is experimental. It is the best catch-all way to do a power analysis for permutation tests I can think of, and similar resampling approaches to estimating power have been used in the literature (e.g. by [4]); however, it should be noted that the neuroimaging literature has not converged upon a standardized approach to performing power analyses. The Bayesian bootstrap approach used here incorporates uncertainty about the effect size into the power estimate, which is handy since uncertainty about the true effect size is considerable following a small pilot study, or even a typical psychology/neuroimaging sample size, as pointed out by [5].
- Parameters
X (array, shape (n_observations, p[, q][, r])) – The data from which to resample.
Xshould contain the observations for one group or paired differences. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. See documentation for user-inputtest_funcfor more details.test_func (function) – The one-sample sequential test you want to run a power analysis for. Must accept
look_times,n_max,alpha, andverbosearguments and return results, the middle two of which are the p-values for each look and the adjusted alphas, respectively. This could be any user-facing function fromniseqthat ends in_1samp.look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)conditional (bool, default:
False) – IfTrue, performs a conditional power analysis; that is, computes the probability of a design rejecting the null hypothesis given that the data inXhas already been collected and is included in the analysis, as in an adaptive design. IfFalse(default), performs a prospective power analysis (e.g. if you’re using pilot data or data from another study to inform sample size planning for a study that hasn’t begun data collection).n_simulations (int, default:
1024) – Number of bootstrap resamples/simulations to perform.seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.**test_func_kwargs – You may input any arguments you’d like to be passed to
test_func.
- Returns
res – A results dictionary with keys:
'uncorr_instantaneous_power'list of floatThe power of a fixed-sample statistical test performed at each look.
'rejection_probability'list of floatThe probability that a sequential test rejects the null hypothesis (for the first time) at each look time.
'cumulative_power'list of floatThe power of a sequential test to reject the null hypothesis by each look time.
res['cumulative_power'][-1]is the power of the full sequential procedure.'uncorr_cumulative_power'list of floatCumulative power if the rejection threshold at each look was not corrected using alpha-spending (as it should be).
'n_expected'floatThe expected sample size for the sequential procedure.
'n_simulations'intThe number of bootstrap resamples used.
'n_orig_data'intThe sample size of the original data
X, i.e.X.shape[0].'conditional'boolWhether the power analysis that was run was conditional (
True) or prospective (False).'test_func'strName of the sequential test function used.
'test_func_kwargs'dictA record of the arguments passed to the test function, including
look_timesandn_max.
- Return type
dict
Notes
The significance level of the test used is specified in
test_func, and thus can be modified by passing an argument totest_funcusing**test_func_kwargs.References
- 1
Rubin, D. B. (1981). The bayesian bootstrap. The annals of statistics, 130-134.
- 2
Tu, D. & Zheng, Z. (1987). The Edgeworth expansion for the random weighting method. Chinese J. Appl. Probability and Statist., 3, 340-347.
- 3
Shao, J., & Tu, D. (2012). The jackknife and bootstrap. Springer Science & Business Media.
- 4
Ruzzoli, M., Torralba, M., Fernández, L. M., & Soto-Faraco, S. (2019). The relevance of alpha phase in human perception. Cortex, 120, 249-268.
- 5
Lakens, D., & Evers, E. R. (2014). Sailing from the seas of chaos into the corridor of stability: Practical recommendations to increase the informational value of studies. Perspectives on psychological science, 9(3), 278-292.
Backend Functions¶
- niseq._permutation.find_thresholds(H0, look_times, max_n, alpha=0.05, tail=0, spending_func=None)¶
Given a permutation null distribution for a corresponding sequence of look times and an alpha spending function, computes the adjusted significance thresholds requires to control the false positive rate across all looks.
This isn’t meant to be accessed directly by users, but it can be used together with
generate_permutation_distto create new sequential tests if you’re confident you know what you’re doing.- Parameters
H0 (array of shape (n_permutations, n_looks)) – The joint permutation null distribution of the test statistic across look times.
look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed
max_n.n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.
alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at
n_max)tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).
spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of
SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions inniseq.spending_functionsmodule.
- Returns
spending (array of shape (n_looks,)) – The value of the alpha spending function at each sample size in
look_times.adj_alphas (array of shape (n_looks,)) – The adjusted significance threshold against which to compare p-values at each sample size in
look_times.
- niseq._permutation.generate_permutation_dist(X, labels, look_times, n_permutations=1024, seed=None, n_jobs=None, statistic=<function _get_cluster_stats_samples>, verbose=True, **statistic_kwargs)¶
This function computes the test statistic and its permutation distribution at each look time. It isn’t meant for users to access directly for ordinary use, though it can be used in combination with
find_thresholdsto construct new sequential tests if you’re confident you know what you’re doing. You’ll want to read the source code carefully to make sure yourstatisticfunction is compatible.- Parameters
X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.
labels (array of shape (n_observations,) | None) – Either condition labels for each observation in
X, a continuous dependent variable to correlate withX, or None. In the latter case, a one-sample (sign flip) permutation scheme will be used, otherwise an independent sample (label shuffle) permutation scheme is used.n_permutations (int, default: 1024) – Number of permutations.
seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If
None(default), the seed will be obtained from the operating system (seeRandomStatefor details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.n_jobs (int | None) – The number of jobs to run in parallel. If
-1, it is set to the number of CPU cores. Requires thejoblibpackage.None(default) is a marker for ‘unset’ that will be interpreted asn_jobs=1(sequential execution) unless the call is performed under ajoblib:joblib.parallel_backend()context manager that sets another value forn_jobs.verbose (bool | str | int | None) – Control verbosity of the logging output. If
None, use the default verbosity level. See the logging documentation andmne.verbose()for details. Should only be passed as a keyword argument.statistic (callable(), default: _get_cluster_stats_samples) – The test statistic to compute on the data, e.g. a cluster statistic or a max-t statistic. The last value
statisticreturns must be the omnibus test statistic (e.g. the max-t or the cluster size), though you can return whatever other stuff you want which will be passed through theobsdictionary.**statistic_kwargs – You may pass arbitrary arguments to the
statisticfunction.
- Returns
obs (dict) – The output of statistic indexed by look time in
look_times.H0 (array of shape (n_permutations, n_looks)) – The joint permutation null distribution of the test statistic across look times.