.. _stats_decon2004:
******************************************************
**Some 3dDeconvolve Notes**
******************************************************
.. contents:: :local:
Overview
++++++++
These are notes by Bob Cox, from the great ``3dDeconvolve`` Upgrades
of 2004 and 2007. This page contains notes on various features added
in that happy time.
Sections below will contain comments on underlying algorithms, option
usage, and other words of wisdom that might be useful.
The current help page is for ``3dDeconvolve`` is
`here. `__
The original 2004 notes page is
`here. `__
The original 2007 notes page is
`here. `__
``3dDeconvolve`` Upgrades: Summer 2004
+++++++++++++++++++++++++++++++++++++++++++
.. _stats_decon2004_svd:
SVD for X-matrix pseudoinverse
==============================
Gaussian elimination on the normal equations of X has been replaced by
using the SVD to compute the X matrix pseudoinverse. This can be
disabled using the ``-nosvd`` option. If the SVD solution is on, then
all-zero ``-stim_file`` functions will NOT be removed from the
analysis (unlike the ``-nosvd`` analysis).
The reason for not removing the all-zero regressors is so that GLTs
and other scripts that require knowing where various results are in
the output can still function.
* One plausible case that can give all-zero regressors: a task with
"correct" and "incorrect" results, you analyze these cases
separately, but some subjects are so good they don't have any
incorrect responses.
The SVD solution will set the coefficient and statistics for an
all-zero regressor to zero.
If two identical (nonzero) regressors are input, the program will
complain but continue the analysis. In this case, each one would get
half the coefficient, which only seems fair. However, your
interpretation of such cases should be made with caution.
For those of you who aren't mathematicians: the SVD solution basically
creates the orthogonal principal components of the columns of the X
matrix (the baseline and stimulus regressors), and then uses those to
solve the linear system of equations in each voxel.
.. _stats_decon2004_xmat_condition:
X-matrix condition numbers
==========================
The X matrix condition number is now computed and printed. This can be
disabled using the ``-nocond`` option. As a rough guide, if the matrix
condition number is about 10^p, then round off errors will cause about p
decimal places of accuracy to be lost in the calculation of the
regression parameter estimates. In double precision, a condition
number more than 10^7 would be worrying. In single precision, more than
1000 would be cause for concern. Note that if Gaussian elimination is
used, then the effective condition number is squared (twice as bad in
terms of lost decimal places); this is why the SVD solution was
implemented.
* The condition number is the ratio of the largest to the smallest
singular value of X. If Gaussian elimination is used (``-nosvd``; see
:ref:`here `), then this ratio is squared.
.. comment: this factoid no longer applies at all, because we don't
build+distribute 3dDeconvolve_f anymore
Use of ``3dDeconvolve_f`` (single precision program) now requires
"informed consent" from the user, indicated by putting the option
"-OK" first on the command line. This is because roundoff error can
cause big errors in single precision if the matrix condition number
is over 1000.
.. _stats_decon2004_xjpeg:
Option: ``-xjpeg``
=====================
The new ``-xjpeg filename`` option will save a JPEG image of the
columns of the regression matrix X into the given file.
.. list-table::
:header-rows: 1
:width: 40%
* - X-matrix from Bootcamp example
* - .. image:: media/X.jpg
:width: 100%
:align: center
* Each column is scaled separately, from white=minimum to black=maximum.
* Environment variable ``AFNI_XJPEG_COLOR`` determine the colors of
the lines drawn between the columns.
* The color format is ``rgbi:rf/gf/bf``, where each rf, gf and bf is
a number between 0.0 and 1.0 (inclusive).
* For example, yellow would be ``rgbi:1.0/1.0/0.0``.
* As a special case, if this value is the string ``none`` or ``NONE``,
then these lines will not be drawn.
* See here for getting color codes: https://rgbcolorpicker.com/0-1
(ignore the alpha value "a").
* Environment variable ``AFNI_XJPEG_IMXY`` determines the size of the
image saved when via the ``-xjpeg`` option to ``3dDeconvolve``.
* It should be in the format **AxB**:
* ``A`` is the number of pixels the image is to be wide
(across the matrix rows)
* ``B`` is the number of pixels high (down the columns)
* For example:
.. code-block:: bash
setenv AFNI_XJPEG_IMXY 768x1024
* Which means to set the x-size (horizontal) to 768 pixels and the y-size
(vertical) to 1024 pixels. These values are the default, by the way.
* If the first value ``A`` is negative and less than -1, its absolute value
is the number of pixels across PER ROW.
* If the second value ``B`` is negative, its absolute value is the number of
pixels down PER ROW.
* Usually there are many fewer columns than rows.
.. _stats_decon2004_warnings:
Warnings
========
``3dDeconvolve`` now checks for duplicate ``-stim_file`` names, and
duplicate matrix columns. Only warning messages are printed -- these
are not fatal errors (at least, if the SVD solution is on).
.. _stats_decon2004_mat_inputs:
Matrix Inputs
=============
Matrix inputs for the ``-glt`` option can now use a notation like
``30@0`` to indicate that 30 0s in a row are to be placed on the
line. For example, if you have 10 runs catenated together, and you
used ``-polort 2``, then there are 30 baseline parameters to skip
(usually) when specifying each GLT row.
The following is a sample matrix file with 34 entries per row:
+------+---+----+---+----+
| 30@0 | 1 | -1 | 0 | 0 |
+------+---+----+---+----+
| 30@0 | 0 | 0 | 1 | -1 |
+------+---+----+---+----+
.. _stats_decon2004_gltsym:
Option: ``-gltsym``
======================
The new ``-gltsym gltname`` option lets you describe the rows of a GLT
matrix using a symbolic notation.
* Each stimulus is symbolized by its ``-stim_label`` option.
* Each line in the ``gltname`` file corresponds to a row in the GLT
matrix.
* On each line should be a set of stimulus symbols, which can take the
following forms (using the label ``Stim`` as the examplar):
.. list-table::
:widths: 20 80
:align: left
* - ``Stim``
- put +1 in the matrix row for each lag of ``Stim``
* - ``+Stim``
- put +1 in the matrix row for each lag of ``Stim`` (same as above)
* - ``-Stim``
- put -1 in the matrix row for each lag of ``Stim``
* - ``Stim[2..7]``
- put +1 in the matrix for lags 2..7 of ``Stim``
* - ``3*Stim[2..7]``
- put +3 in the matrix for lags 2..7 of ``Stim``
* - ``Stim[[2..4]]``
- put +1 in the matrix for lags 2..4 of ``Stim`` in 3 successive
rows of the matrix, as in:
+---+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
\.\.\. whereas ``Stim[2..4]`` would yield one matrix row
+---+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
There can be no spaces or ``*`` characters in the stimulus symbols;
each set of stimulus symbols on a row should be separated by one or
more spaces. For example, the two multi-lag regressors entered with
the options below.
.. code-block:: bash
-stim_label 1 Ear -stim_minlag 1 0 -stim_maxlag 1 5 \
-stim_label 2 Wax -stim_minlag 2 2 -stim_maxlag 2 7
This could have a GLT matrix row specified by:
.. code-block:: none
+Ear[2..5] -Wax[4..7]
Which would translate into a matrix row like (zeros for the baseline):
.. code-block:: none
0 0 1 1 1 1 0 0 -1 -1 -1 -1
Some comments:
* With ``-gltsym``, you do not have to specify the number of rows on
the command line -- the program will determine that from the file.
* You can embed comment lines in the file -- these are lines that
start with the characters ``#`` or ``//``.
* If you want to access the polynomial baseline parameters for some
bizarre reason, you can use the symbolic name ``Ort``; otherwise,
the GLT matrix elements corresponding to these parameters will all
be set to 0, as in the example above.
* A GLT can be expressed directly on the command line with an option
of the form:
.. code-block::
-gltsym 'SYM: +Ear[2..5] -Wax[4..7]'
where the ``SYM:`` that starts the string indicates that the rest of
the string should be used to define the 1 row matrix. It is
important that this string be enclosed in forward single quotes, as
shown. If you want to have multiple rows specified, use the ``\``
character to mark the end of each row, as in:
.. code-block::
-gltsym 'SYM: +Ear[2..5] \ -Wax[4..7]'
* You probably want to use the ``-glt_label`` option with ``-gltsym``,
as with ``-glt``.
* If you want to have the matrices generated by ``-gltsym`` printed to
the screen, you can set environment variable ``AFNI_GLTSYM_PRINT``
to ``YES``.
.. _stats_decon2004_Legendre:
Legendre Polynomials
====================
Polynomial baseline functions now default to Legendre polynomials,
which are more pleasantly behaved than the older power baseline
functions. If you need the old power functions, you must use the
``-nolegendre`` option; this should only be the case if you use the
baseline parameter estimates for some purpose.
For each block of contiguous data, the time range from first to last
is scaled to the interval ``[-1,1]``. The standard Legendre
polynomials P\ :sub:`n`\ (x) are then entered as baseline regressors,
for ``n=0,1,...``
.. _stats_decon2004_cbucket:
Option: ``-cbucket``
========================
You can save ONLY the estimated parameters (AKA regression coefficients) for
each voxel into a dataset with the new ``-cbucket cprefix`` option. This may be
useful if you want to do some calculations with these estimates; you won't have
to extract them from the various statistics that are stored in the output of the
``-bucket bprefix`` option.
.. _stats_decon2004_xsave:
Option: ``-xsave``
===================
In combination with the old ``-bucket bprefix`` option, the new
``-xsave`` option saves the X matrix (and some other information) into
file ``bprefix.xsave``. Use this option when you first run
``3dDeconvolve``, if you think you might want to run some extra GLTs
later, using the ``-xrestore`` option (below) -- this is usually much
faster than running the whole analysis over from scratch.
.. _stats_decon2004_xrestore:
Option: ``-xrestore``
========================
The new ``-xrestore filename.xsave`` option will read the ``-xsave``
file and allow you to carry out extra GLTs after the first
``3dDeconvolve`` run. When you use ``-xrestore``, the only other
options that have effect are ``-glt``, ``-glt_label``, ``-gltsym``,
``-num_glt``, ``-fout``, ``-tout``, ``-rout``, ``-quiet``, and
``-bucket``. All other options on the command line will be ignored
(silently). The original time series dataset (from ``-input``) is
named in the ``-xsave`` file, and must be present for ``-xrestore`` to
work. If the parameter estimates were saved in the original
``-bucket`` or ``-cbucket`` dataset, they will also be read;
otherwise, the estimates will be re-computed from the voxel time
series as needed. The new output sub_bricks from the new ``-glt``
options will be stored as follows:
* No ``-bucket`` option given in the ``-xrestore`` run will be stored
at end of original ``-bucket`` dataset.
* ``-bucket bbb`` option given in the ``-xrestore`` run will be stored
in dataset with prefix "bbb", which will be created if necessary; if
"bbb" already exists, new sub-bricks will be appended to this
dataset.
.. _stats_decon2004_input:
Option: ``-input``
====================
The ``-input`` option now allows input of multiple 3D+time datasets, as in:
.. code-block:: none
-input fred+orig ethel+orig lucy+orig ricky+orig
Each command line argument after ``-input`` that does NOT start with a
``-`` character is taken to be a new dataset. These datasets will be
catenated together in time (internally) to form one big dataset. Other
notes:
* You must still provide regressors that are the full length of the catenated
imaging runs; the program will NOT catenate files for the ``-input1D``,
``-stim_file``, or ``-censor`` options.
* If this capability is used, the ``-concat`` option will be ignored, and the
program will use time breakpoints corresponding to the start of each dataset
from the command line.
.. _stats_decon2004_progress:
Progress Meter
==============
Unless you use the ``quiet`` option, ``3dDeconvolve`` now prints a
"progress meter" while it runs. When it is done, this will look as
below where each digit is printed when 2% of the voxels are done.
.. code-block::
++ voxel loop:0123456789.0123456789.0123456789.0123456789.0123456789.
.. _stats_decon2004_stim_times:
Option: ``-stim_times``
============================
Direct input of stimulus timing, plus generation of a response model,
with the new ``-stim_times`` option:
.. code-block::
-stim_times k tname rtype
k and tname
-----------
``k`` is the stimulus index (from 1 to the ``-num_stimts`` value).
``tname`` is the name of the file that contains the stimulus times (in units
of seconds, as in the TR of the ``-input`` file). There are two formats for
this file.
1. A single column of numbers, in which case each time is relative to the start
of the first imaging run ("global times").
2. If there are ``R`` runs catenated together (either directly on the
command line, or as represented in the ``-concat`` option), the
second format is to give the times within each run separately. In
this format, the input file tname would have ``R`` rows, one per
run; the times for each run take up one row. For example, with R=2:
.. code-block::
12.3 19.8 23.7 29.2 39.8 52.7 66.6
21.8 32.7 41.9 55.5
These times will be converted to global times by the program, by
adding the time offset for each imaging run.
N.B.: The times are relative to the start of the data time series
as input to ``3dDeconvolve``. If the first few points of each
imaging run have been cut off, then the actual stimulus times must
be adjusted correspondingly (e.g., if 2 time points were excised
with TR=2.5, then the actual stimulus times should be reduced by
5.0 before being input to ``3dDeconvolve``).
3. When using the multi-row input style, you may have the situation
where the particular class of stimulus does not occur at all in a
given imaging run. To encode this, the corresponding row of the
timing file should consist of a single ``*`` character; for
example, if there are 4 imaging runs but the kth stimulus only
occurs in runs 2 and 4, then the ``tname`` file would look
something like this:
.. code-block::
*
3.2 7.9 18.2 21.3
*
8.3 17.5 22.2
4. In the situation where you are using multi-row input AND there is
at most one actual stimulus per run, then you might think that the
correct input would be something like:
.. code-block::
*
*
30
*
**However, this will be confused with the 1 column format, which
means global times, and so this is wrong. Instead, you can put an
extra \* on one line with an actual stimulus, and then things will
work OK:**
.. code-block::
*
*
30 *
*
rtype
-----
.. comment:
This allows you to play the game R-Type originally released in
arcades back in
1987. `See here. `__.
This is not to be confused with the ``Type R`` which is the
performance editions of certain Honda models. `See
here. `__.
All joking aside,
``rtype`` specifies the type of response model that is to follow each
stimulus. The following formats for ``rtype`` are recognized, **along
with the more modern types describe in the current version of the
program help** `here
`__.
1. ``'GAM'`` is the response function :math:`h_G(t;b,c) =
(t/(bc))^b\,\exp(b-t/c)` for the Cohen parameters :math:`b=8.6,
c=0.547`. This function peaks at the value 1 at :math:`t=bc`, and
is the same as the output of ``waver -GAM``. See `here for waver
`__.
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``GAM`` output from ``-xjeg``
- ``GAM`` output from ``1dplot``
* - .. image:: media/GAM_x.jpg
:width: 50%
:align: center
- .. image:: media/GAM_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg gam_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'GAM' \
| 1dplot -THICK -one -stdin -xlabel Time -jpg GAM_1d.jpg \
-DAFNI_1DPLOT_COLOR_01=red
----
|
2. ``'GAM(b,c)'`` is the same response function as above, but where you give the
'b' and 'c' values explicitly. The ``GAM`` response models have 1 regression
parameter per voxel (the amplitude of the response).
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``GAM(b,c)`` output from ``-xjeg``
- ``GAM(b,c)`` output from ``1dplot``
* - .. image:: media/GAMbc_x.jpg
:width: 50%
:align: center
- .. image:: media/GAMbc_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg GAMbc_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'GAM(10,2)' \
| 1dplot -THICK -one -stdin -xlabel Time -jpg GAMbc_1d.jpg \
-DAFNI_1DPLOT_COLOR_01=red
----
|
3. ``'SPMG2'`` is the SPM gamma variate regression model, which has 2 regression
parameters per voxel. The basis functions are:
* h\ :sub:`SPM,1`\(t) = exp(-t) [ t\ :sup:`5`\/12 - t\ :sup:`15`\/(6*15!) ]
* h\ :sub:`SPM,2`\(t) = d/dt [ h\ :sub:`SPM,1`\(t) ]
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``SPMG2`` output from ``-xjeg``
- ``SPMG2`` output from ``1dplot``
* - .. image:: media/SPMG2_x.jpg
:width: 50%
:align: center
- .. image:: media/SPMG2_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg SPMG2_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'SPMG2' \
| 1dplot -THICK -one -stdin -xlabel Time -jpg SPMG2_1d.jpg
----
4. ``'TENT(b,c,n)'`` is a tent function deconvolution model, ranging between
times ``s+b`` and ``s+c`` after each stimulus time ``s``, with n basis
functions (and n regression parameters per voxel).
* A 'tent' function is just the colloquial term for a 'linear B-spline'. That
is tent(x) = max( 0 , 1-\|x\| )
* A 'tent' function model for the hemodynamic response function is
the same as modeling the HRF as a continuous piecewise linear
function. Here, the input 'n' is the number of straight-line
pieces.
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``TENT(b,c,n)`` output from ``-xjeg``
- ``TENT(b,c,n)`` output from ``1dplot``
* - .. image:: media/TENT_x.jpg
:width: 50%
:align: center
- .. image:: media/TENT_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg TENT_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'TENT(3,30,3)' \
| 1dplot -thick -one -stdin -xlabel Time -jpg TENT_1d.jpg
----
5. ``'CSPLIN(b,c,n)'`` is a cubic spline deconvolution model; similar to the
``TENT`` model, but where smooth cubic splines replace the non-smooth tent
functions.
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``CSPLIN(b,c,n)`` output from ``-xjeg``
- ``CSPLIN(b,c,n)`` output from ``1dplot``
* - .. image:: media/CSPLIN_x.jpg
:width: 50%
:align: center
- .. image:: media/CSPLIN_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg CSPLIN_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'CSPLIN(1,30,4)' \
| 1dplot -thick -one -stdin -xlabel Time -jpg CSPLIN_1d.jpg
----
6. ``'SIN(b,c,n)'`` is a sin() function deconvolution model, ranging between
times s+b and s+c after each stimulus time s, with n basis functions (and n
regression parameters per voxel). The qth basis function, for q=1..n, is h\
:sub:`SIN,q`\(t) = sin(qπ(t-b)/(c-b)).
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``SIN(b,c,n)`` output from ``-xjeg``
- ``SIN(b,c,n)`` output from ``1dplot``
* - .. image:: media/SIN_x.jpg
:width: 50%
:align: center
- .. image:: media/SIN_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg SIN_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'SIN(1,30,2)' \
| 1dplot -thick -one -stdin -xlabel Time -jpg SIN_1d.jpg
----
7. ``'POLY(b,c,n)'`` is a polynomial function deconvolution model,
ranging between times s+b and s+c after each stimulus time s, with
n basis functions (and n regression parameters per voxel). The qth
basis function, for q=1..n, is h\ :sub:`POLY,q`\(t) = P\
:sub:`q`\(2(t-b)/(c-b)-1) where P\ :sub:`q`\(x) is the qth Legendre
polynomial.
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``POLY(b,c,n)`` output from ``-xjeg``
- ``POLY(b,c,n)`` output from ``1dplot``
* - .. image:: media/POLY_x.jpg
:width: 50%
:align: center
- .. image:: media/POLY_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg POLY_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'POLY(1,30,3)' \
| 1dplot -thick -one -stdin -xlabel Time -jpg POLY_1d.jpg
----
8. ``'BLOCK(d,p)'`` is a block stimulus of duration ``d`` starting at
each stimulus time.
* The basis block response function is the convolution of a gamma
variate response function with a 'tophat' function: :math:`H(t) =
\int_0^{min(t,d)} h(t-s) ds`, where :math:`h(t) =
(t/4)^4\,\exp(4-t)`; :math:`h(t)` peaks at :math:`t=4`, with
:math:`h(4)=1`, whereas :math:`H(t)` peaks at
:math:`t=d/(1-\exp(-d/4))`. Note that the peak value of
:math:`H(t)` depends on 'd'; call this peak value
:math:`H_{peak}(d)`.
* ``'BLOCK(d)'`` means that the response function to a stimulus at
time s is :math:`H(t-s)` for :math:`t=s..s+d+15`.
* ``'BLOCK(d,p)'`` means that the response function to a stimulus
at time *s* is :math:`p\cdot H(t-s)/H_{peak}(d)` for
:math:`t=s..s+d+15`. That is, the response is rescaled so that
the peak value of the entire block is 'p' rather than
:math:`H_{peak}(d)`. For most purposes, the best value would be
:math:`p=1`.
* ``'BLOCK'`` is a 1 parameter model (the amplitude).
.. list-table::
:widths: 50 50
:header-rows: 1
* - ``BLOCK(d,p)`` output from ``-xjeg``
- ``BLOCK(d,p)`` output from ``1dplot``
* - .. image:: media/BLOCK_x.jpg
:width: 50%
:align: center
- .. image:: media/BLOCK_1d.jpg
:width: 90%
:align: center
Plot generated with:
.. code-block:: bash
3dDeconvolve \
-nodata 200 1.0 -num_stimts 1 -polort -1 -xjpeg BLOCK_x.jpg \
-local_times -x1D stdout: \
-stim_times 1 '1D: 10 60 110 170' 'BLOCK(20,1)' \
| 1dplot -thick -one -stdin -xlabel Time -jpg BLOCK_1d.jpg \
-DAFNI_1DPLOT_COLOR_01=red
----
|
9. ``'EXPR(b,c) exp1 exp2 ...'`` is a set of user-defined basis
functions, ranging between times s+b and s+c after each stimulus
time s. The expressions are given using the syntax of ``3dcalc``,
and can use the symbolic variables:
* ``'t'`` = time from stimulus
* ``'x'`` = t scaled to range from 0 to 1 over the b..c interval
* ``'z'`` = t scaled to range from -1 to 1 over the b..c interval
* An example, which is equivalent to ``'SIN(0,35,3)'``, is ``'EXPR(0,35)
sin(PI*x) sin(2*PI*x) sin(3*PI*x)'``. Expressions are separated by blanks,
and must not contain whitespace themselves. An expression must use at least
one of the symbols 't', 'x', or 'z', unless the entire expression is the
single character "1".
----
The basis functions defined above are not normalized in any particular way. The
``-basis_normall`` option can be used to specify that each basis function be
scaled so that its peak absolute value is a constant. For example
``-basis_normall 1`` will scale each function to have amplitude 1. Note that
this scaling is actually done on a very fine grid over the entire domain of t
values for the function, and so the exact peak value may not be reached on any
given point in the actual FMRI time series.
* Note that it is the basis function that is normalized, *not* the convolution
of the basis function with the stimulus timing!
* The ``-basis_normall`` option must be given *before* any ``-stim_times``
options to which you want it applied!
If you use a ``-iresp`` option to output the hemodynamic (impulse) response
function corresponding to a ``-stim_times`` option, this function will be
sampled at the rate given by the new ``-TR_times`` dt option. The default value
is the TR of the input dataset, but you may wish to plot it at a higher time
resolution. (The same remarks apply to the ``-sresp`` option.)
Since the parameters in most models do not correspond directly to amplitudes of
the response, care must be taken when using GLTs with these.
* The parameters for ``GAM``, ``TENT``, ``CSPLIN``, and ``BLOCK`` do corresond
directly to FMRI signal change amplitudes.
* **I NEED TO THINK THIS THROUGH SOME MORE** (Says Bob)
Next to be implemented (someday): an option to compute areas under the curve
from the basis-function derived HRFs.
-----
More changes are on the way - RWCox - 22 Sep 2004 - Bilbo and Frodo Baggins'
birthday!
-----
The ``-nodata`` option now works with the ``-stim_times`` option.
* However, since ``-stim_times`` needs to know the number of time points (NT)
and the time spacing (TR), you have to supply these values after the
``-nodata`` option if you are using ``-stim_times``.
* For example: ``-nodata 114 2.5`` to indicate 114 points in time with a spacing
of 2.5 s.
.. _stats_decon2007:
``3dDeconvolve`` Updates: 2007
+++++++++++++++++++++++++++++++++++++++
.. _stats_decon2007_small:
Small changes: to the defaults, new options, *etc*.
===================================================
* ``-nobout`` and ``-full_first`` are now the defaults. These changes
mean that if you *want* the :math:`\beta` weights for the baseline
parameters in the output ``-bucket`` dataset, you have to
specify -bout on the command line. If you *want* the full-model
statistics to appear last in the dataset, you have to specify
``-nofull_first`` on the command line.
|
* Even if you do not give the ``-fout`` option on the command line
(indicating you do *not* want *F*-statistics for various hypotheses
to be calculated), the program will still compute the full model
*F*-statistics. If you don't want that for some reason, you have to
use the new ``-nofullf_atall`` option.
|
* If you do not give a ``-bucket`` option on the command line, then
the program will act as if you had given ``-bucket Decon``. (This is
known as the "Ah need a bucket" change, with apologies to KFC.)
|
* The program now *always* outputs (to a file) the regression matrix
**X**, even if you don't give a ``-x1D`` option. The default
filename will be the same as the ``-bucket`` prefix, with the suffix
``.x1D`` added.
* The matrix file format has been slightly altered to store column
labels in XML-style comments in the header. (Previously, the
matrix was just written out as an array of unlabeled numbers.)
These labels will be useful in an upcoming regression matrix
analysis program being planned by Ziad Saad. They are also useful
in the new program ``3dSynthesize`` (cf. *infra*).
|
* ``3dDeconvolve`` used to fail with the ``-nodata`` option combined with
``-stim_times``. This crash should be a thing of the past.
* When using ``-nodata``, the program needs to know the length of
the (non-existent) imaging data (number of TRs) and it also needs
to know the TR. The simplest and best way to specify these values
is to put them immediately after the ``-nodata`` option; for
example ``-nodata 300 2.5`` to indicate 300 time points with
TR=2.5 s.
* If you don't do the above, then if you use ``-nlast``, that value
(+1) will be used as the number of TRs. If you don't give the TR
in some way, then the default ``-nodata`` TR is 1.0 s. This TR is
unimportant if you only use ``-stim_file``, but is crucial if you
use ``-stim_times`` with ``-nodata`` or with ``-input1D``.
|
* New option ``-float`` (or ``-datum float``) can be used to make all
the output datasets be stored in floating point format. In the past,
only scaled shorts were possible, and the limited (16-bit) precision
of these sometimes caused problems. Shorts are still the default,
but at some point in the future I may change the default to floats --
if/when this happens, the option ``-short`` can be used if you like
the more compact format.
|
* The program now reports when ``-stim_times`` time values are out of
the time span of the dataset. These are not fatal errors, but can
help notify you to potential problems of your timing files. (This
problem is known as the PSFB syndrome -- it's not as bad as the Mike
Beauchamp syndrome, but try to avoid it.)
|
* The labels for the ``-bucket`` output dataset sub-bricks have been
changed slightly to be more consistent and readable (e.g., ``Tstat``
instead of ``t-st`` to indicate a *t*-statistic).
|
* ``3dDeconvolve`` now computes a recommended ``-polort`` value (1
degree for every 150 s of continuous imaging). If your input value
is less than this, a non-fatal WARNING message is printed. If you
use ``-polort A``, then the program will automatically choose the
polynomial degree to use for detrending (AKA high pass filtering).
|
* A new ``CSPLIN()`` model for ``-stim_times`` is now available. This function
is a drop-in replacement for ``TENT()``, with the same 3 arguments. The basis
functions are cardinal cubic splines, rather than cardinal linear splines.
``CSPLIN()`` will produce smoother looking HRF curves, if plotted with
``-TR_times`` less than the dataset TR. (As always, if you are going to change
your analysis methodology, run some data the old way and the new way, then
compare the results to make sure you understand what is happening!)
.. _stats_decon2007_goforit:
Detection and processing of matrix errors; new ``-GOFORIT`` option
==================================================================
* ``3dDeconvolve`` now makes several more checks for "bad things" in the
regression matrix.
* Besides checking the full matrix condition number, it also checks several
sub-matrices: the signal sub-model, the baseline sub-model, the ``-polort``
sub-model, and the ``-stim_base`` sub-model.
* Each check is printed out and labeled as to how good the program "thinks" it
is. Potentially bad values are flagged with ** **BEWARE** **
* **N.B.**: ``3dDeconvolve``'s condition number is *not* exactly the same as
that computed by Matlab. ``3dDeconvolve`` first scales the matrix columns to
have L\ :sup:`2`\-norm = 1, and then computes the condition number from the
ratio of the extreme singular values of *that* matrix. This method prevents
the pathology of saying that the matrix diag(1,10\ :sup:`–6`\) is
ill-conditioned.
* Other "bad things" that the program checks for include duplicate stimulus
filenames, duplicate regression matrix columns, and all zero matrix columns.
|
* If "bad things" are detected in the matrix (each will be flagged in the text
printout with a warning message containing the symbols '!!'), then
3dDeconvolve will not carry out the regression analysis. However, if you give
the command line option ``-GOFORIT``, then the program will proceed with the
analysis. I *strongly* recommend that you **understand** the reason for the
problem(s), and don't just blindly use ``-GOFORIT`` all the time.
|
* To help disentangle the ``ERROR`` and ``WARNING`` messages (if any) from the
rest of the text output, they are now also output to a file named
``3dDeconvolve.err``.
.. _stats_decon2007_censor:
New option for censoring time points
====================================
* The ``-CENSORTR`` option lets you specify on the command line time points to
be removed from the analysis. It is followed by a list of strings; each string
is of one of the following forms:
.. list-table::
:widths: 20 80
:align: left
* - ``37``
- remove global time index #37
* - ``2:37``
- remove time index #37 in run #2
* - ``37..47``
- remove global time indexes #37-47
* - ``37-47``
- same as above
* - ``2:37..47``
- remove time indexes #37-47 in run #2
* - ``'*:0-2'``
- remove time indexes #0-2 in all runs
* Time indexes within each run start at 0.
* Run indexes start at 1 (just be to confusing, and also to be compatible with
afni_proc.py).
* Multiple ``-CENSORTR`` options may be used, or multiple ``-CENSORTR`` strings
can be given at once, separated by spaces or commas.
* **N.B.**: Under the above rules, ``2:37,47`` means index #37 in run #2 and
then global time index #47; it does not mean index #37 in run #2 and then
index #47 in run #2. To help catch this possible misuse, the program will
print a warning message if you use some ``-CENSORTR`` strings with run numbers
and some without run numbers.
.. _stats_decon2007_3dSynthesize:
New program ``3dSynthesize`` for creating 3D+time datasets
==========================================================
* This program combines the :math:`\beta` weights stored in the ``-cbucket`` output from
``3dDeconvolve``, and the regression matrix time series stored in the ``-x1D``
output, to produce model fit time series datasets. ``3dDeconvolve`` itself has
the ``-fitts`` option to produce the full model fit in each voxel.
``3dSynthesize`` can be used to produce model fits from subsets of the full
model.
|
* In the examples below, suppose that ``fred+orig`` is the output from
``-cbucket`` and that ``fred.x1D`` is the output from ``-x1D``. Also suppose
that there were two stimulus classes, given labels ``Face`` and ``House`` in
``3dDeconvolve`` using ``-stim_label`` options.
* Baseline sub-model:
.. code-block:: bash
3dSynthesize \
-cbucket fred+orig -matrix fred.x1D \
-select baseline -prefix fred_baseline
For example, you could subtract ``fred+baseline+orig`` from the FMRI data
time series, using ``3dcalc``, to get a signal+noise dataset with no
baseline. This combination of programs would be one way to detrend a
multi-run dataset in a logically consistent fashion.
* Baseline plus ``Face`` stimulus sub-model (but not the ``House`` stimulus):
.. code-block:: bash
3dSynthesize \
-cbucket fred+orig -matrix fred.x1D \
-select baseline Face prefix fred_Face
Baseline plus ``House`` stimulus sub-model (but not the ``Face`` stimulus):
.. code-block:: bash
3dSynthesize \
-cbucket fred+orig -matrix fred.x1D \
-select baseline House prefix fred_House
* In general, if you want to "Double Plot" the resulting dataset on top of the
original time series dataset (with the ``Dataset #N`` plugin), you'll need the
baseline model component so that the ``3dSynthesize`` output is on the same
magnitude scale for graphing.
* For further details, see the ``-help`` output from ``3dSynthesize``:
available `here
`__.
|
* [**25 Jun 2007] Censoring**
* ``3dDeconvolve`` and ``3dSynthesize`` have been modified to work when the
``3dDeconvolve`` run using a time point censoring option (i.e., ``-censor``
and/or ``-CENSORTR``). The matrix files output by ``3dDeconvolve`` (which
files are now renamed to end in ``.xmat.1D``) have information about which
time points were censored. ``3dSynthesize`` can use that information to
generate sub-bricks to fill in those time points which are missing in the
actual matrix. The options are:
.. list-table::
:widths: 20 80
:align: left
* - ``-cenfill zero``
- rfill censored time points with zeros [the new default]
* - ``-cenfill nbhr``
- fill censored time points with the average of their non-censored time
neighbors
* - ``-cenfill none``
- rdon't put sub-bricks in for censored time points [what the program
used to do]
Another option is to use the new ``-x1D_uncensored filename`` option in
``3dDeconvolve`` to output an uncensored version of the regression matrix,
then use that matrix as the input the ``3dSynthesize.`` Then the model fit
that you choose will be computed at all the time points.
.. _stats_decon2007_amp_mod:
Amplitude Modulated FMRI regression analysis
++++++++++++++++++++++++++++++++++++++++++++
Analysis of event-related FMRI data when the amplitude of each event's BOLD
response might depend on some externally observed data.
Conceptual Introduction
=======================
When carrying out an FMRI experiment, the stimuli/tasks are grouped
into classes. Within each class, the FMRI-measurable brain activity is
presumed to be the same for each repetition of the task. This crude
approximation is necessary since FMRI datasets are themselves crude,
with low temporal resolution and a low contrast-to-noise ratio
(*i.e.*, the BOLD signal change is not very big). Therefore multiple
measurements of the "same" response are needed to build up decent
statistics.
In many experiments, with each individual stimulus/task a separate
measurement of subject behavior is taken; for example, reaction time,
galvanic skin response, emotional valence, pain level perception, et
cetera. It is sometimes desirable to incorporate this Amplitude
Modulation (**AM**) information into the FMRI data analysis.
Binary AM
=========
If the AM were binary in nature ("on" and "off"), one method of
carrying out the analysis would be to split the tasks into two
classes, and analyze these stimulus classes separately (*i.e.*, with
two ``-stim_times`` options). The statistical test for activation
ignoring the AM would then be a 2 DOF F-test, which could be carried
out in ``3dDeconvolve`` by using a 2 row GLT. The contrast between the
two conditions ("on−off") could be carried out with a 1 row GLT. For
example:
.. code-block:: bash
3dDeconvolve ... \
-stim_times 1 regressor_on.1D 'BLOCK(1,1)' -stim_label 1 'On' \
-stim_times 2 regressor_off.1D 'BLOCK(1,1)' -stim_label 2 'Off' \
-gltsym 'SYM: On \ Off' -glt_label 1 'On+Off' \
-gltsym 'SYM: On -Off' -glt_label 2 'On-Off' ...
(A realistic ``3dDeconvolve`` command line would, of course, have more
options to specify the input and output filenames, etc.) The above
example assumes that each case ("on" and "off") is being analyzed with
simple (fixed-shape) regression -- short 1-second blocks of activity.
Nothing more will be said here about binary AM, since it is just a
standard application of ``3dDeconvolve``; the only (small) difference
is that the stimulus class to which each individual stimulus is
assigned is determined during the FMRI data acquisition itself, rather
than determined by the investigator before the imaging session.
Continuous AM
=============
More complex is the case where the AM measurement values fall onto a
continuous (or finely graded discrete) scale. One form of analysis is
then to construct two regressors: the first being the standard
constant-amplitude-for-all-events-in-the-same-class time series, and
the second having the amplitude for each event modulated by that
event's AM value (or some function of the AM value). To make these two
regressors be orthogonal, it is best to make the modulation be
proportional to the difference between each event's AM value and the
mean AM value for that stimulus class.
The new ``-stim_times_AM2`` option is designed to make this type of
analysis easy. The **'AM'** in the option suffix indicates that
amplitude modulation for each time is expected in the input timing
file. The **'2'** indicates that 2 regressors will be generated from 1
stimulus timing file.
The stimulus timing file for ``-stim_times_AM2`` has a slightly
different format than the stimulus timing file for the standard
``-stim_times`` option. Each stimulus time in the ``_AM2`` file must
have an amplitude "married" to it. For example:
.. code-block::
10*5 30*3 50*2 70*7 90*-3
This indicates that the stimuli at times 10, 30, 50, 70, and 90 have
amplitudes of 5, 3, 2, 7, and -3 (respectively). Note that if a
stimulus time is given without an amplitude, the amplitude will be
taken to be zero and 3dDeconvolve will print a warning
message. (**N.B.**: the '*' separator can also be the 'x' character,
if that is more convenient.)
The program ``1dMarry`` can be used to "glue" two .1D formatted files
together to produce a file appropriate for ``-stim_times_AM2``. With
the ``-divorce`` option, it can also split up a "married" file into 2
separate files -- one with the times and one with the
amplitudes. These features makes it relatively straightforward to run
a standard ``3dDeconvolve`` analysis with ``-stim_times`` and also the
new ``-stim_times_AM2`` type of analysis.
The same response models available with the standard ``-stim_times``
option are also usable with ``-stim_times_AM2``. Two regression matrix
columns will be generated for ``_AM2`` for each one column specified
by the response model (e.g., ``'BLOCK(1,1)'`` generates 1 column
normally, and 2 columns when used with ``_AM2``). The first column
will be created by giving equal weight (1) to each event in the
stimulus timing file. The second column will have each event weighted
by the difference between its individual amplitude and the mean of all
amplitudes in the timing file. The significance of the output
:math:`\beta` weight for this second column (e.g., given by using
the ``-tout`` option) can be used to map regions that are (linearly)
sensitive to the amplitude information. The significance of the
combined :math:`\beta` weights for the two columns (e.g., given by
using the ``-fout`` option) can be used to map regions that are
sensitive the stimulus class as a whole.
It can be useful and enlightening to plot the columns of the
regression matrix that correspond to the equal-weight and
variable-weight model time series generated by
``-stim_times_AM2``. For this purpose, program ``1dplot`` can be
applied to subsets of the .x1D file output by ``3dDeconvolve``.
It is possible to use the option ``-stim_times_AM1`` if you want to
just generate a single regression model where each event is simply
scaled by its associated amplitude. There will be no separation of the
model into the constant and varying components. I do not recommend
this, for reasons given below, but the option is available. (If you
can think of a good reason to use this option for analysis of FMRI
time series, please let me know!)
Deconvolution with Amplitude Modulation
=======================================
It is also legal to use a deconvolution model (*e.g.*, ``'TENT()'``)
with ``-stim_times_AM2``. However, you must realize that the program
will compute a separate HRF shape for the AM component of the response
from the mean component. It is not possible to specify that the AM
component has the same shape as the mean component, and just has a
different response amplitude -- that would be a nonlinear regression
problem, and ``3dDeconvolve`` isn't that flexible. Also, at present,
the ``-iresp`` option will not output the HRF for the AM component of
a ``-stim_times_AM2`` deconvolution model. Nor have I actually tried
using AM deconvolution myself on real data. If you are going to try to
do this, you should (a) understand what you are doing, and (b) consult
with someone here.
Why Use 2 Regressors?
=====================
One user asked the following question: *"Can't I just use the AM
weighted-regressor in the model by itself? Why do you have to include
the standard (mean amplitude) regressor in the full model to
investigate the effect of event amplitude values?"* In other words,
why not use ``-stim_times_AM1``?
The reasoning behind separating the regressor columns into 2 classes
(mean activation and AM-varying activation) is
* to allow for voxels where the amplitude doesn't affect the result, and
* to allow for a cleaner interpretation; in voxels where both
regressors have significant weight, you can use the coefficient
(:math:`\beta` weight) of the first regressor as the mean
activation level, and the coefficient of the second as the
dependence of the activation level on the amplitude.
A numerical example might help elucidate:
Suppose that you have 6 events, to be simple, and that the amplitudes
for these events are ``{1, 2, 1, 2, 1, 2}``, with mean=1.5. Now
suppose you have a voxel that IS active with the task, but whose
activity is not dependent on the amplitude at all. Say its activation
level with each task is 6, so the "activity vector" (*i.e.*, the BOLD
response amplitude for each event) is ``{6, 6, 6, 6, 6, 6}``. This
vector is highly correlated with the AM vector ``{1, 2, 1, 2, 1, 2}``
(cc=0.9486), so you will get a positive activation result at this
voxel when using a single regressor. You can't tell from the
regression if this voxel is sensitive to the amplitude modulation or
not.
But if you use 2 regressors, they would be proportional to ``{1, 1, 1,
1, 1, 1}`` and ``{-0.5, +0.5, -0.5, +0.5, -0.5, +0.5}`` (the
differences of each event amplitude from the mean of 1.5). The first
regression vector is perfectly correlated with the "activity vector"
``{6, 6, 6, 6, 6, 6}`` and the second regression vector is not
correlated with the activity at all. So you would get an activation
result saying "this voxel was activated by the task, but doesn't care
about the amplitude". You cannot make such a dissection without using
2 regressors.
Even if you don't care at all about such non-AM-dependent voxels, you
must still include them if you think this may be a significant effect
in the data. You have to model the data as it presents itself. In a
sense, the constant-activation model is like the baseline model
(*e.g.*, ``-polort`` stuff), in that it must be included in the fit
since it does occur, but you are free to ignore it as you
will. Interpreting the results is your problem.
What about Losing Degrees of Freedom?
=====================================
If you are concerned about losing degrees of freedom, since you will
be adding regressors but not data, then I would run the analysis
twice. Once with the mean regressors only, and then one with the mean
and the variable regressors. Then decide if the maps from the mean
regressors in the two cases differ markedly. My guess is that they
will not, if you have a decent number of events in each case (30+). If
they do not differ too much, then you are safe to use the double
regressor (``AM2``) analysis. If they do differ a lot (*e.g.*, you
lose a lot of mean regressor activation when you set the F-statistic
p-values the same), then you probably can't use the double regressor
analysis. But it is easy enough to try.
You can open two AFNI controllers, and view the single and double
regressor analyses side-by-side. You can set the threshold sliders to
be locked together in p-value (using ``Edit Environment`` on variable
``AFNI_THRESH_LOCK``). This should help you decide very quickly if the
two results look the same or not -- *same*, that is, from the viewpoint
of interpreting the results. The maps will of course not be identical,
since they will have been calculated with different models.
Caveats and Issues
==================
One problem with the above idea is that one may not wish to assume
that the FMRI signal is any particular function of the event amplitude
values. I don't know at this time how to deal with this issue in the
context of linear regression. For example, the "linear in event
amplitude" model could be extended to allow for a quadratic term
(``-stim_times_AM3``?), but it is highly unclear that this would be
useful. Some sort of combination of regression analysis with a mutual
information measurement might be needed (to quantify if the BOLD
response is "predictable" from the AM information), but I don't fully
know how to formulate this idea mathematically.