The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Statistics::SDT - Signal detection theory (SDT) measures of sensitivity and response-bias

SYNOPSIS

The following is based on example data from Stanislav & Todorov (1999), and Alexander (2006), with which the module's results agree.

 use Statistics::SDT 0.05;

 my $sdt = Statistics::SDT->new(
  correction => 1,
  precision_s => 2,
 );

 $sdt->init(
  hits => 50,
  signal_trials => 50, # or misses => 0,
  false_alarms => 17,
  noise_trials => 25, # or correct_rejections => 8
 ); # or init these into 'new' &/or update any of their values as 2nd arg. hashrefs in calling the following methods

 printf("Hit rate = %s\n",            $sdt->rate('h') );          # .99
 printf("False-alarm rate = %s\n",    $sdt->rate('f') );          # .68
 printf("Miss rate = %s\n",           $sdt->rate('m') );          # .00
 printf("Correct-rej'n rate = %s\n",  $sdt->rate('c') );          # .32
 printf("Sensitivity d' = %s\n",      $sdt->sens('d') );          # 1.86
 printf("Sensitivity Ad' = %s\n",     $sdt->sens('Ad') );         # 0.91
 printf("Sensitivity A' = %s\n",      $sdt->sens('A') );          # 0.82
 printf("Bias beta = %s\n",           $sdt->bias('b') );          # 0.07
 printf("Bias logbeta = %s\n",        $sdt->bias('log') );        # -2.60
 printf("Bias c = %s\n",              $sdt->bias('c') );          # -1.40
 printf("Bias Griers B'' = %s\n",     $sdt->bias('g') );          # -0.91
 printf("Criterion k = %s\n",         $sdt->crit() );             # -0.47
 printf("Hit rate via d & c = %s\n",  $sdt->dc2hr() );            # .99
 printf("FAR via d & c = %s\n",       $sdt->dc2far() );           # .68
 printf("LogBeta via d & c = %s\n",   $sdt->dc2logbeta() );       # -2.60

 # If the number of alternatives is greater than 2, there are two method options:
 printf("JAlex. d_fc = %.2f\n", $sdt->sens('f' => {hr => .866, states => 3, correction => 0, method => 'alexander'})); # 2.00
 printf("JSmith d_fc = %.2f\n", $sdt->sens('f' => {hr => .866, states => 3, correction => 0, method => 'smith'})); # 2.05

DESCRIPTION

Signal Detection Theory (SDT) measures of sensitivity and response-bias, e.g., d', A', c. For any particular analysis, you go through the stages of (1) creating the SDT object (see new), (2) initialising the object with relevant data (see init), and then (3) calling the statistic you want, with any statistic-specific arguments.

KEY NAMED PARAMS

The following named parameters need to be given as a hash or hash-reference: either to the new constructor method, init, or into each measure-function. To calculate the hit-rate, you need to feed the (i) count of hits and signal_trials, or (ii) the counts of hits and misses, or (iii) the count of signal_trials and misses. To calculate the false-alarm-rate, you need to feed (i) the count of false_alarms and noise_trials, or (ii) the count of false_alarms and correct_rejections, or (iii) the count of noise_trials and correct_rejections. Or you supply the hit-rate and false-alarm-rate. Or see dc2hr and dc2far if you already have the measures, and want to get back to the rates.

hits

The number of hits.

false_alarms

The number of false alarms.

signal_trials

The number of signal trials. The hit-rate is derived by dividing the number of hits by the number of signal trials.

noise_trials

The number of noise trials. The false-alarm-rate is derived by dividing the number of false-alarms by the number of noise trials.

states

The number of response states, or "alternatives", "options", etc.. Default = 2 (for the classic signal-detection situation of discriminating between signal+noise and noise-only). If the number of alternatives is greater than 2, when calling sens, Smith's (1982) estimation of d' is used (otherwise Alexander's) - see forced_choice.

correction

Indicate whether or not to perform a correction on the number of hits and false-alarms when the hit-rate or false-alarm-rate equals 0 or 1 (due, e.g., to strong inducements against false-alarms, or easy discrimination between signals and noise). This is relevant to all functions that make use of the inverse phi function (all except aprime option with sens, and the griers option with bias). As ndtri must die with an error if given 0 or 1, there is a default correction.

If correction = 0, no correction is performed to calculation of rates. This should only be used when (1) using the parametric measures and the rates will never be at the extremes of 0 and 1; or (2) using only the nonparametric measures (aprime and griers).

If correction = 1 (default), extreme rates (of 0 and 1) are corrected: 0 is replaced with 0.5 / n; 1 is replaced with (n - 0.5) / n, where n = number of signal or noise trials. This is the most common method of handling extreme rates (Stanislav and Todorov, 1999) but it might bias sensitivity measures and not be as satisfactory as the loglinear transformation applied to all hits and false-alarms, as follows.

If correction > 1, the loglinear transformation is appliedt to all values: 0.5 is added to both the number of hits and false-alarms, and 1 is added to the number of signal and noise trials.

If correction is undefined: To avoid errors thrown by the ndtri function, any values that equal 1 or 0 will be corrected as if it equals 1.

precision_s

Precision (n decimal places) of any of the statistics. Default = 0, which actually means that you get all decimal bits possible.

method

Method for estimating d' when number of states/alternatives is greater than 2. Default value is smith; otherwise alexander; see forced_choice for application and description of these methods.

hr

The hit-rate. Instead of passing the number of hits and signal trials, give the hit-rate directly - but, if doing so, ensure the rate does not equal zero or 1 in order to avoid errors thrown by the inverse-phi function (which will be given as "ndtri domain error").

far

This is the false-alarm-rate. Instead of passing the number of false alarms and noise trials, give the false-alarm-rate directly - but, if doing so, ensure the rate does not equal zero or 1 in order to avoid errors thrown by the inverse-phi function (which will be given as "ndtri domain error").

METHODS

new

Creates the class object that holds the values of the parameters, as above, and accesses the following methods, without having to resubmit all the values.

As well as holding the values of the parameters submitted to it, the class-object returned by new will hold two arguments, hr, the hit-rate, and far, the false-alarm-rate. You can supply the hit-rate and false-alarm-rate themselves, but ensure that they do not equal zero or 1 in order to avoid errors thrown by the inverse-phi function. The calculation of the hit-rate and false-alarm-rate by the module corrects for this limitation; correction can only be done by supplying the relevant counts, not just the rate - see the notes on the correction parameter, above.

init

 $sdt->init(
    hits => integer,
    misses => ?integer,
    false_alarms => integer,
    correct_rejections => ?integer,
    signal_trials => integer (>= hits), # or will be calculated from hits and misses
    noise_trials => integer (>= false_alarms), # or will be calculated from false_alarms and correction_rejections
    hr => probability 0 - 1,
    far => probablity 0 - 1,
    correction => 0|1|2 (default = 1),
    states => integer >= 2 (default = 2),
    precision_s => integer (default = 0),
    method => undef|smith|alexander (default = undef)
 )

Instead of sending the number of hits, signal-trials, etc., with every call to the measure-functions, or creating a new class object for every set of data, initialise the class object with these values, as named parameters, key => value pairs. This method is called by new in case you pass the values to it in construction. The hit-rates and false-alarm rates are always calculated anew from the hits and signal trials, and the false-alarms and noise trials, respectively; unless you send a value for one or the other, or both (as hr and far) in a call to init.

Each init replaces the values only of those attributes that you pass to it - any values set in previous inits are retained for those attributes that you do not set in a call to init. If this is not what you want, and you actually want everything reset, first use clear

Optionally, the method also initialises any values you give it for states, correction, precision_s and method. If you have already set these values, and you do not do so in another call to init; the previous values will be retained.

clear

 $sdt->clear()

Sets all attributes to undef: hits, false_alarms, signal_trials, noise_trials, hr, far, states, correction, and method.

rate

 $sdt->rate('hr|far|mr|crr') # scalar string to return the indicated rate
 $sdt->rate(hr => 'prob.', far => 'prob.', mr => 'prob.', crr => 'prob.') # one or more key => value pairs to set the rate
 $sdt->rate('h' => {signal_trials => integer, hits => integer}) # or misses instead of hits
 $sdt->rate('f' => {noise_trials => integer, false_alarms => integer}) # or correct_rejections instead of false_alarms
 $sdt->rate('m' => {signal_trials => integer, misses => integer})  # or hits instead of misses
 $sdt->rate('c' => {noise_trials => integer, correct_rejections => integer})  # or false_alarms instead of correct_rejections

Generic method to get or set any rate.

To get a rate, pass only a string that indicates the rate: hit, false-alarm, miss, correct-rejection: only checks the first letter, so any passable abbreviation will do. The rate is returned to the precision indicated by the present value of precision_s, if anything.

To set a rate, either give the actual probability as key => value pairs, or send a hashref giving sufficient info to calculate the rate (if this has not already been sent to init or one of the measure-methods).

Also performs any required or requested corrections, depending on the present value of correction.

Unless the values of the rates are directly given, then they will be calculated from the presently sent counts and trial-numbers, or whatever has been cached of these values. For the hit-rate, there must be a value for hits and signal_trials, and for the false_alarm_rate, there must be a value for false_alarms and noise_trials. If these values are not sent, they will be taken from any prior value, unless this has been cleared or never existed - in which case expect a croak.

sens

 $s = $sdt->sens('dprime|forcedchoice|area|aprime') # based on values of the measure variables already inited or otherwise set 
 $s = $sdt->sens('dprime' => { signal_trials => integer}) # update any of the measure variables

Alias: sensitivity, discriminability

Get one of the sensitivity measures, as indicated by the first argument string, optionally updating any of the measure variables and options with a subsequent hashref. The measures are as follows, accessed by giving the name (or at least its first two letters) as the first argument.

dprime

Returns the index of sensitivity, or discrimination, d' (d prime), found by subtracting the z-score that corresponds to the false-alarm rate (far) from the z-score that corresponds to the hit rate (hr):

          d' = phi–1(hr) – phi–1(far)

In this way, sensitivity is measured in standard deviation units, larger positive values indicating greater sensitivity. If both the hit-rate and false-alarm-rate are either 0 or 1, then sensitivity returns 0. A value of 0 indicates no sensitivity to the presence of the signal, i.e., it cannot be discriminated from noise. Values less than 0 indicate a lack of sensitivity that might result from a consistent, state-specific "mix-up" or inhibition of responses.

If there are more than two states (not only signal and noise-plus-signal), then d' will be estimated by the following.

forced_choice

An estimate of d' based on the percent correct in a forced-choice task with any number of alternatives. This method is automatically called via sensitivity if the value of states is greater than 2. Only for this condition is it not necessary to calculate the false-alarm rate; the hit-rate is formed, as usual, as the count of hits divided by signal_trials.

At least a couple methods are available to estimate d' when states > 2; accordingly, there is the option - set either in init or sensitivity or otherwise - for method: its default value is smith (this is the method cited by Stanislav & Todorov (1999)); otherwise, you can use the more generally applicable alexander method:

Smith (1982) method: satisfies "the 2% bound for all M [states] and all percentiles and, except for M = 3 or 4, satisfies a 1% error bound". The specific algorithm used depends on number of states:

For n states < 12:

          d' = KM.log( ( (n– 1).hr ) / ( 1 – hr ) )

where

          KM = .86 – .085 . log(n – 1).

If n >= 12,

          d' = A + B . phi–1(hr)

where

          A = (–4 + sqrt(16 + 25 . log(n – 1))) / 3

and

          B = sqrt( (log(n – 1) + 2) / (log(n – 1) + 1) )

Alexander (2006/1990) method: "gives values of d' with an error of less than 2% (mostly less than 1%) from those obtained by integration for the range d' = 0 (or 1% correct for n [states] > 1000) to 75% correct and an error of less than 4% up to 95% correct for n up to at least 10000, and slightly greater maximum errors for n = 100000. This approximation is comparable to the accuracy of Elliott's table (0.02 in proportion correct) but can be used for any n." (Elliott's table being that in Swets, 1964, pp. 682-683). The estimation is offered by:

          d' = ( phi–1(hr) – phi–1(1/n) ) / An

where n is the number of states (or alternatives, alphabet-size, etc.), and An is estimated by:

          An = 1 / (1.93 + 4.75.log10(n) + .63.[log10(n)]2)

aprime

Returns the nonparametric index of sensitivity, A'.

Ranges from 0 to 1. Values greater than 0.5 indicate positive discrimination (1 = perfect performance); values less than 0.5 indicate a failure of discrimination (perhaps due to consistent "mix-up" or inhibition of state-specific responses); and a value of 0.5 indicates no sensitivity to the presence of the signal, i.e., it cannot be discriminated from noise.

adprime

Returns Ad', the area under the receiver-operator-characteristic (ROC) curve, equalling the proportion of correct responses for the task as a two-alternative forced-choice task.

If both the hit-rate and false-alarm-rate are either 0 or 1, then sensitivity with this argument returns 0.5.

bias

 $b = $sdt->bias('likelihood|loglikelihood|decision|griers') # based on values of the measure variables already inited or otherwise set 
 $b = $sdt->bias('likelihood' => { signal_trials => integer}) # update any of the measure variables

Get one of the decision/response-bias measures, as indicated below, by the first argument string, optionally updating any of the measure variables and options with a subsequent hashref (as given by example for signal_trials, above).

With a yes response indicating that the decision variable exceeds the criterion, and a no response indicating that the decision variable is less than the criterion, the measures indicate if there is a bias toward the yes response, and so a liberal/low criterion, or a bias toward the no response, and so a conservative/high criterion.

The measures are as follows, accessed by giving the name (or at least its first two letters) as the first argument to bias.

beta (or) likelihood_bias

Returns the beta measure of response bias, based on the ratio of the likelihood the decision variable obtains a certain value on signal trials, to the likelihood that it obtains the value on noise trials.

          beta = exp( ( (phi–1(far)2 – phi–1(hr)2) ) / 2 )

Values less than 1 indicate a bias toward the yes response, values greater than 1 indicate a bias toward the no response, and the value of 1 indicates no bias toward yes or no.

log_likelihood_bias

Returns the natural logarithm of the likelihood bias, beta.

Ranges from -1 to +1, with values less than 0 indicating a bias toward the yes response, values greater than 0 indicating a bias toward the no response, and a value of 0 indicating no response bias.

c (or) decision_bias

Implements the c parametric measure of response bias. Ranges from -1 to +1, with deviations from zero, measured in standard deviation units, indicating the position of the decision criterion with respect to the neutral point where the signal and noise distributions cross over, there is no response bias, and c = 0.

Values less than 0 indicate a bias toward the yes response; values greater than 0 indicate a bias toward the no response; and a value of 0 indicates no response bias.

griers_bias

Implements Griers B'' nonparametric measure of response bias.

Ranges from -1 to +1, with values less than 0 indicating a bias toward the yes response, values greater than 0 indicating a bias toward the no response, and a value of 0 indicating no response bias.

criterion

 $sdt->criterion() # assume d' and c can be calculated from already inited param values
 $sdt->criterion(d => float, c => float)

Alias: dc2k, crit

Returns the value of the criterion for given values of sensitivity d' and bias c, viz.: k = d' / 2 + c.

dc2hr

 $sdt->dc2hr() # assume d' and c can be calculated from already inited param values
 $sdt->dc2hr(d => float, c => float)

Returns the hit-rate estimated from given values of sensitivity d' and bias c, viz.: hr = phi(d' / 2 - c).

dc2far

 $sdt->dc2far() # assume d' and c can be calculated from already inited param values
 $sdt->dc2far(d => float, c => float)

Returns the false-alarm-rate estimated from given values of sensitivity d' and bias c, viz.: far = phi(-d' / 2 - c).

dc2logbeta

 $sdt->dc2logbeta() # assume d' and c can be calculated from already inited param values
 $sdt->dc2logbeta(d => float, c => float)

Returns the log-likelihood (beta) bias estimated from given values of sensitivity d' and bias c, viz.: b = d' . c.

REFERENCES

Alexander, J. R. M. (2006). An approximation to d' for n-alternative forced choice. From http://eprints.utas.edu.au/475/.

Lee, M. D. (2008). BayesSDT: Software for Bayesian inference with signal detection theory. Behavior Research Methods, 40, 450-456.

Smith, J. E. K. (1982). Simple algorithms for M-alternative forced-choice calculations. Perception and Psychophysics, 31, 95-96.

Stanislaw, H., & Todorov, N. (1999). Calculation of signal detection theory measures. Behavior Research Methods, Instruments, and Computers, 31, 137-149.

Swets, J. A. (1964). Signal detection and recognition by human observers. New York, NY, US: Wiley.

SEE ALSO

Math::Cephes : The present module imports/depends upon the ndtr (phi) and ndtri (inverse phi) functions from this package.

Statistics::ROC : Receiver-operator characteristic curves.

LIMITATIONS/TODO

Expects descriptive counts, not raw observations, confidence ratings; this limits the measures that can be implemented: methods load and unload are reserved to implement handling of data lists.

Perl's params modules do not seem to effect the required validation of parameters needed for each measure; the present work-around is obsessive-compulsive, while not exhaustive of all wayward possibilities, and requires optimisation but extension. It is presently quite possible to suffer an inelegant death should anything too unsual, or impoverished of details, be attempted in the life of the module.

REVISION HISTORY

See Changes file in installation dist.

AUTHOR/LICENSE

rgarton AT cpan DOT org

This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).

Disclaimer

To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.

END

This ends documentation for a Perl implementation of signal detection theory measures of sensitivity and bias.