Statistics::SDT - Signal detection theory (SDT) measures of sensitivity and response-bias
The following is based on example data from Stanislav & Todorov (1999), and Alexander (2006), with which the module's results agree.
use Statistics::SDT 0.05; my $sdt = Statistics::SDT->new( correction => 1, precision_s => 2, ); $sdt->init( hits => 50, signal_trials => 50, # or misses => 0, false_alarms => 17, noise_trials => 25, # or correct_rejections => 8 ); # or init these into 'new' &/or update any of their values as 2nd arg. hashrefs in calling the following methods printf("Hit rate = %s\n", $sdt->rate('h') ); # .99 printf("False-alarm rate = %s\n", $sdt->rate('f') ); # .68 printf("Miss rate = %s\n", $sdt->rate('m') ); # .00 printf("Correct-rej'n rate = %s\n", $sdt->rate('c') ); # .32 printf("Sensitivity d' = %s\n", $sdt->sens('d') ); # 1.86 printf("Sensitivity Ad' = %s\n", $sdt->sens('Ad') ); # 0.91 printf("Sensitivity A' = %s\n", $sdt->sens('A') ); # 0.82 printf("Bias beta = %s\n", $sdt->bias('b') ); # 0.07 printf("Bias logbeta = %s\n", $sdt->bias('log') ); # -2.60 printf("Bias c = %s\n", $sdt->bias('c') ); # -1.40 printf("Bias Griers B'' = %s\n", $sdt->bias('g') ); # -0.91 printf("Criterion k = %s\n", $sdt->crit() ); # -0.47 printf("Hit rate via d & c = %s\n", $sdt->dc2hr() ); # .99 printf("FAR via d & c = %s\n", $sdt->dc2far() ); # .68 printf("LogBeta via d & c = %s\n", $sdt->dc2logbeta() ); # -2.60 # If the number of alternatives is greater than 2, there are two method options: printf("JAlex. d_fc = %.2f\n", $sdt->sens('f' => {hr => .866, states => 3, correction => 0, method => 'alexander'})); # 2.00 printf("JSmith d_fc = %.2f\n", $sdt->sens('f' => {hr => .866, states => 3, correction => 0, method => 'smith'})); # 2.05
Signal Detection Theory (SDT) measures of sensitivity and response-bias, e.g., d', A', c. For any particular analysis, you go through the stages of (1) creating the SDT object (see new), (2) initialising the object with relevant data (see init), and then (3) calling the statistic you want, with any statistic-specific arguments.
The following named parameters need to be given as a hash or hash-reference: either to the new constructor method, init, or into each measure-function. To calculate the hit-rate, you need to feed the (i) count of hits and signal_trials, or (ii) the counts of hits and misses, or (iii) the count of signal_trials and misses. To calculate the false-alarm-rate, you need to feed (i) the count of false_alarms and noise_trials, or (ii) the count of false_alarms and correct_rejections, or (iii) the count of noise_trials and correct_rejections. Or you supply the hit-rate and false-alarm-rate. Or see dc2hr and dc2far if you already have the measures, and want to get back to the rates.
The number of hits.
The number of false alarms.
The number of signal trials. The hit-rate is derived by dividing the number of hits by the number of signal trials.
The number of noise trials. The false-alarm-rate is derived by dividing the number of false-alarms by the number of noise trials.
The number of response states, or "alternatives", "options", etc.. Default = 2 (for the classic signal-detection situation of discriminating between signal+noise and noise-only). If the number of alternatives is greater than 2, when calling sens, Smith's (1982) estimation of d' is used (otherwise Alexander's) - see forced_choice.
Indicate whether or not to perform a correction on the number of hits and false-alarms when the hit-rate or false-alarm-rate equals 0 or 1 (due, e.g., to strong inducements against false-alarms, or easy discrimination between signals and noise). This is relevant to all functions that make use of the inverse phi function (all except aprime option with sens, and the griers option with bias). As ndtri
must die with an error if given 0 or 1, there is a default correction.
If correction
= 0, no correction is performed to calculation of rates. This should only be used when (1) using the parametric measures and the rates will never be at the extremes of 0 and 1; or (2) using only the nonparametric measures (aprime and griers).
If correction
= 1 (default), extreme rates (of 0 and 1) are corrected: 0 is replaced with 0.5 / n; 1 is replaced with (n - 0.5) / n, where n = number of signal or noise trials. This is the most common method of handling extreme rates (Stanislav and Todorov, 1999) but it might bias sensitivity measures and not be as satisfactory as the loglinear transformation applied to all hits and false-alarms, as follows.
If correction
> 1, the loglinear transformation is appliedt to all values: 0.5 is added to both the number of hits and false-alarms, and 1 is added to the number of signal and noise trials.
If correction
is undefined: To avoid errors thrown by the ndtri
function, any values that equal 1 or 0 will be corrected as if it equals 1.
Precision (n decimal places) of any of the statistics. Default = 0, which actually means that you get all decimal bits possible.
Method for estimating d' when number of states/alternatives is greater than 2. Default value is smith; otherwise alexander; see forced_choice for application and description of these methods.
The hit-rate. Instead of passing the number of hits and signal trials, give the hit-rate directly - but, if doing so, ensure the rate does not equal zero or 1 in order to avoid errors thrown by the inverse-phi function (which will be given as "ndtri domain error").
This is the false-alarm-rate. Instead of passing the number of false alarms and noise trials, give the false-alarm-rate directly - but, if doing so, ensure the rate does not equal zero or 1 in order to avoid errors thrown by the inverse-phi function (which will be given as "ndtri domain error").
Creates the class object that holds the values of the parameters, as above, and accesses the following methods, without having to resubmit all the values.
As well as holding the values of the parameters submitted to it, the class-object returned by new
will hold two arguments, hr, the hit-rate, and far, the false-alarm-rate. You can supply the hit-rate and false-alarm-rate themselves, but ensure that they do not equal zero or 1 in order to avoid errors thrown by the inverse-phi function. The calculation of the hit-rate and false-alarm-rate by the module corrects for this limitation; correction can only be done by supplying the relevant counts, not just the rate - see the notes on the correction
parameter, above.
$sdt->init( hits => integer, misses => ?integer, false_alarms => integer, correct_rejections => ?integer, signal_trials => integer (>= hits), # or will be calculated from hits and misses noise_trials => integer (>= false_alarms), # or will be calculated from false_alarms and correction_rejections hr => probability 0 - 1, far => probablity 0 - 1, correction => 0|1|2 (default = 1), states => integer >= 2 (default = 2), precision_s => integer (default = 0), method => undef|smith|alexander (default = undef) )
Instead of sending the number of hits, signal-trials, etc., with every call to the measure-functions, or creating a new class object for every set of data, initialise the class object with these values, as named parameters, key => value pairs. This method is called by new in case you pass the values to it in construction. The hit-rates and false-alarm rates are always calculated anew from the hits and signal trials, and the false-alarms and noise trials, respectively; unless you send a value for one or the other, or both (as hr and far) in a call to init
.
Each init
replaces the values only of those attributes that you pass to it - any values set in previous init
s are retained for those attributes that you do not set in a call to init
. If this is not what you want, and you actually want everything reset, first use clear
Optionally, the method also initialises any values you give it for states, correction, precision_s and method. If you have already set these values, and you do not do so in another call to init
; the previous values will be retained.
$sdt->clear()
Sets all attributes to undef: hits
, false_alarms
, signal_trials
, noise_trials
, hr
, far
, states
, correction
, and method
.
$sdt->rate('hr|far|mr|crr') # scalar string to return the indicated rate $sdt->rate(hr => 'prob.', far => 'prob.', mr => 'prob.', crr => 'prob.') # one or more key => value pairs to set the rate $sdt->rate('h' => {signal_trials => integer, hits => integer}) # or misses instead of hits $sdt->rate('f' => {noise_trials => integer, false_alarms => integer}) # or correct_rejections instead of false_alarms $sdt->rate('m' => {signal_trials => integer, misses => integer}) # or hits instead of misses $sdt->rate('c' => {noise_trials => integer, correct_rejections => integer}) # or false_alarms instead of correct_rejections
Generic method to get or set any rate.
To get a rate, pass only a string that indicates the rate: hit, false-alarm, miss, correct-rejection: only checks the first letter, so any passable abbreviation will do. The rate is returned to the precision indicated by the present value of precision_s, if anything.
To set a rate, either give the actual probability as key => value pairs, or send a hashref giving sufficient info to calculate the rate (if this has not already been sent to init or one of the measure-methods).
Also performs any required or requested corrections, depending on the present value of correction.
Unless the values of the rates are directly given, then they will be calculated from the presently sent counts and trial-numbers, or whatever has been cached of these values. For the hit-rate, there must be a value for hits
and signal_trials
, and for the false_alarm_rate, there must be a value for false_alarms
and noise_trials
. If these values are not sent, they will be taken from any prior value, unless this has been cleared or never existed - in which case expect a croak
.
$s = $sdt->sens('dprime|forcedchoice|area|aprime') # based on values of the measure variables already inited or otherwise set $s = $sdt->sens('dprime' => { signal_trials => integer}) # update any of the measure variables
Alias: sensitivity
, discriminability
Get one of the sensitivity measures, as indicated by the first argument string, optionally updating any of the measure variables and options with a subsequent hashref. The measures are as follows, accessed by giving the name (or at least its first two letters) as the first argument.
Returns the index of sensitivity, or discrimination, d' (d prime), found by subtracting the z-score that corresponds to the false-alarm rate (far) from the z-score that corresponds to the hit rate (hr):
d' = phi^{–1}(hr) – phi^{–1}(far)
In this way, sensitivity is measured in standard deviation units, larger positive values indicating greater sensitivity. If both the hit-rate and false-alarm-rate are either 0 or 1, then sensitivity returns 0. A value of 0 indicates no sensitivity to the presence of the signal, i.e., it cannot be discriminated from noise. Values less than 0 indicate a lack of sensitivity that might result from a consistent, state-specific "mix-up" or inhibition of responses.
If there are more than two states (not only signal and noise-plus-signal), then d' will be estimated by the following.
An estimate of d' based on the percent correct in a forced-choice task with any number of alternatives. This method is automatically called via sensitivity if the value of states
is greater than 2. Only for this condition is it not necessary to calculate the false-alarm rate; the hit-rate is formed, as usual, as the count of hits divided by signal_trials.
At least a couple methods are available to estimate d' when states > 2; accordingly, there is the option - set either in init or sensitivity or otherwise - for method
: its default value is smith (this is the method cited by Stanislav & Todorov (1999)); otherwise, you can use the more generally applicable alexander method:
Smith (1982) method: satisfies "the 2% bound for all M [states] and all percentiles and, except for M = 3 or 4, satisfies a 1% error bound". The specific algorithm used depends on number of states:
For n states < 12:
d' = K_{M}.log( ( (n– 1).hr ) / ( 1 – hr ) )
where
K_{M} = .86 – .085 . log(n – 1).
If n >= 12,
d' = A + B . phi^{–1}(hr)
where
A = (–4 + sqrt(16 + 25 . log(n – 1))) / 3
and
B = sqrt( (log(n – 1) + 2) / (log(n – 1) + 1) )
Alexander (2006/1990) method: "gives values of d' with an error of less than 2% (mostly less than 1%) from those obtained by integration for the range d' = 0 (or 1% correct for n [states] > 1000) to 75% correct and an error of less than 4% up to 95% correct for n up to at least 10000, and slightly greater maximum errors for n = 100000. This approximation is comparable to the accuracy of Elliott's table (0.02 in proportion correct) but can be used for any n." (Elliott's table being that in Swets, 1964, pp. 682-683). The estimation is offered by:
d' = ( phi^{–1}(hr) – phi^{–1}(1/n) ) / An
where n is the number of states (or alternatives, alphabet-size, etc.), and An is estimated by:
An = 1 / (1.93 + 4.75.log_{10}(n) + .63.[log_{10}(n)]^{2})
Returns the nonparametric index of sensitivity, A'.
Ranges from 0 to 1. Values greater than 0.5 indicate positive discrimination (1 = perfect performance); values less than 0.5 indicate a failure of discrimination (perhaps due to consistent "mix-up" or inhibition of state-specific responses); and a value of 0.5 indicates no sensitivity to the presence of the signal, i.e., it cannot be discriminated from noise.
Returns Ad', the area under the receiver-operator-characteristic (ROC) curve, equalling the proportion of correct responses for the task as a two-alternative forced-choice task.
If both the hit-rate and false-alarm-rate are either 0 or 1, then sensitivity
with this argument returns 0.5.
$b = $sdt->bias('likelihood|loglikelihood|decision|griers') # based on values of the measure variables already inited or otherwise set $b = $sdt->bias('likelihood' => { signal_trials => integer}) # update any of the measure variables
Get one of the decision/response-bias measures, as indicated below, by the first argument string, optionally updating any of the measure variables and options with a subsequent hashref (as given by example for signal_trials
, above).
With a yes response indicating that the decision variable exceeds the criterion, and a no response indicating that the decision variable is less than the criterion, the measures indicate if there is a bias toward the yes response, and so a liberal/low criterion, or a bias toward the no response, and so a conservative/high criterion.
The measures are as follows, accessed by giving the name (or at least its first two letters) as the first argument to bias
.
Returns the beta measure of response bias, based on the ratio of the likelihood the decision variable obtains a certain value on signal trials, to the likelihood that it obtains the value on noise trials.
beta = exp( ( (phi^{–1}(far)^{2} – phi^{–1}(hr)^{2}) ) / 2 )
Values less than 1 indicate a bias toward the yes response, values greater than 1 indicate a bias toward the no response, and the value of 1 indicates no bias toward yes or no.
Returns the natural logarithm of the likelihood bias, beta.
Ranges from -1 to +1, with values less than 0 indicating a bias toward the yes response, values greater than 0 indicating a bias toward the no response, and a value of 0 indicating no response bias.
Implements the c parametric measure of response bias. Ranges from -1 to +1, with deviations from zero, measured in standard deviation units, indicating the position of the decision criterion with respect to the neutral point where the signal and noise distributions cross over, there is no response bias, and c = 0.
Values less than 0 indicate a bias toward the yes response; values greater than 0 indicate a bias toward the no response; and a value of 0 indicates no response bias.
Implements Griers B'' nonparametric measure of response bias.
Ranges from -1 to +1, with values less than 0 indicating a bias toward the yes response, values greater than 0 indicating a bias toward the no response, and a value of 0 indicating no response bias.
$sdt->criterion() # assume d' and c can be calculated from already inited param values $sdt->criterion(d => float, c => float)
Alias: dc2k
, crit
Returns the value of the criterion for given values of sensitivity d' and bias c, viz.: k = d' / 2 + c.
$sdt->dc2hr() # assume d' and c can be calculated from already inited param values $sdt->dc2hr(d => float, c => float)
Returns the hit-rate estimated from given values of sensitivity d' and bias c, viz.: hr = phi(d' / 2 - c).
$sdt->dc2far() # assume d' and c can be calculated from already inited param values $sdt->dc2far(d => float, c => float)
Returns the false-alarm-rate estimated from given values of sensitivity d' and bias c, viz.: far = phi(-d' / 2 - c).
$sdt->dc2logbeta() # assume d' and c can be calculated from already inited param values $sdt->dc2logbeta(d => float, c => float)
Returns the log-likelihood (beta) bias estimated from given values of sensitivity d' and bias c, viz.: b = d' . c.
Alexander, J. R. M. (2006). An approximation to d' for n-alternative forced choice. From http://eprints.utas.edu.au/475/.
Lee, M. D. (2008). BayesSDT: Software for Bayesian inference with signal detection theory. Behavior Research Methods, 40, 450-456.
Smith, J. E. K. (1982). Simple algorithms for M-alternative forced-choice calculations. Perception and Psychophysics, 31, 95-96.
Stanislaw, H., & Todorov, N. (1999). Calculation of signal detection theory measures. Behavior Research Methods, Instruments, and Computers, 31, 137-149.
Swets, J. A. (1964). Signal detection and recognition by human observers. New York, NY, US: Wiley.
Math::Cephes : The present module imports/depends upon the ndtr (phi) and ndtri (inverse phi) functions from this package.
Statistics::ROC : Receiver-operator characteristic curves.
Expects descriptive counts, not raw observations, confidence ratings; this limits the measures that can be implemented: methods load
and unload
are reserved to implement handling of data lists.
Perl's params
modules do not seem to effect the required validation of parameters needed for each measure; the present work-around is obsessive-compulsive, while not exhaustive of all wayward possibilities, and requires optimisation but extension. It is presently quite possible to suffer an inelegant death should anything too unsual, or impoverished of details, be attempted in the life of the module.
See Changes file in installation dist.
rgarton AT cpan DOT org
This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).
To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.
This ends documentation for a Perl implementation of signal detection theory measures of sensitivity and bias.