Missing algorithm step

mih commented

2018-08-23 14:57:47 +00:00

(Migrated from github.com)

Velocity and acceleration data were appropriately adjusted to compensate for the time shift introduced by the filters.

This is stated in Nyström et al, 2010. I cannot see this being done in the code at all.

> Velocity and acceleration data were appropriately adjusted to compensate for the time shift introduced by the filters. This is stated in Nyström et al, 2010. I cannot see this being done in the code at all.

mih commented

2018-08-23 15:24:58 +00:00

(Migrated from github.com)

@AdinaWagner does that ring a bell re the other work of that group that you have read?

adswa commented

2018-08-23 15:32:02 +00:00

(Migrated from github.com)

@mih no, never came across this in all other studies I read. Will check in their book tomorrow

mih commented

2018-08-23 15:37:11 +00:00

(Migrated from github.com)

The solution is likely to use https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_coeffs.html#scipy.signal.savgol_coeffs
and perform the filtering via the filtfilt() function to get a zero-lag filter.

The solution is likely to use https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_coeffs.html#scipy.signal.savgol_coeffs and perform the filtering via the `filtfilt()` function to get a zero-lag filter.

adswa commented

2018-08-23 15:45:01 +00:00

(Migrated from github.com)

Marcus Nyström published the Matlab code for the algorithm on his homepage - it must have been recently, I remember searching for and not finding any available code online when I read the thesis: dev.humlab.lu.se/www-transfer/people/marcus-nystrom/EventDetector1.1.zip

the readme states that corrections have been introduced in June 2018. Will check out what they did later today and get back if I find something interesting.

Marcus Nyström published the Matlab code for the algorithm on his homepage - it must have been recently, I remember searching for and not finding any available code online when I read the thesis: dev.humlab.lu.se/www-transfer/people/marcus-nystrom/EventDetector1.1.zip the readme states that corrections have been introduced in June 2018. Will check out what they did later today and get back if I find something interesting.

adswa commented

2018-08-23 15:58:33 +00:00

(Migrated from github.com)

"Because the ONH employs a Savitzky–Golay filter with a window length for our data of 19ms, and given the fact that this filter would delay the resulting smoothed position and velocity calculation by 9 ms and that this filter delay was never corrected in the original code, the algorithm could not be intended to classify the raw position signal. (Although Nyström & Holmqvist, 2010 , claim that the delay is removed, this is not correct." (found here: https://link.springer.com/article/10.3758%2Fs13428-018-1050-7)

here is a discussion and matlab solution of the issue: https://digital.library.txstate.edu/bitstream/handle/10877/6874/DiscussionOfTheFilterDelayIssueWithTheONH.pdf?sequence=8&isAllowed=y

"Because the ONH employs a Savitzky–Golay filter with a window length for our data of 19ms, and given the fact that this filter would delay the resulting smoothed position and velocity calculation by 9 ms and that this filter delay was never corrected in the original code, the algorithm could not be intended to classify the raw position signal. (Although Nyström & Holmqvist, 2010 , claim that the delay is removed, this is not correct." (found here: https://link.springer.com/article/10.3758%2Fs13428-018-1050-7) here is a discussion and matlab solution of the issue: https://digital.library.txstate.edu/bitstream/handle/10877/6874/DiscussionOfTheFilterDelayIssueWithTheONH.pdf?sequence=8&isAllowed=y

adswa commented

2018-08-23 19:57:03 +00:00

(Migrated from github.com)

I've read this very recent paper from Friedman et al: https://link.springer.com/article/10.3758%2Fs13428-018-1050-7
It reads like an extensive protocol of the combined WTFs of us when we looked at the algorithm and its results. I'll highlight the authors main points, they might be reaffirming suspicions of flaws or be helpful in other ways.

objective of the paper was to improve the Nyström & Holmqvist (2010) algorithm after it produced dissatisfying results. They assessed errors the algorithm made via human ratings of eye tracking data of 20 participants (26 seconds recording of reading, tracked with Eyelink 1000), wrote an improved algorithm (in matlab, can be found here: https://digital.library.txstate.edu/bitstream/handle/10877/6874/MNH_Code.zip?sequence=19&isAllowed=y), and compared both algorithms.
The most frequent errors of the Nyström-Holmqvist Algorithm were

(the delay introduced by the uncorrected SG filter, they suggest a correction in matlab by using "sgolayfilt" function in matlab - if anyone has a better matlab understanding than me this information could be helpful?.)
per 26 second recording, on average 10 fixations were not detected (given our recordings are much longer, this number would amount to a lot of missing fixations...). The authors explain this as @mih did on Wednesday with fixations being discarded if they contain any artifact.
50% of all errors were saccades starting too early, additional 30% were saccades ending too late. The authors suggest this is due to the classification being based on local minima. I'm not sure this error can not be detected by the current unit tests. The human eyetracking data the authors use to "test" the algorithms performance possess a slight drop / "undershoot" prior and after the saccade (fig 2) - does the artificial data has the same properties?
A nominally infrequent error (1%) were saccades being not detected. This stemmed from "total failures" in 8.3% of subjects. In these cases, large amounts of saccades were not detected. This is due to the adaptive thresholds relying on means and standard deviations of velocity data during fixation blocks that were found to be extremely skewed. In these cases, highly skewed distributions of velocities led to unreasonable high saccade velocity peak detection thresholds that prevented small saccades from being detected, or, in other cases, unreasonable small saccade peak velocity thresholds, for which saccades could not be detected when local velocity noise exceeded these thresholds.

Some major changes Friedman et al. implemented were

to waive almost all adaptive parameters. They claim adaptiveness does not have proven advantages. To me that seems a bit drastic. However as a sensible step, it could be useful to check whether there is high skew in the velocities in our data that could affect threshold setting and lead to "total failure" situations. A possible fix could then maybe be to use median instead of mean etc.
allowing fixation periods to start or end with artifacts - @mih I think has already thought of the same thing
in saccade start and end detection, they add "a variable number of samples" after or prior, respectively, depending on the point where saccades pass the subthreshold and the local minima.

I've read this very recent paper from Friedman et al: https://link.springer.com/article/10.3758%2Fs13428-018-1050-7 It reads like an extensive protocol of the combined WTFs of us when we looked at the algorithm and its results. I'll highlight the authors main points, they might be reaffirming suspicions of flaws or be helpful in other ways. objective of the paper was to improve the Nyström & Holmqvist (2010) algorithm after it produced dissatisfying results. They assessed errors the algorithm made via human ratings of eye tracking data of 20 participants (26 seconds recording of reading, tracked with Eyelink 1000), wrote an improved algorithm (in matlab, can be found here: https://digital.library.txstate.edu/bitstream/handle/10877/6874/MNH_Code.zip?sequence=19&isAllowed=y), and compared both algorithms. The most frequent errors of the Nyström-Holmqvist Algorithm were - (the delay introduced by the uncorrected SG filter, they suggest a correction in matlab by using "sgolayfilt" function in matlab - if anyone has a better matlab understanding than me this information could be helpful?.) - per 26 second recording, on average 10 fixations were not detected (given our recordings are much longer, this number would amount to a lot of missing fixations...). The authors explain this as @mih did on Wednesday with fixations being discarded if they contain any artifact. - 50% of all errors were saccades starting too early, additional 30% were saccades ending too late. The authors suggest this is due to the classification being based on local minima. I'm not sure this error can not be detected by the current unit tests. The human eyetracking data the authors use to "test" the algorithms performance possess a slight drop / "undershoot" prior and after the saccade (fig 2) - does the artificial data has the same properties? - A nominally infrequent error (1%) were saccades being not detected. This stemmed from "total failures" in 8.3% of subjects. In these cases, large amounts of saccades were not detected. This is due to the adaptive thresholds relying on means and standard deviations of velocity data during fixation blocks that were found to be extremely skewed. In these cases, highly skewed distributions of velocities led to unreasonable high saccade velocity peak detection thresholds that prevented small saccades from being detected, or, in other cases, unreasonable small saccade peak velocity thresholds, for which saccades could not be detected when local velocity noise exceeded these thresholds. Some major changes Friedman et al. implemented were - to waive almost all adaptive parameters. They claim adaptiveness does not have proven advantages. To me that seems a bit drastic. However as a sensible step, it could be useful to check whether there is high skew in the velocities in our data that could affect threshold setting and lead to "total failure" situations. A possible fix could then maybe be to use median instead of mean etc. - allowing fixation periods to start or end with artifacts - @mih I think has already thought of the same thing - in saccade start and end detection, they add "a variable number of samples" after or prior, respectively, depending on the point where saccades pass the subthreshold and the local minima.

mih commented

2018-08-24 05:35:34 +00:00

(Migrated from github.com)

Thx @AdinaWagner. I went through the description and here are the differences that I found:

spikes in the [..] data were filtered using the Stampe (1993) filter. [We were informed by Dr. Holmqvist
(personal communication) that the ONH data were also filtered by the Stample filter.]
Use of the conv function (convolution), as in conv(X,g(:,1),’same’), where X is the signal and g are
the filter coefficients, would have removed the delay.
the ONH filter width was 19 samples (i.e., 19 ms), but MNH changes this to 7 ms. Both algorithms
used the same filter order—that is, 2.
physical velocity threshold: 1,500 deg/s or the acceleration signal was greater than 100,000 deg2/s.
both saccade thresholds are fixed values (that is a NO-GO for us IMHO)
the fixation/artifact related differences mentioned by @AdinaWagner

Both algorithms cannot deal with pursuit events.

Thx @AdinaWagner. I went through the description and here are the differences that I found: - spikes in the [..] data were filtered using the Stampe (1993) filter. [We were informed by Dr. Holmqvist (personal communication) that the ONH data were also filtered by the Stample filter.] - Use of the conv function (convolution), as in conv(X,g(:,1),’same’), where X is the signal and g are the filter coefficients, would have removed the delay. - the ONH filter width was 19 samples (i.e., 19 ms), but MNH changes this to 7 ms. Both algorithms used the same filter order—that is, 2. - physical velocity threshold: 1,500 deg/s or the acceleration signal was greater than 100,000 deg2/s. - both saccade thresholds are fixed values (that is a NO-GO for us IMHO) - the fixation/artifact related differences mentioned by @AdinaWagner Both algorithms cannot deal with pursuit events.

mih commented

2018-08-24 09:11:49 +00:00

(Migrated from github.com)

Turns out, the python function does adjust for the time shift!

Rows
Columns

Missing algorithm step #2