Forkcasting
As clear as a puddle of mud

MechanicalTurk Andon

Stop your MTurk experiment in an emergency

This originally appeared on my Medium on 2018-Jun-30.

Sometimes the shit hits the fan. You need to stop the HIT going to more workers and you need to stop it now. This is what an Andon is for in manufacturing. Cancel, stop, abort, or otherwise halt immediately.

This happened to me tonight. I setup an academic experiment as a HIT, piloted the test with friends and family, trialled it manually in the sandbox, created a CloudWatch dashboard to monitor my experiment, and finally deployed it to one person as a shake-down run. Things didn’t go to plan.

First, I noticed that my metrics were odd. They indicated that more than one person had accepted the HIT. I checked my HIT dashboard and found no-one had been paid. But the workers kept coming in at a rate of about one every 10 minutes.

Then the emails started. “There is no way to submit the HIT.” Workers were completing the full experiment, but were unable to claim their rightful payment. I need to stop workers from taking the HIT as fast as possible.

Unfortunately, I had used the CLI to create a HIT which uses an ExternalQuestion, so it did not show up on my dashboard, and the delete-hit command only works if the HIT is not in progress.

I was unsure how to stop the issue. As a stop-gap measure I made sure that new MTurk workers would waste as little time on my HIT as possible by putting up a ‘HIT BROKEN’ landing page on the experiment. Not great, but definitely a step forwards.

After some digging, I found that I could set the lifetime of a HIT to zero. This stops workers from starting the HIT again. The script below can call the relevant MTurk API if you have linked an AWS account to your MTurk requester account.

#!/bin/sh

# Safer shell scripting -- bail on errors, unset vars, etc.
set -euf

# Read options, -h WORKER_ID
while getopts "h:" OPT
do
  case "${OPT}" in
    h) HIT="${OPTARG}";;
    \?) printf "Error.\n"; exit 1;;
  esac
done

# Bit of input validation
if [ -z "${HIT+x}" ]
then
    printf -- "-h HIT required.\n";
    exit 2;
fi

ENDPOINT="https://mturk-requester.us-east-1.amazonaws.com"
export AWS_DEFAULT_REGION="us-east-1"

printf "AWS Access Key ID> "
read AWS_ACCESS_KEY_ID

printf "AWS Secret key> "
read -s AWS_SECRET_ACCESS_KEY

# Setup an exit trap to clear the secret key.
function onExit {
    unset AWS_ACCESS_KEY_ID
    unset AWS_SECRET_ACCESS_KEY
}
trap onExit EXIT;

export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY

# Expire the HIT
aws mturk update-expiration-for-hit \
  --hit-id "${HIT}" \
  --expire-at "2018-01-01T00:00:00+00:00" \
  --endpoint-url="${ENDPOINT}"

# Display the HIT details for human check.
aws mturk get-hit \
  --hit-id "${HIT}" \
  --endpoint-url="${ENDPOINT}"

The shell script above is tested on a Linux box. Call it using:

% ./andon-cord -h HIT_ID

The AWS CLI list-hits command can show your HIT_ID.

I still need to get all of the affected workers and pay them somehow.

There’s a general lesson that I should’ve applied. Always have an ‘oh shit!’ plan, whether it’s going back to a previous version of the system or just turning it off. This has only cost me ~25 USD, but other companies have lost a lot more by ignoring this lesson.