Introducing Typingpool, My Software for Easy Audio Transcription

Desks and typewriters, as far as the eye can see

Today I’m releasing Typingpool, software that makes audio transcriptions easier and cheaper.

Typingpool chops your audio into small bits and routes them to the labor marketplace Mechanical Turk, where workers transcribe the bits in parallel. This produces transcripts much faster than any lone transcriber for as little one-eighth what you pay a transcription service. Better still, workers keep 91 percent of the money you spend.

Example Typingpool HTML transcriptAt the end of the process you have an interactive transcript that can be opened in your web browser, with audio embedded every paragraph or so. Having audio right next to the corresponding text greatly eases double-checking and correction. No conventional transcript is this interactive or easy to fact check.

You use Typingpool through a series of command-line programs, distributed as a Ruby gem. For the non-geek, Typingpool can be a pain to install, and if you’ve never used command-line programs it will take extra effort to learn. But if you create many transcripts Typingpool can save you a great deal of time and money. And while you have to pay the workers who handle your audio on Mechanical Turk, Typingpool is completely free.

Typingpool runs on Mac OS X and Linux.

Background

Waxy.org screenshotTypingpool builds on techniques outlined by Andy Baio in a popular 2008 blog post, which showed how audio could be divided and uploaded to Mechanical Turk for “cheap, easy audio transcription.” I used Andy’s techniques to quickly transcribe hours and hours of interviews conducted for my book on side projects, The 20% Doctrine. (Thanks to my editor Debbie Stier for pointing me at Mechanical Turk!)

My book The 20 Perent DoctrineUnlike the process in Andy’s post, the process I settled on is highly automated. Andy’s process works great for occasional transcription jobs, whereas mine is designed for people who frequently need to make transcripts. Instead of manually editing my audio, laboriously creating Excel spreadsheets, and copy/pasting text into a transcript as Andy did, I outsourced these tasks to software. Instead of clicking around on the Mechanical Turk website, I sent it jobs automatically, through the API.

To automate, I ended up writing a whole library of Ruby code, after simpler approaches failed. This code became Typingpool. I’ve been using and developing it since Feb. 2011.

How it works

Example command-line help screenHere’s the high level view: You point Typingpool at some audio files. Typingpool converts the files to mp3 format, merges them together, and chops them into 1-minute chunks (adjustable). Then you tell Typingpool how much you want to pay to transcribe each chunk and which template you want to use to create worker assignments (several templates are included, all customizable). Typingpool uploads your audio and assignments to Amazon’s servers, where they are immediately made available to workers. As assignments are returned, you can use Typingpool to approve or reject each one. As you approve more and more assignments, your transcript grows until it is complete.

You can cancel a transcription job at any time. You can provide a list of unusual words in the audio so transcribers are more accurate. You can re-assign chunks that have expired or been rejected. You can set deadlines for how long each worker may take on assignment, when assignments are pulled from Mechanical Turk, and how long you have to review an assignment before it is auto-approved.

You interact with Typingpool through a collection of command-line programs: tp-make, tp-assign, tp-review, and tp-finish. You may sometimes need to use the program tp-collect. Another program, tp-config, is used only when installing Typingpool. A simple config file controls defaults (it’s at ~/.typingpool and in YAML format) and a cache file keeps network connections to a minimum.

Output

Example Typingpool directoryThe final output of Typingpool is a folder on your computer containing a transcript file. The transcript file is HTML — a web page you can open in your browser — with audio chunks embedded alongside each associated transcript chunk.

The project folder also includes supporting files, including a CSV data file used to store raw transcript chunks, Amazon Mechanical Turk HIT information, and other metdata; Javscript code that swaps in Flash players on browsers that don’t support mp3 files in audio tags; the original audio files and the audio chunks generated from them; and a CSS file.

The folder is laid out like so:

  Chad Interview/
      -> transcript.html | transcript_in_progress.html
      -> audio/
          -> chunks/
              -> Chad Interview.00.00.mp3
              -> Chad Interview.01.00.mp3
              -> ... [snip]
          -> originals/
              -> chad1.WMA
              -> chad2.WMA
      -> data/
          -> assignment.csv
          -> id.txt
          -> subtitle.txt
      -> etc/
          -> audio-compat.js
          -> transcript.css
          ->  About these files - readme.txt
          -> player/
              -> audio-player.js
              -> license.txt
              -> player.swf

Details

For further details, keep reading below, or jump to your preferred section here:

Usage

A typical workflow will use the bundled scripts in this order:

tp-make -> tp-assign -> [wait] -> tp-review -> tp-finish

tp-review may be called repeatedly, until transcripts for all audio chunks have been processed. Similarly, tp-assign may be called repeatedly, for example to re-assign chunks rejected using tp-review, or to re-assign chunks that have expired.

(An alternate workflow would go like this:

  tp-make -> [manually upload data/assignments.csv to Amazon RUI] ->
    [wait] -> [approve/reject assignments via RUI] -> tp-collect ->
    tp-finish

)

Example:

  tp-make 'Chad Interview' chad1.WMA chad2.WMA --unusual 'Hack Day,  
    Yahoo' --subtitle 'Phone interview re Yahoo Hack Day'
  
     # => Converting chad1.WMA to mp3
     # => Converting chad2.WMA to mp3
     # => Merging audio
     # => Splitting audio into uniform bits
     # => Uploading Chad Interview.00.00.mp3 to
            ryantate42.s3.amazonaws.com as Chad
            Interview.00.00.33ca7f2cceba9f8031bf4fb7c3f819f4.LHFJEM.mp3
     # => Uploading Chad Interview.01.00.mp3 to
            ryantate42.s3.amazonaws.com as Chad #
            Interview.01.00.33ca7f2cceba9f8031bf4fb7c3f819f4.XMWNYW.mp3
     # => Uploading Chad Interview.02.00.mp3 to
            ryantate42.s3.amazonaws.com as Chad #
            Interview.02.00.33ca7f2cceba9f8031bf4fb7c3f819f4.FNEIWN.mp3
     # => ... [snip]
     # => Done. Project at:
     # => /Users/ryantate/Desktop/Transcripts/Chad Interview
  
  
  tp-assign 'Chad Interview' interview/nameless --reward 1.00
    --deadline 90m --approval 6h --lifetime 2d

     # => Figuring out what needs to be assigned
     # => 85 assignments total
     # => 85 assignments to assign
     # => Deleting old assignment HTML from ryantate42.s3.amazonaws.com
     # => Uploading assignment HTML to ryantate42.s3.amazonaws.com
     # => Assigning
     # => Assigned 85 transcription jobs for $85
     # => Remaining balance: $115.00
  
  [Wait...]
  
  
  tp-review 'Chad Interview'
  
     # => Gathering submissions from Amazon
     # => Matching submissions with local projects
     # => 
     # => Transcript for: https://ryantate42.s3.amazonaws.com/
            Chad%20Interview.29.00.263d492275a81afb005c8231d8d8afdb.
             UEMOCN.mp3
     # => Project: Chad Interview: Phone interview re Yahoo Hack Day
     # => Submitted at: 2012-08-11 17:00:36 -0700 by A9S0AOAI8HO9P
     # => 
     # =>   Chad: ... so it had sort of some geek history. And the
     # =>   weather was really bad. But it was an indoor event,
     # =>   right? So people were staying indoors. And like very
     # =>   early... And there was all this really expensive gear
     # =>   that the BBC had. Like these cameras that guys were like
     # =>   riding around on and stuff, huge sound stage, bigger than
     # =>   the one we had in Sunnyvale.
     # =>   
     # =>   Two hours into the event, we heard this big lightning
     # =>   strike, because we were up on a hill in London. And all
     # =>   the lights went out and the roof opened up in the
     # =>   building. What we didn't know is the fire supression
     # =>   system in that building which got blown up by the
     # =>   lightning during a fire would cause the roof to open
     # =>   up. So we had all these geeks with equipment and all this
     # =>   BBC equipment and it was literally raining on them.
     # =>  
     # => (A)pprove, (R)eject, (Q)uit, [(S)kip]? (1/20) 
     
   a
      
     # => Approved. Chad Interview transcript updated.
     # => 
     # => Transcript for: https://ryantate42.s3.amazonaws.com/
            Chad%20Interview.30.00.263d492275a81afb005c8231d8d8afdb.
            RXNKRN.mp3
     # => Project: Chad Interview: Phone interview re Yahoo Hack Day
     # => Submitted at: 2012-08-11 17:00:58 -0700 by A9S0AOAI8HO9P
     # => 
     # =>   Blah blah blah blah okay I am done typing byeeeeeeee
     # => 
     # => (A)pprove, (R)eject, (Q)uit, [(S)kip]? (2/20) 
  
  r
  
     # => Rejection reason, for worker: 
  
  There's no transcription at all, just nonsense

     # => Rejected
     # => 
     # => Transcript for...
     # => ... [snip]

  
  tp-finish 'Chad Interview'
  
     # => Removing from Amazon
     # =>   Collecting all results
     # =>   Removing HIT 2GKMIKMN9U8PNHKK58NXL3SU4TCBSN (Reviewable)
     # =>   Removing from data/assignment.csv
     # =>   Removing from local cache  
     # =>   Removing HIT 2CFX2Q45UUKQ2HXZU8SNV8OG6CQBTC (Assignable)
     # =>   Removing from data/assignment.csv
     # =>   Removing from local cache
     # =>   Removing HIT 294EZZ2MIKMNNDP1LAU8WWWXOEI7O0...
     # =>   ... [snip]
     # =>   Removing Chad Interview.00.00.
              263d492275a81afb005c8231d8d8afdb.ORSENE.html from 
              ryantate42.s3.amazonaws.com
     # =>   Removing Chad Interview.01.00...
     # =>   ... [snip]
     # =>   Removing Chad Interview.00.00.
              263d492275a81afb005c8231d8d8afdb.RNTVLN.mp3 from
              ryantate42.s3.amazonaws.com
     # =>   Removing Chad Interview.01.00....
     # =>   ... [snip]

If you have additional questions, they may be answered in the section “Usage: Additional details" below, or in one of the links in the "More" section, also below.

Installation

There are four broad steps to install Typingpool:

  1. Configure your Amazon account.
  2. Install prerequisites(rvm, audio tools, Ruby, and perhaps a package manager).
  3. Install the Typingpool gem.
  4. Run tp-config.

1. Configure your Amazon account

The below assumes you’ve already got an account on Amazon.com that you want to begin using for Mechanical Turk and Typingpool. It walks you through signing up for Amazon Mechanical Turk, Amazon S3, and obtaining the security credentials you’ll use to configure Typingpool later.

Visit requester.mturk.com. Click “Create an Account” in the top right corner. If prompted, sign in with your usual Amazon account. Fill out the “User Registration” page, using your own name for “Company Name” (unless you personally have your own named business entity).

After you’ve created your account, you’ll want to put some money in it to cover the cost of your first Mechanical Turk assignment. If you don’t have plans to immediately use Typingpool/Mechanical Turk, you can skip this step.

To fund your first assignment, click “Account Settings” in the top right corner. Then click “Prepay for Mechanical Turk HITs” under the section “Your Balance.” You’ll be prompted to enter an amount. Put in just enough to cover your first assignment:

minutes of audio
-times-
how much you’re willing to pay per minute (default $0.75)
-times-
1.1 (to cover the 10% Amazon surcharge).

So for 60 minutes of audio at $0.75, you’d put in $49.50.

There’s no automated way to get your prepaid balance refunded, so only pre-pay what you plan to immediately use, at least at first.

After entering the amount, click “Continue to Amazon Payments.” You might be prompted to log in with your Amazon account again. You’ll be prompted to add a credit card and to confirm the amount of the pre-pay.

Next you’ll need to sign up for Amazon S3, or Simple Storage Service. This is the website Typingpool uses to make your audio files available to transcribers around the world.

Visit aws.amazon.com/s3. Click “Sign Up” near the top right corner. You may be prompted to sign in. Use your usual Amazon account.

Fill out the contact information form.

Enter your credit card details. Amazon S3 will charge small amounts for when Typingpool uses it to temporarily host your audio. These fees are a tiny fraction of what you spend on transcription. For example, I paid 1.08 cents to host an hour-long mp3 on S3 for three days versus $49.50 for the workers and Mechanical Turk fees.

At this stage, you may be prompted for a phone number to verify your identity. The call is automated and painless.

Your account will either be confirmed or Amazon will promise to email you when it’s confirmed. In my experience, Amazon may “forget” to send this email even when your S3 account is up and running.

Next you will need to obtain Amazon S3 security credentials for use with Typingpool. Go to aws.amazon.com and open the “My Account” menu near the top right corner. When the menu opens, select “Security Credentials” at the bottom.

Scroll down to the “Access Credentials” section, which contains a subsection called “Accesss Keys,” which contains a subsection called “Your Access Keys.” There should already be one key, if not click “Creare a new Access Key.”

Copy the letters under “Access Key ID,” then paste them somewhere you can get to later, like a text file or Word doc. Click “Show” under “Secret Access Key,” then copy the secret access key, then paste it under your access key. You’ll use both these keys later to set up Typingpool.

2. Install prerequisites

In this section you’re going to install the various bits of external software Typingpool needs in order to run. There are quite a few such bits!

You’ll need rvm, the Ruby Version Manager, since Typingpool needs a newer version of Ruby than presently ships with any Mac or Ubuntu Linux system.

You’ll need audio tools like ffmpeg and other miscellaneous prerequisites (xml and zip libraries, etc.).

You’ll need a Ruby of version 1.9.2 or better.

If you’re installing on a Mac, you’re also going to install a general package manager called Homebrew (or MacPorts, if you prefer, see instructions in parenthesis). The package manager will let you more easily install the audio programs Typingpool needs.

Don’t worry, I’ve tested this process on something like nine different system configurations and will walk you through it. Find your operating system below to begin.

Jump to:

Mac OS X 10.8 Mountain Lion and
Mac OS X 10.7 Lion
  • Make sure your Mac is up to date: Select Apple menu/Software Update. Install any updates.
  • Install Xcode from the Mac App Store: Open the Mac App Store, search for Xcode, and click Install. Finish the install by launching Xcode and installing device component support when prompted.
  • Install Xcode command line tools: Launch Xcode, select Xcode/Preferences from the menus, select the Downloads tab, and click Install next to Command Line Tools.
  • Install rvm(Ruby Version Manager): Launch the Terminal (click on the magnifying glass on the top right of your screen, type Terminal, and select the Terminal application). Type
    curl -L https://get.rvm.io | bash -s stable

    and hit the Return key. Respond to any prompts in the affirmative.

  • Close and re-open the Terminal window to load rvm: Close the current Terminal window by clicking the red button in the top left corner of the window. Open a new Terminal window by selecting the Shell menu and then “New Window.”
  • Install Homebrew: In the Terminal, type
    ruby -e "$(curl -fsSkL raw.github.com/mxcl/homebrew/go)"

    and hit Return. Respond to any prompts in the affirmative. Leave the Terminal window open.

    (Or, install Macports: Go to http://www.macports.org/install.php, select the installer for your OS X version, download it, and launch it. Respond to any prompts in the affirmative.)

  • Install dependencies: In the Terminal, type
    brew tap homebrew/dupes

    and hit Return. Then type

    brew install autoconf automake openssl libksba ffmpeg mp3splt mp3wrap apple-gcc42

    and hit Return. This can easily take 45 minutes since many basic Unix tools must be compiled. Afterward, leave the Terminal window open.

    (Or, for Macports: In the Terminal, type

    sudo port install gettext apple-gcc42 openssl curl-ca-bundle ffmpeg mp3splt mp3wrap libxslt libxml2 libiconv zlib

    and hit Return. You will eventually be prompted to install Java SE; respond in the affirmative. If you wait too long to respond to the Java SE prompt, you may need to re-type the line above. The whole process can easily take 1 hour 45 minutes since a whole universe of basic Unix tools must be compiled.)

  • Install and select Ruby: In the Terminal, type
    rvm install 1.9.3 --with-opt-dir=/usr/local/opt

    and hit Return. Then type

    rvm use 1.9.3 --default

    and hit Return. Leave the Terminal window open.

Mac OS X 10.6 Snow Leopard
  • Make sure your Mac is up to date: Select Apple menu/Software Update. Install any updates.
  • Install Xcode: You’ll need to do this from your original system DVD from Apple. Insert the OS X Install DVD into your Mac, open the DVD icon from your Desktop (if needed), open the Other Installs folder, double click on Xcode.mpkg. Follow the prompts to install Xcode with command line tools.
  • Update Xcode: Select Apple menu/Software Update. There should be an update to bring Xcode to version 3.2.6.
  • Install rvm(Ruby Version Manager): Launch the Terminal (click on the magnifying glass on the top right of your screen, type Terminal, and select the Terminal application). Type
    curl -L https://get.rvm.io | bash -s stable

    and hit the Return key. Respond to any prompts in the affirmative. Leave the Terminal window open.

  • Close and re-open the Terminal window to load rvm: Close the current Terminal window by clicking the red button in the top left corner of the window. Open a new Terminal window by selecting the Shell menu and then “New Window.”
  • Install Homebrew: In the Terminal, type
    ruby -e "$(curl -fsSkL raw.github.com/mxcl/homebrew/go)"

    and hit Return. Respond to any prompts in the affirmative. Leave the Terminal window open.

    (Or, install Macports: Go to http://www.macports.org/install.php, select the installer for your OS X version, download it, and launch it. Respond to any prompts in the affirmative.)

  • Install dependencies: In the Terminal, type
    brew install ffmpeg mp3splt mp3wrap

    and hit Return. Leave the Terminal window open.

    (Or, for Macports: In the Terminal, type

    sudo port install ffmpeg mp3splt mp3wrap

    and hit Return.)

  • Install and select Ruby: In the Terminal, type
    rvm install 1.9.3 --disable-binary

    and hit Return. Then type

    rvm use 1.9.3 --default

    and hit Return. Leave the Terminal window open.

Ubuntu Linux 12.10 Quantal Quetzal
and Ubuntu Linux 12.04 Precise Pangolin
and Ubuntu Linux 10.04 Lucid Lynx

Special notes for the desktop and server versions of these operating systems, and for 10.04 Lucid Lynx, will be noted in parenthesis.

  • Update packages: From the terminal, type
    sudo apt-get update
    sudo apt-get upgrade
  • Install dependencies: From the terminal, type
    sudo apt-get install ffmpeg mp3splt mp3wrap curl libssl-dev libxml2-dev libxslt-dev libavcodec-extra-53 zlib1g-dev build-essential libyaml-dev libreadline-dev

    (For 10.04 Lucid Lynx, type

    sudo apt-get install ffmpeg mp3splt mp3wrap curl libssl-dev libxml2-dev libxslt-dev libavcodec-extra-52 zlib1g-dev build-essential libyaml-dev libreadline-dev

    )

  • Install rvm and Ruby: From the terminal, type
    curl -L https://get.rvm.io | bash -s stable --ruby
  • Configure the Terminal app for rvm(Ubuntu Desktop only): Select Edit/Profile preferences, select Title and Command, check “Run command as login shell.”
  • Update your current shell: From the terminal, type
    source ~/.rvm/scripts/rvm

    (Ubuntu Desktop users may skip this step and simply open a new tab within the Terminal app.)

  • Select ruby: From the terminal, type:
    rvm use 2.0 --default

3. Install the Typingpool gem

From a terminal window (or the Terminal app on Mac), type

gem install typingpool

and hit Return.

4. Run tp-config

From a terminal window (or the Terminal application on Mac), type

tp-config

and hit Return. When prompted for you Amazon Access Key ID, paste the first string you copied into a file at the end of step 1. When prompted for your Amazon Secret Access Key, paste the second string you copied into a file at the end of step 1. At all other prompts, it is fine to just hit Return to accept the default.

Frequently Anticipated Questions

Why won’t it work on Windows?

Typingpool relies heavily on some Unix-specific audio tools, including ffmpeg. Max OS X and Linux are both essentially Unix systems that can run these tools natively. Windows is not.

Why not just use a transcription service?

You should absolutely use a transcription service if you’re comfortable with the cost and turnaround time. There is even onethat farms the work out to Mechanical Turk, making it somewhat more affordable.

Typingpool is a good alternative if you are looking for lower costs and/or faster turnaround. By chopping the file into chunks that can be transcribed in parallel, you can get a finished transcript faster. By dealing with end workers yourself, you can get a lower price.

You also end up with audio embedded every paragraph or so in your transcript, a nifty feature that makes it very easy to double check quotes. As far as I know, no transcription service offers that.

Isn’t the quality poor?

The quality is lower than with a professional transcription service, and that’s precisely the point. Typingpool is a “worse is better” solution that works well in certain cases. It is cheap and fast, but rough and provisional. It’s a great tradeoff if you want to skim and search the transcript and then easily -– thanks to the embedded audio every paragraph or so — double-check the text. This works well if you’re trying to remember a quote for a work of journalism.

Conversely, you shouldn’t use Typingpool to produce a transcript of your multi-billion-dollar corporation’s quarterly earnings call, or to create a legal document to submit to a court –- at least not without double checking every single word against the audio.

Does Typingpool help exploit workers?

Like any tool, Typingpool could probably be used that way. Please don’t use it that way! Typingpool defaults to paying $0.75 a minute, and I often offer $1.00/minute, which produces transcripts very quickly, tends to attract better workers and is still roughly half the best rate I’ve seen for high-speed professional transcription. I have had success completing transcripts at lower rates, and Baio four years ago was able to findplenty of workers at $0.40/minute. But lower prices generally translate to slower transcription and lower quality.

Also, bear in mind that just because you pay Mechanical Turk workers a lower rate through Typingpool than you’d pay a service doesn’t mean the workers are actually getting paid less or are worse off. The money you pay to a professional transcription service with high accuracy will typically pay for multiple people to work on each section, and will also support the overhead of managers who delegate the work. With Typingpool you are doing the management work yourself, paying workers directly, and accepting lower accuracy. This helps explain why the professional transcription jobs I have come across on Mechanical Turk have always paid a fraction of the rate I offer directly to workers.

What sort of work should I reject?

Up to you. You should take a look at Amazon’s Mechanical Turk requester best practices guide. My own philosophy is that I generally only reject assignments when it looks like someone failed to put in much work. You’ll sometimes find people submit empty assignments, or nonsensical assignments. They are hoping you will fail to review the assignment before the (mandatory) auto-approval deadline and then they will get paid.

When someone submits very bad work, for example because they seem to have issues with idiomatic English, I will often approve the work and then block them so they can no longer submit future work. Typingpool is careful to always show you the worker ID associated with an assignment, but you must go onto the Mechanical Turk web interface to ban people.

Usage: Additional details

  • When you want to preview your assignments, run tp-assign with the −−sandbox option and with −−qualify ‘rejection_rate < 100’ (to make sure you qualify to view your own HITs). Then visit http://workersandbox.mturk.com and find your assignments (a seach for “mp3” works if you left mp3 set as a keyword in your config file). When you are done previewing, run tp-finish with the name/path of your project and the −−sandbox option.
  • When you assign your transcription jobs via tp-assign, you must supply a template name or relative path as the second argument. In the example above, the named template is “interview/nameless.”

    The template “interview/nameless” is a great general-purpose template. It instructs the transcriber not to worry about the names of the speakers, and instead to use labels like “male 1,” “male 2,” etc. This allows the transcriber to work quickly and usually results in a viable transcript, since you can consult your memory or the original audio to figure out who is who.

    To find what other templates are available, navigate to the directory where typingpool is installed (`gem which typingpool`) and then go into lib/typingpool/templates/assignment and its subdirectories. Anything that ends in ‘.html.erb’ is an available template. You may also create your own templates in the directory listed in the “templates” param of your config file.

    The templates interview, interview/phone, and interview/noisy require you to have passed the names of two voices to tp-make when you created your project. The first voice should be the name (and optional title) of the interviewer, and the second the name (and title) of the interviewee, like so:

    tp-make 'Chad Interview' chad1.WMA chad2.WMA --voice ‘Ryan, hack reporter’
       --voice ‘Chad, a software engineer’ --unusual 'Hack Day,  
        Yahoo' --subtitle 'Phone interview re Yahoo Hack Day'
    
  • When you’ve rejected some submissions in tp-review and need to re-assign these chunks to be transcribed, simply re-run tp-assign with the name (or path) of your project. You may select the same template, reward, deadlines, etc., or pick new ones. tp-assign will be careful not to re-assign chunks for which you have approved a transcript, or which are pending on Mechanical Turk.
  • When some chunks previously assigned via tp-assign have expired without attracting submissions, simply re-run tp-assign as described above to re-assign these chunks. Consider increasing the dollar amount specified in your −−reward argument.
  • When some chunks previously assigned via tp-assign have been submitted by workers but not approved or rejected in time for the approval deadline (assign/approval in your config file or –approval as passed to tp-assign), Mechanical Turk has automatically approved these submissions for you and you’ll need to run tp-collect to collect them. (Yes, it’s silly you need run a whole different script instead of just calling tp-review as usual. I’ll fix this in a future version.)
  • When you want to cancel outstanding assignments, simply run tp-finish with the name of your project. If your assignments have already attracted submissions, you may be prompted to run tp-review first.
  • When tp-make, tp-assign, or tp-finish tells you it failed an upload, deletion, or Amazon command, simply re-run the script with the same arguments to re-attempt the upload, deletion or Amazon command. Typingpool carefully records which network operations it is attempting and which network operations have completed. It can robustly handle network errors, including uncaught exceptions.

Maintenance

  • Cache: If the cache file grows too large, you’ll need to delete it manually. It may be safely deleted as long as no Typingpool scripts are running. Its location is specified in the ‘cache’ param in the config file. (The config file is at ~/.typingpool and the cache, by default, is at ~/.typingpool.cache.)

    Typingpool takes no steps to limit the size of the cache file. It prunes the cache of project-specific entries when you run tp-finish on a project, but the cache may grow large if you work on many active projects in parallel, or if you fail to run tp-finish on projects when you are done with them.

  • tp-finish: You should run tp-finish PROJECT each time you finish a project, where PROJECT may be either the project name or path. Assuming you have no submissions pending or awaiting approval, this clears all traces of the project from Amazon Mechanical Turk, from Amazon S3 or your SFTP server, and from the local cache. This will keep your local cache from balooning in size and will minimize your S3 charges or SFTP disk usage. It will also help Typingpool scripts run faster by reducing the number of HITs you have on Amazon Mechanical Turk; many Typingpool operations involve iterating through all of your HITs.

Problems?

If you hit any issues, or think you’ve found a bug, please feel free to email me: ryantate@ryantate.com

Source

If you’re a ruby user, you can obtain the Typingpool source code by typing on the command line:

gem install typingpool

The source code is also available on Github: https://github.com/ryantate/typingpool

Typingpool is distributed under the MIT license.

More

  • Run any script with the −−help option for further details on how to run the script (e.g. `tp-make −−help`).
  • See the docs for Typingpool::Config (`ri Typingpool::Config`) for details of the config file format.
  • See Amazon’s Mechanical Turk documentation for guides and overviews on how Mechanical Turk works:
  • See the documentation on ffmpeg (`man ffmpeg`) and related libraries for clues as to how to make Typingpool support additional file formats. Typingpool can work with any file format that ffmpeg can convert to mp3 (libmp3lame). You may need to install a lib via your package manager to enable this.
  • For an overview of the concepts on which Typingpool is built, see Andy Baio’s guide to using Mechanical Turk for transcription.

My favorite fall meal (so far)

Leg of Pork with Cider and Cream

I’ve made the below meal twice so far this fall and it’s fantastically autumnal. Also, fairly forgiving to prepare. I basically stumbled across the main course flipping through one of my go-to cookbooks after aimlessly picking up a handsome chunk of pork shoulder from the butcher. Words like “with Cider and Cream” tend to jump out at me.

  1. Shoulder of Pork with Cider and Cream
    American Cookery, James Beard.
    Online recipe (photo)

    Basting seems to be out of fashion right now, but it really works here; you’ll be able to taste the apple cider in the finished roast, and the juice that doesn’t stick to the meat or (blackened) to the pan will end up flavoring the cream gravy. The apple flavor compliments the nutmeg/ginger/salt rub very well.

    Notes:

    • You’ll notice the recipe is technically for a “leg of pork;” Beard says later in the chapter to treat shoulder “in the same fashion as leg of pork.”
    • My copy of the cookbook (1972) calls for an internal temp of 165. You’ll notice the one on the website calls for an internal temp of 145. Between you and I, an internal temp of 130 when removing from the oven is probably ideal, assuming high quality meat (if the meat is cheap/factory farmed, go to 145). You’ll get an extra 5-10 degrees in the center after resting.
    • The recipe calls for a ~10 pound roast, a whole shoulder or leg. I used a partial shoulder about half that weight each time. That ran about $40 at my fancy schmancy butcher, but you’ll obviously get a lot of mileage out of that much meat.
    • He does a thing where you flame the roast with applejack. I forgot to do this the first time and honestly I’m not sure it made any difference at all. The second time  I forgot to remove my insta-read thermometers before flaming so now they look like this. Anyway, don’t go out and buy a bottle of applejack for this recipe.
    • If you do a half recipe you’ll likely end up with some blackened apple cider on the bottom of your pan (there are fewer fat drippings to absorb heat and keep the cider from steaming and reducing). Don’t panic, everything is fine. The black bits will stick and stay out of the drippings you use for the gravy, and you can get them off with some Bon Ami or Comet after an overnight soak. (To minimize this, go heavier on the basting, and baste more frequently, early in the cooking, to cool off the bottom of the pan.)
    • Notice how the gravy involves pan juices, heavy cream, butter, and egg yolks? To pour on your fat laden shoulder roast? Ha ha, delicious heart disease. Anyway, you can skip the whole last part of the gravy recipe, where you stir in the yolk(s) and remaining cream. I did this on accident the first time and frankly I thought the gravy was better. It’s, uh, just a little heavy with the yolks and extra cream in there.
    • If you do a half roast (5 lbs), don’t forget to cut the gravy recipe in half too! 



  2. Buttermilk mashed potatoes
    The Zuni Cafe Cookbook, Judy Rodgers.
    Online recipe 

    This is one of the top two or three standout recipes in this fantastic cookbook, along with the famous Zuni Roast Chicken with Bread Salad, the sublime rosemary roast potatoes, and the wonderful polenta, hanger steak, short ribs, oxtail, and brasato recipes (among others!) You should buy this cookbook! “The Practice of Salting Early” section alone is worth the cover price. If you are a meat eater, it will change your life. Or should, at least.

    Notes:

    • The online recipe lists milk and cream/half-and-half. Do milk or cream or half-and-half, 2-3 tablespoons total.
    • Rodgers recommends that the milk/cream/half-and-half (but not the buttermilk) be hot when mixing in. I achieve this by putting it in a pan set on the stovetop (not a burner) as the roast cooks in the oven. You’ll want to add an extra tablespoon or so if you do it this way in case it reduces.
    • Rodgers recommends the butter be just melted. You can microwave or do as with the milk in the bullet above.
    • Rodgers says to serve immediately “or keep warm, covered, in a double boiler, for up to 30 minutes.” I think it’s actually superior after 10-30 minutes in the double boiler because it will be warmer than if you serve it straightaway. Also, it greatly reduces the stress of timing the different courses since now you have a broader window for serving. If you do the double boiler, it’s easiest to reserve the water you boiled the potatoes in, since it will already be hot.
    • Don’t slack on the whipping, especially after the last addition (butter). Get the potatoes nice and fluffy! (The potatoes in the image above are underwhipped.) And make sure there is enough salt in there.
    • Did you buy the cookbook yet? Go buy it! The roast chicken and bread salad are just staggering.



  3. Fennel Baked in Stock and Tomato Sauce
    Adapted from How To Cook Everything, Mark Bittman.
    Online recipe

    The strong anise flavor of the fennel stands up well to all the fatty flavor in the other dishes, as does the tomato sauce. 

    The original Bittman recipe is just “baked in stock” but I was lacking in stock so filled around half the required liquid with juice from my whole canned tomatoes (think San Marzano or Muir Glen). The rest was either stock and/or water with wine and/or dry vermouth. The recipe is typical of Bittman - simple, easy, absolutely delicious (buy his cookbook too, if you haven’t). 

    Notes:

    • Replace half the stock with tomato sauce, such as the juice from a can of San Marzano whole tomatoes.
    • Pretty sure I skipped the Parmesan. You’re, uh, probably getting enough dairy in the other two dishes.

Eat with a hearty, casual red table wine (Côtes du Rhône, Sangiovese, Syrah, etc.) or a nice beer. Happy autumn!

It’s a brutal blend of fascism, corporatism, capitalism, and Stalinism brought together in a very special place. And we made that happen. We unleashed our corporations. We exported our jobs, and we chose not to export our values. We wanted it to be that we way; if we did not want it to be that way we would do something. We would at least know. But we do not. Our silence is our consent.

Mike Daisey, Jan. 20 2011; Berkeley, California.

Eventually, the ebook versions of my review may cost more than the operating system.

Siracusa Said SoReblogged from Siracusa Said So

Preeminent Mac OS X reviewer John Siracusa on the curious relationship between technologies and stories about those technologies, Hypercritical podcast, episode 78, 18th minute (via siracusasaidso). I wonder how much of Apple’s market cap is attributable to the storytelling abilities of Steve Jobs.
"When the woman saw herself represented visually on the wall behind her usual puesto the morning after Dobler struck, she began attempting to wash it off." -Intersections High-res

"When the woman saw herself represented visually on the wall behind her usual puesto the morning after Dobler struck, she began attempting to wash it off." -Intersections

We need to focus on humans, on how humans care about doing programming or operating the application of the machines. We are the masters. They are the slaves… For the time being anyway, until the age of Terminator.

Yukihiro Matsumoto, inventor of the Ruby programming language, enemy of robot collaborators.
emilygould:

The Saddest Shelf In The Library

Fuck that! &#8220;Philip and Alex&#8217;s Guide To Web Publishing&#8221; changed my life. You can read it here, though it&#8217;s been heavily revised since the original (per Philip Greenpun&#8217;s very practical philosophy of what a book should be), so anyone with a library this cool should check out a copy (and then somehow transport yourself to 1998, if at all possible, for context).
In all seriousness, Greenspun set a bar and a vision for long-form web writing that has been sadly marginalized. There&#8217;s something very touching, 13 years on, about the &#8220;Philip and Alex&#8217;s&#8221; chapters in which he argues for the web as an accessible form of education. This is a book that can remind those of us writing online what the hell we&#8217;re working toward.   High-res

Things I Ate That I LoveReblogged from Things I Ate That I Love

emilygould:

The Saddest Shelf In The Library

Fuck that! “Philip and Alex’s Guide To Web Publishing” changed my life. You can read it here, though it’s been heavily revised since the original (per Philip Greenpun’s very practical philosophy of what a book should be), so anyone with a library this cool should check out a copy (and then somehow transport yourself to 1998, if at all possible, for context).

In all seriousness, Greenspun set a bar and a vision for long-form web writing that has been sadly marginalized. There’s something very touching, 13 years on, about the “Philip and Alex’s” chapters in which he argues for the web as an accessible form of education. This is a book that can remind those of us writing online what the hell we’re working toward.