Pin It
I won't say that Tuesday was the best day in my life, but I did have a very amusing experience.  I interviewed the new middle school principal on Friday and needed to transcribe our 40 minute interview.  Because of an arm injury typing has been painful for me lately.  I learned of a way for my computer to transcribe it from my sound file to text.  I wouldn't have to type.  This could be a Godsend.  I was excited!

The Mac has dictation built into its operating system.  An open-source piece of software called Soundflower can listen to your sound file and send the sound directly to the computer's dictation function.  So your sound file replaces your computer's microphone and voila! a transcript!  No typing need be done, except for minor edits.

Well, not exactly...

Science fiction has given us unrealistic expectations of voice recognition.  Star Trek, Star Wars, 2001 A Space Odyssey... they all had computers or robots that could speak perfect English and understand anyone who spoke to them, no matter what their accent, and in some cases, language. Technology has made great strides, but maybe not as great as we expect.

Now to be fair, I have used the Mac's built-in dictation before, and when I speak slowly and distinctly it does seem to get most of it right.  But when I played it the audio file of my interview it freaked out.  The first words I saw were "Rosetta locking vibrators are grabbing".  That certainly got my attention, because I would never talk to a school principal about vibrators, and I had no idea you could lock them, let alone use them for translating something.

It's like those automated phone system conversations that are so perky and friendly:

Phone: If you want to speak to an operator say 'Operator'

Me: Operator

Phone: I did not understand what you said.  Please try again. If you want to speak to an operator say 'Operator'

Me: Operator

Phone: I did not understand what you said.  Please try again. If you want to speak to an operator say 'Operator'

Me: Operator, you stupid twit.  Operator.  Operator!  Human Being!  Anyone!

Phone: Phone: I did not understand what you said.  Please try again...

So I set up the computer to transcribe the interview and went away for 45 minutes to do something else.  When I came back here is a sample of what I found:

Unedited Actual Words Spoken What The Computer Thought It Heard
Born and raised in Groton.

I grew up in the Summer Hill area, which is just outside of Locke and right next door to Groton.  I attended Groton Schools, K-12, class and 95, and never left.

I left for college.  I attended SUNY Cortland, got my undergrad and masters in my CAS at SUNY Cortland.
Rosetta locking vibrators are grabbing by always been a grad schools Katie did you fall river driving class and 95 man never left night night wife I love her college you attended SUNY Cortland got my undergrad and masters in my CAS;
Your family... are you married?

Yes I'm married to my wife Beth. She's also a true blue Grotonite, born and raised.  She attended Cortland as well.  We left so to speak but she's a 2005 graduate of SUNY Cortland, and most recently Syracuse University as a school counselor.

We do not have children yet. We just celebrated our second anniversary this past July 26.
and your family get married yes I'm married to my wife that she's also a true blue gratinee born and raised attended Cortland as well so get with we left Sorsby but she's a 2005 graduate of St. Portland for some sushi sugar city as a school counselor: do not get no we do not Chosea and we just already our second anniversary of this past July 26

I tried editing the transcription, but finally gave up and manually transcribed it the old fashioned way by listening and typing.  It was painful, but it was worth it.

Can you imagine if Hollywood wrote technology as it really is?

Mr. Spock: Computer, calculate the amount of power we will need to reach the Rigelian System.

Computer: Rosetta locking vibrators are grabbing.

Spock: That is not logical.

Computer: Katie did you fall river driving class and 95.

Spock: Are you saying we must travel 95 lightyears?  By car?

Computer: There's no burgers just reminds it's time for somebody else last time for new Fressia.

I have long held that cell phones are a techniology behind their time.  We used to have reliable telephones that always had a clear, audible connection.  Now we can barely understand what people are saying what with bad connections and noise and dead transmission spots.

Transcription software is the same.  You can get it to work under certain not-real-life circumstances, but if you want to have a conversation with a device that claims to have voice recognition, well... all I can say is watch out for your Rosetta locking vibrators... they may be grabbing.

And for those readers who are professional transcribers... don't worry.  Your job isn't going away any time soon!

Pin It