http://www.extrahop.com/post/blog/programming-by-voice-staying-productive-without-harming-yourself
One of the reasons I love working at ExtraHop is the lack
of meetings and abundance of uninterrupted development time. However, I
quickly found after starting that I was unaccustomed to coding for such
long periods. A few weeks after I started at ExtraHop, I began to
develop discomfort in my wrists and forearms. I have had intermittent
trouble with this in the past, but limiting my computer usage at home in
the evenings had always been enough to previously solve it. This time,
however, was different.
As a very recent college graduate, I was concerned that my
daily work activities could be causing permanent injury. I started
looking into ergonomic keyboards and mice, hoping to find a cure-all
solution. As you might have guessed, I did not find a magical solution,
and my situation worsened with each passing week.
While the discomfort was frustrating, I was much more
concerned that the injury was preventing me from being able to quickly
and easily create and communicate at work and at home.
An Introduction to a Solution
After trying and abandoning several other solutions, a coworker of mine at ExtraHop showed me a PyCon talk by Tavis Rudd, a developer who programs
by using his voice. At first, I was skeptical that this solution would
be reliable and productive. However, after watching the video, I was
convinced that voice input was a compelling option for programmers. Rudd
suffered from a similar injury, and he had gone through all of the same
investigations that I had, finally determining that a fancy keyboard
wasn’t enough to fix it.
That night, I scoured the Internet for people who
programmed by voice, looking for tips or tutorials. They were few and
far between, and many people claimed that it was impossible. Not easily
deterred, I started to piece together a toolkit that would allow me to
program by voice on a Linux machine.
Configuration: The Hard Part
It was immediately clear that Dragon NaturallySpeaking was
the only option for dictation software. Their product was miles ahead of
others in voice recognition, but it only ran on Windows or Mac.
Unfortunately I was never successful running Dragon NaturallySpeaking in
Wine and had to settle for running in a Windows VM and proxying the
commands to the Linux host.
I will leave out some of the configuration steps that I
went through in this post. You can find detailed instructions on how to
get everything up and running on my GitHub repo.
If you are following along with the instructions, you
should now be able to send dictation and the example command to your
Linux host, but that will not get you very far with programming. I ended
up spending most of the next two weeks writing grammars. The majority
of the process was:
-
Attempt to perform a task (programming, switching windows, etc).
-
Write a command that would let me do this by voice.
-
Test that command and add related commands.
-
Repeat.
The process was slow going, I am hopeful that the
repository I linked will help you avoid starting from scratch. Even
after using this for about a month, I am still tweaking my commands a
couple times a day. Tavis Rudd claims to have over 2000 custom commands,
which means that I must still have a long way to go.
The Results
Like Rudd explained in his talk, the microphone is a
critical link in this setup. A good microphone that hears only you will
make a big difference in both accuracy and speed of recognition. I
really like the Yeti from Blue that I am using, but I can generally only use it if the office is mostly quiet.
With the commands I have created so far, I can switch between windows, navigate the web (with the help of Vimium),
switch between workspaces, and, most importantly, I can program in
Python and Go with decent speed. It is not quite as fast as programming
with a keyboard, but it is surprisingly efficient once you learn the
commands.
The grammars I have shared in the above GitHub repository
are specific to what I need in my workflow. I recommend that you use
them as a starting point, while keeping in mind that the computer may
recognize words differently for you than it does for me. These grammars
are also specific to the languages I use most often. Please don’t
hesitate to write ones for your favorite languages. And finally, look
for my .vimrc file in my dotfiles repository to find the custom shortcuts that the voice commands trigger.
Coding by voice is not perfect, but it has reached a point where it
is a practical option. Don’t continue suffering through wrist and arm
discomfort when there is an alternative. Feel free to send me a pull
request and we can continue making voice programming better for
everyone.
No comments:
Post a Comment