Bring the power of the Linux command line into your
application development process.
As a novice software developer, the one thing I look for when choosing
a programming language is this: is there a library that allows me to interface
with the system to accomplish a task? If Python didn't have Flask, I
might choose a different language to write a web application. For this
same reason, I've begun to develop many, admittedly small, applications with
Bash. Although Python, for example, has many modules to import and extend
functionality, Bash has thousands of commands that perform a variety of
features, including string manipulation, mathematic computation, encryption
and database operations. In this article, I take a look at these features and how to
use them easily within a Bash application.
Reusable Code Snippets
Bash provides
three features that I've found particularly
useful when creating reusable functions: aliases, functions and command
substitution. An alias is a command-line shortcut for a long command.
Here's an example:
alias getloadavg='cat /proc/loadavg'
The alias for this example is
getloadavg
. Once defined, it can be
executed as any other Linux command. In this instance,
alias
will dump the
contents of the /proc/loadavg file. Something to keep in mind is that this
is a static command alias. No matter how many times it is executed, it
always will dump the contents of the same file. If there is a need to vary the
way a command is executed (by passing arguments, for instance), you can
create a function. A function in Bash functions the same way as a function
in any other language: arguments are evaluated, and commands within the
function are executed. Here's an example function:
getfilecontent() {
if [ -f $1 ]; then
cat $1
else
echo "usage: getfilecontent "
fi
}
This function declaration defines the function name as
getfilecontent
. The
if
/
else
statement checks
whether the file specified as the first function
argument (
$1
) exists. If it does, the contents of the file is outputted.
If not, usage text is displayed. Because of the incorporation of the
argument, the output of this function will vary based on the argument provided.
The final feature I want to cover is command substitution. This is
a mechanism for reassigning output of a command. Because of the versatility
of this feature, let's take a look at two examples. This one
involves reassigning the output to a variable:
LOADAVG="$(cat /proc/loadavg)"
The syntax for command substitution is
$(command)
where "command" is the
command to be executed. In this example, the
LOADAVG
variable will have the
contents of the /proc/loadavg file stored in it. At this point, the
variable can be evaluated, manipulated or simply echoed to the console.
Text Manipulation
If there is one feature that sets scripting on UNIX apart from other
environments, it is the robust ability to process text. Although
many text processing mechanisms are available when scripting in Linux, here
I'm
looking at
grep
,
awk
,
sed
and variable-based operations. The
grep
command allows for searching through text whether in a file or piped from
another command. Here's a
grep
example:
alias searchdate='grep
↪"[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]"'
The alias created here will search through data for a date in the YYYY-MM-DD
format. Like the
grep
command, text either can be provided as piped data
or as a file path following the command. As the example shows, search
syntax for the
grep
command includes the use of regular expressions (or
regex).
When processing lines of text for the purpose of pulling out
delimited fields,
awk
is the easiest tool for the
job. You can use
awk
to
create verbose output of the /proc/loadavg file:
awk '{ printf("1-minute: %s\n5-minute: %s\n15-minute:
↪%s\n",$1,$2,$3); }' /proc/loadavg
For the purpose of this example, let's examine the structure of the
/proc/loadavg file. It is a single-line file, and there are typically five
space-delimited fields, although this example uses only the first three
fields. Much like Bash function arguments, fields in
awk
are references as
variables are named by their position in the line
(
$1
is the first field and so
on). In this example, the first three fields are referenced as
arguments to the
printf
statement. The
printf
statement will display three
lines, and each line will contain a description of the data and the data
itself. Note that each
%s
is substituted with the corresponding parameter
to the
printf
function.
Within all of the commands available for text
processing on Linux,
sed
may be considered the Swiss army knife for text
processing. Like
grep
,
sed
uses regex. The specific operation I'm looking at here
involves regex substitution. For an accurate comparison, let's
re-create the previous
awk
example using
sed
:
sed 's/^\([0-9]\+\.[0-9]\+\) \([0-9]\+\.[0-9]\+\)
↪\([0-9]\+\.[0-9]\+\).*$/1-minute: \1\n5-minute:
↪\2\n15-minute: \3/g' /proc/loadavg
Since this is a long example, I'm going to separate this into smaller parts. As
I mentioned, this example uses regex substitution, which follows this
syntax: s/search/replace/g. The "s" begins the definition of the
substitution statement. The "search" value defines the text pattern you want
to search for, and the "replace" value defines what you want to replace the
search value with. The "g" at the end is a flag that denotes global
substitution within the file and is one of many flags available with the
substitute statement. The search pattern in this example is:
^\([0-9]\+\.[0-9]\+\) \([0-9]\+\.[0-9]\+\)
↪\([0-9]\+\.[0-9]\+\).*$
The caret (^) at the beginning of the string denotes the beginning of a line of
text being processed, and the dollar sign ($) at the end of the string denotes
the end of a line of text. Four things are being searched for within
this example. The first three items are:
\([0-9]\+\.[0-9]\+\)
This entire string is enclosed with escaped parentheses, which makes the
value within available for use in the replace value. Just like the
grep
example, the
[0-9]
will match a single numeric character. When followed by
an escaped plus sign, it will match one or more numeric characters. The
escaped period will match a single period. When you put this whole
expression together, you get an pattern for a decimal digit.
The fourth
item in the search value is simply a period followed by an asterisk. The
period will match any character, and the asterisk will match zero or more of
whatever preceded it. The replace value of the example is:
1-minute: \1\n5-minute: \2\n15-minute: \3
This is largely composed of plain text; however, it contains four unique
special items. There are newline characters that are represented by the
slash-"/n". The other three items are slashes followed by a number. This
number corresponds to the patterns in the search value surrounded by
parentheses. Slash-1 is the first pattern in parentheses, slash-2 is the
second and so on. The output of this
sed
command will be exactly the same
as the
awk
command from earlier.
The final mechanism for string
manipulation that I want to discuss involves using Bash variables to
manipulate strings. Although this is much less powerful than traditional
regex, it provides a number of ways to manipulate text. Here are a few
examples using Bash variables:
MYTEXT="my example string"
echo "String Length: ${#MYTEXT}"
echo "First 5 Characters: ${MYTEXT:0:5}"
echo "Remove \"example\": ${MYTEXT/ example/}"
The variable named
MYTEXT
is the sample string this
example works with. The first
echo
command shows how to determine the length of a string
variable. The second
echo
command will return the first five characters of
the string. This substring syntax involves the beginning character index
(in this case, zero) and the length of the substring (in this case, five).
The third
echo
command removes the word
"example" along with a leading
space.
Mathematic Computation
Although text processing might be what makes Bash scripting great, the need to
do mathematics still exists. Basic math problems can be evaluated using
either
bc
,
awk
or Bash
arithmetic expansion. The
bc
command has the
ability to evaluate math problems via an interactive console interface and
piped input. For the purpose of this article, let's look at
evaluating piped data. Consider the following:
pow() {
if [ -z "$1" ]; then
echo "usage: pow "
else
echo "$1^$2" | bc
fi
}
This example shows creating an implementation of the
pow
function from
C++. The function requires two arguments. The result of the function will
be the first number raised to the power of the second number. The math
statement of
"$1^$2"
is piped into the
bc
command for calculation.
Although
awk
does provide the ability to do basic math
calculation, the ability for
awk
to iterate through lines of text makes it especially useful for creating
summary data. For instance, if you want to calculate the total size of
all files within a folder, you might use something like this:
foldersize() {
if [ -d $1 ]; then
ls -alRF $1/ | grep '^-' | awk 'BEGIN {tot=0} {
↪tot=tot+$5 } END { print tot }'
else
echo "$1: folder does not exist"
fi
}
This function will do a recursive long-listing for all entries underneath
the folder supplied as an argument. It then will search for all lines
beginning with a dash (this will select all files). The final step is to
use
awk
to iterate through the output and calculate the combined size of
all files.
Here is how the
awk
statement breaks down. Before processing
of the piped data begins, the
BEGIN
block sets a
variable named
tot
to zero.
Then for each line, the next block is executed. This block will add to
tot
the
value of the fifth field in each line, which is the file size. Finally,
after the piped data has been processed, the
END
block then will print the
value of
tot
.
The other way to perform basic math is through arithmetic
expansion. This will take a similar visual for the command substitution.
Let's
rewrite the previous example using arithmetic expansion:
pow() {
if [ -z "$1" ]; then
echo "usage: pow "
else
echo "$[$1**$2]"
fi
}
The syntax for arithmetic expansion is
$[expression]
, where expression is a
mathematic expression. Notice that instead of using the caret
operator for exponents, this example uses a double-asterisk. Although there are
differences and limitations to this method of calculation, the syntax can be
more intuitive than piping data to the
bc
command.
Cryptography
The ability to perform cryptographic operations on data may be necessary
depending on the needs of an application. If a string needs to be hashed,
a file needs to be encrypted, or data needs to be base64-encoded, this
all can be accomplished using the
openssl
command. Although
openssl
provides a
large set of ciphers, hashing algorithms and other functions, I cover only
a few here.
The first example shows encrypting a
file using the blowfish cipher:
$1.enc
else
echo "usage: bf-enc "
fi
}
This function requires two arguments: a file to encrypt and the password to
use to encrypt it. After running, this script produces a file named the same
as your original but with the file extension of "enc".
Once you have the
data encrypted, you need a function to decrypt it. Here's the decryption
function:
bf-dec() {
if [ -f $1 ] && [ -n "$2" ]; then
cat $1 | openssl enc -d -blowfish -pass pass:$2 >
↪${1%%.enc}
else
echo "usage: bf-dec "
fi
}
The syntax for the decryption function is almost identical to the
encryption function with the addition of "-d" to decrypt the piped data and
the syntax to remove ".enc" from the end of the decrypted filename.
Another piece of functionality provided by
openssl
is the ability to create
hashes. Although files may be hashed using
openssl
,
I'm going to focus on hashing
strings here. Let's make a function to create an MD5 hash of a string:
md5hash() {
if [ -z "$1" ]; then
echo "usage: md5hash "
else
echo "$1" | openssl dgst -md5 | sed 's/^.*= //g'
fi
}
This function will take the string argument provided to the function and
generate an MD5 hash of that string. The
sed
statement at the end of the
command will strip off text that
openssl
puts at the beginning of the
command output, so that the only text returned by the function is the hash
itself.
The way that you would validate a hash (as opposed to decrypting
it) is to create a new hash and compare it to the old hash. If the hashes
match, the original strings will match.
I also want to discuss the
ability to create a base64-encoded string of data. One particular
application that I have found this useful for is creating an HTTP basic
authentication header string (this contains username:password). Here is a
function that accomplishes this:
basicauth() {
if [ -z "$1" ]; then
echo "usage: basicauth "
else
echo "$1:$(read -s -p "Enter password: " pass ;
↪echo $pass)" | openssl enc -base64
fi
}
This function will take the user name provided as the first function
argument and the password provided by user input through command
substitution and use
openssl
to base64-encode the string. This string
then can be added to an HTTP authorization header field.
Database Operations
An application is only as useful as the data that sits behind it. Although
there are command-line tools to interact with database server software,
here I
focus on the SQLite file-based database. Something that can be
difficult when moving an application from one computer to another is that
depending on the version of SQLite, the executable may be named differently
(typically either
sqlite
or
sqlite3
). Using command substitution, you can
create a fool-proof way of calling
sqlite
:
$(ls /usr/bin/sqlite* | grep 'sqlite[0-9]*$' | head -n1)
This will return the full file path of the
sqlite
executable available on a
system.
Consider an application that, upon first execution, creates an
empty database. If this syntax is used to invoke the
sqlite
binary,
the empty database always will be created using the correct version of
sqlite
on that system.
Here's an example of how to create a new database
with a table for personal information:
$(ls /usr/bin/sqlite* | grep 'sqlite[0-9]*$' | head -n1) test.db
↪"CREATE TABLE people(fname text, lname text, age int)"
This will create a database file named test.db and will create the people
table as described. This same syntax could be used to perform any SQL
operations that SQLite provides, including SELECT, INSERT, DELETE, DROP and
many more.
This article barely scrapes the surface of commands available to develop
console applications on Linux. There are a number of great resources for
learning more in-depth scripting techniques, whether in Bash, awk, sed or
any other console-based toolset. See the Resources section for links to
more helpful information.
Resources