Working with Unix

As new Insider's Guide classes are no longer being offered, this site is not currently being updated. Please refer to NCBI's E-utilities documentation for more up-to-date information.

Commands and arguments

Once you have access to a Unix terminal, you will interact with Unix by typing in commands, which are instructions, given by a user, telling a computer to do something. Some commands allow you to modify their behavior by adding arguments, which provide data to be used as the input of a command, or allow you to set options for a command.

For example:

esearch -db pubmed -query "seasonal affective disorder"

In the line above, esearch is the command, and it has two arguments: -db and -query. These arguments provide input to the esearch command, specifying that we should be searching the database “pubmed”, and that our search query should be “seasonal affective disorder”. (For more about the esearch command, see our esearch documentation page.)

einfo -dbs

In the line above, we are executing the command einfo with one argument: -dbs. Rather than specifying input, the -dbs argument sets an option, telling einfo what mode it should use. (For more about the einfo command, see our einfo documentation page.)

Different commands accept different arguments. Some arguments are required for certain commands, while others are optional. Using a particular command with a particular set of arguments lets you customize your instructions to the computer.

Combining commands together

As was mentioned earlier, one of the strengths of Unix is the ability to take the output of a command and use it as the input for a different command. Unix accomplishes this using the “|” character (pronounced “pipe”, located over the Enter key, on the same key as the backslash). To send the output of one command into another, simply connect them with a “|”:

cat pmids.csv | epost -db pubmed | efetch -format xml

In the line above, we are executing the command cat on the file “pmids.csv”. We then pipe the output of that command into our next command, epost. Finally, we pipe the output of the epost command into a third command, efetch. (For more information on these commands, see our cat, epost, and efetch documentation pages.)

As you build more elaborate scripts, you may find that your string of commands gets rather long. Long series of commands on a single line can be difficult to read, so it may be more convenient to write your series of commands on multiple lines. Using backslash (“\”), we can indicate to Unix that our command or series of commands continues on the next line. This lets us reformat our previous command as:

cat pmids.csv | \
epost -db pubmed | \
efetch -format xml

The backslash tells Unix that the command or series of commands is not yet complete. If you press enter on a line that ends in a backslash, Unix will not execute the command, but will advance to the next line and allow you to finish typing your commands.

Why didn’t it work?

Once you start working with Unix, you may encounter situations where a series of commands does not work the way you expect (or fails to work at all). Don’t worry; this is quite common! With Unix, every detail matters. Unix is case-sensitive, but is also space-sensitive and spelling sensitive. If you misplace a space, misspell a command, or incorrectly capitalize an argument, your script will most likely fail. Additionally, Unix is not always particularly forthcoming about why your script failed. In many cases, a script will fail and give no indication as to what was wrong.

Successfully using Unix may take some effort. Pay attention to the details. Be willing to experiment. Test early and test often. Above all, have patience!

What is Unix? What's the difference between EDirect and E-utilities?

Last Reviewed: July 30, 2021

The Insider's Guide to Accessing NLM Data

Working with Unix

Commands and arguments

Combining commands together

Why didn’t it work?