21.3.1. Command Line Tutorial (Bash)¶
In High Energy Physics we need to work a lot with the command line. This has multiple reasons, one is that it’s a lot of work to create graphical user interfaces but the most important one is that it is more efficient to work with once you get used to it.
The good thing is that this is not Belle II specific so there are very good tutorials out there we can just use.
For this tutorial we want to focus on the Bash shell. This is the most popular
command line interpreter and can be considered the standard so we stick with it.
There is also zsh
which behaves almost the same but has some advanced features
so you are welcome to try it if you want after this tutorial.
There’s another type of command line interpreter called the C shell (csh
, or
the improved version tcsh
) which tended to be popular among scientists in
the nineties. However, in contrast to the Bash it has severe drawbacks when
writing scripts (for more details you can look here).
The C shell is still around in High Energy Physics but support for it fading out: Many experiments start removing support to use their software with it. Also Belle II is planning remove support in the near future.
So if you already know C shell you probably should still continue with this tutorial. And If you’re new to command lines and shells you should definitely not learn C shell. If your supervisor uses it that should not be your problem 😉.
The only thing you need to be able to follow this lecture is to have a Bash available on your system. For macOS and Linux this is basically always the case but for Windows you need to install it first.
Installation on Windows
Luckily with recent Windows versions it has become exceedingly easy to install a bash and use it. We recommend to follow the Ubuntu Instructions and install the latest long term supported Linux (Ubuntu 20.04). You might also want to follow the tutorial to run graphical applications on that page but that is optional for now.
If you’re interested in a more technical description of the Windows Subsystem for Linux (WSL) please refer to the Microsoft documentation
We also strongly recommend that you install the Windows Terminal as it makes working with the terminal much easier on Windows and gives you basically the same features you would get on macOS or Linux.
And since the folks at Software Carpentry have already prepared a very nice introduction to the The Unix Shell. We would like you to go there and go through the introduction and then come back here when you are done.
After this introduction you know should have a basic understanding of the shell. One thing we need that was not covered in the introduction above and that is the use of environment variables. So let’s expand a bit on variables in general.
Shell Variables¶
You already learned about normal variables when learning about loops: values
can be assigned to names and we can obtain the value by putting a $
in front
of the name. In the previous tutorial this was only used for loop variables and
command line arguments ($1
, $2
, …).
This concept can also be extend to user defined variables: you can very easily define your own variables:
myvariable="Some value"
echo "I defined myvariable to ${myvariable}"
Warning
You cannot have any spaces between the name, the equal sign, and the value.
This can be very helpful when writing scripts as you can assign repeated values or command line arguments to readable names. Now the values of these variables are “local” to the current shell: if you run a program it will not see variables defined in this way.
Exercise
Run the two lines above. Then write a shell script to just print the name
of the variable $myvariable
.
Solution
Create a file print_myvariable.sh
that just contains one line:
echo "The value of myvariable is ${myvariable}"
and run it with bash print_myvariable.sh
.
The variable should be empty so the output should just be
The value of myvariable is
You can tell the shell to export your variables to all programs you call with
the export
statement. It looks basically the same as the normal variable
definition.
export myvariable="Some value"
echo "I defined myvariable to ${myvariable}"
Exercise
Run the two lines above. Then execute the shell script from the previous exercise again.
Solution
Now the script should show the value you assigned so the output should be
The value of myvariable is Some value
Exported variables are called environment variables and by convention they
should be always be in capital letters, so in the example above we should have
called it MYVARIABLE
or maybe MY_VARIABLE
.
Note
It is technically impossible for sub process to modify the environment variables of its parent process. So if you execute a script or run a program it cannot modify the environment variables in your shell.
In the example above, if the script would modify
the value of $myvariable
to something else this would not have any
affect on your the value in your current shell.
Environment Variables¶
As we said, any exported variable is technically an environment variable but
some names have special meaning. The most important one is the variable
$PATH
Exercise
Print the value of $PATH
in your shell.
Solution
You should see a number of directory names separated by colon, for example
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
The exact value depends on your operating system and might be different.
The $PATH
variable determines where the shell looks for programs to execute.
If you tell the shell to execute any program it will go through all directories
in this list one by one and look for a program by that name. If it finds one it
will execute it, otherwise it will complain that it cannot find it.
We can find out in which the shell found a certain program by using which
.
This command will print the full path to the program that would be called.
Exercise
Find out in which directory the ls
program is located.
Solution
Running which ls
should produce
/bin/ls
You can modify this $PATH
to look for programs in additional directories,
for example to first look for programs in bin
in your home directory you
could use:
export PATH=~/bin:$PATH
Question
Why do we have $PATH
in the value of the variable assignment?
Solution
We want to add a directory to the existing $PATH
, not fully replace the
value. Otherwise the shell would only look in our home directory for
programs.
There are other important variables that affect the behavior of the shell. The most important ones are:
-
PATH
¶ Determines where to look for executables.
-
LD_LIBRARY_PATH
¶ Similar to
PATH
this determines where to look for shared libraries which might be needed by the executables.
-
PYTHONPATH
¶ Similar to
PATH
this determines where the Python scripting language will look for additional modules.
-
LC_ALL
¶ Change the language settings in your shell. This goes together with a large list of “locale” variables all starting with
LC_
to change how numbers, dates, or times are formatted and how letters are sorted.LC_ALL
allows to set all of them at once.For example to change everything to German we could use
export LC_ALL=de_DE.utf8
You can find out which locales are available on your system by running
locale -a
and you can see your current settings by running justlocale
.Note
These days you should always choose a locale ending in
utf8
to have support for all characters.
-
EDITOR
¶ Lets you set your preferred editor to start when a program needs a text editor. Can be set to the executable of any editor you would like to use by default.
Modifying Environment Variables in the Shell
Now to be able to use software not installed in the default locations, like for
example the Belle II software, we need to change at least PATH
and
LD_LIBRARY_PATH
but usually also set a few others.
Now we already discussed above that executing a script cannot modify the environment variables of our current shell but it would be very inconvenient if everyone had to copy paste instructions what to set there all the time.
Luckily there is a way to modify the environment in our shell, it’s called “sourcing” a script. It behaves almost like executing a script but all the commands affect the current shell:
source myenvvars.sh
This will read the script myenvvars.sh
and execute all the commands it finds
in there in the current terminal. It is exactly equivalent to copy pasting every
single line into the terminal one by one and hitting return.
Warning
While in many cases this looks almost identical to executing a script there
can be very big differences. For example if the script contains an exit
command it will close your current terminal and not just stop executing the
script itself.
You should only use sourcing if you really need to modify the current shell.
There’s also a short version which gets used very often but can be very
misleading: The source command can be replaced by a single .
, so the above
could also be written as:
. myenvvars.sh
Note
There needs to be a space between the .
and the script name. We
recommend to use source
wherever possible as it is much clearer to
understand and avoids mistakes.
Key points
variables in bash can be created by simply writing
name=value
to make them available to called programs they need to be exported via
export name=value
executed scripts cannot affect variables in the main shell
exported variables are called environment variables
there are a few important environment variables like
PATH
variables in the current shell can be modified by sourcing a script.
Further reading¶
Bash has a lot of features and it might take a long time before you feel fully “at home” in the command line. As with many other tools, it might feel very clumsy at first and it takes some effort to get a feeling for its true power. We could only show you some very basic features, but there is much more to be discovered. The more you know, the more the command line will become an integral part of life for you (after some time you might be surprised with how many commands you remember).
Exercise
Search for “most useful bash commands” in your browser. Write down a couple of commands that you might need in the future.
Danger
NEVER use a bash command you do not understand.
As you might have seen, bash commands are a very raw way of interacting
with your machine. You cannot count on being prompted for confirmation if you
do something dangerous (the rm
command just deletes, it doesn’t ask for
confirmation and there is no trash).
So always make sure that you know what you’re doing.
See also
We have started to compile a reading list for git on confluence. Please take a look (and help us extend it if you can recommend other tutorials)!
Stuck? We can help!
If you get stuck or have any questions to the online book material, the #starterkit-workshop channel in our chat is full of nice people who will provide fast help.
Refer to Collaborative Tools. for other places to get help if you have specific or detailed questions about your own analysis.
Improving things!
If you know how to do it, we recommend you to report bugs and other requests
with JIRA. Make sure to use the
documentation-training
component of the Belle II Software
project.
If you just want to give very quick feedback, use the last box “Quick feedback”.
Please make sure to be as precise as possible to make it easier for us to fix things! So for example:
typos (where?)
missing bits of information (what?)
bugs (what did you do? what goes wrong?)
too hard exercises (which one?)
etc.
If you are familiar with git and want to create your first pull request for the software, take a look at How to contribute. We’d be happy to have you on the team!
Quick feedback!
Author of this lesson
Martin Ritter