Unix architecture and file system

From Unix
Jump to navigation Jump to search

In this section, we'll give an intuitive understanding to the Unix operating system (OS) and its file system. After having understood this, we'll do some work within a 'Unix command line interface (CLI) (exactly what a CLI is, we'll go into detail in the sections underneath) by using some basic Unix commands, some of which allow us to navigate throughout the file system. We'll also take a look at how Unix commands can perform different tasks depending on the options you give it.
You'll have to be able work within a Unix environment. If you're using a Mac or Linux computer, then you should be good to go, as these both use Unix OS variants. If you're using a Windows computer, starting a Unix terminal can be most easily done in MobaXterm. A better option for windows users, however, is Ubuntu WSL. But as this installation requires some knowledge about Unix commands and the filesystem, you're better off by completing this section and then coming back to installing Ubuntu WSL. For more details on how to start a Unix terminal, see the section Course Preparation. Just so you know, there are some Unix Commands and options that are present in the Linux and BSD Unix (the Unix OS used in Mac computers) that aren't present in MobaXterm. For instance, the man command, which outputs manuals for Unix commands is not present. You can, however, just google man 'UNIX Command' to get the same result.

The Unix Operating systems (OS)

Figure 1.1: Unix Architecture

An operating system (OS) is what ultimately connects the user with the hardware of a computer. The user interacts with the hardware of a computer through applications, which interact with the kernel that in turn interacts with your hardware. Sounds complicated, but it's actually not as we'll see when we go through what these terms actually mean.

Applications are programs or groups of programs that are designed to allow the user to perform a set of coordinated tasks. For example, the programs in the office pack and your browser would be applications. Unless you've worked in a command line interface which we'll discuss what is below, then using these kind of applications is how you've been working with your computer till now. It is important to distinguish programs from applications as they're not necessarily the same. All applications are programs, but there are a lot of programs that run 'behind the scenes' in an OS and these are not applications. These programs are what's called 'services' on a Windows OS, and 'daemons' on any Unix OS.

The kernel is the main component of an OS. It translates inputs originating from applications to the hardware, which then does some work, and translates outputs coming from the hardware to the user. You can think of a kernel as a collection of standardized low-level services or daemons of general nature. In other words a collection of independent programs, that users can't directly interact with. To generalize, when we use the term input we mean signals or data originating from the user or applications. Oppositely, output are signals or data originating from the hardware, for example, remember those warning messages you keep getting when you're running low on disk space? That's an output signal coming from your hardware that's being translated by your kernel to you!

User interfaces

In general there are two types of user interfaces, graphical user interface (GUI) and command-line interfaces(CLI). They are like special applications that allow the user to interact with the OS. GUI is a user interface, which you can think of as a space from which users can interact with a machine, that you're most certainly familiar with even though you might not have heard about it. This is what allows you to navigate through your computer by clicking on different icons. A command line interface is just a user interface in the form of a command line and a command line is just like it sounds like, a line wherein you can type commands. From here, we can interact with the hardware of our computer by issuing text-based commands to the kernel. In this course we'll sometimes use the term terminal which is synonymous with command line interface in Unix.

Unix shells are command line interpreters

For beginners, what a shell is can be quite confusing and you might find conflicting answers when googling it. This is because the meaning is different depending on whether you're using Unix or Windows. In Unix, the shell is defined as command line interpreter that sends interpretations to the kernel. The returning output from the kernel is then also interpreted by the shell and outputted in your terminal. The confusion with Windows is that terminal and shell are one in the same. This can be exemplified by the fact that the terminal on Windows computers is literally called PowerShell.
Within the Unix OS, there are different shells and in later sections, we'll take a quick look at what these shells are and how we might switch between them. But for now, just know that the default shell we'll be working with is called BASH (Bourne again Shell).


The Unix File system

Intuitively, the Unix file system can be thought as a rooted tree from which branches originate. These branches are termed directories (or folders), where the topmost directory is termed root and is denoted with /. The symbol / followed by some directory name tells you where you are on the tree. Some of these directories are predefined (these are the directories within the green box in Figure 1.2), while others are user-defined (the directories within the blue box in Figure 1.2). In general, predetermined directories contain important files and programs that execute the inner functions of your computer. So meddling around here without knowing what you're doing is a bad idea.

Figure 1.2: Unix File Structure

The predetermined directory, /home, is of special interest to you as this is where most of user-defined directories exist. The /home directory can be denoted with the symbol ~.

When you start working in the Unix file system, you'll notice that files have different colours. This is simply one of the ways for which the Unix file system distinguishes between different file types. But as these colours can differ between operating systems (and it is actually possible to change these colours), it's best not to immediately assume that colours represent a filetype when working on a new computer. In the table beneath, we show one of the most common ways for which colour file typing is distributed.

Color Description
White Ordinary/regular file
Blue/dark blue Directory
Green Executable file
Sky blue Linked file
Yellow with black background Device
Pink Graphic image file
Red Archive file or Zip file

Introduction to Terminal and Unix Commands

A Unix command can consist of three parts; a command, command line option and a command line argument. In general Unix commands are acronyms, words made shorter to make life easier for you, the user. The notation <something>, is used to indicate that 'something' is a required variable, which could for example be files, directories or URL's. The notation [something] is used to indicate that 'something' is an optional variable, and is often used for command line options. The idea behind Unix commands (this is part of what's called the 'Unix philosophy') is that every command is only ever able to do one thing, but do it well. This of course isn't entirely true as the functionality of Unix commands can differ depending on given command line options. However, if you compare this to more complex programs, which is what you've been using till now; browser, Word, videos games and so forth, then Unix commands truly are simple. Think of Unix commands as small programs that are able to do some small function and larger programs in Unix as a combination of a multitude of Unix commands.

Unix Command Acronym translation Description
who or whoami - Tells you who the current user is. The 'who' command is not present in MobaXterm but you can use 'whoami' instead
man <COMMAND> Manual A very useful command. By using this command on other Unix commands, it gives you a manual of how to use them. This command is not present in MobaXterm, but you can instead use google to find command manuals.
cal [OPTION] Calendar Gives you the current date in a calendar form
date [OPTION] - Gives you the current date
pwd [OPTION] Print working directory Where are you? Shows the current directory
ls [OPTION] [DIRECTORY] List segments Shows the files in the current directory if a filepath is not given.
cd [OPTION] <DIRECTORY> Change Directory Moves you to a specified directory. You can type 'cd ..' to move one directory back. The '..' simply means to go up one level towards your root
mkdir [OPTION] <DIRECTORY>
Make directory Makes the specified directory
rmdir [OPTION] <DIRECTORY> Remove directory Removes the specified directory if it's empty. To remove a non-empty directory, one can use the recursive option 'r.
ln [OPTION] <DIRECTORY> <LINK_NAME> Link Can be used to create links (shortcuts) between files and directories.
Figure 1.3: MobaXterm CLI

Let's start off by starting a terminal from which we can issue some of these commands. If you're using Mobaxterm, simply click on the icon and 'Start local terminal'. It should look like figure 1.3. You may also be using the Ubuntu WSL (Windows Subssystem Linux), and in that case, it will look like figure 1.4.

Figure 1.4: Ubuntu WSL CLI

If you're on a Mac computer, a program called 'Terminal' should be installed on your system. Locate it and run it. The display will be different from Mobaxterm and Ubuntu WSL but the setup will be the same.

In all 3 cases, the terminals have the same structure. They consist of a prompt and a command line. As illustrated in figure 1.3 and figure 1.4, the prompt is a line that precedes the command line and the text that it consists of can vary on different computers. The presence of the prompt indicates that the system is ready to carry out commands. The command line is the space wherein you can type your commands. You'll notice, that when you've just typed in a command, the prompt disappears as your system is preoccupied and not quite ready for another command.

Now that we've started a Unix terminal, let's start off by using the Unix commands cd and ls to get familiarized with how to navigate through a file system.

Navigating through the file system with cd and ls from current working directory, ~ and /

There are 3 starting points, from where one can navigate through the file system. Your current working directory, home directory (~) and root directory (/) . It's a good idea, to consider where your files of interest are located with respect to ~ and / as well as your current working directory.

When you start your Unix terminal, your current working directory is by default the /home directory. This should be indicated with ~ in the command prompt.

Prompt$ ls

will display files and directories in your current working directory. Sometimes, there are hidden files or directories.

Prompt$ ls -a

will display all files and directories (including the hidden ones) in your current working directory. The command option a stands for all.

From your current working directory you can move to other directories. For example, if you're currently in your home directory, ~, and one of the connected pathways is called 'workwork',

Prompt$ cd workwork

will change your current directory to ~/workwork. If there had been a directory within /workwork called 'morework',

Prompt$ cd workwork/morework

would change your current directory to ~/workwork/morework. This way it's possible to change to faraway directories. It is also possible to change directories in a backwards fashion. If you're currently in the directory ~/workwork/morework/somuchwork,

Prompt$ cd ../../

would change your directory to ~/workwork.

In this sort of navigation, you navigate from your current working directory and use what's termed as a relative path. This just means, that you don't need to specify the exact path (starting from the root directory) of you where you want to go, as part of this path is already provided by your current working directory. You can, however, also navigate by using absolute paths. This can be done by starting at either your home or root directory, ~ and /. In figure 1.6, all three ways of navigation are illustrated. Essentially you can navigate from you home and root directory by typing,

Prompt$ cd ~/filepath
Prompt$ cd /filepath 
Figure 1.6: Navigating from current, home or root directory: The blue, green, and orange paths indicate navigation from current directory, home directory and root directory respectively. In the bottom part of the figure, the commands that would result in the corresponding navigation is shown.

Tips and keyboard shortcuts

In Unix, there are a lot of tricks and keyboard shortcuts which can be very handy.

  • Tab key

The 'tab key' is useful if you're working with directories or files that have long names. Basically, when you press 'tab', the shell will fill in filenames or directories for you. For example, datafile1 has a pretty long name and it would be annoying if you had to type in the whole name everytime.

Prompt$ cat 'P'

then type 'tab' and the result would be

Prompt$ cat Pseudomonas_Aeruginosa_16SrRNA.gb

If there had been more than one file starting with character 'P', however, the shell wouldn't have been able to guess the filename. But this just means you have to be a little more specific by filling in more characters before pressing tab.

  • Arrow keys

By using the arrow keys you can go back to commands executed earlier. This is useful if you've executed a long command but you've made a small error so it doesn't work. Arrows keys allow you to go back to your previous command, correct the error and execute the command again.

  • Ctrl-U and Ctrl-Y.

If you're in the middle of typing a long command in the command line and you forget how a command option works, you don't want to delete the entire command line so you can test that command option. You could copy the entire command line and paste it back when you've figured out how the command option works, but there's an easier way. Ctrl-U saves the command line, and when you've figured out how the command option works, Ctrl-Y pastes the command line back in.

Exercise 1: Getting Familiar with Unix Commands

1. Get an idea of where you are by using pwd or ls.

Prompt$ pwd 

will output the current working directory.

Prompt$ ls 

will list the directories and files in the current working directory. Using ls to navigate through a file system is generally easier than using pwd.

2. Try moving around the file system using cd.

Prompt$ cd <DIRECTORY>

will change your current working directory to the specified directory. You can move to faraway directories if you provide a longer filepath. If you're using Ubuntu WLS, you can find your desktop in the directory, /mnt/c/Users/Your user name/Desktop.

3. Go to your dekstop and make a directory which will be your workplace using mkdir. You can call it whatever you like.

Prompt$ mkdir <DIRECTORY>

will create DIRECTORY. If you specify a filepath, mkdir filepath/DIRECTORY, you can create directories in other places as well.
4. Try making and then removing a directory using rmdir.

Prompt$ rmdir <DIRECTORY>

will remove DIRECTORY. Just as with mkdir you can remove directories in other places by specifying a filepath.

5. Figure out the current date and user with date and whoami

Prompt$ date

will output the current date.

Prompt$ whoami

will output the current user.

Exercise 2: Navigating through the file system

In this exercise we're gonna make a bunch of directories and navigate through them. This will be important in the later sessions when you need to move files between distant directories.

Figure 1.7: Ex2 Branch of directories
  1. Make a branch of directories like the one shown in figure 1.7. (Hint 1 and Hint 2)
  2. Move to AB
  3. Go from AB to A7
  4. Go from A7 to A5
  5. Go from A5 to B5
  6. Go from B5 to B7
  7. Go from B7 to AB
  8. Create a link (shortcut) from AB to B7 called ABtoB7. (Hint 3)
  9. While in B7, list the directories in B1. (Hint 4)
  10. While in B7, list the directories in A1.

Hint 1: mkdir can create multiple directories in one line.
Hint 2: Directories within directories (subdirectories) can be made by specifying the directory file path to where you want to make a directory. For example, if you wanted to make a directory called 'Hello' within a directory called -- current working directory/My/Name/is/Bob/ -- you would use the command mkdir My/Name/is/Bob/Hello. But this only works if the directory already exists! If the directory -- /My/Name/is/Bob/ -- doesn't exist, you will get an error message. To solve this, you can instead use mkdir -p My/Name/is/Bob/Hello. The -p is an option for mkdir and is short for 'parents'.
Hint 3: Use the command ln with the command line option s.
Hint 4: ls can be used to list directories or files in other directories than your current directory. But this requires that you specify the path to the directory. In this case, there are 3 file paths that you could make.
You can make a backwards file path, using ../. For example, typing ls ../ would list the files/directories in the previous directory and ls ../../ would list the files/directories previous-previous directory.