Unix answers: Difference between revisions

From 22126
Jump to navigation Jump to search
(Created page with " 1. Use a text editor to (nedit/gedit/komodo/textwrangler) to create a file mycommands.txt where you write all commands and observations you do in the following exercises. Use copy/paste to copy the commands. Note: There are more standard text editors than nedit, etc. Examples are emacs, xemacs, vi, vim, and pico. Make sure that we can easily see which exercise you attempt to solve. 2. First list the files in the directory. <pre> ls </pre> 3. Copy ex1.acc to myfile.ac...")
 
No edit summary
 
Line 1: Line 1:
== Possible Answers to UNIX Exercises ==


1. Use a text editor to (nedit/gedit/komodo/textwrangler) to create a file mycommands.txt where you write all commands and observations you do in the following exercises. Use copy/paste to copy the commands.
1. Use any plain-text editor (nano, vim, emacs, Notepad++, TextEdit in plain text mode, etc.) to create ''mycommands.txt'' and paste all commands you run.
Note: There are more standard text editors than nedit, etc. Examples are emacs, xemacs, vi, vim, and pico.
Make sure it is clear which exercise each section belongs to.
Make sure that we can easily see which exercise you attempt to solve.


2. First list the files in the directory.
2. First list the files in the directory.
Line 19: Line 19:


<pre>
<pre>
cat ex1.acc  
cat ex1.acc
cat myfile.acc  
cat myfile.acc
paste  ex1.acc myfile.acc  
diff ex1.acc myfile.acc
diff  ex1.acc myfile.acc
# or:
md5sum ex1.acc myfile.acc
md5sum ex1.acc myfile.acc
sha256sum  ex1.acc myfile.acc  
</pre>
</pre>


 
5. Copy ex1.dat to myfile.acc (overwrite).
5. Copy ex1.dat to myfile.acc.


<pre>
<pre>
cp ex1.dat myfile.acc  
cp ex1.dat myfile.acc
</pre>
</pre>


6. Check that the content of myfile.acc changed.
6. Check that the content of myfile.acc changed.


same as above
<pre>
diff ex1.dat myfile.acc
# or view:
head myfile.acc
</pre>


7. Delete myfile.acc.
7. Delete myfile.acc.
Line 44: Line 46:
</pre>
</pre>


8. Make a directory test and move the three files to it.
8. Make a directory ''test'' and move the three files into it.


<pre>
<pre>
mkdir test
mkdir test
 
mv ex1.acc ex1.dat orphans.sp test/
mv * test/
#or
mv ex1.acc test/
mv ex1.dat test/
mv orphans.sp test/
</pre>
</pre>


 
9. Make a directory ''data'' and move the three files to that instead.
9. Make a directory data and move the three files to that instead.


<pre>
<pre>
mkdir data/
mkdir data
mv test/* data/
mv test/* data/
</pre>
</pre>
Line 67: Line 63:


<pre>
<pre>
rmdir test/
rmdir test
</pre>
</pre>


 
11. Change directory to data and confirm that you succeeded. Then go back.
11. Change directory to data and confirm that you succeded. Go back to the home directory or work directory afterwards.


<pre>
<pre>
cd data/
cd data
pwd
pwd
 
ls
cd -
cd -
#or
# or:
cd ~  
cd ~
</pre>
</pre>


12. Make three new directories newtest - one inside the other, like a russian doll.
12. Make three nested directories “newtest” like Russian dolls.


<pre>
<pre>
mkdir newtest
mkdir -p newtest/one/two
cd newtest
mkdir newtest
cd newtest
mkdir newtest
cd newtest
#to visualize:
pwd
</pre>
</pre>


13. Move the data directory to the innermost newtest directory.
13. Move the ''data'' directory to the innermost newtest directory.


<pre>
<pre>
cd ..
mv data newtest/one/two/
cd ..
cd ..
 
#or
cd ../../..
 
mv data/ newtest/newtest/newtest/
</pre>
</pre>


 
14. Confirm the files are inside newtest/one/two/data.
14. Confirm that the three files are moved along with the data directory.


<pre>
<pre>
ls newtest/newtest/newtest/data/
ls newtest/one/two/data
</pre>
</pre>


15. Copy the three files to your home (your top directory).
15. Copy the three files back to your home directory.


<pre>
<pre>
cp newtest/newtest/newtest/data/* .
cp newtest/one/two/data/* .
</pre>
</pre>


16. Remove all newtest directories and data in the with a single command.
16. Remove all newtest directories and data with a single safe command.


<pre>
<pre>
rm -vr newtest/
rm -r newtest
#v for verbose, fun to see what happens
#r for recursive
</pre>
</pre>


Line 132: Line 110:


<pre>
<pre>
wc -l ex1.*
wc -l ex1.acc ex1.dat
#or
wc -l ex1.acc
wc -l ex1.dat
</pre>
</pre>


18. Concatenate ex1.acc and ex1.dat in the file ex1.tot, i.e. copy the content of two files into one new file. Verify that all gene IDs comes first followed by numerical data.
18. Concatenate acc and dat files into ex1.tot.


<pre>
<pre>
Line 144: Line 119:
</pre>
</pre>


19. Merge/Paste ex1.acc and ex1.dat together in ex1.tot, thus destroying the old file. Verify that corresponding gene IDs and numerical data are put on the same line. as the data.
19. Merge/paste acc and dat side-by-side.


<pre>
<pre>
paste ex1.acc ex1.dat > ex1.tot
paste ex1.acc ex1.dat > ex1.tot
</pre>


Verify:
<pre>
head ex1.acc
head ex1.acc
head ex1.dat
head ex1.dat
head ex1.tot
head ex1.tot
</pre>


20. Extract column 1 and 5 from ex1.tot into ex1.res.


tail ex1.acc
<pre>
tail ex1.dat
cut -f1,5 ex1.tot > ex1.res
tail ex1.tot
</pre>
</pre>


 
21. Find the 3 highest values (column 2) from ex1.res.
Note: Some versions of MobaXterm has an unfortunate bug in the command neded. You still need to do the exercise but you can get the right result here for your use in the following exercises.
 
 
20. Extract (cut) SwissProt ID and 3nd numerical data (column 1 and 5) from ex1.tot. Put results into a file ex1.res.


<pre>
<pre>
cut -f 1,5 ex1.tot > ex1.res
sort -k2,2nr ex1.res | head -3
</pre>
</pre>


21. Find the 3 SwisProt ID's in ex1.res which have the largest number(s) in column 2, i.e. the top 3 entries.
22. Count GenBank accessions in orphans.sp (should be 85).
 
A more accurate regex:


<pre>
<pre>
sort -k2gr,2 ex1.res|head -3
grep -Eo "[A-Z]{1,2}[0-9]{5,6}" orphans.sp | wc -l
#or
sort -k2nr,2 ex1.res|head -3
</pre>
</pre>


#be wary of the difference between -g and -n https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html
23. Count human genes with SwissProt IDs; count how many are hypothetical.
 
21. Find the lines (using grep) in orphans.sp which contain a GenBank accession number. There are 85, verify this. Note: An accession number is one or two capital letters and looks like this 'AB000114.CDS.1', the .CDS. part is kind of optional.


<pre>
<pre>
grep -c -E  "[A-Z]{2}[0-9]{5,6}" orphans.sp
grep "_HUMAN" orphans.sp | wc -l
#or
grep "_HUMAN" orphans.sp | grep -i "HYPOTHETICAL" | wc -l
grep -c    "[A-Z][A-Z][0-9][0-9][0-9][0-9][0-9]" orphans.sp
#or
grep     "[A-Z][A-Z][0-9][0-9][0-9][0-9][0-9]" orphans.sp|wc -l
</pre>
</pre>


 
24. Count rat genes and precursors.
22. How many human genes with SwissProt IDs in orphans.sp exist ? How many of those are hypothetical ? (11) Note: A Swissprot ID looks like 'PARG_HUMAN' or 'TF1A_MOUSE', with the gene being before the underscore and the organism after the underscore.


<pre>
<pre>
grep -c  "_HUMAN" orphans.sp
grep "_RAT" orphans.sp | wc -l
#207
grep "_RAT" orphans.sp | grep -i PRECURSOR | wc -l
grep   "_HUMAN" orphans.sp|grep -c HYPOTHETICAL
#11
</pre>
</pre>


25. Split ex1.res into positive and negative values.


23. How many genes belong to the rat, and how many of those are precursors ?
Cleaner version than cat | grep:


<pre>
<pre>
grep   "_RAT" orphans.sp|wc -l
grep "-" ex1.res > ex1.neg
#51
grep -v "-" ex1.res > ex1.pos
grep   "_RAT" orphans.sp |grep PRECURSOR |wc -l
#9
</pre>
</pre>


24. From the file ex1.res find the lines with positive numbers and put then into ex1.pos. The lines with negative number go into ex1.neg.
26. Arithmetic directly in the shell.


<pre>
<pre>
cat ex1.res  |grep "-" > ex1.neg
echo $(( (356+51)*123 - 12765 ))
cat ex1.res  |grep -v "-" > ex1.pos
echo $(( ((356+51)*123 - 12765) / 56 ))
</pre>
</pre>
25. Write a shell script that solves exercise 19-24, with the exercises clearly separated in both the script and the output. The output should be explained. "42" is unclear, but "Number of genes: 42" is clear. This should be straight forward (but long), especially since you took notes (exercise 1).
26. Write a shell script (which is simply just a list of unix commands in a file) that puts all the positive numbers in the file ex1.dat into a file ex1.pos2, and all the negative numbers into a file ex1.neg2. Column position does not matter. The script must clean up after itself, so if any temporary files are used, they must be deleted as the last action. Remember to put the date and a description of the files in the first lines of the resulting output files.

Latest revision as of 15:46, 19 November 2025

Possible Answers to UNIX Exercises

1. Use any plain-text editor (nano, vim, emacs, Notepad++, TextEdit in plain text mode, etc.) to create mycommands.txt and paste all commands you run. Make sure it is clear which exercise each section belongs to.

2. First list the files in the directory.

ls

3. Copy ex1.acc to myfile.acc.

cp ex1.acc myfile.acc

4. Look at the content of both files to ensure they are identical.

cat ex1.acc
cat myfile.acc
diff ex1.acc myfile.acc
# or:
md5sum ex1.acc myfile.acc

5. Copy ex1.dat to myfile.acc (overwrite).

cp ex1.dat myfile.acc

6. Check that the content of myfile.acc changed.

diff ex1.dat myfile.acc
# or view:
head myfile.acc

7. Delete myfile.acc.

rm myfile.acc

8. Make a directory test and move the three files into it.

mkdir test
mv ex1.acc ex1.dat orphans.sp test/

9. Make a directory data and move the three files to that instead.

mkdir data
mv test/* data/

10. Remove test directory.

rmdir test

11. Change directory to data and confirm that you succeeded. Then go back.

cd data
pwd
ls
cd -
# or:
cd ~

12. Make three nested directories “newtest” like Russian dolls.

mkdir -p newtest/one/two

13. Move the data directory to the innermost newtest directory.

mv data newtest/one/two/

14. Confirm the files are inside newtest/one/two/data.

ls newtest/one/two/data

15. Copy the three files back to your home directory.

cp newtest/one/two/data/* .

16. Remove all newtest directories and data with a single safe command.

rm -r newtest

17. Count the lines in ex1.acc and ex1.dat.

wc -l ex1.acc ex1.dat

18. Concatenate acc and dat files into ex1.tot.

cat ex1.acc ex1.dat > ex1.tot

19. Merge/paste acc and dat side-by-side.

paste ex1.acc ex1.dat > ex1.tot

Verify:

head ex1.acc
head ex1.dat
head ex1.tot

20. Extract column 1 and 5 from ex1.tot into ex1.res.

cut -f1,5 ex1.tot > ex1.res

21. Find the 3 highest values (column 2) from ex1.res.

sort -k2,2nr ex1.res | head -3

22. Count GenBank accessions in orphans.sp (should be 85).

A more accurate regex:

grep -Eo "[A-Z]{1,2}[0-9]{5,6}" orphans.sp | wc -l

23. Count human genes with SwissProt IDs; count how many are hypothetical.

grep "_HUMAN" orphans.sp | wc -l
grep "_HUMAN" orphans.sp | grep -i "HYPOTHETICAL" | wc -l

24. Count rat genes and precursors.

grep "_RAT" orphans.sp | wc -l
grep "_RAT" orphans.sp | grep -i PRECURSOR | wc -l

25. Split ex1.res into positive and negative values.

Cleaner version than cat | grep:

grep "-" ex1.res > ex1.neg
grep -v "-" ex1.res > ex1.pos

26. Arithmetic directly in the shell.

echo $(( (356+51)*123 - 12765 ))
echo $(( ((356+51)*123 - 12765) / 56 ))