{ "cells": [ { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "# Manipulating files and directories with the command line" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## mv: moving and renaming files\n", "\n", "Since we're currently doing a command line tutorial, let's go into that directory and see what is there.\n", "\n", " cd command_line_recitation\n", " ls\n", "\n", "We see that we have a directory called `sequences`, as well as a FASTA file named `some sequence.fasta`. This file name has the annoying space in it. We would like to rename it something without a space, say `some_sequence.fasta`. To do this, we us the **`mv`** command, short for \"move\". We enter `mv`, followed by the name of the file we want to rename, and then its new name. \n", "\n", " mv some sequence.fasta some_sequence.fasta\n", "\n", "Uh-oh! That gave us some strange output, talking about the usage of `mv`. This is because the space in the file `some sequence.fasta` was interpreted as a gap between arguments of the `mv` command. To specify that the space is part of the file name, we need to use an **escape character**. The escape character for macOS or Linux is `\\`. With Windows, you can use a caret `^` as an escape character or you can enclose the file name with a space in single quotes. The space following the escape character is not considered as an argument separator. To change the name do the following.\n", "\n", "- macOS or Linux: \n", " `mv some\\ sequence.fasta some_sequence.fasta` \n", " or \n", " `mv 'some sequence.fasta' some_sequence.fasta` \n", "\n", "- Windows: \n", " `mv 'some sequence.fasta' some_sequence.fasta`\n", "\n", "You will also work with some special directories, such as the one for you team's repo. These directories are managed by Git, that will synchronize them to your github repo. As you synchronize them according to the version of the files you have, they are called \"under version control\".\n", "When you are working on files under version control, you should precede the `mv` command with `git`. That way, Git will keep track of the naming changes you made. So, you would do this:\n", "\n", "- macOS or Linux: \n", " `git mv some\\ sequence.fasta some_sequence.fasta` \n", " or \n", " `git mv 'some sequence.fasta' some_sequence.fasta` \n", " \n", "- Windows: \n", " `git mv 'some sequence.fasta' some_sequence.fasta`\n", "\n", "Now, we probably want this file in the `sequences` directory. We can also move files into directories (without changing their file names) using the `mv` command.\n", "\n", " mv some_sequence.fasta sequences/\n", "\n", "The trailing slash is not necessary, but I always include it out of habit to remind myself that I am moving a file to a directory. (Again, if these files were under version control, you would precede the above command with `git`.)\n", "\n", "Now let's go into the `sequences` directory and see what we have.\n", "\n", " cd sequences\n", " ls\n", "\n", "We see that `some_sequence.fasta` is there, along with other FASTA files." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Word to the wise: **NO SPACES**\n", "\n", "Look at what is in the directory using `ls`.\n", "\n", " ls\n", "\n", "The directory `command_line_recitation/` has some files that will help us through this lesson. Note that there are no spaces in the directory name. **In general, you avoid spaces in directory and file names**, even though your operating system often has them in there. Trust me on this, they can make things a total mess, especially on the command line, since a space also separates commands. Really. **NO SPACES.**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploring file content\n", "\n", "We would like to see what is in the sequence files. Bash offers various ways to display the content of files. We'll look at the genome of the dengue virus in the file `dengue.fasta`. There are lots of ways to do it. We'll start with `less`. It got its name because it is more feature-rich than `more`, which was used to look at files before `less` came to be. (\"`less` is `more`,\" get it?) It allows using the arrow up and arrow down keys traverse up or down by line. It also allows scrolling by touchpad or mouse. Since it doesn't require the whole file to be read before displaying the top content, it's ideal for larger files.\n", "\n", "- macOS or Linux:\n", " `less dengue.fasta`\n", "\n", "- Windows:\n", " `more dengue.fasta`\n", " \n", "It also supports searching initiated by \"/\" followed by the query: `/AAAA`.\n", "You can move down by a number of lines by \"`:`\" followed by the number of lines: `:40`.\n", "\n", "To show line numbers, type `-N`. Other useful commands are `shift+g` will go to the end of the file and `gg` to the beginning. To exit `less` or `more`, hit `q`.\n", "\n", "We'll now look at several other ways to look at files. Just substitute them for `less` in the above command.\n", "\n", "### cat\n", "`cat` prints the entire file to the standard output (terminal). This is especially useful if the files are very small. Windows users, use `!type` instead of `cat`.\n", "\n", "### head\n", "`head` just prints the top lines of the file to the standard output. The default can be changed:\n", "\n", " head -5 \n", "\n", "This will print the first 5 lines to the standard output. Windows users, note alternative command: `gc myfile.txt -head 5`.\n", "\n", "### tail\n", "Like `head`, but for the last lines of the file. Windows users, note alternative command: `gc myfile.txt -tail 5`.\n", "\n", "You will notice that `less` does not show the text in the terminal once you exit, whereas `cat`, `head`, and `tail` do. This might be useful if you need something from the file for your next command. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Copying files and directories: cp\n", "\n", "If you want to retain a copy of the folder/file in the original folder you can use the copy command `cp`. It works straightforwardly with files. Applied to directories it requires a **flag**: `cp -r`, meaning \"recursive.\" A **flag** typically begins with a hyphen (`-`) and gives the command some extra directions on how you want to do things. In this case, we are telling `cp` to work recursively.\n", "\n", "Let's have a look at the `cp` command in action.\n", "\n", " cp dengue.fasta copy_of_dengue.fasta\n", "\n", "Maybe we want a copy of the entire `sequences` directory. To do that, we will `cd` one directory up to the `command_line_recitation` directory.\n", "\n", " cd ..\n", "\n", "We went up one directory using `..`. This is an example of a **relative path**. The current directory is \"`.`\", \"`../..`\" is two directories up, \"`../../..`\" is three directories up, and so on. This is very very useful when navigating directory structures. Now let's try copying an entire directory with the `-r` flag.\n", "\n", " cp -r sequences copy_of_sequences\n", "\n", "We can also rename directories with the `mv` command. Let's rename `copy_of_sequences` to `sequences_copy`. This is silly, but illustrates how things work.\n", "\n", " mv copy_of_sequences sequences_copy\n", " \n", "One more thing... `rsync` is a generally better version of `cp` and it's available on most distributions and OSes. Knowing that `cp` exists is necessary, using it everyday, not so much.\n", "`rsync -avzP` is a great setup where your files will be copied faster (thanks to compression), while maintaining all their properties and showing you progress." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Removing files and directories with rm\n", "\n", "Yes, some of the things we just did are silly. We have no need for having a copy of a given sequence or a copy of the whole sequences directory. We can clean things up by deleting them. First, let's get rid of our copy of the dengue sequence. Let's `cd` into the sequences directory and make sure it's there.\n", "\n", " cd sequences\n", " ls\n", "\n", "Now let's remove the file and verify it is gone.\n", "\n", " rm copy_of_dengue.fasta\n", " ls\n", "\n", "And poof! It's gone! And I mean gone. It is pretty much irrecoverable. **Warning**: `rm` is a wrecking ball. It will destroy any files you have that do not have restrictive permissions. This is so important, I will say it again.\n", "\n", "
\n", "\n", "**rm is unforgiving.**\n", " \n", "
\n", "\n", "Therefore, I always like to use the `-i` flag, which means that `rm` will ask me if I'm sure before deletion.\n", "\n", " rm -i some_sequence.fasta\n", "\n", "You will get a prompt. Answer \"`n`\" if you do not want to delete it.\n", "\n", "Now, cd into the higher directory `cd ../`, and let's use `rm` to remove an entire directory. To do this, we need to use the `-r` flag.\n", "\n", " rm -r sequences_copy\n", " \n", "In the same way, copying onto a previously existing file with `cp` or `rsync` is also irreversible, so be very careful when using them too!\n", "`cp -i` also exists and, if you are unsure about what you might be doing, choose this over `rsync`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "rm -
\n", "\n", "*Copyright note: In addition to the copyright shown below, this recitation was developed based on materials from Axel Müller.*" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 4 }