Using the shell to create a directory is no different than using a file explorer.\n",
"If you open the current directory using your operating system's graphical file explorer,\n",
"the thesis
directory will appear there too.\n",
"While they are two different ways of interacting with the files,\n",
"the files and directories themselves are the same.
Complicated names of files and directories can make your life painful\n", "when working on the command line. Here we provide a few useful\n", "tips for the names of your files.
\n", "Whitespaces can make a name more meaningful\n",
" but since whitespace is used to break arguments on the command line\n",
" it is better to avoid them in names of files and directories.\n",
" You can use -
or _
instead of whitespace.
-
(dash).Commands treat names starting with -
as options.
.
(period or 'full stop'), -
(dash) and _
(underscore).Many other characters have special meanings on the command line.\n", " We will learn about some of these during this lesson.\n", " There are special characters that can cause your command to not work as\n", " expected and can even result in data loss.
\n", "If you need to refer to names of files or directories that have whitespace\n",
"or another non-alphanumeric character, you should surround the name in quotes (\"\"
).
When we say, \"nano
is a text editor,\" we really do mean \"text\": it can\n",
"only work with plain character data, not tables, images, or any other\n",
"human-friendly media. We use it in examples because it is one of the \n",
"least complex text editors. However, because of this trait, it may \n",
"not be powerful enough or flexible enough for the work you need to do\n",
"after this workshop. On Unix systems (such as Linux and Mac OS X),\n",
"many programmers use Emacs or\n",
"Vim (both of which require more time to learn), \n",
"or a graphical editor such as\n",
"Gedit. On Windows, you may wish to\n",
"use Notepad++. Windows also has a built-in\n",
"editor called notepad
that can be run from the command line in the same\n",
"way as nano
for the purposes of this lesson.
No matter what editor you use, you will need to know where it searches\n", "for and saves files. If you start it from the shell, it will (probably)\n", "use your current working directory as its default location. If you use\n", "your computer's start menu, it may want to save files in your desktop or\n", "documents directory instead. You can change this by navigating to\n", "another directory the first time you \"Save As...\"
\n", "\n", "The Control key is also called the \"Ctrl\" key. There are various ways\n", "in which using the Control key may be described. For example, you may\n", "see an instruction to press the Control key and, while holding it down,\n", "press the X key, described as any of:
\n", "Control-X
Control+X
Ctrl-X
Ctrl+X
^X
C-x
In nano, along the bottom of the screen you'll see ^G Get Help ^O WriteOut
.\n",
"This means that you can use Control-G
to get help and Control-O
to save your\n",
"file.
We have seen how to create text files using the nano
editor.\n",
"Now, try the following command in your home directory:
$ cd # go to your home directory\n", "$ touch my_file.txt\n", "
What did the touch command do?\n", " When you look at your home directory using the GUI file explorer,\n", " does the file show up?
\n", "Use ls -l
to inspect the files. How large is my_file.txt
?
When might you want to create a file this way?
\n", "When you inspect the file with 'ls -l', note that the size of\n", " 'my_file.txt' is 0kb. In other words, it contains no data.\n", " If you open 'my_file.txt' using your text editor it is blank.
\n", "Some programs do not generate output files themselves, but\n", " instead require that empty files have already been generated.\n", " When the program is run, it searches for an existing file to\n", " populate with its output. The touch command allows you to\n", " efficiently generate a blank text file to be used by such\n", " programs.
\n", "The Unix shell doesn't have a trash bin that we can recover deleted\n", "files from (though most graphical interfaces to Unix do). Instead,\n", "when we delete files, they are unhooked from the file system so that\n", "their storage space on disk can be recycled. Tools for finding and\n", "recovering deleted files do exist, but there's no guarantee they'll\n", "work in any particular situation, since the computer may recycle the\n", "file's disk space right away.
\n", "\n", "What happens when we type rm -i thesis/quotations.txt
?\n",
"Why would we want this protection when using rm
?
$ rm: remove regular file 'thesis/quotations.txt'?\n",
"
The -i option will prompt before every removal. \n", "The Unix shell doesn't have a trash bin, so all the files removed will disappear forever. \n", "By using the -i flag, we have the chance to check that we are deleting only the files that we want to remove.
\n", "\n", "Removing the files in a directory recursively can be a very dangerous\n",
"operation. If we're concerned about what we might be deleting we can\n",
"add the \"interactive\" flag -i
to rm
which will ask us for confirmation\n",
"before each step
$ rm -r -i thesis\n", "rm: descend into directory ‘thesis’? y\n", "rm: remove regular file ‘thesis/draft.txt’? y\n", "rm: remove directory ‘thesis’? y\n", "
This removes everything in the directory, then the directory itself, asking\n", "at each step for you to confirm the deletion.
\n", "\n", "After running the following commands,\n",
"Jamie realizes that she put the files sucrose.dat
and maltose.dat
into the wrong folder:
$ ls -F\n",
" analyzed/ raw/\n",
"$ ls -F analyzed\n",
"fructose.dat glucose.dat maltose.dat sucrose.dat\n",
"$ cd raw/\n",
"
Fill in the blanks to move these files to the current folder\n", "(i.e., the one she is currently in):
\n", "$ mv ___/sucrose.dat ___/maltose.dat ___\n", "
$ mv ../analyzed/sucrose.dat ../analyzed/maltose.dat .\n", "
Recall that ..
refers to the parent directory (i.e. one above the current directory)\n",
"and that .
refers to the current directory.
You may have noticed that all of Nelle's files' names are \"something dot\n",
"something\", and in this part of the lesson, we always used the extension\n",
".txt
. This is just a convention: we can call a file mythesis
or\n",
"almost anything else we want. However, most people use two-part names\n",
"most of the time to help them (and their programs) tell different kinds\n",
"of files apart. The second part of such a name is called the\n",
"filename extension, and indicates\n",
"what type of data the file holds: .txt
signals a plain text file, .pdf
\n",
"indicates a PDF document, .cfg
is a configuration file full of parameters\n",
"for some program or other, .png
is a PNG image, and so on.
This is just a convention, albeit an important one. Files contain\n", "bytes: it's up to us and our programs to interpret those bytes\n", "according to the rules for plain text files, PDF documents, configuration\n", "files, images, and so on.
\n", "Naming a PNG image of a whale as whale.mp3
doesn't somehow\n",
"magically turn it into a recording of whalesong, though it might\n",
"cause the operating system to try to open it with a music player\n",
"when someone double-clicks it.
Suppose that you created a .txt
file in your current directory to contain a list of the\n",
"statistical tests you will need to do to analyze your data, and named it: statstics.txt
\n",
"After creating and saving this file you realize you misspelled the filename! You want to\n",
"correct the mistake, which of the following commands could you use to do so?
cp statstics.txt statistics.txt
mv statstics.txt statistics.txt
mv statstics.txt .
cp statstics.txt .
What is the output of the closing ls
command in the sequence shown below?
$ pwd\n",
"
/Users/jamie/data\n", "
$ ls\n", "
proteins.dat\n", "
$ mkdir recombine\n", "$ mv proteins.dat recombine/\n", "$ cp recombine/proteins.dat ../proteins-saved.dat\n", "$ ls\n", "
proteins-saved.dat recombine
recombine
proteins.dat recombine
proteins-saved.dat
We start in the /Users/jamie/data
directory, and create a new folder called recombine
.\n",
"The second line moves (mv
) the file proteins.dat
to the new folder (recombine
).\n",
"The third line makes a copy of the file we just moved. The tricky part here is where the file was\n",
"copied to. Recall that ..
means \"go up a level\", so the copied file is now in /Users/jamie
.\n",
"Notice that ..
is interpreted with respect to the current working\n",
"directory, not with respect to the location of the file being copied.\n",
"So, the only thing that will show using ls (in /Users/jamie/data
) is the recombine folder.
proteins-saved.dat
is located at /Users/jamie
proteins.dat
is located at /Users/jamie/data/recombine
proteins-saved.dat
is located at /Users/jamie
Jamie is working on a project and she sees that her files aren't very well\n", "organized:
\n", "$ ls -F\n", "
analyzed/ fructose.dat raw/ sucrose.dat\n", "
The fructose.dat
and sucrose.dat
files contain output from her data\n",
"analysis. What command(s) covered in this lesson does she need to run so that the commands below will\n",
"produce the output shown?
$ ls -F\n", "
analyzed/ raw/\n", "
$ ls analyzed\n", "
fructose.dat sucrose.dat\n", "
mv *.dat analyzed\n", "
Jamie needs to move her files fructose.dat
and sucrose.dat
to the analyzed
directory.\n",
"The shell will expand *.dat to match all .dat files in the current directory.\n",
"The mv
command then moves the list of .dat files to the \"analyzed\" directory.
For this exercise, you can test the commands in the data-shell/data
directory.
In the example below, what does cp
do when given several filenames and a directory name?
$ mkdir backup\n", "$ cp amino-acids.txt animals.txt backup/\n", "
In the example below, what does cp
do when given three or more file names?
$ ls -F\n", "
amino-acids.txt animals.txt backup/ elements/ morse.txt pdb/ planets.txt salmon.txt sunspot.txt\n", "
$ cp amino-acids.txt animals.txt morse.txt \n", "
If given more than one file name followed by a directory name (i.e. the destination directory must \n",
"be the last argument), cp
copies the files to the named directory.
If given three file names, cp
throws an error because it is expecting a directory\n",
"name as the last argument.
cp: target ‘morse.txt’ is not a directory\n", "
You're starting a new experiment, and would like to duplicate the file\n", "structure from your previous experiment without the data files so you can\n", "add new data.
\n", "Assume that the file structure is in a folder called '2016-05-18-data',\n",
"which contains a data
folder that in turn contains folders named raw
and\n",
"processed
that contain data files. The goal is to copy the file structure\n",
"of the 2016-05-18-data
folder into a folder called 2016-05-20-data
and\n",
"remove the data files from the directory you just created.
Which of the following set of commands would achieve this objective?\n", "What would the other commands do?
\n", "$ cp -r 2016-05-18-data/ 2016-05-20-data/\n", "$ rm 2016-05-20-data/raw/*\n", "$ rm 2016-05-20-data/processed/*\n", "
$ rm 2016-05-20-data/raw/*\n", "$ rm 2016-05-20-data/processed/*\n", "$ cp -r 2016-05-18-data/ 2016-5-20-data/\n", "
$ cp -r 2016-05-18-data/ 2016-05-20-data/\n", "$ rm -r -i 2016-05-20-data/\n", "
The first set of commands achieves this objective.\n",
"First we have a recursive copy of a data folder.\n",
"Then two rm
commands which remove all files in the specified directories.\n",
"The shell expands the '*' wild card to match all files and subdirectories.
The second set of commands have the wrong order: \n", "attempting to delete files which haven't yet been copied,\n", "followed by the recursive copy command which would copy them.
\n", "The third set of commands would achieve the objective, but in a time-consuming way:\n", "the first command copies the directory recursively, but the second command deletes\n", "interactively, prompting for confirmation for each file and directory.
\n", "\n", "