\documentclass{article} \usepackage{alltt} \usepackage{fullpage} \usepackage{moreverb} % \usepackage{hyperref} \usepackage{hevea} \ifhevea\@def@charset{UTF-8}\fi \input{local} \fulltrue %\newcommand{\NT}[1]{\(\langle\)\textit{#1}\(\rangle\)} \newcommand{\NT}[1]{\textit{#1}} \newcommand{\ARG}[1]{\texttt{\textit{#1}}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{document} \ifhevea\begin{rawhtml}
\end{rawhtml}\fi \ifhevea\else\bigskip\fi% \ifdraft% \begin{center}% {\Huge \ifhevea\red\fi DraftDraftDraftDraft}% \end{center}% \ifhevea\else \bigskip \fi \fi \ifhevea\begin{rawhtml}
\end{rawhtml}% \else \thispagestyle{empty} \fi% \SNIP{About Unison}{about}% \iftextversion \section*{Unison File Synchronizer %% \\ %% \ONEURL{http://www.cis.upenn.edu/\home{bcpierce}/unison} \\ Version \unisonversion } \else% \ifhevea\else \vspace*{2in} \fi% \begin{center}% \Huge{\ifhevea\black\else\bf \fi Unison File Synchronizer}% %% \ifhevea \\ \else \\[2ex] \fi %% \large %% \ONEURL{http://www.cis.upenn.edu/\home{bcpierce}/unison} \ifhevea \\ \else \\[2ex] \fi% \huge {\ifhevea\black\else\bf \fi User Manual and Reference Guide}% \ifhevea \\ \else \\[6ex] \fi% \LARGE% Version \unisonversion \\[4ex] % % \today % \large Copyright 1998-2023, Benjamin C. Pierce \end{center}% \fi% % % \ifhevea\begin{rawhtml}
\end{rawhtml}\fi \ifhevea\else\newpage\fi \TABLEOFCONTENTS \ifhevea\else\newpage\fi \SECTION{Overview}{overview}{ } \input{short} \ifhevea\else\bigskip\fi % \begin{quote} % {\bf\ifhevea\red\fi Warning:} The current implementation of Unison is % considered beta-test software. It is in daily use by quite a few % people, but there are still undoubtedly some bugs. If you choose to % use it to synchronize important data, please pay careful attention % to what it is doing! Also, the installation/setup procedure is not % yet as smooth as we want it to be. % \end{quote} \SECTION{Preface}{intro}{ } \TOPSUBSECTION{People}{people} \URL{http://www.cis.upenn.edu/\home{bcpierce}/}{Benjamin Pierce} leads the Unison project. % The current version of Unison was designed and implemented by \URL{http://www.research.att.com/\home{trevor}/}{Trevor Jim}, \URL{http://www.cis.upenn.edu/\home{bcpierce}/}{Benjamin Pierce}, and \URL{http://www.pps.jussieu.fr/\home{vouillon}/}{J\'{e}r\^{o}me Vouillon}, with \URL{http://alan.petitepomme.net/}{Alan Schmitt}, {Malo Denielou}, \URL{http://www.brics.dk/\home{zheyang}/}{Zhe Yang}, Sylvain Gommier, and Matthieu Goulay. % The Mac user interface was started by Trevor Jim and enormously improved by Ben Willmore. % Our implementation of the \URL{http://samba.org/rsync/}{rsync} protocol was built by \URL{http://www.eecs.harvard.edu/\home{nr}/}{Norman Ramsey} and Sylvain Gommier. It is based on \URL{http://samba.anu.edu.au/\home{tridge}/}{Andrew Tridgell}'s \URL{http://samba.anu.edu.au/\home{tridge}/phd\_thesis.pdf}{thesis work} and inspired by his \URL{http://samba.org/rsync/}{rsync} utility. % \finish{Our low-level fingerprinting implementation uses an algorithm % by Michael Rabin and incorporates some coding tricks from Andrei % Broder and Mike Burrows.} % The mirroring and merging functionality was implemented by Sylvain Roy, improved by Malo Denielou, and improved yet further by St\'ephane Lescuyer. % \URL{http://wwwfun.kurims.kyoto-u.ac.jp/\home{garrigue}/}{Jacques Garrigue} contributed the original Gtk version of the user interface; the Gtk2 version was built by Stephen Tse. % Sundar Balasubramaniam helped build a prototype implementation of an earlier synchronizer in Java. \URL{http://www.cis.upenn.edu/\home{ishin}/}{Insik Shin} and \URL{http://www.cis.upenn.edu/\home{lee}/}{Insup Lee} contributed design ideas to this implementation. \URL{http://research.microsoft.com/\home{fournet}/}{Cedric Fournet} contributed to an even earlier prototype. \TOPSUBSECTION{Obtaining Unison}{obtaining} \paragraph{Source code} Unison is primarily distributed as source code, which contains instructions in {\tt INSTALL.md}: \begin{quote} \ONEURL{https://github.com/bcpierce00/unison} \end{quote} \paragraph{Binaries} The Unison wiki contains information about builds done as part of Continuous Integration and other sources of binaries; read the entire wiki at: \begin{quote} \ONEURL{https://github.com/bcpierce00/unison/wiki} \end{quote} \TOPSUBSECTION{Community, Maintenance, and Development}{development} Many people use and contribute to Unison. This community has two main homes. \paragraph{Mailinglists} Most discussion is appropriate on one of the mailinglists: \begin{quote} \ONEURL{https://github.com/bcpierce00/unison/wiki/Mailing-Lists} \end{quote} \paragraph{Reporting Bugs} Bug reports and feature requests may be made after reading the guidelines: \begin{quote} \ONEURL{https://github.com/bcpierce00/unison/wiki/Reporting-Bugs-and-Feature-Requests} \end{quote} Help improving Unison is welcome; see {\tt CONTRIBUTING.md} in the sources. \TOPSUBSECTION{Copying}{copying} This file is part of Unison. Unison is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. Unison is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. The GNU General Public License can be found at \ONEURL{http://www.gnu.org/licenses}. A copy is also included in the Unison source distribution in the file {\tt COPYING}. \TOPSUBSECTION{Acknowledgements}{ack} Work on Unison has been supported by the National Science Foundation under grants CCR-9701826 and ITR-0113226, {\em Principles and Practice of Synchronization}, and by University of Pennsylvania's Institute for Research in Cognitive Science (IRCS). \SECTION{Upgrading}{upgrading}{upgrading} (This section is perhaps misplaced, but is early because it is far better to have at least skimmed it than to not know it exists.) Before upgrading, it is a good idea to run the {\em old} version one last time, to make sure all your replicas are completely synchronized. A new version of Unison will sometimes introduce a different format for the archive files used to remember information about the previous state of the replicas. In this case, the old archive will be ignored (not deleted --- if you roll back to the previous version of Unison, you will find the old archives intact), which means that any differences between the replicas will show up as conflicts that need to be resolved manually. As of version 2.52, Unison has a degree of backward and forward compatibility. This means three things. First, it is possible for local and remote machines to run a different version of Unison. Second, it is possible for local and remote machines to run a version (same or different) of Unison built with a different version of OCaml compiler (this has been problematic historically). Lastly, it is possible to upgrade Unison on the local machine (compiled with any OCaml version) and keep the existing archive. {\bf Upgrading from Unison 2.48 or 2.51}: If version interoperability requirements are followed then Unison 2.52 up to 2.53.8 can upgrade the archive created by Unison 2.48 to 2.51. To avoid rebuilding archive files when upgrading from a version older than 2.52, you must install version 2.52 or newer built with the same OCaml version as your previous version of Unison, and then run it at least once on each root. Doing so will upgrade the archive file. Upgrading directly to version newer than 2.53.8 is not supported; upgrade first to a version between 2.52 and 2.53.8 if you want to keep the archives. After upgrading the archive, you are free to swap the Unison 2.52 or newer executable to one compiled with a different version of OCaml. The archive file is no longer dependent on the compiler version. \SUBSECTION{Version interoperability}{interoperability} To ensure interoperability with different Unison versions on local and remote machines, and to upgrade from an earlier version {\em without rebuilding the archive files}, you have to remember these guidelines. Upgrading from an incompatible version, while possible and normal, will require fully scanning both roots, which can be time-consuming with big replicas. {\bf Unison 2.52 and newer} are compatible with: \begin{itemize} \item {\em Unison 2.52 or newer} (for as long as backwards compatibility is maintained in the newer versions). You do not have to pay any attention to OCaml compiler versions. \end{itemize} Additionally, {\bf Unison 2.52 up to 2.53.8 (included)} are compatible with: \begin{itemize} \item {\em Unison 2.51} if both versions are compiled with same OCaml compiler version (you can see which compiler version was used by running {\tt unison -version}). \item {\em Unison 2.48} if both versions are compiled with same OCaml compiler version. See special notes below. \end{itemize} \vspace{1em} \noindent {\bf Interoperability matrix} for quick reference: \vspace{1em} \begin{tabular}{r||c|c|c} Client versions & \multicolumn{3}{c}{Server versions} \\ & newer than 2.53.8 & 2.52 to 2.53.8 & older than 2.52 \\ \hline \hline newer than 2.53.8 & full interop & full interop & {\bf no interop} \\ \hline 2.52 to 2.53.8 & full interop & full interop & {\em see below} \\ \hline older than 2.52 & {\bf no interop} & {\em see below} & {\em see below} \\ \end{tabular} \vspace{1em} \begin{tabular}{r||c|c|c} Client versions & \multicolumn{3}{c}{Server versions} \\ & 2.52 to 2.53.8 & 2.51 & 2.48 \\ \hline \hline 2.52 to 2.53.8 & full interop & same OCaml version & same OCaml version \\ \hline 2.51 & same OCaml version & full interop & no interop \\ \hline 2.48 & same OCaml version* & no interop & full interop \\ \end{tabular} \vspace{2em} \noindent {\it Special notes for Unison 2.48:} \begin{itemize} \item Unison 2.48 does not show which OCaml compiler was used to compile it. If you do not have the option of re-compiling the 2.48 version, you have two alternatives. First (and most likely to succeed), see what is the version of the OCaml compiler in the same package repository where you installed Unison 2.48 from, then use Unison 2.52 compiled with that version. Second, you can just try Unison 2.52 executables compiled with different OCaml versions and see which one works with your copy of Unison 2.48. \item When running Unison 2.48 on the client machine with Unison 2.52 or newer on the server machine, you have to do some additional configuration. The Unison executable name on the server must start with \verb|unison-2.48| (just \verb|unison-2.48| is ok, as is \verb|unison-2.48.exe|, but also \verb|unison-2.48+ocaml-4.05|). If using TCP socket connection to the server then you're all set! If using {\tt ssh} then you have to add one of the following options to your profile or as a command-line argument on the client machine: \verb|-addversionno|; see \sectionref{remote}{Remote Usage}, or \verb|-servercmd|; see \sectionref{rshmeth}{Remote Shell Method}. \end{itemize} \SECTION{Tutorial}{tutorial}{tutorial} %\finish{Put a pointer somewhere in here to the typical profile in the % reference section.} \SUBSECTION{Preliminaries}{prelim} Unison can be used with either of two user interfaces: \begin{enumerate} \item a textual interface and \item a graphical interface \end{enumerate} The textual interface is more convenient for running from scripts and works on dumb terminals; the graphical interface is better for most interactive use. For this tutorial, you can use either. If you are running Unison from the command line, just typing {\tt unison} will select either the text or the graphical interface, depending on which has been selected as default when the executable you are running was built. You can force the text interface even if graphical is the default by adding {\tt -ui text}. The other command-line arguments to both versions are identical. The graphical version can also be run directly by clicking on its icon. For this tutorial, we assume that you're starting it from the command line. Unison can synchronize files and directories on a single machine, or between two machines on a network. (The same program runs on both machines; the only difference is which one is responsible for displaying the user interface.) If you're only interested in a single-machine setup, then let's call that machine the \CLIENT{}. If you're synchronizing two machines, let's call them \CLIENT{} and \SERVER. \SUBSECTION{Local Usage}{local} Let's get the client machine set up first and see how to synchronize two directories on a single machine. Ensure that unison is installed on your system. Create a small test directory {\tt a.tmp} containing a couple of files and/or subdirectories, e.g., \begin{verbatim} mkdir a.tmp touch a.tmp/a a.tmp/b mkdir a.tmp/d touch a.tmp/d/f \end{verbatim} Copy this directory to b.tmp: \begin{verbatim} cp -r a.tmp b.tmp \end{verbatim} Now try synchronizing {\tt a.tmp} and {\tt b.tmp}. (Since they are identical, synchronizing them won't propagate any changes, but Unison will remember the current state of both directories so that it will be able to tell next time what has changed.) Type: \begin{verbatim} unison a.tmp b.tmp \end{verbatim} (You may need to add \verb|-ui text|, depending how your unison binary was built.) \begin{textui} You should see a message notifying you that all the files are actually equal and then get returned to the command line. \end{textui} \begin{tkui} You should get a big empty window with a message at the bottom notifying you that all files are identical. Choose the Exit item from the File menu to get back to the command line. \end{tkui} Next, make some changes in a.tmp and/or b.tmp. For example: \begin{verbatim} rm a.tmp/a echo "Hello" > a.tmp/b echo "Hello" > b.tmp/b date > b.tmp/c echo "Hi there" > a.tmp/d/h echo "Hello there" > b.tmp/d/h \end{verbatim} Run Unison again: \begin{verbatim} unison a.tmp b.tmp \end{verbatim} This time, the user interface will display only the files that have changed. If a file has been modified in just one replica, then it will be displayed with an arrow indicating the direction that the change needs to be propagated. For example, \begin{verbatim} <--- new file c [f] \end{verbatim} \noindent indicates that the file {\tt c} has been modified only in the second replica, and that the default action is therefore to propagate the new version to the first replica. To {\bf f}ollow Unison's recommendation, press the ``f'' at the prompt. If both replicas are modified and their contents are different, then the changes are in conflict: \texttt{<-?->} is displayed to indicate that Unison needs guidance on which replica should override the other. \begin{verbatim} new file <-?-> new file d/h [] \end{verbatim} By default, neither version will be propagated and both replicas will remain as they are. If both replicas have been modified but their new contents are the same (as with the file {\tt b}), then no propagation is necessary and nothing is shown. Unison simply notes that the file is up to date. These display conventions are used by both versions of the user interface. The only difference lies in the way in which Unison's default actions are either accepted or overridden by the user. \begin{textui} The status of each modified file is displayed, in turn. When the copies of a file in the two replicas are not identical, the user interface will ask for instructions as to how to propagate the change. If some default action is indicated (by an arrow), you can simply press Return to go on to the next changed file. If you want to do something different with this file, press ``\verb|<|'' or ``\verb|>|'' to force the change to be propagated from right to left or from left to right, or else press ``\verb|/|'' to skip this file and leave both replicas alone. When it reaches the end of the list of modified files, Unison will ask you one more time whether it should proceed with the updates that have been selected. When Unison stops to wait for input from the user, pressing ``\verb|?|'' will always give a list of possible responses and their meanings. \end{textui} \begin{tkui} The main window shows all the files that have been modified in either {\tt a.tmp} or {\tt b.tmp}. To override a default action (or to select an action in the case when there is no default), first select the file, either by clicking on its name or by using the up- and down-arrow keys. Then press either the left-arrow or ``\verb|<|'' key (to cause the version in b.tmp to propagate to a.tmp) or the right-arrow or ``\verb|>|'' key (which makes the a.tmp version override b.tmp). Every keyboard command can also be invoked from the menus at the top of the user interface. (Conversely, each menu item is annotated with its keyboard equivalent, if it has one.) When you are satisfied with the directions for the propagation of changes as shown in the main window, click the ``Go'' button to set them in motion. A check sign will be displayed next to each filename when the file has been dealt with. \end{tkui} \SUBSECTION{Remote Usage}{remote} Next, we'll get Unison set up to synchronize replicas on two different machines. NB: Unison has not been designed to run with elevated privileges (e.g. setuid), and it has not been audited for that environment. Therefore Unison should be run with the userid of the owner of the files to be synchronized, and should never be run setuid or similar. (Problems encountered when running setuid etc. must be reproduced without setuid before being reported as bugs.) Follow the instructions in the Installation section to download or build an executable version of Unison on the server machine, and install it somewhere on your search path. (It doesn't matter whether you install the textual or graphical version, since the copy of Unison on the server doesn't need to display any user interface at all. The major benefit of installing the textual version is that it doesn't have any external dependencies required by the GUI executable.) It is important that the version of Unison installed on the server machine is the same as the version of Unison on the client machine. But some flexibility on the version of Unison at the client side can be achieved by using the \verb|-addversionno| option; see \sectionref{prefs}{Preferences}. Now there is a decision to be made. Unison provides two methods for communicating between the client and the server: \begin{itemize} \item {\em Remote shell method}: To use this method, you must have some way of invoking remote commands on the server from the client's command line, using a facility such as \verb|ssh|. This method is more convenient (since there is no need to manually start a ``unison server'' process on the server) and also more secure, assuming you are using \verb|ssh|). \item {\em TCP socket method}: This method requires only that you can get TCP packets from the client to the server and back. It is insecure and should not be used. \item {\em Unix socket method}: This method only works within a single machine. It is similar to the TCP sockets method, but it is possible to configure it securely. \end{itemize} Decide which of these you want to try, and continue with \sectionref{rshmeth}{Remote Shell Method} or \sectionref{socketmeth}{Socket Method}, as appropriate. \SUBSECTION{Remote Shell Method}{rshmeth} The standard remote shell facility on Unix systems is \verb|ssh|. Running \verb|ssh| requires some coordination between the client and server machines to establish that the client is allowed to invoke commands on the server; please refer to the \verb|ssh| documentation for information on how to set this up. First, test that we can invoke Unison on the server from the client. Typing \begin{alltt} ssh \NT{remotehostname} unison -version \end{alltt} should print the same version information as running \begin{verbatim} unison -version \end{verbatim} locally on the client. If remote execution fails, then either something is wrong with your ssh setup (e.g., ``permission denied'') or else the search path that's being used when executing commands on the server doesn't contain the \verb|unison| executable (e.g., ``command not found''). Create a test directory {\tt a.tmp} in your home directory on the client machine. Test that the local unison client can start and connect to the remote server. Type \begin{alltt} unison -testServer a.tmp ssh://\NT{remotehostname}/a.tmp \end{alltt} Now cd to your home directory and type: \begin{verbatim} unison a.tmp ssh://remotehostname/a.tmp \end{verbatim} The result should be that the entire directory {\tt a.tmp} is propagated from the client to your home directory on the server. After finishing the first synchronization, change a few files and try synchronizing again. You should see similar results as in the local case. If your user name on the server is not the same as on the client, you need to specify it on the command line: \begin{verbatim} unison a.tmp ssh://username@remotehostname/a.tmp \end{verbatim} \noindent {\it Notes:} \begin{itemize} \item If you want to put \verb|a.tmp| some place other than your home directory on the remote host, you can give an absolute path for it by adding an extra slash between \verb|remotehostname| and the beginning of the path: \begin{verbatim} unison a.tmp ssh://remotehostname//absolute/path/to/a.tmp \end{verbatim} \item You can give an explicit path for the \verb|unison| executable on the server by using the command-line option \showtt{-servercmd /full/path/name/of/unison} or adding \showtt{servercmd=/full/path/name/of/unison} to your profile (see \sectionref{profile}{Profiles}). Similarly, you can specify an explicit path for the \verb|ssh| program using the \showtt{-sshcmd} option. Extra arguments can be passed to \verb|ssh| by setting the \verb|-sshargs| preference. \item By leveraging \showtt{-sshcmd} and \showtt{-sshargs}, you can effectively use any remote shell program, not just \verb|ssh|; just remember that the roots are still specified with \verb|ssh| as the protocol, that is, they have to start with \showtt{ssh://}. \end{itemize} \SUBSECTION{Socket Method}{socketmeth} To run Unison over a socket connection, you must start a Unison daemon process on the server. This process runs continuously, waiting for connections over a given socket from client machines running Unison and processing their requests in turn. Since the socket method is not used by many people, its functionality is rather limited. For example, the server can only deal with one client at a time. Note that the Unison daemon process is always started with a command-line argument; not from a profile. \SUBSUBSECTION{TCP Sockets}{socket-tcp} \begin{quote} {\bf\ifhevea\red\fi Warning:} The TCP socket method is insecure: not only are the texts of your changes transmitted over the network in unprotected form, it is also possible for anyone in the world to connect to the server process and read out the contents of your filesystem! (Of course, to do this they must understand the protocol that Unison uses to communicate between client and server, but all they need for this is a copy of the Unison sources.) The socket method is provided only for expert users with specific needs; everyone else should use the \verb|ssh| method. \end{quote} To start the daemon for connections over a TCP socket, type \begin{verbatim} unison -socket NNNN \end{verbatim} on the server machine, where {\tt NNNN} is the TCP port number that the daemon should listen on for connections from clients. ({\tt NNNN} can be any large number that is not being used by some other program; if \texttt{NNNN} is already in use, Unison will exit with an error message.) Create a test directory {\tt a.tmp} in your home directory on the client machine. Now type: \begin{alltt} unison a.tmp socket://\NT{remotehostname}:NNNN/a.tmp \end{alltt} Note that paths specified by the client will be interpreted relative to the directory in which you start the server process; this behavior is different from the ssh case, where the path is relative to your home directory on the server. % The result should be that the entire directory {\tt a.tmp} is propagated from the client to the server (\texttt{a.tmp} will be created on the server in the directory that the server was started from). % After finishing the first synchronization, change a few files and try synchronizing again. You should see similar results as in the local case. By default Unison will listen for incoming connections on all interfaces. If you want to limit this to certain interfaces or addresses then you can use the {\tt -listen} command-line argument, specifying a host name or an IP address to listen on. {\tt -listen} can be given multiple times to listen on several addresses. \SUBSUBSECTION{Unix Domain Sockets}{socket-unix} To start the daemon for connections over a Unix domain socket, type \begin{verbatim} unison -socket PPPP \end{verbatim} where {\tt PPPP} is the path to a Unix socket that the daemon should open for connections from clients. ({\tt PPPP} can be any absolute or relative path the server process has access to but it must not exist yet; the socket is created at that path when the daemon process is started.) You are responsible for securing access to the socket path. For example, this can be done by controlling the permissions of socket's parent directory, or ensuring a restrictive {\tt umask} value when starting Unison. Clients can connect to a server over a Unix domain socket by specifying the absolute or relative path to the socket, instead of a server address and port number: \begin{alltt} unison a.tmp socket://\textrm{\{}\NT{path/to/unix/socket}\textrm{\}}/a.tmp \end{alltt} (socket path is enclosed in curly braces). Note that Unix domain sockets are local sockets (they exist in the filesystem namespace). One could use Unixs socket remotely, by forwarding access to the socket by other means, for example by using {\tt spiped} secure pipe daemon. \SUBSECTION{Using Unison for All Your Files}{usingit} Once you are comfortable with the basic operation of Unison, you may find yourself wanting to use it regularly to synchronize your commonly used files. There are several possible ways of going about this: \begin{enumerate} \item Synchronize your whole home directory, using the Ignore facility (see \sectionref{ignore}{Ignoring Paths}) to avoid synchronizing temporary files and things that only belong on one host. \item Create a subdirectory called {\tt shared} (or {\tt current}, or whatever) in your home directory on each host, and put all the files you want to synchronize into this directory. \item Make your home directory the root of the synchronization, but tell Unison to synchronize only some of the files and subdirectories within it on any given run. This can be accomplished by using the {\tt -path} switch on the command line: \begin{alltt} unison /home/\NT{username} ssh://\NT{remotehost}//home/\NT{username} -path shared \end{alltt} The {\tt -path} option can be used as many times as needed, to synchronize several files or subdirectories: \begin{alltt} unison /home/\NT{username} ssh://\NT{remotehost}//home/\NT{username} \verb|\| -path shared \verb|\| -path pub \verb|\| -path .netscape/bookmarks.html \end{alltt} These \verb|-path| arguments can also be put in your preference file. See \sectionref{prefs}{Preferences} for an example. \end{enumerate} Most people find that they only need to maintain a profile (or profiles) on one of the hosts that they synchronize, since Unison is always initiated from this host. (For example, if you're synchronizing a laptop with a fileserver, you'll probably always run Unison on the laptop.) This is a bit different from the usual situation with asymmetric mirroring programs like \verb|rdist|, where the mirroring operation typically needs to be initiated from the machine with the most recent changes. \sectionref{profile}{Profiles} covers the syntax of Unison profiles, together with some sample profiles. \SUBSECTION{Using Unison to Synchronize More Than Two Machines}{usingmultiple} Unison is designed for synchronizing pairs of replicas. However, it is possible to use it to keep larger groups of machines in sync by performing multiple pairwise synchronizations. If you need to do this, the most reliable way to set things up is to organize the machines into a ``star topology,'' with one machine designated as the ``hub'' and the rest as ``spokes,'' and with each spoke machine synchronizing only with the hub. The big advantage of the star topology is that it eliminates the possibility of confusing ``spurious conflicts'' arising from the fact that a separate archive is maintained by Unison for every pair of hosts that it synchronizes. \SUBSECTION{Going Further}{further} On-line documentation for the various features of Unison can be obtained either by typing \begin{verbatim} unison -doc topics \end{verbatim} \noindent at the command line, or by selecting the Help menu in the graphical user interface. \iftextversion The same information is also available in a typeset User's Manual (HTML or PostScript format) through \ONEURL{https://github.com/bcpierce00/unison/wiki}. \else The on-line information and the printed manual are essentially identical. \fi If you use Unison regularly, you should subscribe to one of the mailing lists, to receive announcements of new versions. See \sectionref{obtaining}{Obtaining Unison}. \SECTION{Basic Concepts}{basics}{basics} To understand how Unison works, it is necessary to discuss a few straightforward concepts. These concepts are developed more rigorously and at more length in a number of papers, available at \ONEURL{http://www.cis.upenn.edu/\home{bcpierce}/papers}. But the informal presentation here should be enough for most users. \SUBSECTION{Roots}{roots} A replica's {\em root} tells Unison where to find a set of files to be synchronized, either on the local machine or on a remote host. For example, \begin{alltt} \NT{relative/path/of/root} \end{alltt} \noindent specifies a local root relative to the directory where Unison is started, while \begin{alltt} /\NT{absolute/path/of/root} \end{alltt} \noindent specifies a root relative to the top of the local filesystem, independent of where Unison is running. Remote roots can begin with \verb|ssh://| to indicate that the remote server should be started with ssh: \begin{alltt} ssh://\NT{remotehost}//\NT{absolute/path/of/root} ssh://\NT{user}@\NT{remotehost}/\NT{relative/path/of/root} \end{alltt} If the remote server is already running (in the socket mode), then the syntax \begin{alltt} socket://\NT{remotehost}:\NT{portnum}//\NT{absolute/path/of/root} socket://\NT{remotehost}:\NT{portnum}/\NT{relative/path/of/root} socket://[\NT{IPv6literal}]:\NT{portnum}/\NT{path} \end{alltt} \noindent is used to specify the hostname and the port that the client Unison should use to contact it. Syntax \begin{alltt} socket://\textrm{\{}\NT{path/of/socket}\textrm{\}}//\NT{absolute/path/of/root} socket://\textrm{\{}\NT{path/of/socket}\textrm{\}}/\NT{relative/path/of/root} \end{alltt} \noindent is used to specify the Unix domain socket the client Unison should use to contact the server. The syntax for roots is based on that of URIs (described in RFC 2396). The full grammar is: \begin{alltt} \NT{replica} ::= [\NT{protocol}:]//[\NT{user}@][\NT{host}][:\NT{port}][/\NT{path}] | \NT{path} \NT{protocol} ::= file | socket | ssh \NT{user} ::= [-\_a-zA-Z0-9\%@]+ \NT{host} ::= [-\_a-zA-Z0-9.]+ | \textrm{\textbackslash}[ [a-f0-9:.]+ \NT{zone}? \textrm{\textbackslash}] IPv6 literals (no future format). | \textrm{\{} [\^{}\textrm{\}}]+ \textrm{\}} For Unix domain sockets only. \NT{zone} ::= \%[-\_a-zA-Z0-9~\%.]+ \NT{port} ::= [0-9]+ \end{alltt} When \verb|path| is given without any protocol prefix, the protocol is assumed to be \verb|file:|. Under Windows, it is possible to synchronize with a remote directory using the \verb|file:| protocol over the Windows Network Neighborhood. For example, \begin{verbatim} unison foo //host/drive/bar \end{verbatim} \noindent synchronizes the local directory \verb|foo| with the directory \verb|drive:\bar| on the machine \verb|host|, provided that \verb|host| is accessible via Network Neighborhood. When the \verb|file:| protocol is used in this way, there is no need for a Unison server to be running on the remote host. However, running Unison this way is only a good idea if the remote host is reached by a very fast network connection, since the full contents of every file in the remote replica will have to be transferred to the local machine to detect updates. The names of roots are {\em canonized} by Unison before it uses them to compute the names of the corresponding archive files, so {\tt //saul//home/bcpierce/common} and {\tt //saul.cis.upenn.edu/common} will be recognized as the same replica under different names. \SUBSECTION{Paths}{paths} A {\em path} refers to a point {\em within} a set of files being synchronized; it is specified relative to the root of the replica. Formally, a path is just a sequence of names, separated by \verb|/|. Note that the path separator character is always a forward slash, no matter what operating system Unison is running on. Forward slashes are converted to backslashes as necessary when paths are converted to filenames in the local filesystem on a particular host. % (For example, suppose that we run Unison on a Windows system, synchronizing the local root \verb|c:\pierce| with the root \verb|ssh://saul.cis.upenn.edu/home/bcpierce| on a Unix server. Then the path \verb|current/todo.txt| refers to the file \verb|c:\pierce\current\todo.txt| on the client and \verb|/home/bcpierce/current/todo.txt| on the server.) The empty path (i.e., the empty sequence of names) denotes the whole replica. Unison displays the empty path as ``\verb|[root]|.'' If \verb|p| is a path and \verb|q| is a path beginning with \verb|p|, then \verb|q| is said to be a {\em descendant} of \verb|p|. (Each path is also a descendant of itself.) \SUBSECTION{What is an Update?}{updates} The {\em contents} of a path \verb|p| in a particular replica could be a file, a directory, a symbolic link, or absent (if \verb|p| does not refer to anything at all in that replica). More specifically: \begin{itemize} \item If \verb|p| refers to an ordinary file, then the contents of \verb|p| are the actual contents of this file (a string of bytes) plus the current permission bits of the file. \item If \verb|p| refers to a symbolic link, then the contents of \verb|p| are just the string specifying where the link points. \item If \verb|p| refers to a directory, then the contents of \verb|p| are just the token ``DIRECTORY'' plus the current permission bits of the directory. \item If \verb|p| does not refer to anything in this replica, then the contents of \verb|p| are the token ``ABSENT.'' \end{itemize} Unison keeps a record of the contents of each path after each successful synchronization of that path (i.e., it remembers the contents at the last moment when they were the same in the two replicas). We say that a path is {\em updated} (in some replica) if its current contents are different from its contents the last time it was successfully synchronized. Note that whether a path is updated has nothing to do with its last modification time---Unison considers only the contents when determining whether an update has occurred. This means that touching a file without changing its contents will {\em not} be recognized as an update. A file can even be changed several times and then changed back to its original contents; as long as Unison is only run at the end of this process, no update will be recognized. What Unison actually calculates is a close approximation to this definition; see \sectionref{caveats}{Caveats and Shortcomings}. \SUBSECTION{What is a Conflict?}{conflicts} A path is said to be {\em conflicting} if the following conditions all hold: \begin{enumerate} \item it has been updated in one replica, \item it or any of its descendants has been updated in the other replica, and \item its contents in the two replicas are not identical. \end{enumerate} \finishlater{Note that this isn't precisely what we implement, in the case of directory permission changes!} \SUBSECTION{Reconciliation}{recon} Unison operates in several distinct stages: \begin{enumerate} \item On each host, it compares its archive file (which records the state of each path in the replica when it was last synchronized) with the current contents of the replica, to determine which paths have been updated. \item It checks for ``false conflicts'' --- paths that have been updated on both replicas, but whose current values are identical. These paths are silently marked as synchronized in the archive files in both replicas. \item It displays all the updated paths to the user. For updates that do not conflict, it suggests a default action (propagating the new contents from the updated replica to the other). Conflicting updates are just displayed. The user is given an opportunity to examine the current state of affairs, change the default actions for nonconflicting updates, and choose actions for conflicting updates. \item It performs the selected actions, one at a time. Each action is performed by first transferring the new contents to a temporary file on the receiving host, then atomically moving them into place. \item It updates its archive files to reflect the new state of the replicas. \end{enumerate} \TOPSUBSECTION{Invariants}{failures} Given the importance and delicacy of the job that it performs, it is important to understand both what a synchronizer does under normal conditions and what can happen under unusual conditions such as system crashes and communication failures. % Unison deals with two sorts of information: the two replicas % themselves and its own memory of the ``last synchronized state'' of % each path in the replicas. The latter is what allows it to detect % correctly which replica is new when a file been updated. Roughly, % the sequence of actions that occur when Unison runs is: % \begin{enumerate} % \item It reads a private archive file stored with each replica % and checks which paths on each replica have been updated. % Technically, a path has been updated if its contents in a replica are % different from the contents of that replica at the end of the last % synchronization in which that path was successfully synchronized --- % i.e., the last time the two replicas were equal at that path at the % end of a run of Unison. The ``contents'' of a path can be either a % file, a directory, or nothing at all, so deleting a file or changing a % directory to a file count as updates to the contents at that path. % For efficiency, Unison does not try to calculate the set of updated % paths exactly: it will sometimes falsely detect a change in a path % whose contents have actually not changed (this can happen, for % example, when the file's modification time has been changed, for some % reason). As long as this path has not been modified in the other % replica, this ``conservativity'' in update detection is invisible to % the user. If the other replica {\em has} been modified, however, a % ``false conflict'' may be reported. % \item It combines the lists of paths that (may) have been updated in % the two replicas, assigns default actions to those where the change % was in one replica only, and records a conflict for those that were % changed in both replicas. % \item The current contents of the paths on this list are then % compared, to see if they actually differ. (This is done by comparing % fingerprints, not transferring the whole files.) Paths whose contents % are actually identical are marked as synchronized and deleted from the % list. % \item The remaining paths are displayed to the user, who then has an % opportunity to change the default actions and choose actions for % conflicting paths. % \item When this process is finished, the selected changes are actually % propagated between the replicas. % \item Finally, Unison updates its internal state, marking as % synchronized all the files for which changes were successfully % propagated. % \end{enumerate} Unison is careful to protect both its internal state and the state of the replicas at every point in this process. Specifically, the following guarantees are enforced: \begin{itemize} \item At every moment, each path in each replica has either (1) its {\em original} contents (i.e., no change at all has been made to this path), or (2) its {\em correct} final contents (i.e., the value that the user expected to be propagated from the other replica). \item At every moment, the information stored on disk about Unison's private state can be either (1) unchanged, or (2) updated to reflect those paths that have been successfully synchronized. \end{itemize} The upshot is that it is safe to interrupt Unison at any time, either manually or accidentally. [Caveat: the above is {\em almost} true there are occasionally brief periods where it is not (and, because of shortcoming of the Posix filesystem API, cannot be); in particular, when it is copying a file onto a directory or vice versa, it must first move the original contents out of the way. If Unison gets interrupted during one of these periods, some manual cleanup may be required. In this case, a file called {\tt DANGER.README} will be left in the {\tt .unison} directory, containing information about the operation that was interrupted. The next time you try to run Unison, it will notice this file and warn you about it.] If an interruption happens while it is propagating updates, then there may be some paths for which an update has been propagated but which have not been marked as synchronized in Unison's archives. This is no problem: the next time Unison runs, it will detect changes to these paths in both replicas, notice that the contents are now equal, and mark the paths as successfully updated when it writes back its private state at the end of this run. If Unison is interrupted, it may sometimes leave temporary working files (with suffix \verb|.tmp|) in the replicas. It is safe to delete these files. Also, if the \verb|backups| flag is set, Unison will leave around old versions of files that it overwrites, with names like \verb|file.0.unison.bak|. These can be deleted safely when they are no longer wanted. Unison is not bothered by clock skew between the different hosts on which it is running. It only performs comparisons between timestamps obtained from the same host, and the only assumption it makes about them is that the clock on each system always runs forward. If Unison finds that its archive files have been deleted (or that the archive format has changed and they cannot be read, or that they don't exist because this is the first run of Unison on these particular roots), it takes a conservative approach: it behaves as though the replicas had both been completely empty at the point of the last synchronization. The effect of this is that, on the first run, files that exist in only one replica will be propagated to the other, while files that exist in both replicas but are unequal will be marked as conflicting. Touching a file without changing its contents should never affect whether or not Unison does an update. (When running with the fastcheck preference set to true---the default on Unix systems---Unison uses file modtimes for a quick first pass to tell which files have definitely not changed; then, for each file that might have changed, it computes a fingerprint of the file's contents and compares it against the last-synchronized contents. Also, the \verb|-times| option allows you to synchronize file times, but it does not cause identical files to be changed; Unison will only modify the file times.) It is safe to ``brainwash'' Unison by deleting its archive files {\em on both replicas}. The next time it runs, it will assume that all the files it sees in the replicas are new. It is safe to modify files while Unison is working. If Unison discovers that it has propagated an out-of-date change, or that the file it is updating has changed on the target replica, it will signal a failure for that file. Run Unison again to propagate the latest change. \finishlater{There are some race conditions. We should probably talk about them.} Changes to the ignore patterns from the user interface (e.g., using the `i' key) are immediately reflected in the current profile. \SUBSECTION{Caveats and Shortcomings}{caveats} Here are some things to be careful of when using Unison. \begin{itemize} \item In the interests of speed, the update detection algorithm may (depending on which OS architecture that you run Unison on) actually use an approximation to the definition given in \sectionref{updates}{What is an Update?}. In particular, the Unix implementation does not compare the actual contents of files to their previous contents, but simply looks at each file's inode number and modtime; if neither of these have changed, then it concludes that the file has not been changed. Under normal circumstances, this approximation is safe, in the sense that it may sometimes detect ``false updates'' but will never miss a real one. However, it is possible to fool it, for example by using \verb|retouch| to change a file's modtime back to a time in the past. \finishlater{One user---Marcus Mottl---claimed that it could also happen if we use memory mapped I/O, but this is not clear} \item If you synchronize between a single-user filesystem and a shared Unix server, you should pay attention to your permission bits: by default, Unison will synchronize permissions verbatim, which may leave group-writable files on the server that could be written over by a lot of people. You can control this by setting your \verb|umask| on both computers to something like 022, masking out the ``world write'' and ``group write'' permission bits. Unison does not synchronize the \verb|setuid| and \verb|setgid| bits, for security. \item The graphical user interface is single-threaded. This means that if Unison is performing some long-running operation, the display will not be repainted until it finishes. We recommend not trying to do anything with the user interface while Unison is in the middle of detecting changes or propagating files. \item Unison does not understand hard links. \item It is important to be a little careful when renaming directories containing {\tt ignore}d files. For example, suppose Unison is synchronizing directory A between the two machines called the ``local'' and the ``remote'' machine; suppose directory A contains a subdirectory D; and suppose D on the local machine contains a file or subdirectory P that matches an ignore directive in the profile used to synchronize. Thus path A/D/P exists on the local machine but not on the remote machine. If D is renamed to D' on the remote machine, and this change is propagated to the local machine, all such files or subdirectories P will be deleted. This is because Unison sees the rename as a delete and a separate create: it deletes the old directory (including the ignored files) and creates a new one ({\em not} including the ignored files, since they are completely invisible to it). \end{itemize} \SECTION{Reference Guide}{reference}{ } This section covers the features of Unison in detail. \TOPSUBSECTION{Running Unison}{running} There are several ways to start Unison. \begin{itemize} \item Typing ``{\tt unison \NT{profile}}'' on the command line. Unison will look for a file \texttt{\NT{profile}.prf} in the \verb|.unison| directory. If this file does not specify a pair of roots, Unison will prompt for them and add them to the information specified by the profile. \item Typing ``{\tt unison \NT{profile} \NT{root1} \NT{root2}}'' on the command line. In this case, Unison will use {\tt \NT{profile}}, which should not contain any {\tt root} directives. \item Typing ``{\tt unison \NT{root1} \NT{root2}}'' on the command line. This has the same effect as typing ``{\tt unison default \NT{root1} \NT{root2}}.'' \item Typing just ``{\tt unison}'' (or invoking Unison by clicking on a desktop icon). In this case, Unison will ask for the profile to use for synchronization (or create a new one, if necessary). \end{itemize} % \finish{Need to check that the text UI actually works this way. (It % doesn't prompt, for sure, but it should.)} \SUBSECTION{The {\tt .unison} Directory}{unisondir} Unison stores a variety of information in a private directory on each host. If the environment variable {\tt UNISON} is defined, then its value will be used as the path/folder name for this directory. This can be just a name, or a path. A name on it's own, for example {\tt UNISON=mytestname} will place a folder in the same directory that the Unison binary was run in, with that name. Using a path like {\tt UNISON=../mytestname2} will place that folder in the folder above where the Unison binary was run from. If {\tt UNISON} is not defined, then the directory depends on which operating system you are using. In Unix, the default is to use {\tt \$HOME/.unison}. In Windows, if the environment variable {\tt USERPROFILE} is defined, then the directory will be {\tt \$USERPROFILE$\backslash$.unison}; otherwise if {\tt HOME} is defined, it will be {\tt \$HOME$\backslash$.unison}; otherwise, it will be {\tt c:$\backslash$.unison}. On OS X, {\tt \$HOME/.unison} will be used if it is present, but {\tt \$HOME/Library/Application Support/Unison} will be created and used by default. The archive file for each replica is found in the {\tt .unison} directory on that replica's host. Profiles (described below) are always taken from the {\tt .unison} directory on the client host. Note that Unison maintains a completely different set of archive files for each pair of roots. We do not recommend synchronizing the whole {\tt .unison} directory, as this will involve frequent propagation of large archive files. It should be safe to do it, though, if you really want to. Synchronizing just the profile files in the {\tt .unison} directory is definitely OK. \SUBSECTION{Archive Files}{archives} The name of the archive file on each replica is calculated from \begin{itemize} \item the {\em canonical names} of all the hosts (short names like \verb|saul| are converted into full addresses like \verb|saul.cis.upenn.edu|), \item the paths to the replicas on all the hosts (again, relative pathnames, symbolic links, etc.\ are converted into full, absolute paths), and \item an internal version number that is changed whenever a new Unison release changes the format of the information stored in the archive. \end{itemize} This method should work well for most users. However, it is occasionally useful to change the way archive names are generated. Unison provides two ways of doing this. The function that finds the canonical hostname of the local host (which is used, for example, in calculating the name of the archive file used to remember which files have been synchronized) normally uses the \verb|gethostname| operating system call. However, if the environment variable \verb|UNISONLOCALHOSTNAME| is set, its value will be used instead. This makes it easier to use Unison in situations where a machine's name changes frequently (e.g., because it is a laptop and gets moved around a lot). A more powerful way of changing archive names is provided by the \verb|rootalias| preference. The preference file may contain any number of lines of the form: \begin{alltt} rootalias = //\NT{hostnameA}//\NT{path-to-replicaA} -> //\NT{hostnameB}/\NT{path-to-replicaB} \end{alltt} When calculating the name of the archive files for a given pair of roots, Unison replaces any root that matches the left-hand side of any rootalias rule by the corresponding right-hand side. So, if you need to relocate a root on one of the hosts, you can add a rule of the form: \begin{alltt} rootalias = //\NT{new-hostname}//\NT{new-path} -> //\NT{old-hostname}/\NT{old-path} \end{alltt} Note that root aliases are case-sensitive, even on case-insensitive file systems. {\em Warning}: The \verb|rootalias| option is dangerous and should only be used if you are sure you know what you're doing. In particular, it should only be used if you are positive that either (1) both the original root and the new alias refer to the same set of files, or (2) the files have been relocated so that the original name is now invalid and will never be used again. (If the original root and the alias refer to different sets of files, Unison's update detector could get confused.) % After introducing a new \verb|rootalias|, it is a good idea to run Unison a few times interactively (with the \verb|batch| flag off, etc.) and carefully check that things look reasonable---in particular, that update detection is working as expected. \SUBSECTION{Preferences}{prefs} Many details of Unison's behavior are configurable by user-settable ``preferences.'' Some preferences are boolean-valued; these are often called {\em flags}. Others take numeric or string arguments, indicated in the preferences list by {\tt n} or {\tt xxx}. Some string arguments take the backslash as an escape to include the next character literally; this is mostly useful to escape a space or the backslash; a trailing backslash is ignored and is useful to protect a trailing whitespace in the string that would otherwise be trimmed. Most of the string preferences can be given several times; the arguments are accumulated into a list internally. There are two ways to set the values of preferences: temporarily, by providing command-line arguments to a particular run of Unison, or permanently, by adding commands to a {\em profile} in the {\tt .unison} directory on the client host. The order of preferences (either on the command line or in preference files) is not significant. On the command line, preferences and other arguments (the profile name and roots) can be intermixed in any order. To set the value of a preference {\tt p} from the command line, add an argument {\tt -p} (for a boolean flag) or {\tt -p n} or {\tt -p xxx} (for a numeric or string preference) anywhere on the command line. To set a boolean flag to \verb|false| on the command line, use {\tt -p=false}. Here are all the preferences supported by Unison. This list can be obtained by typing {\tt unison -help}. \begin{quote} \verbatiminput{prefs.tmp} \end{quote} Here, in more detail, is what they do. Many are discussed in greater detail in other sections of the manual. It should be noted that some command-line arguments are handled specially during startup, including \verb|-doc|, \verb|-help|, \verb|-version|, \verb|-socket|, and \verb|-ui|. They are expected to appear on the command-line only, not in a profile. In particular, \verb|-version| and \verb|-doc| will print to the standard output, so they only make sense if invoked from the command-line (and not a click-launched gui that has no standard output). Furthermore, the actions associated with these command-line arguments are executed without loading a profile or doing the usual command-line parsing. % \input{prefsdocs.tmp} \SUBSECTION{Profiles}{profile} A {\em profile} is a text file that specifies permanent settings for roots, paths, ignore patterns, and other preferences, so that they do not need to be typed at the command line every time Unison is run. Profiles should reside in the \verb|.unison| directory on the client machine. If Unison is started with just one argument \ARG{name} on the command line, it looks for a profile called \texttt{\ARG{name}.prf} in the \verb|.unison| directory. If it is started with no arguments, it scans the \verb|.unison| directory for files whose names end in \verb|.prf| and offers a menu (provided that the Unison executable is compiled with the graphical user interface). If a file named \verb|default.prf| is found, its settings will be offered as the default choices. To set the value of a preference {\tt p} permanently, add to the appropriate profile a line of the form \begin{verbatim} p = true \end{verbatim} for a boolean flag or \begin{verbatim} p = \end{verbatim} for a preference of any other type. A profile may include blank lines and lines beginning with {\tt \#}; both are ignored. Spaces and tabs before and after {\tt p} and {\tt xxx} are ignored. Spaces, tabs, and non-printable characters within values are not treated specially, so that e.g. \verb|root = /foo bar| refers to a directory containing a space. (On systems using newline for line ending, carriage returns are currently ignored, but this is not part of the specification.) When Unison starts, it first reads the profile and then the command line, so command-line options will override settings from the profile. Profiles may also include lines of the form \texttt{include \ARG{name}}, which will cause the file \ARG{name} (or \texttt{\ARG{name}.prf}, if \ARG{name} does not exist in the \verb+.unison+ directory) to be read at the point, and included as if its contents, instead of the \texttt{include} line, was part of the profile. Include lines allows settings common to several profiles to be stored in one place. A similar line of the form \texttt{source \ARG{name}} does the same except that it does not attempt to add a suffix to \ARG{name}. Similar lines of the form \texttt{include\mbox{?} \ARG{name}} or \texttt{source\mbox{?} \ARG{name}} do the same as their respective lines without the question mark except that it does not constitute an error to specify a non-existing file \ARG{name}. In \ARG{name} the backslash is an escape character. A profile may include a preference `\texttt{label = \ARG{desc}}' to provide a description of the options selected in this profile. The string \ARG{desc} is listed along with the profile name in the profile selection dialog, and displayed in the top-right corner of the main Unison window in the graphical user interface. The graphical user-interface also supports one-key shortcuts for commonly used profiles. If a profile contains a preference of the form % `\texttt{key = \ARG{n}}', where \ARG{n} is a single digit, then pressing this digit key will cause Unison to immediately switch to this profile and begin synchronization again from scratch. In this case, all actions that have been selected for a set of changes currently being displayed will be discarded. \SUBSECTION{Sample Profiles}{profileegs} \SUBSUBSECTION{A Minimal Profile}{minimalprofile} Here is a very minimal profile file, such as might be found in {\tt .unison/default.prf}: \begin{verbatim} # Roots of the synchronization root = /home/bcpierce root = ssh://saul//home/bcpierce # Paths to synchronize path = current path = common path = .netscape/bookmarks.html \end{verbatim} \SUBSUBSECTION{A Basic Profile}{basicprofile} Here is a more sophisticated profile, illustrating some other useful features. \begin{verbatim} # Roots of the synchronization root = /home/bcpierce root = ssh://saul//home/bcpierce # Paths to synchronize path = current path = common path = .netscape/bookmarks.html # Some regexps specifying names and paths to ignore ignore = Name temp.* ignore = Name *~ ignore = Name .*~ ignore = Path */pilot/backup/Archive_* ignore = Name *.o ignore = Name *.tmp # Window height height = 37 # Keep a backup copy of every file in a central location backuplocation = central backupdir = /home/bcpierce/backups backup = Name * backupprefix = $VERSION. backupsuffix = # Use this command for displaying diffs diff = diff -y -W 79 --suppress-common-lines # Log actions to the terminal log = true \end{verbatim} \SUBSUBSECTION{A Power-User Profile}{powerprofile} When Unison is used with large replicas, it is often convenient to be able to synchronize just a part of the replicas on a given run (this saves the time of detecting updates in the other parts). This can be accomplished by splitting up the profile into several parts --- a common part containing most of the preference settings, plus one ``top-level'' file for each set of paths that need to be synchronized. (The {\tt include} mechanism can also be used to allow the same set of preference settings to be used with different roots.) The collection of profiles implementing this scheme might look as follows. % The file {\tt default.prf} is empty except for an {\tt include} directive: \begin{verbatim} # Include the contents of the file common include common \end{verbatim} Note that the name of the common file is {\tt common}, not {\tt common.prf}; this prevents Unison from offering {\tt common} as one of the list of profiles in the opening dialog (in the graphical UI). The file {\tt common} contains the real preferences: \begin{verbatim} # Roots of the synchronization root = /home/bcpierce root = ssh://saul//home/bcpierce # (... other preferences ...) # If any new preferences are added by Unison (e.g. 'ignore' # preferences added via the graphical UI), then store them in the # file 'common' rather than in the top-level preference file addprefsto = common # Names and paths to ignore: ignore = Name temp.* ignore = Name *~ ignore = Name .*~ ignore = Path */pilot/backup/Archive_* ignore = Name *.o ignore = Name *.tmp \end{verbatim} Note that there are no {\tt path} preferences in {\tt common}. This means that, when we invoke Unison with the default profile (e.g., by typing '{\tt unison default}' or just '{\tt unison}' on the command line), the whole replicas will be synchronized. (If we {\em never} want to synchronize the whole replicas, then {\tt default.prf} would instead include settings for all the paths that are usually synchronized.) To synchronize just part of the replicas, Unison is invoked with an alternate preference file---e.g., doing '{\tt unison workingset}', where the preference file {\tt workingset.prf} contains \begin{verbatim} path = current/papers path = Mail/inbox path = Mail/drafts include common \end{verbatim} causes Unison to synchronize just the listed subdirectories. The {\tt key} preference can be used in combination with the graphical UI to quickly switch between different sets of paths. For example, if the file {\tt mail.prf} contains \begin{verbatim} path = Mail batch = true key = 2 include common \end{verbatim} then pressing 2 will cause Unison to look for updates in the {\tt Mail} subdirectory and (because the {\tt batch} flag is set) immediately propagate any that it finds. \SUBSECTION{Keeping Backups}{backups} When Unison overwrites (or deletes) a file or directory while propagating changes from the other replica, it can keep the old version around as a backup. There are several preferences that control precisely where these backups are stored and how they are named. To enable backups, you must give one or more \verb|backup| preferences. Each of these has the form \begin{verbatim} backup = \end{verbatim} where \verb|| has the same form as for the \verb|ignore| preference. For example, \begin{verbatim} backup = Name * \end{verbatim} causes Unison to create backups of {\em all} files and directories. The \verb|backupnot| preference can be used to give a few exceptions: it specifies which files and directories should {\em not} be backed up, even if they match the \verb|backup| pathspec. It is important to note that the \verb|pathspec| is matched against the path that is being updated by Unison, not its descendants. For example, if you set \verb|backup = Name *.txt| and then delete a whole directory named \verb|foo| containing some text files, these files will not be backed up because Unison will just check that \verb|foo| does not match \verb|*.txt|. Similarly, if the directory itself happened to be called \verb|foo.txt|, then the whole directory and all the files in it will be backed up, regardless of their names. Backup files can be stored either {\em centrally} or {\em locally}. This behavior is controlled by the preference \verb|backuplocation|, whose value must be either \verb|central| or \verb|local|. (The default is \verb|central|.) Note that central storage of backups can lead to backup files being stored in a different filesystem than the original files, which could have different security properties and different amounts of available storage. When backups are stored locally, they are kept in the same directory as the original. When backups are stored centrally, the directory used to hold them is controlled by the preference \verb|backupdir| and the environment variable \verb|UNISONBACKUPDIR|. (The environment variable is checked first.) If neither of these are set, then the directory \verb|.unison/backup| in the user's home directory is used. The preference \verb|maxbackups| (default 2) controls how many previous versions of each file are kept (including the current version), following the usual plan of deleting the oldest when creating a new one. By default, backup files are named \verb|.bak.VERSION.FILENAME|, where \verb|FILENAME| is the original filename and \verb|VERSION| is the backup number (1 for the most recent, 2 for the next most recent, etc.). This can be changed by setting the preferences \verb|backupprefix| and/or \verb|backupsuffix|. If desired, \verb|backupprefix| may include a directory prefix; this can be used with \verb|backuplocation = local| to put all backup files for each directory into a single subdirectory. For example, setting \begin{verbatim} backuplocation = local backupprefix = .unison/$VERSION. backupsuffix = \end{verbatim} will put all backups in a local subdirectory named \verb|.unison|. Also, note that the string \verb|$VERSION| in either \verb|backupprefix| or \verb|backupsuffix| (it must appear in one or the other) is replaced by the version number. This can be used, for example, to ensure that backup files retain the same extension as the originals. Other than \verb|maxbackups| (which will never delete the last backup), there are no other mechanisms for deleting backups. For backward compatibility, the \verb|backups| preference is also supported. % It simply means \verb|backup = Name *| and \verb|backuplocation = local|. \SUBSECTION{Merging Conflicting Versions}{merge} Unison can invoke external programs to merge conflicting versions of a file. The preference \verb|merge| controls this process. The \verb|merge| preference may be given once or several times in a preference file (it can also be given on the command line, of course, but this tends to be awkward because of the spaces and special characters involved). Each instance of the preference looks like this: \begin{verbatim} merge = -> \end{verbatim} The \verb|| here has exactly the same format as for the \verb|ignore| preference (see \sectionref{pathspec}{Path Specification}). For example, using ``\verb|Name *.txt|'' as the \verb|| tells Unison that this command should be used whenever a file with extension \verb|.txt| needs to be merged. Many external merging programs require as inputs not just the two files that need to be merged, but also a file containing the {\em last synchronized version}. You can ask Unison to keep a copy of the last synchronized version for some files using the \verb|backupcurrent| preference. This preference is used in exactly the same way as \verb|backup| and its meaning is similar, except that it causes backups to be created of the {\em current} contents of each file after it has been synchronized by Unison, rather than the {\em previous} contents that Unison overwrote. These backups are stored in {\em both} replicas in the same place as ordinary backup files---i.e. according to the \verb|backuplocation| and \verb|backupdir| preferences. They are named like the original files if \verb|backupslocation| is set to 'central' and otherwise, Unison uses the \verb|backupprefix| and \verb|backupsuffix| preferences and assumes a version number 000 for these backups. Note that there are no mechanisms (beyond the limit on the number of backups for each file) to remove backup files. The \verb|| part of the preference specifies what external command should be invoked to merge files at paths matching the \verb||. Within this string, several special substrings are recognized; these will be substituted with appropriate values before invoking a sub-shell to execute the command. \begin{itemize} \item \relax\verb|CURRENT1| is replaced by the name of (a temporary copy of) the local variant of the file. \item \relax\verb|CURRENT2| is replaced by the name of a temporary file, into which the contents of the remote variant of the file have been transferred by Unison prior to performing the merge. \item \relax\verb|CURRENTARCH| is replaced by the name of the backed up copy of the original version of the file (i.e., the file saved by Unison if the current filename matches the path specifications for the \verb|backupcurrent| preference, as explained above), if one exists. If no archive exists and \relax\verb|CURRENTARCH| appears in the merge command, then an error is signalled. \item \relax\verb|CURRENTARCHOPT| is replaced by the name of the backed up copy of the original version of the file (i.e., its state at the end of the last successful run of Unison), if one exists, or the empty string if no archive exists. \item \relax\verb|NEW| is replaced by the name of a temporary file that Unison expects to be written by the merge program when it finishes, giving the desired new contents of the file. \item \relax\verb|PATH| is replaced by the path (relative to the roots of the replicas) of the file being merged. \item \relax\verb|NEW1| and \relax\verb|NEW2| are replaced by the names of temporary files that Unison expects to be written by the merge program when it is only able to partially merge the originals; in this case, \verb|NEW1| will be written back to the local replica and \verb|NEW2| to the remote replica; \verb|NEWARCH|, if present, will be used as the ``last common state'' of the replicas. (These three options are provided for later compatibility with the Harmony data synchronizer.) \item \relax\verb|BATCHMODE| is replaced according to the batch mode of Unison; if it is in \texttt{batch} mode, then a non empty string (``\verb|batch|'') is substituted, otherwise the empty string is substituted. \end{itemize} To accommodate the wide variety of programs that users might want to use for merging, Unison checks for several possible situations when the merge program exits: \begin{itemize} \item If the merge program exits with a non-zero status, then merge is considered to have failed and the replicas are not changed. \item If the file \verb|NEW| has been created, it is written back to both replicas (and stored in the backup directory). Similarly, if just the file \verb|NEW1| has been created, it is written back to both replicas. \item If neither \verb|NEW| nor \verb|NEW1| have been created, then Unison examines the temporary files \verb|CURRENT1| and \verb|CURRENT2| that were given as inputs to the merge program. If either has been changed (or both have been changed in identical ways), then its new contents are written back to both replicas. If either \verb|CURRENT1| or \verb|CURRENT2| has been {\em deleted}, then the contents of the other are written back to both replicas. \item If the files \verb|NEW1|, \verb|NEW2|, and \verb|NEWARCH| have all been created, they are written back to the local replica, remote replica, and backup directory, respectively. If the files \verb|NEW1|, \verb|NEW2| have been created, but \verb|NEWARCH| has not, then these files are written back to the local replica and remote replica, respectively. Also, if \verb|NEW1| and \verb|NEW2| have identical contents, then the same contents are stored as a backup (if the \verb|backupcurrent| preference is set for this path) to reflect the fact that the path is currently in sync. \item If \verb|NEW1| and \verb|NEW2| (resp. \verb|CURRENT1| and \verb|CURRENT2|) are created (resp. overwritten) with different contents but the merge command did not fail (i.e., it exited with status code 0), then we copy \verb|NEW1| (resp. \verb|CURRENT1|) to the other replica and to the archive. This behavior is a design choice made to handle the case where a merge command only synchronizes some specific contents between two files, skipping some irrelevant information (order between entries, for instance). We assume that, if the merge command exits normally, then the two resulting files are ``as good as equal.'' (The reason we copy one on top of the other is to avoid Unison detecting that the files are unequal the next time it is run and trying again to merge them when, in fact, the merge program has already made them as similar as it is able to.) \end{itemize} You can disable a merge by setting a \verb|| that does nothing. For example you can override the merging of text files specified in a profile by typing on the command line: \begin{verbatim} unison profile -merge 'Name *.txt -> echo SKIP' \end{verbatim} If the \verb|confirmmerge| preference is set and Unison is not run in batch mode, then Unison will always ask for confirmation before actually committing the results of the merge to the replicas. You can detect batch mode by testing \verb|BATCHMODE|; for example to avoid a merge completely do nothing: \begin{verbatim} merge = Name *.txt -> [ -z "BATCHMODE" ] && mergecmd CURRENT1 CURRENT2 \end{verbatim} A large number of external merging programs are available. For example, on Unix systems setting the \verb|merge| preference to \begin{verbatim} merge = Name *.txt -> diff3 -m CURRENT1 CURRENTARCH CURRENT2 > NEW || echo "differences detected" \end{verbatim} \noindent will tell Unison to use the external \verb|diff3| program for merging. % Alternatively, users of \verb|emacs| may find the following settings convenient: \begin{verbatim} merge = Name *.txt -> emacs -q --eval '(ediff-merge-files-with-ancestor "CURRENT1" "CURRENT2" "CURRENTARCH" nil "NEW")' \end{verbatim} \noindent (These commands are displayed here on two lines to avoid running off the edge of the page. In your preference file, each command should be written on a single line.) Users running emacs under windows may find something like this useful: \begin{verbatim} merge = Name * -> C:\Progra~1\Emacs\emacs\bin\emacs.exe -q --eval "(ediff-files """CURRENT1""" """CURRENT2""")" \end{verbatim} Users running Mac OS X (you may need the Developer Tools installed to get the {\tt opendiff} utility) may prefer \begin{verbatim} merge = Name *.txt -> opendiff CURRENT1 CURRENT2 -ancestor CURRENTARCH -merge NEW \end{verbatim} Here is a slightly more involved hack. The {\tt opendiff} program can operate either with or without an archive file. A merge command of this form \begin{verbatim} merge = Name *.txt -> if [ CURRENTARCHOPTx = x ]; then opendiff CURRENT1 CURRENT2 -merge NEW; else opendiff CURRENT1 CURRENT2 -ancestor CURRENTARCHOPT -merge NEW; fi \end{verbatim} (still all on one line in the preference file!) will test whether an archive file exists and use the appropriate variant of the arguments to {\tt opendiff}. Linux users may enjoy this variant: \begin{verbatim} merge = Name * -> kdiff3 -o NEW CURRENTARCHOPT CURRENT1 CURRENT2 \end{verbatim} \begin{quote} \it Please post suggestions for other useful values of the \verb|merge| preference to the {\tt unison-users} mailing list---we'd like to give several examples here. \end{quote} \finishlater{ \SUBSECTION{Communicating with a Remote Server}{server} If you can mount both filesystems on the same host, then you can run with no server (note, though, that this won't be fast enough over a phone line).......... } \SUBSECTION{The User Interface}{ui} Both the textual and the graphical user interfaces are intended to be mostly self-explanatory. Here are just a few tricks: \begin{itemize} \item By default, when running on Unix the textual user interface will try to put the terminal into the ``raw mode'' so that it reads the input a character at a time rather than a line at a time. (This means you can type just the single keystroke ``\verb|>|'' to tell Unison to propagate a file from left to right, rather than ``\verb|>| Enter.'') There are some situations, though, where this will not work --- for example, when Unison is running in a shell window inside Emacs. Setting the \verb|dumbtty| preference will force Unison to leave the terminal alone and process input a line at a time. \end{itemize} \SUBSECTION{Interrupting a Synchronization}{intr} It is possible to interrupt an ongoing synchronization process before it completes. Different user interfaces offer different ways of doing it. \begin{tkui} In the graphical user interface the synchronization process can be interrupted before it is finished by pressing the ``Stop'' button or by closing the window. The ``Stop'' button causes the onging propagation to be stopped as quickly as possible while still doing proper cleanup. The application keeps running and a rescan can be performed or a different profile selected. Closing the window in the middle of update propagation process will exit the application immediately without doing proper cleanup; it is therefore not recommended unless the ``Stop'' button does not react quickly enough. \end{tkui} \begin{textui} When not synchronizing continuously, the text interface terminates when synchronization is finished normally or due to a fatal error occurring. In the text interface, to interrupt synchronization before it is finished, press ``Ctrl-C'' (or send signal \verb|SIGINT| or \verb|SIGTERM|). This will interrupt update propagation as quickly as possible but still complete proper cleanup. If the process does not stop even after pressing ``Ctrl-C'' then keep doing it repeatedly. This will bypass cleanup procedures and terminates the process forcibly (similar to \verb|SIGKILL|). Doing so may leave the archives or replicas in an inconsistent state or locked. When synchronizing continuously (time interval repeat or with filesystem monitoring), interrupting with ``Ctrl-C'' or with signal \verb|SIGINT| or \verb|SIGTERM| works the same way as described above and will additionally stop the continuous process. To stop only the continuous process and let the last synchronization complete normally, send signal \verb|SIGUSR2| instead. \end{textui} \SUBSECTION{Exit Code}{exit} When running in the textual mode, Unison returns an exit status, which describes whether, and at which level, the synchronization was successful. The exit status could be useful when Unison is invoked from a script. Currently, there are four possible values for the exit status: \ifhevea\begin{itemize}\else\begin{quote}\begin{description}\fi \numitem [0]: successful synchronization; everything is up-to-date now. \numitem [1]: some files were skipped, but all file transfers were successful. \numitem [2]: non-fatal failures occurred during file transfer. \numitem [3]: a fatal error occurred, or the execution was interrupted. \ifhevea\end{itemize}\else\end{description}\end{quote}\fi The graphical interface does not return any useful information through the exit status. \SUBSECTION{Path Specification}{pathspec} Several Unison preferences (e.g., \verb|ignore|/\verb|ignorenot|, \verb|follow|, \verb|sortfirst|/\verb|sortlast|, \verb|backup|, \verb|merge|, etc.) specify individual paths or sets of paths. These preferences share a common syntax based on regular-expressions. Each preference is associated with a list of path patterns; the paths specified are those that match any one of the path pattern. \begin{itemize} \item Pattern preferences can be given on the command line, or, more often, stored in profiles, using the same syntax as other preferences. For example, a profile line of the form \begin{alltt} ignore = \ARG{pattern} \end{alltt} adds \ARG{pattern} to the list of patterns to be ignored. \item Each \ARG{pattern} can have one of three forms. The most general form is a POSIX Extended Regular Expression introduced by the keyword \verb|Regex|. (The collating symbol, equivalence class expression, and character class expression described in \URL{https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1\_chap09.html\#tag\_09\_03\_05}{Section 9.3.5 of the POSIX specification} are not currently supported). \begin{alltt} Regex \ARG{regexp} \end{alltt} For convenience, three other styles of pattern are also recognized: \begin{alltt} Name \ARG{name} \end{alltt} matches any path in which the last component matches \ARG{name}, \begin{alltt} Path \ARG{path} \end{alltt} matches exactly the path \ARG{path}, and \begin{alltt} BelowPath \ARG{path} \end{alltt} matches the path \ARG{path} and any path below. % The \ARG{name} and \ARG{path} arguments of the latter forms of patterns are {\em not} regular expressions. Instead, standard ``globbing'' conventions can be used in \ARG{name} and \ARG{path}: \begin{itemize} \item a \verb|*| matches any sequence of characters not including \verb|/| (and not beginning with \verb|.|, when used at the beginning of a \ARG{name}) \item a \verb|?| matches any single character except \verb|/| (and leading \verb|.|) \item \verb|[xyz]| matches any character from the set $\{{\tt x}, {\tt y}, {\tt z} \}$ \item \verb|{a,bb,ccc}| matches any one of \verb|a|, \verb|bb|, or \verb|ccc|. (Be careful not to put extra spaces after the commas: these will be interpreted literally as part of the strings to be matched!) \end{itemize} \item The path separator in path patterns is always the forward-slash character ``/'' --- even when the client or server is running under Windows, where the normal separator character is a backslash. This makes it possible to use the same set of path patterns for both Unix and Windows file systems. \item A path specification may be followed by the separator ``\verb| -> |'' itself followed by a string which will be associated to the matching paths: \begin{alltt} Path \ARG{path} -> \ARG{associated string} \end{alltt} Not all pathspec preferences use these associated strings but all pathspec preferences are parsed identically and the strings may be ignored. Only the last match of the separator string on the line is used as a delimiter. Thus to allow a path specification to contain the separator string, append an associated string to it, even if it is not used. The associated string cannot contain the separator string. \end{itemize} Some examples of path patterns appear in \sectionref{ignore}{Ignoring Paths}. Associated strings are used by the preference \texttt{merge}. \SUBSECTION{Ignoring Paths}{ignore} Most users of Unison will find that their replicas contain lots of files that they don't ever want to synchronize --- temporary files, very large files, old stuff, architecture-specific binaries, etc. They can instruct Unison to ignore these paths using patterns introduced in \sectionref{pathspec}{Path Specification}. For example, the following pattern will make Unison ignore any path containing the name \verb|CVS| or a name ending in \verb|.cmo|: \begin{verbatim} ignore = Name {CVS,*.cmo} \end{verbatim} The next pattern makes Unison ignore the path \verb|a/b|: \begin{verbatim} ignore = Path a/b \end{verbatim} Path patterns do {\em not} skip filenames beginning with \verb|.| (as Name patterns do). For example, \begin{verbatim} ignore = Path */tmp \end{verbatim} will include \verb|.foo/tmp| in the set of ignore directories, as it is a path, not a name, that is ignored. The following pattern makes Unison ignore any path beginning with \verb|a/b| and ending with a name ending by \verb|.ml|. \begin{verbatim} ignore = Regex a/b/.*\.ml \end{verbatim} Note that regular expression patterns are ``anchored'': they must match the whole path, not just a substring of the path. Here are a few extra points regarding the \texttt{ignore} preference. \begin{itemize} \item If a directory is ignored, all its descendants will be too. \item The user interface provides some convenient commands for adding new patterns to be ignored. To ignore a particular file, select it and press ``{\tt i}''. To ignore all files with the same extension, select it and press ``{\tt E}'' (with the shift key). To ignore all files with the same name, no matter what directory they appear in, select it and press ``{\tt N}''. % These new patterns become permanent: they are immediately added to the current profile on disk. \item If you use the \verb|include| directive to include a common collection of preferences in several top-level preference files, you will probably also want to set the \verb|addprefsto| preference to the name of this file. This will cause any new ignore patterns that you add from inside Unison to be appended to this file, instead of whichever top-level preference file you started Unison with. \item Ignore patterns can also be specified on the command line, if you like (this is probably not very useful), using an option like \verb|-ignore 'Name temp.txt'|. \item Be careful about renaming directories containing ignored files. Because Unison understands the rename as a delete plus a create, any ignored files in the directory will be lost (since they are invisible to Unison and therefore they do not get recreated in the new version of the directory). \item There is also an \verb|ignorenot| preference, which specifies a set of patterns for paths that should {\em not} be ignored, even if they match an \verb|ignore| pattern. However, the interaction of these two sets of patterns can be a little tricky. Here is exactly how it works: \begin{itemize} \item Unison starts detecting updates from the root of the replicas---i.e., from the empty path. If the empty path matches an \verb|ignore| pattern and does not match an \verb|ignorenot| pattern, then the whole replica will be ignored. (For this reason, it is not a good idea to include \verb|Name *| as an \verb|ignore| pattern. If you want to ignore everything except a certain set of files, use \verb|Name ?*|.) \item If the root is a directory, Unison continues looking for updates in all the immediate children of the root. Again, if the name of some child matches an \verb|ignore| pattern and does not match an \verb|ignorenot| pattern, then this whole path {\em including everything below it} will be ignored. \item If any of the non-ignored children are directories, then the process continues recursively. \end{itemize} \end{itemize} \SUBSECTION{Moved or Renamed Paths}{moves} Unison can, under certain conditions, detect moved and/or renamed files and directories. In that case, the move/rename is atomically propagated to the other replica, completely bypassing copying any data over the network or locally. To ask Unison to detect moves/renames, enable the \verb|moves-experimental| preference. When the \verb|moves-experimental| preference is not enabled then all moved and/or renamed files and directories will be detected and propagated as a separate creation at the new path and deletion at the old path, potentially having to copy a large amount of data. It is not possible to detect actual moves and renames simply by scanning the file system. Thus, moved and/or renamed files and directories are detected heuristically by matching detected deletions and creations that have the same contents; this may detect supposed moves/renames that do not match the actions done by user. For directories, contents means the names and contents of all children recursively. When having the same contents, a deletion and a creation are matched as a possible move/rename by following tests (up to first passing test): \begin{itemize} \item (for files only) inodes match; \item modification time and parent are same, but names are not (a renamed file/directory); \item modification time and name are the same, but parents are not (a moved file/directory); \item parent is the same, but names are not (a renamed file/directory); \item name is the same, but parents are not (a moved file/directory); \item modification time is the same, but parents and names are not (renamed and moved file/directory); \item nothing is the same (except contents) (renamed and moved file/directory). \end{itemize} If the contents have changed since the last sync then it will not be detected as a move/rename. If propagating a move/rename fails in the other replica then Unison falls back to a regular copy. \SUBSECTION{Symbolic Links}{symlinks} Ordinarily, Unison treats symbolic links in Unix replicas as ``opaque'': it considers the contents of the link to be just the string specifying where the link points, and it will propagate changes in this string to the other replica. It is sometimes useful to treat a symbolic link ``transparently,'' acting as though whatever it points to were physically {\em in} the replica at the point where the symbolic link appears. To tell Unison to treat a link in this manner, add a line of the form \begin{alltt} follow = \ARG{pathspec} \end{alltt} to the profile, where \ARG{pathspec} is a path pattern as described in \sectionref{pathspec}{Path Specification}. \begin{quote} {\bf\ifhevea\red\fi Warning:} Be careful when using the \verb|follow| preference. Using a \verb|Pathspec| that is not detailed and accurate enough will cause Unison to follow symlinks that you may have not intended to. This can cause paths outside the replica to be overwritten and deleted due to updates in the other replica. This can also cause the targets of links pointing to within the replica to be synchronized under two or more names (once directly and once via the link, for example), leading to unintended results and conflicts. \end{quote} {\em Warning}: Deleting, in one replica, the path where a followed symbolic link is in the other replica will cause the {\em target} of the ``transparent'' link to be deleted in the other replica. The symbolic link itself will not be deleted and remains as a broken link. %% FIXME: [2026-04] Is this intentional %% or is it a bug? This happens even if there was a symbolic link in both replicas and only the link, not the link target, was deleted in one. Treating symbolic links ``transparently'' may not always work as expected when it comes to directories. Deleting a file or directory in one replica will cause the target of a ``transparent'' link to be deleted in the other replica. Deleting a parent directory, however, which itself might or might not be a followed link, will not delete the targets of any ``transparent'' links contained within. %% FIXME: [2026-04] Is this intentional or is it a bug? Renaming or moving a file or directory in one replica behaves as a combination of a delete on the old path and a recreate on the new path. Thus, all the caveats regarding deletion of followed symbolic links apply (see above). Additionally, the link {\em target} is recreated in-place on the new path, {\em not} the symbolic link. Followed symbolic links are treated ``transparently'' also when it comes to the \verb|copyonconflict| and \verb|backup| preferences. This means that the target of the link is copied to the conflict or backup location, and you need to restore to the original link target location outside the replica, should you need the backup. %% FIXME: [2026-04] Unlike deletion, these copies follow %% links within directory hierarchy, too. Is this inconsistency %% intentional or a bug? Not all Windows versions and file systems support symbolic links; Unison will refuse to propagate an opaque symbolic link from Unix to Windows and flag the path as erroneous if the support or privileges are lacking on the Windows side. When a Unix replica is to be synchronized with such Windows system, all symbolic links should match either an \verb|ignore| pattern or a \verb|follow| pattern. To completely ignore all symbolic links, you may set the preference {\tt links} to {\tt false}. {\em Warning}: Just like with \verb|ignore|, be careful with ``{\tt links = false}''. This makes Unison effectively ignore symbolic links, so they could be deleted without notice. You may need to acquire extra privileges to create symbolic links under Windows. By default, this is only allowed for administrators. Unison may not be able to automatically detect support for symbolic links under Windows. In that case, set the preference {\tt links} to {\tt true} explicitly. \SUBSECTION{Permissions}{perms} Synchronizing the permission bits of files is slightly tricky when two different filesystems are involved (e.g., when synchronizing a Windows client and a Unix server). In detail, here's how it works: \begin{itemize} \item When the permission bits of an existing file or directory are changed, the values of those bits that make sense on {\em both} operating systems will be propagated to the other replica. The other bits will not be changed. \item When a newly created file is propagated to a remote replica, the permission bits that make sense in both operating systems are also propagated. The values of the other bits are set to default values (they are taken from the current umask, if the receiving host is a Unix system). \item For security reasons, the Unix \verb|setuid| and \verb|setgid| bits are not propagated. \item The Unix owner and group ids can be propagated (see \verb|owner| and \verb|group| preferences) by mapping names or by numeric ids (see \verb|numericids| preference). \end{itemize} \SUBSECTION{Access Control Lists - ACLs}{acls} Unison allows synchronizing access control lists (ACLs) on platforms and filesystems that support them. In general, synchronization makes sense only in case both replicas support the same type of ACLs and recognize same users and groups. In some cases you may be able to go beyond that and synchronize ACLs to a replica that couldn't fully use them---this may be be useful for the purpose of preserving ACLs. If one of the replicas does not support any type of ACLs then Unison will not attempt ACL synchronization. If the other replica does support ACLs then those will remain intact. If both replicas support ACLs of any supported type then you can request Unison to try ACL synchronization (\verb|acl| preference). Success of synchronization depends on permissions of the owner and group of Unison process (Unison must have permissions to set ACL) and the compatibility of ACL types on both replicas. An ACL is propagated as a single unit, with all ACEs. There is no merging of ACEs from the replicas. {\em Caveat}: ACE inheritance may in certain scenarios cause synchronization inconsistencies. In Windows, only explicit ACEs are synchronized; inherited ACEs are not actively synchronized, but Windows will propagate ACEs from parent directories (unless inheritance is explicitly prevented on a file or a directory---this prevention is also synchronized). Due to inheritance, the ultimately effective ACL may be different, or provide different access, even after synchronization. Unison currently supports the following platforms and ACL types: \begin{itemize} \item Windows (Windows XP SP2 and later) \begin{itemize} \item NTFS ACL (discrete ACL (DACL) only) \end{itemize} \item Solaris, OpenSolaris and illumos-based OS (OpenIndiana, SmartOS, OmniOS, etc.) \begin{itemize} \item NFSv4 ACL (ZFS ACL) \item POSIX-draft ACL \item Some NFSv4 ACL (ZFS ACL) cross-synchronization with POSIX-draft ACL \item Full cross-synchronization with other platforms that support NFSv4 ACLs; limited cross-syn\-chro\-niza\-tion with POSIX-draft ACLs \end{itemize} \item FreeBSD, NetBSD \begin{itemize} \item NFSv4 ACL (ZFS ACL) \item Limited POSIX-draft ACL (access ACL only; not default ACL) \item Full cross-synchronization with other platforms that support NFSv4 ACLs \end{itemize} \item Darwin (macOS) \begin{itemize} \item Extended ACL \end{itemize} \end{itemize} Not all filesystems on the listed platforms support all ACL types (or any ACLs at all). Synchronizing POSIX ACLs on Linux is not supported directly. However, it is possible to synchronize these ACLs with another Linux system by synchronizing extended attributes (xattrs) instead, because POSIX ACLs are stored as xattrs by Linux. This is disabled by default (see \sectionref{xattrs}{Extended Attributes - xattrs}). A simple way to enable syncing POSIX ACLs on Linux is to enable the preference \verb|xattrs| and add a preference \verb|xattrignorenot| with a value \texttt{Path !system.posix\_acl\_*}. The \verb|*| will be expanded to include both \verb|posix_acl_access| and \verb|posix_acl_default| attributes -- if you only want to sync either one, just remove the \verb|*| and type out the attribute name in full. If you want to prevent other xattrs from being synced then add an \verb|xattrignore| with a value \texttt{Path *} (value \texttt{Regex .*} will also work). \SUBSECTION{Extended Attributes - xattrs}{xattrs} Unison allows synchronizing extended attributes on platforms and filesystems that support them. System attributes are not synchronized. What exactly is considered a system attribute is platform-dependent. Synchronization is possible cross-platform, but see caveats below. If one of the replicas does not support extended attributes then Unison will not attempt attribute synchronization. If the other replica does support extended attributes then those will remain intact. If both replicas support extended attributes then you can request Unison to try attribute synchronization (\verb|xattrs| preference). Extended attributes from both replicas will not be merged, all extended attributes are propagated as a set from one replica to another. Unison currently supports extended attributes on the following platforms: \begin{itemize} \item {\em Linux} Attributes in user, trusted and security namespaces. Synchronization of the latter two namespaces depends on \verb|unison| process privileges and is disabled by default. To sync one or more attributes in the security namespace, for example, you can set the preference \verb|xattrignorenot| to \verb|Path !security.*| (for all) or to \verb|Path !security.selinux| (for one specific attribute). Attributes in system namespace are not synchronized, with the exception of \verb|system.posix_acl_default| and \verb|system.posix_acl_access| (also disabled by default). \item {\em Solaris, OpenSolaris and illumos-based OS (OpenIndiana, SmartOS, OmniOS, etc.)} \item {\em FreeBSD, NetBSD} Attributes in user namespace. \item {\em Darwin (macOS)} \end{itemize} Not all filesystems on the listed platforms may support extended attributes. \noindent {\it Caveats:} \begin{itemize} \item Some platforms and file systems support very large extended attribute values. Unison synchronizes only up to 16 MB of each attribute value. \item Attributes are synchronized as simple name-value pairs. More complex extended attribute concepts supported by some platforms are not synchronized. \item On Linux, attribute names always have a fully qualified form (\texttt{namespace.attribute}). Other platforms do not have the same constraint. The consequence of this is that Unison will sync the attribute names on Linux as follows: an \verb|!| is prepended to the namespace name, except for the \verb|user| namespace; the \verb|user.| prefix is stripped from attribute names instead. This allows syncing extended attributes from Linux to other platforms. These transformations are reversed when syncing {\em to} Linux, resulting in correct fully qualified attribute names. The \verb|xattrignore| and \verb|xattrignorenot| preferences work on the transformed attribute names. This means that any patterns for the user namespace must be specified without the \verb|user.| prefix and any patterns intended for other namespaces must begin with an \verb|!|. \end{itemize} The \verb|xattrignore| preference can be used to filter the names of extended attributes that will be synchronized. The most useful ignore patterns can be constructed with the \verb|Path| form (where shell wildcards \verb|*| and \verb|?| are supported) and with the \verb|Regex| form. The \verb|xattrignorenot| preference can be used to override \verb|xattrignore|. Disabling the security and trusted namespaces on Linux is achieved by setting a default \verb|xattrignore| pattern of \texttt{Regex !(security|trusted)[.].*}. Disabling the syncing of attributes used to store POSIX ACL on Linux is achieved by setting a default \verb|xattrignore| pattern of \texttt{Path !system.posix\_acl\_*}. \SUBSECTION{Cross-Platform Synchronization}{crossplatform} If you use Unison to synchronize files between Windows and Unix systems, there are a few special issues to be aware of. \textbf{Case conflicts.} In Unix, filenames are case sensitive: \texttt{foo} and \texttt{FOO} can refer to different files. In Windows, on the other hand, filenames are not case sensitive: \texttt{foo} and \texttt{FOO} can only refer to the same file. This means that a Unix \texttt{foo} and \texttt{FOO} cannot be synchronized onto a Windows system --- Windows won't allow two different files to have the ``same'' name. Unison detects this situation for you, and reports that it cannot synchronize the files. You can deal with a case conflict in a couple of ways. If you need to have both files on the Windows system, your only choice is to rename one of the Unix files to avoid the case conflict, and re-synchronize. If you don't need the files on the Windows system, you can simply disregard Unison's warning message, and go ahead with the synchronization; Unison won't touch those files. If you don't want to see the warning on each synchronization, you can tell Unison to ignore the files (see \sectionref{ignore}{Ignoring Paths}). \textbf{Illegal filenames.} Unix allows some filenames that are illegal in Windows. For example, colons (`:') are not allowed in Windows filenames, but they are legal in Unix filenames. This means that a Unix file \texttt{foo:bar} can't be synchronized to a Windows system. As with case conflicts, Unison detects this situation for you, and you have the same options: you can either rename the Unix file and re-synchronize, or you can ignore it. \SUBSECTION{Slow Links}{speed} Unison is built to run well even over relatively slow links such as modems and DSL connections. Unison uses the ``rsync protocol'' designed by Andrew Tridgell and Paul Mackerras to greatly speed up transfers of large files in which only small changes have been made. More information about the rsync protocol can be found at the rsync web site (\ONEURL{http://samba.anu.edu.au/rsync/}). If you are using Unison with {\tt ssh}, you may get some speed improvement by enabling {\tt ssh}'s compression feature. Do this by adding the option ``{\tt -sshargs -C}'' to the command line or ``{\tt sshargs = -C}'' to your profile. \SUBSECTION{Fast Update Detection}{fastcheck} If your replicas are large and at least one of them is on a Windows system, you may find that Unison's default method for detecting changes (which involves scanning the full contents of every file on every sync---the only completely safe way to do it under Windows) is too slow. Unison provides a preference {\tt fastcheck} that, when set to \verb|true|, causes it to use file creation times as 'pseudo inode numbers' when scanning replicas for updates, instead of reading the full contents of every file. When \verb|fastcheck| is set to \verb|no|, Unison will perform slow checking---re-scanning the contents of each file on each synchronization---on all replicas. When \verb|fastcheck| is set to \verb|default| (which, naturally, is the default), Unison will use fast checks on Unix replicas and slow checks on Windows replicas. This strategy may cause Unison to miss propagating an update if the modification time and length of the file are both unchanged by the update. However, Unison will never {\em overwrite} such an update with a change from the other replica, since it always does a safe check for updates just before propagating a change. Thus, it is reasonable to use this switch most of the time and occasionally run Unison once with {\tt fastcheck} set to \verb|no|, if you are worried that Unison may have overlooked an update. Fastcheck is (always) automatically disabled for files with extension \verb|.xls| or \verb|.mpp|, to prevent Unison from being confused by the habits of certain programs (Excel, in particular) of updating files without changing their modification times. \SUBSECTION{Mount Points and Removable Media}{mountpoints} Using Unison removable media such as USB drives can be dangerous unless you are careful. If you synchronize a directory that is stored on removable media when the media is not present, it will look to Unison as though the whole directory has been deleted, and it will proceed to delete the directory from the other replica---probably not what you want! To prevent accidents, Unison provides a preference called \verb|mountpoint|. Including a line like \begin{verbatim} mountpoint = foo \end{verbatim} in your preference file will cause Unison to check, after it finishes detecting updates, that something actually exists at the path \verb|foo| within both replicas; if it does not, the Unison run will abort. Note that the preference's name is confusing; it is intended to be used when a root might or might not be mounted, but the value is a relative path within a root, for a file or directory that should be present. (The preference is not used to specify the path at which a replica is mounted.) \appendix \finishlater{ \SECTION{Other Synchronizers}{other}{other} See also: D. Duchamp A Toolkit Approach to Partially Disconnected Operation Proc. USENIX 1997 Ann. Technical Conf. USENIX, Anaheim CA, pp. 305-318, January 1997 \ONEURL{https://www.usenix.org/conference/usenix-1997-annual-technical-conference/toolkit-approach-partially-connected-computing} } \finishlater{ \SECTION{TODO}{todo}{ } Things to write about: \begin{itemize} \item When started in 'socket server' mode, Unison prints 'server started' on stderr when it is ready to accept connections. (This may be useful for scripts that want to tell when a socket-mode server has finished initialization.) \item {\tt DANGER.README}. \end{itemize} } \finishlater{ Things to write about later: \begin{itemize} \item Document different reporting of file status when no archives were found. \item Document buttons in graphical UI \end{itemize} } \iftextversion \SECTION{Junk}{ }{ } \fi \ifhevea\begin{rawhtml}
\end{rawhtml}\fi \end{document}