RALEE -- RNA alignment editor in Emacs -------------------------------------- Version 0.8 -- 2014-11-16 What have we here then? ----------------------- RALEE is a major mode for the emacs editor that adds functionality to aid the viewing and editing of alignments of structured RNAs. It was written by and primarily for curators of the Rfam database (http://rfam.xfam.org/), who are always wanting to do things that available multiple sequence alignment editors don't allow. Requirements ------------ - ralee-mode.tar.gz - get the lastest version from http://sgjlab.org/ralee/ - Emacs 21 or later - http://www.gnu.org/software/emacs/emacs.html - tested with GNU Emacs 21.2 on OSF1/alpha and GNU/Linux, GNU Emacs 22.0 on Mac OS X, and minimally on Windows XP. - some Stockholm format RNA alignments that you need to view/edit - http://rfam.sanger.ac.uk/ - http://en.wikipedia.org/wiki/Stockholm_format - Optionally: - folding and viewing structures requires the ViennaRNA package and ghostview - http://www.tbi.univie.ac.at/~ivo/RNA/ - http://www.cs.wisc.edu/~ghost/ - Building trees and sorting sequences thereby requires the quicktree programme. - http://www.sanger.ac.uk/Software/analysis/quicktree/ - An included perl script fetches sequences and annotation from NCBI entrez. LWP.pm is required. Installation ------------ 1. Untar ralee-mode.tar.gz $ gunzip -c ralee-mode.tar.gz | tar -xvf - 2. Copy the .el files from the elisp directory into a directory of your choice; somewhere in your home area if you are the only person who will use it, or somewhere more central if other people may want to (and if you have root access). The following example uses the location ~/site-lisp/. 3. Add this location to your emacs load-path by adding the following line to your emacs configuration file (likely ~/.emacs but could be ~/Library/Preferences/Emacs/Preferences.el if you're using Aquamacs Emacs on Mac OS X): ;------- (add-to-list 'load-path "~/site-lisp") ;------- 4. Add some stuff to your .emacs config file to auto-load the RALEE mode on demand: ;------- ;; ralee mode is good for RNA alignment editing (autoload 'ralee-mode "ralee-mode" "Yay! RNA things" t) (setq auto-mode-alist (cons '("\\.stk$" . ralee-mode) auto-mode-alist)) ;------- 5. Copy the contents of the perl directory to somewhere on your path, or add the perl directory to your path. 6. And you're good to go! 7. Optional useful .emacs additions: ;------- ;; Show current line and column number (column-number-mode t) (line-number-mode t) ;; Show parenthesis pairs - screws up XEmacs (show-paren-mode t) ;; Always end a file with a newline (setq require-final-newline t) ;------- 8. If you find that ralee can't find key executables on your path, for example RNAalifold for "fold-alignment", then try explicitely telling emacs where to find those executables. Add lines like the following to your .emacs file: ;------- (add-to-list 'exec-path "~/path/to/executables/i/need") ;------- You can add multiple lines to add multiple directories to your exec-path. Running RALEE ------------- 1. Format your input alignment to be unblocked stockholm format. You could use the sreformat tool from Sean Eddy's HMMER package (http://hmmer.wustl.edu): $ sreformat --pfam stockholm [alignfile] > [infile] Stockholm format can include a consensus secondary structure markup in a line with the tag "#=GC SS_cons". Most of RALEE's functionality makes use of this markup so you should add an SS_cons line if you don't already have one. The format of the line uses parentheses to show nested base pairs so: 0123456789012345678901234 .<<<<<...>>.<<...>>..>>>. Column 1 pairs with 23 2 with 22 3 with 21 4 with 10 5 with 9 12 with 18 13 with 17 Note that the column numbers are not part of the Stockholm format -- they are just shown here to aid explanation. You will also note that this doesn't allow markup of pseudoknot interactions -- there are ways of showing these, but RALEE doesn't yet support them. 2. Open the alignment in emacs: $ emacs [infile] If your infile name ends with .stk then you should see "(Ralee)" in the bottom bar of emacs. If it doesn't, you can load ralee-mode with: M-x ralee-mode (M-x means "hold down Alt and press x" or "press Esc then x" if your setup is vaguely normal.) 3. Get editing! Take a look at the file examples.txt to work through some examples. Read the MANUAL file for a list of commands and shortcuts. Bugs ---- - The code has only been tested with an unblocked version of the Rfam database (http://rfam.sanger.ac.uk/) alignment format. Things will fail randomly and unhelpfully if you try and open other alignment formats. - Full Stockholm compatability is by no means guarenteed. - Very (and I do mean very) bad things happen if your alignment has more than one sequence with the same name. - shift-sequence-right is buggy if we push off the end of the line with #=GR lines about. - unblock-alignment will keep header annotation, but lose anything that is mid-alignment and not #=GC or #=GR. - Some functions assume that gaps are all '.'s. - Highlighting a sequence by clicking doesn't work in XEmacs. - Highlighting a sequence screws up the colours if you have customised them. - re-align block fails if block comprises whole lines - Block functions fail in XEmacs. - write-ps fails without useful complaint if the alignment ends aren't flush, or (more annoyingly) if you have #=GR lines. - Temp files are written with names like /tmp/trees.ralee. If these don't get deleted then we get permission errors overwriting existing files. - Menus, method names and documentation contain an annoying mix of Anglicised and American spellings -- color and colour being the worst culprits. - Error messages are generally uninformative. - Unknown bugs are many and varied. Wishlist -------- - Colour scheme should handle pseudoknot markup. - Shift user defined blocks of sequence about. - Import/export other alignment formats. - Customise thresholds for colour markup, as well as the colours themselves. - Colour by posterior probability markup in Xfam alignments. - Choose a sequence to realign to the model - Validate Stockholm format Contact ------- We're using RALEE in anger to improve alignments in the Rfam database (http://rfam.sanger.ac.uk/). But the code is experimental in places, and some features are poorly tested. I'd appreciate a mail if you find it useful, and any bugs/requests/abuse should be directed at: sam.griffiths-jones@manchester.ac.uk Publication ----------- RALEE is published! "If you use it, cite it" should be your motto: Sam Griffiths-Jones. RALEE - RNA ALignment Editor in Emacs. Bioinformatics 2005, 21:257-259 History ------- 0.8 2014-11-16 - unblock-alignment now deals with #=GR lines (bug reports from Larry Ruzzo and Paul Gardner). - Replace tabs in input alignment with spaces (deal with Tom Jones' broken alignment formats :p ) - Fix write-ps bug for alignments that haven't been painted (from Karissa Sanbonmatsu). - Added some useful message outputs. - Code that checks for updates is more friendly and no longer uses external perl script. 0.7 2012-03-31 - Public release of collected bugfixes and requests from 0.62a and 0.63a (see below). - Add extremely minimal functionality for protein alignments (just colour by conservation right now). - Fix last column bug for colour by base identity method (Tom Jones). 0.63a 2011-03-17 - Fix file permission bug with unblock-alignment. - unblock-alignment will now retain information from the header of the alignment, eg. #=GF and #=GS lines from Rfam alignments (requested by Tom Jones). - Unblock blocked alignments on loading by default. Disable by setting the ralee-auto-unblock variable to nil (eg via "customize-group ralee"). - Reimplement paint-buffer-by-base to colour all bases, not just those that are conserved above arbitrary threshold (requested by Tom Jones). 0.62a 2008-01-28 - For Rfam Sangerites eyes only currently -- untested and half implemented bits and bobs. - Fix bug with long sequence ids and clustalw in realign-block. - Add remove-base-pair and add-base-pair methods in response to Paul Gardner request. WARNING: add-base-pair doesn't check whether nesting is preserved. If it isn't then remove-base-pair won't remove what you just added. - Reimplement sequence sorting and add sort-sequences-by-score. - Add remove-identical-sequences. - Rewrite all the calls to external programmes. - Fix bug with calculate-tree call, whereby the temporary *trees* buffer doesn't get over-written properly. - Parse clustalw output better to cope with its truncation of seq ids. 0.61 2007-12-06 - fixes bug with the gap positions in structures predicted with fold-structure. 0.6 2007-12-04 - protect-mode is no longer on by default. If you want it on then add the following lines to your .emacs file: (add-hook 'ralee-mode-hook 'protect-mode-on) If you like protect-mode to default to off then you can protect against future whimsical changes with: (add-hook 'ralee-mode-hook 'protect-mode-off) - realign-block takes a defined block and realigns it using clustalw. - paint-buffer-by-compensatory-changes is the default scheme for compensatory view (inspired by Jakob Pedersen's evofold track at UCSC). Significantly different from the previous implementation, available as paint-buffer-by-compensatory-all. - sort-sequences-by-(id|tree). Sort by tree order requires the quicktree program. Can also calculate-tree and calculate-distance-matrix. - fetch sequences and annotation from NCBI entrez with fetch-sequence and fetch-sequence-descriptions. Uses perl/efetch.pl to access NCBIs eutils. Roll your own sequence fetcher and configure in elisp/ralee-helpers. - Colouring should no longer use many entries of the undo buffer. 0.5 2005-07-10 - paint-buffer-by-compensatory colours columns if they have compensatory mutations - center-on-point on screen refresh - Paint arbitrary user defined regions - Search for a sequence motif - Colours used for all types of markup are now customisable - Scrolling around with C-arrows is more sensible - Define and operate on a sequence block (eg folding) - Shift and throw sequences left and right more sensibly, including concommittant movement of associated #=GR lines - Fixed bugs related to sequence names of unexpected format - XEmacs compatability is broken in several ways 0.4 2004-09-23 - Highlight a line with left mouse click - Add shift-block-to-gap-right and -left methods - Write a pretty postscript file with write-ps - Configure the page for the above with page-setup 0.3 2004-06-30 - Add protect-mode to stop inadvertant deletion and insertion - Add trim-left and trim-right methods - Add t-to-u and u-to-t methods - Rewrite XEmacs menu from scratch (largely lifted from XEmacs source) - Expand the examples 0.2 2004-06-08 - Add menus for both GNU Emacs and XEmacs - A couple of minimal examples included - Fold sequences to add "#=GR SS" lines, and allow colouring of buffer based on alternative structures - paint-buffer-by-cons and paint-buffer-by-base do different things 0.1 2004-05-31 - First effort at something that will be useful, at least for Rfam internal use. Colour bases in an alignment based on the base pairing. Legal things ------------ RALEE is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. RALEE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with RALEE; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. ------------------------------- 2014-11-16 Sam Griffiths-Jones