Release Notes
    
                       geWorkbench V2.0.0
               
                        June 8th, 2010

       Joint Centers for Systems Biology, Columbia University
                       New York, NY  10032

                   http://www.geworkbench.org

================================================================
                            Contents
================================================================
    
    1.0 geWorkbench Installation Notes
    2.0 geWorkbench Introduction and History
    3.0 New Features and Updates
    4.0 Known Issues/Defects
    5.0 Bug Reports and Support
    6.0 Documentation and Files
    7.0 geWorkbench Web Pages 


================================================================
    1.0 geWorkbench Installation Notes
================================================================

    System Requirements:
        Java:    
                 The Java 6 JRE is required.  On Windows and Linux it can be 
                 installed separately or together with geWorkbench.  On MacOSX,
                 the Java 6 JRE is included with MacOSX versions 10.5 and higher.
                 Please note that Java 6 is, in places referred to by Sun as
                 Java 1.6.  32-bit and 64-bit versions of Java can be used on
                 appropriate platforms.
                 
                 See http://java.sun.com/javase/downloads/index.jsp

        Memory:  
                 At least 2 GB is recommended.  geWorkbench 2.0.0 by
                 default will request up to 1 GB of memory for the Java VM.
        
        Operating System:
                 Windows XP/Vista/Windows 7 (32 or 64-bit): no special requirements.
                 
                 MacOSX:  Version 10.5 or higher is required to provide
                     the Java 6 JRE.
                
                 Linux: no special requirements known.
    
    
    All three platform-specific versions of geWorkbench (Windows,
    Linux, and Macintosh) provide an installation wizard
    (generated using InstallAnywhere).
    
    A generic version of geWorkbench, which does not use any installer,
    is also available.

    Additional installation details are provided below, and at
    www.geworkbench.org.  All user documentation is maintained 
    in online form at www.geworkbench.org.
    
    geWorkbench, unless otherwise noted for particular components, can be
    run on both 32 and 64-bit operating systems and JREs.


    Platform-specific release details:

    1. Windows (XP/Vista/Windows 7)
    
        Special note for Vista/Windows 7 - if you run the installer on Vista
        or Windows 7, please install geWorkbench to your root directory, e.g.
        c:\geWorkbench_2.0.0  rather than to C:\Program Files\geWorkbench_2.0.0.

        File: geWorkbench_v2.0.0_Windows_installer_with_JRE6.exe

            Includes the 32-bit Sun Java 6 JRE.
             
        File: geWorkbench_v2.0.0_Windows_installer_noJRE.exe
        
            No JRE is included, you must make sure that an appropriate Java 6 JRE
            is installed on your system before installing geWorkbench.

       Download and double-click the installer file to begin installation.


    2. MacOSX

       File: geWorkbench_v2.0.0_MacOSX_installer.zip.

       This version relies on the Java 6 JRE included with recent updates to the
       MacOSX operating system.

       Double-click geworkbench_v2.0.0_MacOSX_installer.zip to begin
       installation.

       Notes
          * Requires Mac OS X 10.5 or later
          * The compressed installer should be recognized by Stuffit
            Expander and should automatically be expanded after downloading.
            If it is not expanded, you can expand it manually using a recent
            version of StuffIt Expander. 


    3. Linux

       File: geWorkbench_v2.0.0_Linux_installer_with_JRE6.bin.

            Includes the 32-bit Sun Java 6 JRE.
       
       The Linux version of geWorkbench relies on X-Windows being installed
       and running. If you are running Linux on a server and e.g. Windows
       on your desktop, you will also need to run an X-windows server on
       your desktop machine. Further information can be found on the
       Download and Installation page of geworkench.org.

       After downloading, cd (if needed) to the directory to which you
       downloaded the installer.

       To begin the installation, type the command: 

         "sh ./geWorkbench_v2.0.0_Linux_installer_with_JRE6.bin".

          This will extract geWorkbench into a new directory called
          geWorkbench_2.0.0. 

       To run geWorkbench, and assuming you are using the Linux bash shell,
          issue the command:  
          
          "./rungeWorkbench_2.0.0" 


    4. Generic -   A non-installer-based version of
       geWorkbench is supplied in a Zip file which should work on
       any platform.

       File: geWorkbench_v2.0.0_Generic.zip

          Installation: 
    
             Unzip the file.  It will create a directory
                geWorkbench_2.0.0.
          

          Running geWorkbench (generic):

            You must have the Java 6 JRE installed and the
               JRE must be in the path for geWorkbench.

            Windows: you can double click on the file
                "launch_geWorkbench.bat" to launch geWorkbench, or
                run it from a command window.
            
            Linux/Unix:   Execute the script "launch_geworkbench.sh".
      
            Any: Alternatively, if you have Apache Ant installed,
                 you can type "ant run" in the geWorkbench directory.


================================================================
    2.0 - geWorkbench Introduction and History
================================================================


    geWorkBench, an open source bioinformatics platform 
    written in Java, makes sophisticated  tools for data management, 
    analysis and visualization available to the community in a 
    convenient fashion.
 
    geWorkbench evolved from a project, caWorkbench, which was originally
    sponsored by the National Cancer Institute Center for Bioinformatics
    (NCICB). Some of the most fully developed capabilities of the platform
    include microarray data analysis, pathway analysis, sequence
    analysis, transcription factor binding site analysis,
    and pattern discovery. 

    geWorkbench 2.0.0 adds several new components contributed by the MAGNet
    Center at Columbia University.  It also adds  file parsers for
    MAGE-TAB and GEO Soft files, new filtering components, and includes many
    other new features, enhancements, and bug fixes.


================================================================
    3.0 New Features and Updates
================================================================ 


Changes included in geWorkbench 2.0.0

    New components

    Skyline - A high-throughput comparative modeling pipeline. 
             It creates structural homology models for protein sequences
             with similarity to a protein with an experimentally determined
             3-D structure.  The input is a PDB file.
                
    Skybase - SkyBase is a database that stores the homology models built by
             SkyLine analysis for all NESG PSI2 protein structures.  It is
             queried using FASTA-format protein sequence files.
                
    Pudge -  Interface to a protein structure prediction server which integrates
             tools used at different stages of the structural prediction process.
             Modeling starts with a FASTA-format protein sequence file.

    Other major new features in release 2.0

    * More than 250 "bug reports" were closed. These included many new features,
        improvements in the usability of numerous components, and actual bug fixes.
    * Java 6 - Moved from Java 5 to Java 6. geWorkbench now requires Java 6.
        Works on both 32 bit and 64 bit VMs (JREs).
    * Look and Feel - Switched to new, more modern Look and Feel (Nimbus).
        geWorkbench appearance now consistent across all platforms.
    * caBIO component updated from 4.2 to 4.3. 
    * Cellular Network Knowledge Base (CNKB) - Revamped interface to allow choice
        of interactome and data types.
    * File parsers added:
        MAGE-TAB data matix
        GEO Soft format - added series (GSE) and curated matrix (GDS). 
    * Filtering - completely revamped - now works directly for all modes,
        allows specification of minimum % matching arrays before filtering occurs.
    

    List of other major changes

    * caArray - Improved memory usage on downloads from caArray.
    * CNKB - Can now return markers direct from CNKB without use of Cytoscape.
    * Color Mosaic - enhancements to display (bug 2147):
        toggle array names on/off
        search on array name, accession, or label 
    * Component Configuration Manager - now can filter display list by categories:
        Analysis, Viewer, Normalizer, Filter
    * Cytoscape - Corrected mapping between gene names in Cytoscape display and
        markers in Marker Sets panel (now uses Entrez IDs).
    * Dendrogram - can now create Array subsets as well as marker subsets.
    * Markers and Arrays - Hover text available in Markers and Arrays phenotypes
        to visualize long names if needed.
    * Marker Annotation - search results can be saved to a text file, including
        relevant URLs and pathway BioCarta pathway names.
    * File loading - Checking for "out of memory" errors during file loading.
    * GUI - in switching to new L&F, fixed many text highlighting problems that were
        previously seen on Macintosh only but now appeared on Windows also.
    * File parser menu - The file parser selection menu now shows valid file extensions
        for each type.
    * Promoter - JASPAR promoter motifs now filterable by taxon.
    * Sequence alignment (BLAST) - many enhancements, including
        added additional databases to match those listed at NCBI
        improved handling of results from searches containing long query sequences. 

    Online Help chapters updated
	* CCM
	* Filtering
	* Normalization
	
	
    Versions of external files/components included in this release
    * gene_ontology.1_2.obo downloaded 5/24/2010 from geneontology.org.
    * Ontologizer.jar version 2.0, file released 3/10/2010.
    * Jaspar_CORE (http://jaspar.genereg.net/) SQL files last updated on server 10/2009.
    * JMOL - component updated. 


***Changes in previous versions***

    Changes included in geWorkbench 1.8.0

    New components

    1. Gene Ontology Enrichment - Analysis and visual components.
       Analysis component is built on Ontologizer 2.0.


    Other changes in release 1.8.0:

     1. caArray - Update caArray component to use caArray 2.3.0 Java API.
        Please note that geWorkbench 1.8.0 is not compatible with earlier
        versions of caArray.
     2. CNKB - The network graph generated by CNKB was only showing nodes
        centered about a focus node.  Now all accepted nodes will be
        displayed.
     3. Dataset History - Additions for several modules.
     4. Grid Services - A number of fixes to grid services were made.
     5. Marker Annotations - Fixed a problem with retrieving marker
        annotations when microarray data downloaded from caArray.
     6. Mark-Us - JMOL dependency added for molecule display.
     7. Promoter - Update JASPAR motifs to release of December 2007.
        -Note on October 12, 2009 a new version of JASPAR was released
        which made an incompatible change in the file format.
     8. Promoter - component now displays logos using the "Schneider" method,
        including his "small-value correction", rather than using a previous
        "in-house" method.
     9. Promoter - the displayed data now does not include the effects of
        the pseudo-count normalization process.
    10. Promoter - Added ability to specify pseudocount or select previous
        hard-coded option of square root of number of sequences.
    11. Promoter - Loaded TFs now are properly added to the list of
        available TFs.
    12. Sequence Alignment (BLAST) - PFP filtering option removed
    13. Usability fixes - operation of cancel buttons, progress bar.
    14. Release Notes - Added specific installation instructions.


    Online Help chapters updated
     1. ANOVA
     2. ARACNe
     3. CNKB
     4. Marker Annotations
     5. Master Regulator Analysis
     6. Promoter
     7. Sequence Alignment (BLAST) 


    Changes included in geWorkbench 1.7.0


    New components

    1.  MarkUs - The MarkUs component assists in the assessment of the
        biochemical function for a given protein structure. It serves
        as an interface to the Mark-Us web server at Columbia. Mark-Us
        identifies related protein structures and sequences, detects
        protein cavities, and calculates the surface electrostatic
        potentials and amino acid conservation profile.

    2.  MRA - The Master Regulator Analysis component attempts to identify
        transcription factors which control the regulation of a set
        of differentially expressed target genes (TGs). Differential
        expression is determined using a t-test on microarray gene
        expression profiles from 2 cellular phenotypes, e.g. experimental
        and control.

    3.  Pudge - Interface to a protein structure
        prediction server (Honig lab) which integrates tools used
        at different stages of the structural prediction process.

    4.  ARACNe2 - upgraded to ARACNe2 distribution from Califano lab,
        which adds selectable modes (Preprocessing, Discovery, Complete)
        and a new algorithm (Adaptive Partitioning). Preprocessing allows
        determination of key parameters from actual input dataset.

    5.  caGrid v1.3 - Upgrading of grid services to caGrid v1.3 +
        introduction of caTransfer for large data tranfers.

    6.  Component Configuration Manager - allows individual components to
        be loaded into or unloaded from geWorkbench.

    7.  genSpace collaborative framework - discovery and visualization
        of workflows. Implemented user registration and preferences.

    8. SVM 3.0 (GenePattern) - Support Vector machines for classification.

    Other changes in release 1.7.0:

     1. Analysis - Parameter saving implemented in all components. If
        current settings match a saved set, it is highlighted.  
     2. ARACNe - improved description of DPI in Online Help.  
     3. caArray - query filtering on Array Provider, Organism and Investigator
        implemented.  
     4. caArray - can now add a local annotation file to caArray data downloads.  
     5. caGrid - caGrid connectivity is now built directly in to supported
        components rather than being a separate component itself.
     6. caScript - The caScript editor is no longer supported.  
     7. Color Mosaic - now interactive with the Marker Sets list and Selection set.  
     8. Cytoscape - Upgrade to Cytoscape version 2.4 for network visualization
        and interaction.
     9. Cytoscape - Set operations on genes being returned from
        Cytoscape network visualizations, via right-click menu.
    10. Cytoscape - Changes to tag-for-visualization - e.g., now only
       one way, from marker set to Cytoscape, not vice-versa.  
    11. Gene Ontology file - the OBO 1.2 file format is supported.  
    12. Marker Annotations - Direct access to the NCI Cancer Gene Index was
        added. It supplies detailed literature-based annotations on a
        curated set of cancer-related genes.  
    13. Marker Annotations - add export to CSV file.  
    14. Marker Sets component - a set copy function was added.  
    15. MINDy - many improvements to display and results filtering - including
        marker set filtering.  
    16. Scatter Plot - Up to 100 overlapping points can be displayed in a single
        tooltip.  
    17. Various - A number of components were refactored.
    18. Workspace saving - now works properly for all components.


    Changes included in geWorkbench 1.6.3

    * geWorkbench 1.6.3 fixes several caArray related issues:
       - connection issue that may cause a time-out on some machines.
       - incorrect caching of caArray query results.
       - duplicate query process removed. 

  
    Changes included in geWorkbench 1.6.2

    * geWorkbench 1.6.2 provides improved proxy communication with its grid
    service dispatcher component (see Mantis bug 1631).
    * A problem was fixed in the server-side grid implementation of
      hierarchical clustering (Mantis bug 1598).


    Changes included in geWorkbench 1.6.1

    * A Java servlet now provides connectivity to the Cellular Networks
      Knowledge Base database through the firewall.
    * Online help for the Sequence Retriever component was added.
    * The GenePix annotation parser was augmented to include more data fields.
    * Added a missing GenSpace component.
    * The GenSpace component was moved from the visual area to the command area.
    * Volcano plot scaling was fixed to display extreme P-values (E-45).
    

    Changes included in geWorkbench 1.6.0

    * Adds Mindy component
    * The GO Terms component is not included in this release.  It will
           return in a future release.
    * Fixed a problem (caused by a change in a server-side URL) with
        retrieving annotations for genes in Biocarta pathway diagrams (bug 1577).
    * The default caArray server was set to the production server at NCI
        (array.nci.nih.gov, port 8080) (bug 1602). The URL for the staging
      array was updated to array-stage.nci.nih.gov.
    * An incorrect argument was being sent to NCBI's BLAST server. Due to
        recent changes there implementing stricter checking, blastn would no
        longer run. (bug 1597).
    * Corrected a problem where, when using the adjusted Bonferroni correction,
        or the Westphal-Young with MaxT, only values with positive fold-changes
        were returned and displayed (bug 1603).
    * Added a feature whereby the user is warned before any operation that
        will alter the dataset, e.g. before filtering out markers, or before
        a log2 transformation.
    * Added a feature to allow adding a new empty marker set. This can then
        be used to receive markers selected interactively in Cytoscape (bug 1541).
    * Fixed a problem displaying patterns in the sequence viewer after running
        Pattern Discovery (SPLASH) (bug 1415).
    * Fixed a problem with displaying adjacency matrices generated by ARACNE
        in the Cytoscape component (bug 1449). 


    * Numerous changes were made to improve responsiveness, including when
          - selecting a marker in a large dataset (bug 1346),
          - right-clicking on Project with a large dataset (bug 1337),
          - saving a workspace (bug 1525), and
          - starting an analysis (bug 1544). 
     * Remaining changes, not listed here in detail, included
          - internal issues within geWorkbench,
          - improved verification of parameters and set selections before
            beginning a calculation,
          - improvements to the graphical user interfaces of many components, and
          - corrections to the grid implementations of analytical services
            (Hierarchical Clustering, SOM, ANOVA etc). 


    Changes included in geWorkbench 1.5.1:
. 
        *  It addresses changes in the APIs for the caArray and caBIO
           data services since geWorkbench 1.5 was released.  geWorkbench 1.5.1
           can currently connect with caArray 2.1 and caBIO 4.0/4.1.
        *  It also includes an update to parse the new release 26 of Affymetrix
           annotation files.
        *  Fixes a problem where annotation information was not associated with
           arrays that were merged.


    Changes included in geWorkbench 1.5:

        New Modules:
          * ARACNE � gene network reverse engineering (from Andrea
              Califano's lab at Columbia University, 
              http://wiki.c2b2.columbia.edu/califanolab/index.php/Software). 
          * ANOVA � Analysis of variance, ported from TIGR's MEV,
              http://www.tm4.org/mev.html). 
          * caArray2.0 connectivity � query for and download data from
              caArray 2.0 directly into geWorkbench.
          * Cellular Networks Knowledge Base � database of molecular 
              interactions.  (from Andrea Califano's lab at Columbia University, 
              http://amdec-bioinfo.cu-genome.org/html/BCellInteractome.html).
          * GenSpace - provide social networking capabilities and 
          allow you to connect with other geWorkbench users.
          * MatrixReduce � transcription factor binding site prediction
              (from Harmen Bussemaker's lab at Columbia University, 
              http://bussemaker.bio.columbia.edu/software/MatrixREDUCE/).
          * Analysis components ported from GenePattern (http://www.genepattern.org) 
              - Principle Component Analysis (PCA)
              - K-nearest neighbors (KNN)
              - Weighted Voting (WV)

        New File types supported
           * The NCBI GEO series matrix file for microarray data (tab-delimited)

        New server side architecture
           * Invocation of caGrid services is now delagated to an independent 
              component (the Dispatcher). This makes it possible to exit geWorkbench 
              after submitting a long-running job and then automatically pick up any 
              results next time the application starts. 

        Other changes
          * The Marker and Array/Phenotypes components now support algebraic operations 
              (union, intersection, xor) on marker and array groups.
          * Upon exiting the application, the user is prompted to store their workspace.
          * Workspace persistence problems have been resolved.
          * The Marker Annotations component has been enhanced in several ways:
              ** The integration with caBIO has been updated to use API Version 4.0
              ** The caBIO Pathway component (previously an independent geWorkbench 
                    component that would display BioCarta pathway images) has been 
                    integrated into the Marker Annotations component.
              ** Markers can be returned from BioCarta pathway diagrams.
              ** A new option is provided to choose between human or mouse CGAP 
                    annotation pages.


================================================================
    4.0 Known Issues/Defects
================================================================
   
     Affymetrix Annotation files:

        Due to licensing restrictions, Affymetrix annotation files cannot be
        included in this distribution.  geWorkbench users who are working
        with Affymetrix chip data should retrieve the latest version of the
        appropriate annotation file for the chip type they using directly from

        https://www.affymetrix.com/site/login/login.affx

        A free account at Affymetrix.com is required.

        Current annotation files in CSV format are listed there.
        If you need an annotation file for an older file you can use
        its name in the search field on the web page, e.g. "HG_U95Av2".

        An example file from the Affymetrix site is
        "HG_U95Av2.na29.annot.csv.zip".  This file would need to be
        unzipped before use.  You can place the file in any convenient
        directory.  When you load a new data file, you will be asked
        for the location of the annotation file and can browse to it. 


     Grid Computations
        The reference implementations of the server-side grid-enabled algorithms
        currently are running on a single front-end server not meant for
        heavy computational use.  That server is not configured for computing on large
        datasets or for long-running jobs.
 

================================================================
    5.0 Bug Reports and Support
================================================================
    
    Support is provided via online forums at the NCI's Molecular Analysis Tools
    Knowledge Center. 

        See https://cabig-kc.nci.nih.gov/Molecular/forums/

    FAQs and other articles are also available at


        https://cabig-kc.nci.nih.gov/Molecular/KC/index.php/Main_Page#geWorkbench


    Finally, please see the geWorkbench project page for additional known issues and FAQs.

        www.geworkbench.org.

    
================================================================
    6.0 Documentation and Files
================================================================

    
    The documents and support files in this distribution include:

    geWorkbench Release Notes:
        ReleaseNotes_2.0.0.txt (this file)
         
  
    geWorkbench License: 
        geWorkbenchLicense.txt


    Online Help:
        Within geWorkbench, users can access "Help Topics" by clicking the
    top menu. It has detailed information about each module.

       
    For other documentation not directly included as part of the
    distribution, see the following section (7.0) Web Resources.

   
================================================================
    7.0 geWorkbench Web Resources
================================================================
     
 
    The geWorkbench team maintains a Wiki containing extensive documentation,
    a User Manual, tutorials and training slides.  It is available at:
        http://www.geworkbench.org