Release Notes geWorkbench Version 1.0.5 March 15, 2007 Joint Centers for Systems Biology, Columbia University New York, NY 10032 ================================================================ Contents ================================================================ 1.0 geWorkbench Introduction and History 2.0 New Features and Updates 3.0 Known Issues/Defects 4.0 Bug Reports and Support 5.0 Documentation and Files 6.0 geWorkbench Web Pages ================================================================ 1.0 - geWorkbench Introduction and History ================================================================ geWorkBench 1.0.5, an open source bioinformatics platform written in Java, makes sophisticated tools for data management, analysis and visualization available to the community in a convenient fashion. geWorkbench evolved from a project, caWorkbench, which was originally sponsored by the National Cancer Institute Center for Bioinformatics (NCICB). Some of the most fully developed capabilities of the platform include microarray data analysis, pathway analysis, sequence analysis, transcription factor binding site analysis, and pattern discovery. geWorkbench 1.0.5 is primarily a maintenance and bug-fix release. All included components have been tested using formal, documented test procedures. All known problems which could affect data quality or interpretation, or which could cause program failure, have been fixed. Any remaining known issues were deemed to be matters of convenience, layout, or feature requests. Several components included in previous releases have been removed from this release until they can be brought up to the desired level of integration with the rest of geWorkbench. These include ARACNE and Reverse Engineering. ================================================================ 2.0 New Features and Updates ================================================================ New features added in geWorkbench 1.0.5: No new Modules were added in this release. This is a base release of well-tested modules. Changes from release 1.0.4: General All components using the Apache Axis code for SOAP communications were revamped such that each component now has its own copy of Axis. The caCORE code included in geWorkbench was updated to version 3.1 to maintain compatibility with NCI servers. The was necessary for such modules as the BioCarta Pathways retrieval. Annotations Panel caBIO database searches for gene information: - If the input dataset was associated with an annotations file when it was opened, then we retrieve the HUGO gene symbol associated with a marker (e.g., for marker 31335_at the HUGO symbol is IGF1R) and search caBIO using this gene symbol as a query. - If the input dataset does not have an associated annotation file, then we do the caBIO search using the marker name. In this case we are restricted, as the only markers for which we will be able to retrieve information are the ones in the HU133 chip. Browser access to CGAP gene annotations: - In the past, clicking on a gene name hyperlink would directly bring up the corresponding CGAP page. Now, the users are provided with an option; namely they are asked which of the supported CGAP organisms (human or mouse) they want to retrieve info for. In the (near) future we plan to provide additional options here, such as searching Entrez Gene instead of CGAP. Extract markers/genes from pathways: - In the past, the only operation available for BioCarta pathways was the ability to visualize the pathway image in the caBIO Pathways component. Now, 2 more options are avaible: -- Add pathway genes to set: Selecting this option results in retrieving the HUGO sysmbols of all genes that comprise the pathway. For each such symbol XXX the application will try to find if the currently selected microarray set has a marker whose associated gene is XXX (obviously this will work only if the mocroarray set has been associated with an annotations file). If one (or more) such markers exist, then they will be placed in a marker set which will be named after the pathway and will be added in the Markers panel. -- Export genes to CSV: Information about all genes in the pathway is exported to a text file. The file contains as many rows as the genes extracted and each row contains 2 comma separated values: (1) a gene symbol, and (2) the description associated with that gene. Affymetrix Annotation files: Previous releases have included a copy of an Affymetrix annotation file for the HG-U95aV2 chip. Due to licensing restrictions, this file is no longer included. geWorkbench users who are working with Affymetrix chip data should retrieve the latest version of the appropriate annotation file for the chip type they using directly from https://www.affymetrix.com/site/login/login.affx A free account at Affymetrix.com is required. Current annotation files in CSV format are listed there. If you need an annotation file for an older file you can use its name in the search field on the web page, e.g. "HG_U95Av2". An example file from the Affymetrix site is "HG_U95Av2.na21.annot.csv.zip". This file would need to be unzipped before use. You can place the file in any convenient directory. When you load a new data file, you will be asked for the location of the annotation file and can browse to it. Special Purpose hardware support removed: The JCSB no longer supports the Paracel BLAST machine or GeneMatcher 2 hardware. The interfaces to these special purpose computers have been removed from geWorkbench. BLAST is still available in geWorkbench via an interface to NCBI. A replacement interface to run NCBI BLAST on a local JCSB cluster is under development. The following components are not included in this distribution, but were included in 1.0.4: * ARACNE * Column Major Format * Frequency Threshold Filter * GCRMA Via R CEL Loader * Genotypic File Format * Interactions DB * Network Browser * Pattern Discovery Algorithm (association analysis) * Reverse Engineering * SVM Format * Synteny A new, pure-Java version of ARACNE has been developed and will return to geWorkbench in a future release. The Reverse Engineering component and associated Network Browser component are also being redeveloped. Other components may return in a future distribution as needed. ================================================================ 3.0 Known Issues/Defects ================================================================ Pattern Discovery In the Pattern Discovery module, the hierarchical discovery option has been disabled. This is due to an unresolved communications problem with the ARACNE server. Gene Ontology The Gene Ontology component uses data files obtained from www.geneontology.org. These files are in the older GO format, which is deprecated but still generated on a weekly basis at that site. The geWorkbench download includes the three data files process, function and location, current as of the time of this release. However, geWorkbench at present contains no method of updating these files. To update the gene ontology files, download them from: http://www.geneontology.org/GO.downloads.ontology.shtml. These files must be copied into the GO Terms component class directory on your machine: geworkbench_root\components\goterms\classes\org\geworkbench\components\goterms, where geworkbench_root is the directory you downloaded geworkbench into. Due to the large size of these files, we have found that mapping the Process tree will work best if more than 1 GB of memory is installed in the computer. ================================================================ 4.0 Bug Reports and Support ================================================================ Support is provided via the geWorkbench project on the NCI's Gforge site. See the Forums section at: https://gforge.nci.nih.gov/projects/geworkbench/ ================================================================ 5.0 Documentation and Files ================================================================ The documents and support files in this distribution include: geWorkbench Release Notes: ReleaseNotes.txt (this file) geWorkbench License: geWorkbenchLicense.txt Online Help: Within geWorkbench, users can access "Help Topics" by clicking the top menu. It has detailed information about each module. For other documentation not directly included as part of the distribution, see the following section (6.0) Web Resources. ================================================================ 6.0 geWorkbench Web Resources ================================================================ The geWorkbench team maintains a Wiki containing extensive documentation, a User Manual, tutorials and training slides. It is available at: http://geWorkbench.org. For the majority of newer features, detailed requirements specification documents are available at the caBIG CVS site: http://cabigcvs.nci.nih.gov/viewcvs/viewcvs.cgi/caworkbenchcabig