README for blast_demo archive ----------------------------- (last updated 01/23/02) This archive contains some demonstration programs written for Tom Madden's BLAST programming tutorial at the O'Reilly Bioinformatics conference. The main purpose of these programs is to demonstrate how programmers might call BLAST functions in C or C++. Prior knowledge of C/C++ programming is assumed and you must first compile the NCBI toolkit to compile these programs as the NCBI toolkit contains the libraries for BLAST and these programs merely call those libraries. Currently these programs have only been compiled under UNIX/LINUX, though they should work under all platforms supported by the NCBI toolkit. The script and Makefile have only been tested under UNIX. At some point in the near future these programs may be added to the demo directory of the NCBI toolkit. The first step is to compile the NCBI toolkit: ---------------------------------------------- 1.) download ncbi.tar.Z or ncbi.tar.gz from the NCBI FTP site at ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/ 2.) uncompress or gunzip the ncbi.tar file and then extract the source code with "tar -xvf ncbi.tar". 3.) follow the instructions in ncbi/make/readme.unx. Make SURE to redirect the output to "out.makedis.csh" exactly as directed in ncbi/make/readme.unx as you will need this file to build the programs contained in this archive. 4.) write to "toolbox@ncbi.nlm.nih.gov" with questions about compiling the NCBI toolkit. The second step is to compile the BLAST demo programs: ------------------------------------------------ 1.) download blast_demo.gz from the NCBI FTP site at ftp://ftp.ncbi.nih.gov/blast/demo/ 2.) gunzip blast_demo.tar.gz and place it in the same directory as ncbi.tar and out.makedis.csh (created when the toolkit is built). 3.) execute "tar -xvf blast_demo.tar" which will create a directory call "blast_demo". 4.) cd to "blast_demo" and execute ./blast_demo.csh This will copy over a "ncbi.mk" configuration file from ../ncbi/platform and make the binaries. The third step is to perform a few test runs: --------------------------------------------- The archive also includes a copy of the BLAST ecoli protein database (files called "ecoli.p*"), a sample query in FASTA format ("aa.129295"), and a SeqAnnot file (ASN.1) that can be used as input to blreplay. 1.) blreplay: this program reads in ASN.1 results produced by a BLAST run, a query used in the run and produces a BLAST report or XML output using local copies of the BLAST databases. To run blreplay with the included query, ASN.1 input, and BLAST databases you should execute: ./blreplay -p blastp -d ecoli -i aa.129295 -O seqannot.ecoli.129295 -o out.ecoli The BLAST report will be in the file out.ecoli. Other output options are also available, to see those execute: ./blreplay - To produce other ASN.1 input for blreplay you can run blastall or blastpgp using the -J option (to certify that the query has not been changed if you downloaded it from Entrez) and -O FILENAME which writes ASN.1 to FILENAME. This ASN.1 and query can then be used (with the appropriate database) to reproduce the BLAST output using blreplay. 2.) db2fasta reads an entry from the BLAST database and dumps it as FASTA, to run this execute: ./db2fasta -d ecoli -n "gi|1786183" 3.) doblast is a simple program to perform a BLAST run and demonstrates the high level calls to the BLAST library that are needed. You may run it by executing: ./doblast -p blastp -d ecoli -i aa.129295 Other BLAST databases may be retrieved from ftp://ftp.ncbi.nih.gov/blast/db/FormatedDatabases/ Please refer to ftp://ftp.ncbi.nih.gov/blast/documents/README.bls for instructions on running BLAST in general. General questions about BLAST options, databases, etc. should be sent to blast-help@ncbi.nlm.nih.gov Specific questions about the three applications in this archive may be sent to Tom Madden at madden@ncbi.nlm.nih.gov