Software and Algorithms

ISSR bands are scored as dominant markers, which means that they are di-allelic with band present or absent scored as the two alleles of a locus (inferred band size in kb). To calculate average similarity and measures of genetic distance using ISSR banding patterns, it is appropriate to use algorithms that consider band matches only in the calculations (see Wolfe and Liston 1998 for a discussion of anlysis of RAPD data). An example would be the Jaccard coefficient (Sneath and Sokal 1973, p. 131). Unlike allozyme and microsatellite alleles, codominance is not easy to establish with ISSR loci and the measures of genetic diversity used with codominant markers (e.g., Nei's genetic distance) are inappropriate for these data. The second consideration to be made is in the establishment of homology, which can be determined by conducting Southern blot experiments, but few researchers have actually made the effort to establish homology of RAPD and/or ISSR bands (reviewed in Wolfe and Liston 1998). If a band is present, one can make a basic assumption that the priming loci on either side of the band is, in fact, present. On the other hand, the absence of a band can result from several phenomena, including: (1) lack of a priming site altogether; (2) mutation in either of the priming sites; (3) structural rearrangement of the chromosome during meiosis; and (4) insertion/deletion of large enough size to increase or decrease the band size sufficiently for it to be scored as a separate locus.


Software for ISSR Analysis:

PAUP 4.01b
Vera S. Ford's Programs
RAPDistance
RAPDFst


PAUP 4.01b

PAUP 4.01b is available from Sinauer Associates. This version of Dave Swofford's phylogenetic analysis program is extremely useful for UPGMA and Neighbor Joining cluster analysis of ISSR data. The matrix is formatted as a NEXUS file with bands represented as 1's and absence of bands represented as 0's. The distance coefficient used for analysis is the Nei and Li coefficient, a band-matching coefficient, which is also the coefficient used by some of the other programs discussed below. The advantage of using PAUP 4.01b is that a bootstrap or jacknife can be done to assess the relative support for the tree topology.

We have been using PAUP 4.01b recently, but have also used Vera S. Ford's program and RAPDFst for some of our analyses.

Return to Contents


Vera S. Ford's Programs

We have been using unpublished software written by Vera S. Ford (UC Davis), which calculates the average similarity between groups where only band matches are used in the calculations according to the following equations:

For the first group of individuals, the average pair-wise similarity is calculated by:

Where S = similarity; i and j = bands that are matched in individuals n1 to nx in pairwise combinations

The average pair-wise similarity for the second group is calculated by:

Where S = similarity;i and j = bands that are matched in individuals m1 to mx in pairwise combinations

The average pair-wise similarity for individuals in the two groups is calculated by:

Vera Ford had kindly agreed to make her program available. You can download the software by FTP using Fetch. Sign on to 140.254.12.151 using "anonymous" as the user ID. Click on the "incoming" folder and then the "ISSR" folder. Use the "get file" functions of Fetch to transfer the described below.

There are four programs in this package (!WXDL.EXE, !WXDNL.EXE, AVSIM.EXE, and WAVSIML.EXE). These programs are written for a DOS platform; they recognize a data matrix saved as an asci file called "infile" in the directory where the executable files reside (make a directory called "RAPD"). The programs accept question marks for missing data.

For some interpretations as to what the names of the program mean, here's a translation: in WXDL, the W is for wrapped data, the X for matrix, the D for distance, and the L for large data sets. Vera Ford has offered to rename the programs for easier use as well as make a couple of other programs available. If you decide to download the programs, be sure to check my ISSR website for additional information.

!WXDL, !WXDNL and WAVSIML will analyze more than one line of data at a time. AVSIM will only analyze one line of data per individual at a time. You can reformat you data matrix to fit on one line to get around this limitation (e.g., type your data matrix in the windows in the RAPD directory; choose font courier new TT and shrink the matrix down to 1 or 2).

The output for each program is listed in the RAPD directory under the same name but is not followed by an "exe" command. For example, the output for !WXDL is "wxdl" in the directory with no suffix listed. Everytime the program is run, the old outputs are replaced. The output has the data and time of the last run.

Program descriptions:

!WXDL produces a lower triangular distance matrix (1-similarity) for input into PHYLIP. The infile data can wrap around more than one line for up to 550 individuals and 400 bands.

!WXDNL produces a lower triangular distance matrix (1-similarity) for input into NTSYS. The infile data can wrap around more than one line for up to 550 individuals and 400 bands.

AVSIM calculates similarities between groups of individuals for up to 180 individuals and 260 bands per group. Output is on the screen and you must write the results by hand onto a datasheet.

WAVSIML calculates similarities between groups of individuals; infile data can wrap around more than one line for 550 individuals and 400 bands. Sample matrix for Vera Ford's programs -- this is a subset of my Hyobanche data matrix.

(Notes: the first number on line 1 is the # of taxa; the second # is the number of bands. The next line is the first entry starting with the taxon label, five spaces, an "X", five more spaces an "X" and then a hard return. The 1's and 0's are entered without spaces and without hard returns. End the data entry for each taxon with a hard return. Format the matrix in text format for a PC with the "delete soft returns option." Name the file "infile" to run the programs.)

10 164
G1137 X X
000011101100001000001000000110010100100110011000010100000000100000 101000001100010000110000000001110000100000010110000000000000111000 10000000000011010000110010000000
G1139 X X
000010101100000000001000000000011000000110011000100100000000100000 100000001100010010010000000001100000100000010110000000000010111000 00000000000010010000100010000010
G1152 X X
000001101100001000001000100000010001110000001000110000000100100000 101010000000000100010000000001100000110010000100001001001000111000 00000010000110010010101010000000
R1121 X X
000001001100000010000000001000000000100000000000000000000100000000 100000000000100000010100000001100010110000010100000001000000111000 00000000010100000101100010000000
R1148 X X
000000001100000000000000111000000000001000000000000000000000000000 000010100000001000010000001001100101010000000100000000000000111000 00000000000001000100100010000000
R1155 X X
010001001100010100000000111000000000101000000000000000000100000000 110000000100000001010000001001100000010000010100100001000100111000 00000000000000000001100010000000
S3127 X X
110100101101011100000000010000010010001000100100000011010000011000 100110000000000100011011001000100001101001011100010010000100101010 00000000001000101101000010001001
S3133 X X
000100101101010100001000000000000010000010100000001000010000001000 000101000000000000001001000000100001111000011000000000000000101010 01000000011000100100010001001011
S3156 X X
000000000000000000000000000000000000000000000000000000000000000000 000000000000000000001001000000100001100000000000000000000000101010 00000000000001100001000000001000
S3157 X X
000000001101000000000000000000010101000000100000000001010000000000 000001010000000000001001000001100001100000000000000000000000111010 00100000000000100001000000100101




Sample output for !WXDL for Hyobanche data matrix (the entire data set) -- output file name is wxdl and this is then renamed "infile" when transferred to the PHYLIP directory.

33
A1130
A1147 0.478
G1137 0.804 0.759
G1139 0.767 0.720 0.231
G1152 0.755 0.679 0.405 0.447
G1161 0.800 0.789 0.412 0.403 0.422
G1165 0.708 0.709 0.398 0.467 0.333 0.463
G1136 0.755 0.714 0.429 0.395 0.439 0.470 0.383
G1144 0.795 0.783 0.459 0.455 0.444 0.397 0.465 0.500
G2125 0.828 0.754 0.484 0.482 0.473 0.522 0.489 0.429 0.407
G2149 0.782 0.806 0.533 0.537 0.500 0.528 0.471 0.432 0.385 0.155
G2135 0.737 0.733 0.452 0.538 0.493 0.556 0.400 0.493 0.508 0.550 0.506
G3132 0.721 0.720 0.615 0.543 0.605 0.558 0.547 0.553 0.515 0.600 0.537 0.446
G3141 0.787 0.778 0.488 0.514 0.450 0.506 0.443 0.475 0.486 0.416 0.442 0.536 0.459
G3142 0.778 0.824 0.645 0.630 0.700 0.672 0.661 0.667 0.640 0.681 0.667 0.633 0.630 0.621
G3150 0.667 0.622 0.569 0.474 0.524 0.625 0.452 0.492 0.434 0.556 0.507 0.423 0.298 0.475 0.561
R1121 0.722 0.721 0.549 0.556 0.449 0.571 0.441 0.507 0.458 0.487 0.493 0.483 0.492 0.373 0.574 0.360
R1148 0.688 0.692 0.642 0.627 0.569 0.576 0.531 0.538 0.564 0.541 0.521 0.556 0.627 0.524 0.535 0.522 0.462
R1155 0.750 0.745 0.573 0.582 0.507 0.541 0.500 0.534 0.524 0.488 0.494 0.548 0.582 0.408 0.647 0.519 0.333 0.393
R1158 0.783 0.736 0.531 0.562 0.443 0.500 0.487 0.519 0.507 0.523 0.553 0.529 0.562 0.481 0.614 0.567 0.455 0.452 0.400
R2131 0.837 0.786 0.476 0.579 0.561 0.518 0.457 0.512 0.528 0.560 0.545 0.549 0.553 0.450 0.567 0.524 0.507 0.508 0.452 0.443
S1123 0.745 0.593 0.683 0.676 0.725 0.704 0.646 0.600 0.657 0.663 0.674 0.710 0.649 0.590 0.690 0.639 0.642 0.619 0.662 0.558 0.575
S2124 0.755 0.750 0.738 0.711 0.780 0.663 0.679 0.610 0.694 0.648 0.636 0.746 0.526 0.575 0.700 0.619 0.681 0.662 0.671 0.646 0.585 0.325
S2134 0.879 0.800 0.853 0.800 0.818 0.791 0.754 0.788 0.786 0.760 0.722 0.818 0.733 0.719 0.864 0.745 0.736 0.714 0.789 0.778 0.758 0.531 0.485
S2143 0.769 0.763 0.701 0.696 0.694 0.651 0.643 0.600 0.653 0.660 0.648 0.676 0.494 0.518 0.587 0.576 0.611 0.588 0.632 0.585 0.529 0.301 0.176 0.507
S3127 0.833 0.761 0.705 0.701 0.677 0.617 0.674 0.613 0.687 0.608 0.576 0.756 0.609 0.604 0.690 0.676 0.675 0.632 0.571 0.644 0.570 0.538 0.441 0.584 0.458
S3133 0.826 0.811 0.728 0.699 0.772 0.650 0.744 0.696 0.768 0.727 0.718 0.765 0.616 0.688 0.684 0.700 0.697 0.742 0.743 0.684 0.620 0.532 0.418 0.619 0.463 0.378
S3156 0.600 0.630 0.818 0.830 0.849 0.778 0.769 0.774 0.814 0.839 0.831 0.762 0.787 0.804 0.806 0.765 0.750 0.722 0.818 0.800 0.774 0.608 0.585 0.514 0.607 0.656 0.600
S3157 0.714 0.667 0.714 0.710 0.706 0.623 0.701 0.618 0.759 0.688 0.730 0.719 0.645 0.667 0.826 0.633 0.673 0.686 0.729 0.692 0.647 0.576 0.529 0.692 0.549 0.544 0.508 0.487
S3159 0.714 0.755 0.714 0.652 0.707 0.632 0.649 0.627 0.723 0.690 0.654 0.656 0.565 0.589 0.660 0.607 0.613 0.621 0.636 0.639 0.573 0.507 0.387 0.627 0.410 0.419 0.361 0.565 0.508
S3160 0.892 0.864 0.694 0.719 0.800 0.690 0.768 0.743 0.800 0.747 0.763 0.864 0.781 0.676 0.750 0.804 0.719 0.774 0.770 0.731 0.714 0.618 0.543 0.667 0.589 0.630 0.522 0.659 0.571 0.524
S4128 0.733 0.692 0.675 0.639 0.692 0.620 0.662 0.692 0.706 0.724 0.714 0.672 0.556 0.579 0.643 0.593 0.600 0.705 0.623 0.600 0.615 0.500 0.436 0.645 0.481 0.528 0.493 0.633 0.594 0.437 0.485
S5138 0.636 0.517 0.789 0.796 0.818 0.714 0.741 0.782 0.822 0.812 0.803 0.773 0.755 0.774 0.758 0.722 0.714 0.684 0.696 0.769 0.745 0.585 0.527 0.641 0.552 0.667 0.654 0.385 0.561 0.625 0.721 0.569

To get a UPGMA tree from PHYLIP, the program Neighbor was used with the UPGMA and jumble options. Drawtree was used to obtain a phenogram.

The resulting tree for Hyobanche study was:

Return to Contents


RAPDistance

RAPDistance Version 1.04 for the Analysis of Patterns of RAPD Fragments was written by John Armstrong, Adrian Gibbs, Rod Peakall, and Georg Weiller of the Australian National University, Canberra, Australia. The programs are written in Ansi C for a PC platform. They can be obtained by downloading it via FTP from:

ftp://life.anu.edu.au/pub/software/RAPDistance/

If you haven't done FTP from your web browser, just type in the URL as listed above and hit the return button. You'll receive a gopher-style menu. Click on the rapd104.zip link and the zipped file for the programs will be transferred to your computer. To unzip them, you need the application pkunzip. You can get that from the ftp site provided by Bill Black, discussed in the next section.

pkunzip decompresses the zipped files when you type in the command:

pkunzip rapd104

Open the document files to read about the programs and how to use them.

This software package is pretty comprehensive. I haven't yet used it, but I was able to download it successfully and I've looked through the documentation. There is a long list (18!) of options for calculating similarities -- many of which look like band-matching algorithms, including the Dice and Jaccard coefficients, Kulcyzynski 1, Kulcyzynski 2, Pearson's Phi coefficient, Russell and Rao, five coefficients from Sokal and Sneath, Ochiai, Yule and Kendall, Upholt, Li & Graur, Simple Matching, Excoffier, Rogers and Tanimoto, and Hamman. The manual gives a brief overview of the similarities between these coefficients and gives plenty of references for further reading.

The program has several options. Options 1-3 deal with data storage, listing and stats about the data set; options 4-11 deal with editing the data; options 12-14 are for converting the data into formats for other programs; I didn't see any details in the manual about whether there are options 15-20 -- the next set starts with #21 and includes 22, 23, 31-36 -- calculating pairwise distances, producing a neighbor-joining tree and so on.

Return to Contents


RAPDFst


RAPDFst and other programs by Bill Black, Colorado State University are available via your net browser FTP at:

ftp://lamar.colostate.edu/pub/wcb4/

Bill Black has provided seven FORTRAN programs for population genetic analysis of RAPD PCR data, including:

RAPDPLOT - Cluster analysis of RAPD-PCR patterns in individuals using Nei and Li's genetic similarity index .

RAPDBOOT -- A new program developed in 1996 to perform bootstrap analysis on RAPD data from individuals (uses Nei & Li's genetic similarity index).
RAPDDIST -- A new program developed in 1996 to perform bootstrap analysis on genetic distance estimates among populations -- has several options for distance measures.

FINGERS -- A program for genetic fingerprint analysis of individuals with RAPD-PCR markers.

RAPDBIOS -- A program that converts a RAPD-PCR dataset into a DATYPE=3 allele frequency dataset for further analysis by BIOSYS.

RAPDFST -- A program that estimates Fst among populations in which RAPD alleles have been surveyed.

RAPDLD -- A program that estimates linkage disequilibrium among RAPD alleles.

Details of RAPDBIOS, RAPDFST and RAPDLD appear in:

Apostol, B. L., W. C. Black IV, P. Reiter and B. R. Miller. 1996. Population genetics with RAPD-PCR markers: Aedes aegypti in Puerto Rico. Heredity 76: 325-334.

Also available at this FTP site are PKUNZIP and BIOSYS2.

Return to Contents


Return to:

ISSR Resource Website


Go to:

Wolfe Homepage

Wolfe Lab Homepage

Penstemon Website


Please send your suggestions, comments, corrections to wolfe.205@osu.edu
Last updated November 19, 1998.