NOTES ON PART 1 Before writing the NK model, you should've realized a major problem in constructing the fitness table which reflects the epistatic interactions of order K. Since each unique combination of neighbors for each locus has a separate fitness, the size of the fitness table would become 2^(K+1). The +1 comes from the locus for which a separate fitness is assigned whether that value is one or zero. To be clearer: Let's say K=2 and we decide that the two neighbors of a locus influences it. So the fitness table would look something like: Locus value Neighbors Fitness 0 00 .34 0 01 .22 0 10 .52 0 11 .68 1 00 .33 1 01 .72 1 10 .16 1 11 .85 The point is that when K becomes pretty big, like 24, then we're talking about a table that holds 2^25 floating point numbers. 2^25 = over 30 million and each floating point number is 8 bytes long. 8 x 30 million = over 240 megabytes of memory to hold this single table. The code in nk.awk and nk_landscape.c employs a "jump table" to construct the random fitnesses of the epistatic interactions. I haven't confirmed that this method produces adequately random numbers but the author of this code thinks so. A description of this code can be found in the file 'readme.santefe.txt'. ---------------------------------------------------------------------------- NOTES ON PART 2 In order to produce the tables in the Kauffman and Weinberger paper, you need to do several things: 1) Write code that will find the local optima under different condition and record several different measures. The different conditions are: Changing N and K K sites bearing are neighbors or randomly chosen The different measures are: Walk length, or number of steps, to reach local optimum Fitness of local optima How many fitter 1-mutant neighbors there are at each step Estimate the number of optima in varying landscapes 2) Write code that will run the NK for 100 runs and collect means and s.d.'s You can do all of this within the AWK script. 3) You can use printf(" ... ") > "filename" to output to a specific file. If that filename is in a variable that changes like output.0, output.1, etc., you can use something like: filename = sprintf("output.%d",counter); printf(" ... ") > filename; 4) Suggestions: Send your output to a tab-delimited ("\t") file and use Excel or your favorite statistics package to graph and do regressions or t-tests (to compare, say, the behavior of two conditions). The coding you're left to do will be shorter than the what you did for the garbage can model, but perhaps conceptually more difficult since you're not just translating from another language. ---------------------------------------------------------------------------- NOTES ON PART 3 According to Kathleen, this is to be a thought exercise. There is no right answer in how you determine the N and K's and fitnesses for the garbage can model. For instance, you can select "number of problems solved" as the fitness and K to be the number of choices that decision-makers share. Find something that seems to make sense and document why. ----------------------------------------------------------------------------- Just so you're on the right track... If you use the print_out code I've included in the function "find_local_optimum", and run the script using the following command: gawk -v print_fitter_count=1 -v show_epistasis=1 -v n=10 -v k=3 -v nt=1 -v seed=1 -f nk.awk You should get the following output: The landscape's epistatic interactions are as follows: Locus 0: 8 9 1 Locus 1: 0 9 2 Locus 2: 1 3 4 Locus 3: 1 2 4 Locus 4: 3 5 6 Locus 5: 3 4 6 Locus 6: 5 7 8 Locus 7: 6 8 9 Locus 8: 6 7 9 Locus 9: 8 0 1 0100001110 0 1.386294 4 0.565916 0100001100 1 1.386294 4 0.597159 1100001100 2 1.386294 4 0.646332 1110001100 3 1.386294 4 0.669840 1010001100 4 1.098612 3 0.714876 1010011100 5 0.000000 1 0.788865 1010011000