DNA Sequencer (aka "Genomatic" )

Summary

Develop an FPGA-based design that reads data from an embedded memory and determines the frequency of occurrence of substrings defined in a second memory. The frequency of each substring is recorded in individual registers which can be accessed by switches on the Zybo board. 

Technical Details

A set of substrings ("codons") will be defined in a coefficients (COE) file for a 32x4-bit memory (the Codon Memory). Each substring will consist of between one and five nibbles (4-bit "nucleotides"), delimited in memory by a single 'F', and there will be a maximum of six codons defined. Remaining locations will be filled with 'F'.

A second memory, also initialized by a COE file, will contain the data to be read. This is known as the Gene Memory and will be organized as a 256x4-bit memory.

Upon reset, the FPGA design will begin accessing  the memories and determine the number of occurrences of each codon within the gene. The end condition of the Gene Memory  will be indicated by two consecutive nibbles of value 'F'. (Remaining entries are undefined.) The FPGA will analyze the Gene Memory to determine the frequency of occurrence of codons defined in the 32x4-bit memory. Upon completion, the FPGA will light a DONE light, at which point the codon frequency (i.e., count) are available through internal Registers 1-6. The contents of a specifc register can be determined by applying the appropriate register ID using switches two through zero. The status of the DONE signal is available on LED0 through "virtual" register zero.

Constraints

Sample COE files

The sample files are located here. Enjoy!