Preliminary Demo by 7pm on Friday, February 9th
Final Demo by 7pm on Thursday, February 15th
Writeup due in-class on Friday, February 16th
This lab is to be done individually.
This lab is worth 25 points.
In this lab, you will construct the ALU (Arithmetic/Logical Unit) for a P37X ISA processor. Before you can build the ALU, you need to create a few building blocks (4-bit adder, 16-bit adder, 16-bit multiplier, 16-bit shifter) which you will then combine to form an ALU.
Before you begin, we have another tutorial for you to walk through: ModelSim simulation tutorial. This tutorial covers simulation of designs for verifying they are correct and debugging them when they are not.
Design and test the following combinational logic structures.
Important
Before you write any Verilog code, first create a hand-drawn schematic diagram of the circuit with all wires and input/outputs labeled. Why? When designing hardware, even when using Verilog, you need to be thinking explicitly about the structure and interconnectedness of the circuits. Only when the diagram is complete should you write the Verilog code that corresponds to the circuit. As described below, you need to turn in both the hand-drawn schematic and a printout of the Verilog code.
Before creating a 16-bit adder, first create a signed 4-bit ripple-carry adder as a basic building block. It has three inputs: two 4-bit signed values and a 1-bit carry in signal. It has two output values: the 4-bit output and a 1-bit carry out signal. You might want to use the 3-input, 2-output single-bit "full adder" you designed in Lab 0 (or an improved version of it) as a building block.
Testing: Test the adder both in simulation and on the board. To test the adder on the board, hook your 4-bit adder inputs to two sets of four input switches on the extension board; hook the outputs to five LEDs on the extension board.
The 16-bit adder takes in two 16-bit signed values and a single-bit carry-in signal. It has a single 16-bit signed output.
Implementation: For comparison purposes, create three different adder implementations (using the 4-bit adder specified above):
In the lab writeup, compare the delay (in nanoseconds) and area (in terms of lookup tables or "LUTs") of these three different adder implementations.
See the CSE371 lecture notes for more information on carry-select adders.
Testing: Test the adder both in simulation and on the board. Unfortunately, the extension boards do not have enough switches to represent two 16-bit inputs. As an incomplete workaround, test the adder on the board by sign extending the two sets of four input switches on the extension board; hook the eight low-order bits of the 16-bit output to the eight LEDs on the extension board. This setup will give you partial test coverage (enough to demonstrate the design is basically working).
The 16-bit multiplier takes in two 16-bit signed values. It has a single 16-bit signed output. The multiplier is single-cycle and fully combinational (in contrast, a sequential multiplier takes multiple cycles and latches intermediate values).
Implementation: The most straightforward implementation uses a chain of 15 sixteen-bit adders you just created to add up the 16 partial values. You'll also need to use some multiplexors, ranged bit selection, and/or other combinational logic.
Important
Do not use the shifters described below within your multiplier. Shifting by a known value can be done easily and more efficiently using bit selection and and concatenation operations.
For comparison purposes, create three different multiplier implementations:
Note: As you'll be including the 16-bit adder as a structural component, the textual differences between these multipliers should be minor.
In the lab writeup, explain your general multiplier design and compare its delay using these three adders.
Testing: Test the multiplier much like you tested the 16-bit adder.
The shifter unit has three inputs: a 16-bit value, a 4-bit shift amount, and a 2-bit shift type (00 is left shift, 01 is logical right shift, 10 is arithmetic right shift, 11 is no shift). It has a single 16-bit output.
Implementation: Note, there are several ways to implement this shifter. You could create three different shifters using 2-to-1 MUXes at each level. You would then use a 4-to-1 mux to select among them at the end. An alternative implementation would use four copies of the 4-to-1 MUX to select between the three kinds of shifts and no shift at all at each stage.
Testing: Test the shifters much like you tested the 16-bit adder. Use an additional two switches to specify the specific shift operation.
The ALU has three inputs: two 16-bit signed values and a 4-bit control signal that determines which operation the ALU should perform. The ALU has a single 16-bit signed output, which is the result of the operation. The ALU can perform ten operations:
Description | Insn | Control |
---|---|---|
Addition | ADD | 0 100 |
Subtraction | SUB | 0 101 |
Multiplication | MUL | 0 110 |
Bitwise or | OR | 1 000 |
Bitwise not | NOT | 1 001 |
Bitwise and | AND | 1 010 |
Bitwise xor | XOR | 1 011 |
Shift left logical | SLL | 1 100 |
Shift right logical | SRL | 1 101 |
Shift right arithmetic | SRA | 1 110 |
A few notes:
Implementation: The ALU should instantiate a single 16-bit adder (also used for subtract), a 16-bit multiplier, and a left/right shifter. Using the outputs from these modules and some combinational logic to generate all ten possible values. Finally, use a 16-to-1 multiplexer to select the correct signal.
Testing: Test the shifters much like you tested the 16-bit adder, but use an additional four switches (the small switches on the main FGPA board) as the 4-bit input select.
This lab should be implemented using only low-level structural Verilog and the assign statement. You are not allowed to use the following Verilog operators: +, -, *, /, <<, >>, etc. However, you are allowed to use the following operators: ~, &, |, ^, ==, !=, ?:, {}, etc. If you're not sure if you're allowed to use a certain Verilog construct, just ask (post a message on the newsgroup, send an e-mail, etc.).
We'll be using an extension board that contains additional LEDs and switches. See lab1.v and lab1.ucf for a top-level Verilog module and mappings for the LED and switch pins.
Note
The switches on the extension boards are "active high", but (as described in the lab 0), the LEDs and the switches on the main board are "active low" signals.
Don't forget to walk through the ModelSim simulation tutorial before you begin.
In an effort to make the testing process slightly less painful, your friendly CSE 372 TA's have put together a testbench framework using behavioral Verilog. The testbench is available here: lab1_testbench.v.
The code is straightforward: the Unit Under Test (UUT) is instantiated at the top. The testbench code basically just reads a series of values from a file (named, by default, lab1.input.test) to send as input to the UUT. Expected output values are also read from this file, and compared with the UUT's actual output.
To make the testbench work, make sure you have both the testbench verilog module and the test input file. Both can reside in the main directory of your Xilinx project, and they should work fine for both Xilinx and ModelSim.
Currently, the file lab1.input.test is pretty skimpy - you'll need to flesh it out with your own test cases. For example, you might want to make a testbench for each module or at least make input files to stress each part.
A couple of features:
Caveats: When a comment is read from the test input file, it causes the ALU control signal to dip to 0x0000 temporarily. This is usually not an issue, as 0x0000 is an invalid control signal, but if you write many lines of comments in succession, this can cause some bizarre timing issues.
The delay and resource usage of your design can be found in various reports:
When reporting timing results, use the "Post Place and Route Timing" information.
For this lab, there is a preliminary and final demo.
For each of the designs, turn in:
Please put the lab analysis first; interleave the schematics with the Verilog code for each module.
Answer the following questions for your lab report. When reporting timing results, use the information from the "Post Place and Route Timing" report.
Note
As part of your grade will be determined based on your lab writeups, they should be clear, concise and neat (preferably typed). You could have the greatest design in the world but if you cannot convey your idea clearly to the graders and convince them that it works you will not get good marks. Your lab writeups should include a brief explanation of what the circuits are supposed to do and how they do it.