Skip to content

Dependencies And Data Hazards

Dependency

A dependency is observed between two instructions when an instruction wants to use the value of a resource after it's modified by another instruction.

[Assume that this is executed without pipelining]
1) add R1 R2 R3
2) add R2 R1 R3

In the code above, line 2 wishes to use the value of R1 to calculate the final result. This isn't a problem, because the value of R1 will be updated much before instruction 2 even starts processing.

Example

Think of your assignment. You were waiting for the TAs to upload the new test-cases so that you can finalise the code. If they had uploaded them in May, that would've been great. They would've made the test-cases, and you would've tested your code afterwards.

Think of the test-cases GitHub repository as a resource here. The TAs write into it, and you read from it. This is a read-after-write dependency. A resource in a computer can be many things, but for us the relevant resources are memory locations and registers.

Data Hazards

Hazards occur in pipelined processors when the processor starts executing the next instruction before the current one is finished. If there's a dependency between these instructions, then the value read by the new instruction may be "stale". This risk of passing of a stale value is called a read-after-write hazard.

There are also write-after-write and write-after-read hazards, but those are not relevant to us.

A data hazard is a consequence of a dependency. Not all dependencies cause data hazards, but all data hazards are caused by dependencies.

Example

Coming back to our example from earlier, both you and the TAs were working towards the project. This work was going on in parallel. If we finished writing the test cases before you were done with your code, that's great. No reason to worry, you can just test your code and submit. But if we released them after you were done, there's a problem.

You might end up testing on only the older cases, which only cover the Assembler. This is problematic, because you're relying on stale information (test-cases that are outdated for the main deadline). Surely, you would try to avoid this unintended outcome! But how?

Likewise, if our code were pipelined, we would have a problem. When instruction 2 is in D phase, instruction 1 is in E phase. There are two more cycles till the stale value in the resource (R1) becomes fresh. We don't want to work with this stale value, so we'll do something else to resolve the Data Hazard.