- Q1: When predicting memory dependencies, what is the cost of "over
predicting" (falsely predicting a dependence)? What is the cost of
"under predicting" (failing to predict an actual dependence)?
- Q2: How were the simulation results for the "perfect" memory
dependence predictor generated? Why can't the hardware just use the
same approach, thus achieving perfect memory dependence prediction?
- Q3: Briefly describe the store sets approach to memory
dependence prediction (just a paragraph).
- Q4: The stores sets approach works well for most benchmarks, but
the authors found it doesn't work well on the benchmark "applu". What
was the root cause of the poor performance on applu? What do they
suggest the compiler might do to help? Although they leave it to
"future work", how might the hardware's dependence predictor be modified
to do better in such cases?