== Using MapReduce-style data analytics frameworks (5pts) == 5 Report shows evidence of student's ability to run MapReduce jobs and to vary both basic compute (number of slaves) and basic data (replication level) parameters 3 Report shows evidence of student's ability to run MapReduce jobs and to vary either basic compute or basic data parameters 1 Report shows evidence of student's ability to run MapReduce jobs with a default configuration 0 No MapReduce jobs were run == Effects of cluster size (2pts) == 2 The effect of cluster size on MapReduce performance is accurately explained and supported by measurement results. 1 Measurement results are presented for varying cluster sizes, but the results are not explained accurately or at all. 0 No results or explanation of cluster size effects is included == Effects of data parameters (4pts) == 4 All of the following 3 questions are answered with insightful, accurate explanations supported by measurement results: How does the performance change with different block sizes? Does the replication level influence performance? How do different types of MapReduce jobs respond to the same replication level? 3 2 of the above questions are answered with insightful, accurate explanations supported by measurement results OR results are presented for 3 of the above questions but the results are not sufficiently explained 2 1 of the above questions is answered with insightful, accurate explanations supported by measurement results OR results are presented for 2 of the above questions but the results are not sufficiently explained 1 Results are presented for 1 of the above questions but the results are not sufficiently explained 0 No results or explanation of data parameter effects is included == Effects of compute parameters (4pts) == 4 All of the following 3 questions are answered with insightful, accurate explanations supported by measurement results: How does the number of map tasks influence performance? How does the number of tasks per slave influence performance? How does the number of reduce tasks influence performance? 3 2 of the above questions are answered with insightful, accurate explanations supported by measurement results OR results are presented for 3 of the above questions but the results are not sufficiently explained 2 1 of the above questions is answered with insightful, accurate explanations supported by measurement results OR results are presented for 2 of the above questions but the results are not sufficiently explained 1 Results are presented for 1 of the above questions but the results are not sufficiently explained 0 No results or explanation of compute parameter effects is included == Performance estimation (3pts) == 3 Differences between medium and high-cpu medium instances that may influence MapReduce performance are discussed; Values are specified for all compute and data parameters; Value choices are thoroughly explained and backed by earlier observations 2 Differences between medium and high-cpu medium instances that may influence MapReduce performance are discussed; Values are specified for all compute and data parameters OR Values are specified for all compute and data parameters; Value choices are thoroughly explained and backed by earlier observations 1 Values are specified for all compute and data parameters 0 No MapReduce jobs are run on a cluster of high-cpu medium instances or parameter choices for this setup are not specified == Parameter critique (2pts) == 2 Choice of parameters is critiqued based on observed performance on a cluster of high-cpu medium instances; New values are specified for at least one compute or data parameter 1 New values are specified for at least one compute or data parameter 0 No second round of MapReduce jobs are run on a cluster of high-cpu medium instances or parameter choices for this second round are not specified == Choice of performance metrics (3pts) == 3 Several metrics of performance are considered; An appropriate subset of metrics are considered when evaluating specific parameters 2 Several metrics of performance are considered; Metrics considered do not vary based on the parameters being evaluated 1 Only one metric of performance is considered for each parameter being evaluated 0 Only one metric of performance is considered throughout the entire study == Measurement results (2pt) == 2 All measurement results are presented in a clear format and accurately labeled. 1 Some measurement results are not presented in a clear format or accurately labeled. 0 Measurement results are hard to interpret