This first lab was about calculating metrics for spatial
data. One of the first things to get a handle on was the difference between
accuracy and precision. Accuracy is basically how close the measured or
observed value is to the known or “true” value, where precision is how close or
clustered the measured or observed values are to each other. The first part of
the lab introduced us to using ArcMap to determine precision and accuracy. Using
a set of 50 different points representing a single location mapped 50 times
using a GPS unit, we performed tasks to determine the accuracy and precision of
the GPS unit. First, we determined the average location by using the Statistics
tool to obtain an average X and Y location, and created a feature class, using
Editor to place this point on our map. The shapefiles needed to be projected
into a coordinate system that uses meters instead of decimal degrees. Next, we
needed to find a distance from the average location that corresponded to a
percentage of the observations, specifically 50%, 68%, and 95%. To do this, I
first performed a spatial join on the waypoints and average location
shapefiles, which created a distance field. After sorting the distance field in
ascending order, I set my buffers at the distance corresponding to the 25th
value (50%), the 34th value (68%), and I used the value halfway
between the 47th and 48th value (95%). This created
buffers for the 50th, 68th, and 95th
percentile, which shows the precision of the results. The map displaying this
is shown below.
Using the 68th percentile as an accepted value
for precision, I compared the device’s accuracy and precision. In the
horizontal direction, the distance between the average location and the
reference point was determined (using the Measure tool) to be 3.8 meters. The precision
value was 4.4 meters. As the distance between the average and the known is less
than the precision value, I determined that it is reasonably accurate in the
horizontal direction. In the vertical direction, the distance between the
average and the reference value was approximately 6 meters, where the precision
value is 5.7 meters. This tells me that the device is not as accurate in the
vertical direction.
In Part B, we worked with a larger dataset and determined
the Root Mean Square Error (RMSE) and created a cumulative distribution
function (CDF) of the error. After opening the Excel file and copying the
benchmark X and Y values over, I had columns for various points (X and Y
values) and the benchmark X and Y values, from which I created new columns and
calculated the X and Y errors (difference between the point values and
benchmark values), the XY error squared, and the error XY values. From the
error XY values, I calculated the mean, median, RMSE, 68th, 90th,
and 95th percentiles. I also determined the minimum and maximum.
From the error XY values and the cumulative percent (there
are 200 values, so each is 0.5 percent of the total), I created a scatterplot
showing the Error_XY versus the cumulative percent, which helps to demonstrate that several of the metrics can be determined from the scatterplot chart.
The most difficult part of the assignment was determining
the difference between accuracy and precision. The definition is
straightforward enough, but I was having issues mentally picturing the
difference when performing the calculations. It made sense to me to say that if
the distance between the average location and the reference point was less than
the 68% precision value, then the data is accurate, but it’s still a little
unclear. Hopefully I can work a little more with the statistical aspect of GIS in the future and gain a better understanding of it.