+ - 0:00:00
Notes for current slide
Notes for next slide

Overlay Spatial Objects in R

Soils Map and CDL for South Dakota

Xiaodan Lyu

Joint work with Heike Hofmann, Emily Berg

26 October 2017

1 / 39

Navigation

  1. Background

  2. Web Soil Survey

  3. Cropland Data Layer

  4. Spatial Objects in R

  5. Procedure

  6. Visualization with Shiny

1 / 39

Background

2 / 39

Conservation Effects Assessment Project (CEAP)

3 / 39

CEAP-Cropland

  • To predict the population mean of rainfall-erosion losses from cropland at county level.

  • One of the objectives is to classify the soils and identify cropland soil.

    • Public available data resources: Web Soil Survey and Cropland Data Layer (CDL).

    • Thinking about overlaying soils data with CDL.

4 / 39

Web Soil Survey

5 / 39

SSURGO database

  • Detailed Soil Map.

  • The SSURGO database can be viewed in the Web Soil Survey or downloaded in ESRI shapefile format.

  • The extent of a SSURGO dataset is a soil survey area, which may consists of a single county, multiple counties or parts of multiple counties.

  • There are 67 soil survey areas in South Dakota.

    • 57 survey areas (SD003-SD137) in South Dakota cover a single county;

    • 10 survey areas (SD600-SD603, SD606-SD613) do not.

6 / 39

10 Soil Survey Areas

df %>% group_by(AREASYM) %>%
summarise(n = length(unique(COUNTY)),
COUNTY = toString(unique(COUNTYNM))) %>%
arrange(desc(AREASYM)) %>% head(10)
## # A tibble: 10 × 3
## AREASYM n COUNTY
## <fctr> <int> <chr>
## 1 SD613 1 Shannon
## 2 SD612 3 Jackson, Pennington, Shannon
## 3 SD611 1 Jackson
## 4 SD610 1 Jackson
## 5 SD607 2 Custer, Pennington
## 6 SD606 2 Custer, Pennington
## 7 SD603 2 Brule, Buffalo
## 8 SD602 2 Hanson, Hutchinson
## 9 SD601 1 Meade
## 10 SD600 1 Meade
7 / 39

10 survey areas Map

8 / 39

SD612, Badlands National Park

9 / 39

Map Units

  • The soil map outlines areas called map units.

  • A soil survey area is a group of map units.

  • Labels for a mapunit include AREASYM, MUKEY, MUSYM and MUNAME.

mu %>% dplyr::select(AREASYM, MUKEY, MUSYM, MUNAME) %>% glimpse
## Observations: 7,501
## Variables: 4
## $ AREASYM <fctr> SD011, SD011, SD011, SD011, SD011, SD011, SD011, SD01...
## $ MUKEY <int> 418821, 418822, 418823, 418824, 2765292, 2765288, 2765...
## $ MUSYM <fctr> Wa, Wb, WeA, Wo, Z150A, Z151A, Z152A, Z153A, Z155A, Z...
## $ MUNAME <fctr> Wakonda-Chancellor complex, 0 to 2 percent slopes, Wa...
10 / 39

Mapunit Example Plotted Using R

MUKEY = 417991; MUSYM = "Pr";

MUNAME = "Prosper-Stickney complex, 0 to 2 percent slopes".

11 / 39

Cropland Data Layer

12 / 39

CDL data

  • Resource: USDA-NASS, available free at the CropScape portal.

  • a raster, geo-referenced, crop-specific land cover data layer;

  • available in GeoTIFF(.TIF) file format

    • .tif file: each pixel has an integer value ranging from 0 to 254.

    • .tif.vat.dbf file: information about each value, including CLASS_NAME and COLOR (RED, GREEN, BLUE, OPACITY).

cdl.sd.dbf %>% glimpse
## Observations: 255
## Variables: 6
## $ VALUE <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1...
## $ CLASS_NAME <fctr> Background, Corn, Cotton, Rice, Sorghum, Soybeans,...
## $ RED <dbl> 0.0000000, 1.0000000, 1.0000000, 0.0000000, 1.00000...
## $ GREEN <dbl> 0.0000000, 0.8274510, 0.1490196, 0.6588235, 0.61960...
## $ BLUE <dbl> 0.00000000, 0.00000000, 0.14901961, 0.89803922, 0.0...
## $ OPACITY <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
13 / 39
14 / 39

2006 South Dakota CDL color legend

15 / 39

CDL Example Plotted Using R

16 / 39

Spatial Objects in R

17 / 39

Coordinate Reference Systems (CRS)

CRS for spatial objects are similar to units for volume or distance.

## a data frame of all the CRSs
rgdal::make_EPSG() %>% glimpse
## Observations: 5,078
## Variables: 3
## $ code <S3: AsIs> 3819, 3821, 3824, 3889, 3906, 4001, 4002, 4003, 4004...
## $ note <S3: AsIs> # HD1909, # TWD67, # TWD97, # IGRS, # MGI 1901, # Un...
## $ prj4 <S3: AsIs> +proj=longlat +ellps=bessel +towgs84=595.48,121.69,5...
## CRS used by Google Earth
sp::CRS("+init=epsg:4326")
## CRS arguments:
## +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84
## +towgs84=0,0,0
18 / 39

Coordinate Reference Systems (Cont'd)

When readOGR is used to import ESRI shapefiles, the CRS is automatically linked to the R spatial object.

poly.name <- "SD6XX"
poly <- rgdal::readOGR(dsn = path.to.poly.folder,
layer = poly.name)
## OGR data source with driver: ESRI Shapefile
## Source: "G:/Grads/lyux/mu_by_areasym/SurveyAreaPoly", layer: "SD6XX"
## with 10 features
## It has 2 fields
## To retrieve the CRS for a spatial object
sp::proj4string(poly)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
## not run
## To assign a known CRS to a spatial data
proj4string(x) <- CRS("+init=epsg:4326")
## To transform from one CRS to another
newData <- spTransform(x, CRS("+init=epsg:4269"))
19 / 39

Raster Objects

cdl.sd <- raster::raster(x = path.to.cdl)
cdl.sd
## class : RasterLayer
## dimensions : 7368, 11016, 81165888 (nrow, ncol, ncell)
## resolution : 56, 56 (x, y)
## extent : -652194, -35298, 2164650, 2577258 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
## data source : C:\Users\lyux\Box Sync\SAEZeros\Data\CDL\CDL_2006_46.tif
## names : CDL_2006_46
## values : 0, 255 (min, max)
## retrieve spatial bounding box
sp::bbox(cdl.sd)
## min max
## s1 -652194 -35298
## s2 2164650 2577258
20 / 39

Raster Objects (Cont'd)

## To crop by row and column number
cdl.sub <- raster::crop(x = cdl.sd,
y = extent(cdl.sd, 1001, 1100, 1001, 1100))
dim(cdl.sub)
## [1] 100 100 1
## Extract Cell Values
getValues(cdl.sub) %>% table ## extract(cdl.sub)
## .
## 29 61 111 121 122 152 176 190 195
## 40 217 8 262 1 8011 1450 1 10
## Transfrom Raster to SpatialPointDataFrame
cdl.point <- raster::rasterToPoints(cdl.sub, spatial = TRUE)
class(cdl.point)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
21 / 39

Spatial Polygons Data Frame

rgdal::write.OGR(obj, dsn, layer, driver = "ESRI Shapefile")

mu.sd@data %>% glimpse ## slot(mu.sd, "data")
## Observations: 9,394
## Variables: 12
## $ OBJECTID <int> 856089, 856090, 856173, 856174, 856175, 856176, 856...
## $ AREASYMBOL <fctr> SD602, SD602, SD602, SD602, SD602, SD602, SD602, S...
## $ SPATIALVER <dbl> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, ...
## $ MUSYM <fctr> CdA, CdA, Bo, Bo, Bo, Bo, Cb, Bo, Cb, Bo, Bo, Bo, ...
## $ MUKEY <fctr> 417943, 417943, 417939, 417939, 417939, 417939, 41...
## $ Shape_Leng <dbl> 748.7136, 567.2321, 252.3721, 2637.9464, 2387.1247,...
## $ Shape_Area <dbl> 38846.432, 16061.870, 4102.656, 119761.515, 87533.6...
## $ NAME <fctr> Hanson, Hanson, Hanson, Hanson, Hanson, Hanson, Ha...
## $ STATE_NAME <fctr> South Dakota, South Dakota, South Dakota, South Da...
## $ STATE_FIPS <fctr> 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46...
## $ CNTY_FIPS <fctr> 061, 061, 061, 061, 061, 061, 061, 061, 061, 061, ...
## $ FIPS <fctr> 46061, 46061, 46061, 46061, 46061, 46061, 46061, 4...
mu <- mu.sd %>% subset(MUKEY == "417938")
22 / 39
23 / 39

Spatial Polygons Data Frame (Cont'd)

class(mu@polygons)
## [1] "list"
mu@polygons[[1]] %>% glimpse
## Formal class 'Polygons' [package "sp"] with 5 slots
## ..@ Polygons :List of 1
## .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
## ..@ plotOrder: int 1
## ..@ labpt : num [1:2] -97.9 43.5
## ..@ ID : chr "208"
## ..@ area : num 2.64e-06
geometry(mu)
## class : SpatialPolygons
## features : 77
## extent : -97.9669, -97.61069, 43.49695, 43.84431 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
24 / 39

Overlaying

x %over% y ## over(x, y)

at the spatial locations of object x retrieves the indexes or attributes from spatial object y.

## require identitical (CRS)
identicalCRS(cdl.point, mu)
## [1] FALSE
cdl.point <- spTransform(cdl.point, CRS = proj4string(mu))
sp::over(mu, cdl.point) %>% glimpse
## Observations: 77
## Variables: 1
## $ CDL_2006_46 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
25 / 39
26 / 39

Overlaying (Cont'd)

index <- !is.na(over(mu, cdl.point))
class(index)
## [1] "matrix"
cdl.point[index[1,], ]
## class : SpatialPointsDataFrame
## features : 0
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## variables : 1
## names : CDL_2006_46
cdl.point[mu, ] ## cdl.point[geometry(mu), ]
## class : SpatialPointsDataFrame
## features : 0
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## variables : 1
## names : CDL_2006_46
27 / 39

Plotting

data.frame(cdl.point) %>% ggplot() +
geom_point(aes(x = x, y = y, color = factor(CDL_2006_46)),
shape = 15) +
coord_equal()
28 / 39

Plotting (Cont'd)

fortify(mu) %>% ggplot() +
geom_path(aes(x = long, y = lat, group = factor(group))) +
coord_equal()
29 / 39

Procedure

30 / 39
31 / 39

Implementation

  1. Export SSURGO soils map in South Dakota into a shapefile.

  2. Download CDL raster file from year 2006 to year 2016 for South Dakota from USDA-NASS.

  3. Iterate first by county then by mapunits

    • Crop the CDL raster to the extent of the bounding box of the target mapunit;

    • Transforms rater file into spatial point object;

    • Overlay the mapunit polygon with the cropped CDL points;

    • Table pixel count by CDL category grouped by each mapunit.

32 / 39

Some tricks

  1. Split the shapefile into separate shapefiles grouped by soil survey areas.

  2. Omit background pixels and pixels with missing values in CDL data.

  3. download the latest county boundary shapefiles from the website of US census Bureau, then overlay them with the shapefiles of the 10 survey areas.

  4. Transform spatial objects to be with the same Coordinating Reference System.

  5. Using parallelized computing to speed up the process. [1]

[1] Package multidplyr is used.

33 / 39

Example overlaid image

overlay

Figure 1: SD602 MUKEY 417991 overlaid with 2006 CDL raster.

34 / 39

Example overlaid image

Hanson MUKEY 417991 overlaid with 2006 CDL raster.

35 / 39

Example CDL pixel counts

hanson.417991 %>%
arrange(desc(PIXEL_COUNT)) %>%
filter(PIXEL_COUNT>=50)
## CODE CLASS_NAME PIXEL_COUNT
## 1 5 Soybeans 12874
## 2 1 Corn 12214
## 3 176 Grass/Pasture 5836
## 4 121 Developed/Open Space 1749
## 5 24 Winter Wheat 1472
## 6 61 Fallow/Idle Cropland 813
## 7 36 Alfalfa 699
## 8 195 Herbaceous Wetlands 550
## 9 23 Spring Wheat 543
## 10 141 Deciduous Forest 196
## 11 28 Oats 50
36 / 39

Visualization with Shiny

37 / 39

Beta Version

available at shinyapps.io

38 / 39

Other Visualization Tools

Existing tools besides CropScape and Web Soil Survey:

  • Interfaces to SoilWeb - California Soil Resource Lab :: SoilWeb Apps

  • R packages provided by project Algorithms for Quantitative Pedology

    • aqp, sharpshootR and soilDB: a collection of algorithms related to the modeling of soil resources, soil classification, soil profile aggregation, and visualization.
39 / 39

Navigation

  1. Background

  2. Web Soil Survey

  3. Cropland Data Layer

  4. Spatial Objects in R

  5. Procedure

  6. Visualization with Shiny

1 / 39
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow