A journey into mapping

Introduction

mapping allows to easily show data using maps, without concern about the geographical coordinates which are provided in the package and automatically link with the data. The mapping functions use the already available and well implemented function in tmap, cartography, and leaflet packages.

Since different countries have different geographical structure, and, in particular, different statistical unit or different subdivision, mapping provide a single function for static and interactive plots of such subvisions:

Country Function Static
World mappingWR()
European Union mappingEU()
Italy mappingIT()
United States of America mappingUS()

In addition a generic mapping() function is also provided, and explained in a specific section.

The most important step is to link each country, partition or statistical unit with their coordinates. The package also provides specific function to automatically build a object with data and coordinates:

Coordinates Function Object Class
World WR() WR
European Union EU() EU
Italy IT() IT
United States of America US() US

The data can be linked first building the object with its specific function or using the mapping function which automatically will both link, and then map the input data.



The CRAN version can be loaded as follows:

library('mapping')

or the development version from GitHub:

remotes::install_github('serafinialessio/mapping')

The population data, available in the package, will be used to describe the package features

data("popWR")
str(popWR)
## 'data.frame':    269 obs. of  5 variables:
##  $ country     : Factor w/ 265 levels "","Afghanistan",..: 2 3 4 5 6 7 8 10 11 12 ...
##  $ country_code: Factor w/ 265 levels "","ABW","AFG",..: 3 5 60 11 6 4 12 9 10 2 ...
##  $ total       : num  37172386 2866376 42228429 55465 77006 ...
##  $ male        : num  19093281 1460043 21332000 NA NA ...
##  $ female      : num  18079105 1406333 20896429 NA NA ...
data("popEU")
str(popEU)
## 'data.frame':    2252 obs. of  5 variables:
##  $ TIME  : num  2019 2019 2019 2019 2019 ...
##  $ GEO   : chr  "BE" "BE1" "BE10" "BE100" ...
##  $ total : num  11455519 1215290 1215290 1215290 6596233 ...
##  $ male  : num  5644826 597008 597008 597008 3265134 ...
##  $ female: num  5810693 618282 618282 618282 3331099 ...
data("popIT")
str(popIT)
## 'data.frame':    107 obs. of  4 variables:
##  $ ID     : chr  "Roma" "Milano" "Napoli" "Torino" ...
##  $ maschi : num  2081239 1576316 1497289 1092504 624201 ...
##  $ femmine: num  2260973 1673999 1587601 1167019 641753 ...
##  $ totale : num  4342212 3250315 3084890 2259523 1265954 ...
data("popUS")
str(popUS)
## 'data.frame':    52 obs. of  2 variables:
##  $ id        : chr  "Maine" "North Carolina" "Georgia" "Alaska" ...
##  $ population: int  1338404 10383620 10519475 737438 4887871 626299 3034392 1805832 3943079 5813568 ...

Load coordinates and check names

Coordinates can be separately downloaded using this specific functions

Coordinates Function
World loadCoordWR()
European Union loadCoordEU()
Italy loadCoordIT()
United States of America loadCoordUS()

Coordinates are download from the GitHub repository , which provides .geojson and .RData files with coordinates, which return an object of class sf.

coord_eu <- loadCoordEU(unit = "nuts0")

The unit argument in the load functions, indicates the type of statistical unit, geographical subdivision or level of aggregation, which is specific for the country. For example, in this case, the EU has different statistical units, and we are interested to get coordinates for “nuts0”, i.e. for European countries.

library(tmap)
tm_shape(coord_eu) + tm_borders()


library(mapview)
mapview(coord_eu)

Returning an object of class sf, we can also use the mapping function available in the other R packages.

Note that, the data are downloaded from an online repository, and then an internet connection should be preferred. Nevertheless, if the use_internet argument set to FALSE, we will get the coordinates locally available in the package.

The names provided from the user, and the names available in the package have to be the same to link the coordinate. checkNames functions will return the nomatching names:

checkNamesIT(popIT$ID, unit = "provincia")
## [1] "reggio di calabria"             "bolzano / bozen"               
## [3] "valle d'aosta / vallée d'aoste"

GetNames functions returns the names used in the packages for each unit.

getNamesEU(unit = "nuts0")
##                                                 country iso2 iso3 country_code
## 1                                               Germany   DE  DEU          276
## 2                                               Czechia   CZ  CZE          203
## 3                                              Bulgaria   BG  BGR          100
## 4                                           Switzerland   CH  CHE          756
## 5                                               Albania   AL  ALB            8
## 6                                               Austria   AT  AUT           40
## 7                                                Cyprus   CY  CYP          196
## 8                                                Greece   GR  GRC          300
## 9                                               Belgium   BE  BEL           56
## 10                                               France   FR  FRA          250
## 11                                              Denmark   DK  DNK          208
## 12                                              Estonia   EE  EST          233
## 13                                                Spain   ES  ESP          724
## 14                                              Finland   FI  FIN          246
## 15                                               Norway   NO  NOR          578
## 16                                               Sweden   SE  SWE          752
## 17                                             Slovenia   SI  SVN          705
## 18                                          Netherlands   NL  NLD          528
## 19                                                Italy   IT  ITA          380
## 20                                            Lithuania   LT  LTU          440
## 21                                           Luxembourg   LU  LUX          442
## 22                                               Latvia   LV  LVA          428
## 23                                           Montenegro   ME  MNE          499
## 24                                      North Macedonia   MK  MKD          807
## 25                                                Malta   MT  MLT          470
## 26                                              Romania   RO  ROU          642
## 27                                               Serbia   RS  SRB          688
## 28                                              Croatia   HR  HRV          191
## 29                                             Slovakia   SK  SVK          703
## 30                                        Liechtenstein   LI  LIE          438
## 31                                             Portugal   PT  PRT          620
## 32                                              Hungary   HU  HUN          348
## 33                                              Ireland   IE  IRL          372
## 34                                              Iceland   IS  ISL          352
## 35 United Kingdom of Great Britain and Northern Ireland   GB  GBR          826
## 36                                               Poland   PL  POL          616
## 37                                               Turkey   TR  TUR          792
##    nuts0_id                                                nuts0
## 1        DE                                              Germany
## 2        CZ                                              Czechia
## 3        BG                                             Bulgaria
## 4        CH                                          Switzerland
## 5        AL                                              Albania
## 6        AT                                              Austria
## 7        CY                                               Cyprus
## 8        EL                                               Greece
## 9        BE                                              Belgium
## 10       FR                                               France
## 11       DK                                              Denmark
## 12       EE                                              Estonia
## 13       ES                                                Spain
## 14       FI                                              Finland
## 15       NO                                               Norway
## 16       SE                                               Sweden
## 17       SI                                             Slovenia
## 18       NL                                          Netherlands
## 19       IT                                                Italy
## 20       LT                                            Lithuania
## 21       LU                                           Luxembourg
## 22       LV                                               Latvia
## 23       ME                                           Montenegro
## 24       MK                                      North Macedonia
## 25       MT                                                Malta
## 26       RO                                              Romania
## 27       RS                                               Serbia
## 28       HR                                              Croatia
## 29       SK                                             Slovakia
## 30       LI                                        Liechtenstein
## 31       PT                                             Portugal
## 32       HU                                              Hungary
## 33       IE                                              Ireland
## 34       IS                                              Iceland
## 35       UK United Kingdom of Great Britain and Northern Ireland
## 36       PL                                               Poland
## 37       TR                                               Turkey
getNamesEU(unit = "nuts0", all_levels = FALSE)
##  [1] Germany                                             
##  [2] Czechia                                             
##  [3] Bulgaria                                            
##  [4] Switzerland                                         
##  [5] Albania                                             
##  [6] Austria                                             
##  [7] Cyprus                                              
##  [8] Greece                                              
##  [9] Belgium                                             
## [10] France                                              
## [11] Denmark                                             
## [12] Estonia                                             
## [13] Spain                                               
## [14] Finland                                             
## [15] Norway                                              
## [16] Sweden                                              
## [17] Slovenia                                            
## [18] Netherlands                                         
## [19] Italy                                               
## [20] Lithuania                                           
## [21] Luxembourg                                          
## [22] Latvia                                              
## [23] Montenegro                                          
## [24] North Macedonia                                     
## [25] Malta                                               
## [26] Romania                                             
## [27] Serbia                                              
## [28] Croatia                                             
## [29] Slovakia                                            
## [30] Liechtenstein                                       
## [31] Portugal                                            
## [32] Hungary                                             
## [33] Ireland                                             
## [34] Iceland                                             
## [35] United Kingdom of Great Britain and Northern Ireland
## [36] Poland                                              
## [37] Turkey                                              
## 37 Levels: Albania Austria Belgium Bulgaria Croatia Cyprus Czechia ... United Kingdom of Great Britain and Northern Ireland

Building a mapping object

Before building a map of our data, we have to link the ids with coordinates, which is automatic in mapping package using specific functions:

Coordinates Function Object Class
World WR() WR
European Union EU() EU
Italy IT() IT
United States of America US() US

These are the functions to build the dataset with data and coordinates, and, using specific arguments, we can manipulate the data before mapping.

The popIT, as showed in the previous section, does not contain any information about the geographical geometries:

str(popIT)
## 'data.frame':    107 obs. of  4 variables:
##  $ ID     : chr  "Roma" "Milano" "Napoli" "Torino" ...
##  $ maschi : num  2081239 1576316 1497289 1092504 624201 ...
##  $ femmine: num  2260973 1673999 1587601 1167019 641753 ...
##  $ totale : num  4342212 3250315 3084890 2259523 1265954 ...

then, the coordinates are added as follows:

it <- IT(data = popIT, unit = "provincia", year = "2018",colID = "ID")

We have to specify the type of statistical unit, the column containing the ids and, if necessary, the year of the subdivision. The functions will automatically download the coordinates linking them to the data. In this example, we have data about the population of the Italian province in 2018.

library(tmap)
tm_shape(it) + tm_borders() + tm_fill("totale")

We have missing observation because the names in the data are different from the name in the package, as showed before.

The unit belongs to different levels of aggregation/division. We can think at this as an hierarchy, i.e. starting from a subdivision we can know all the larger aggregation and then building the bigger geographical object.

This diagram shows an example of this hierarchy, with the largest level, level0, to the smaller, level4. If we have a level3 unit, we will have all the largest one until the level0.

In linking the data and the coordinates, the functions available in this packages will return also the information about larger units. For example, in the Italian case the it data will have a column indicating the ripartizione and regione, which are larger aggregates than provincia. Building the hierarchy is available for all functions in this section and it is also available in the loading functions.

str(it,1)
## Classes 'sf', 'IT', 'IT' and 'data.frame':   107 obs. of  11 variables:
##  $ ripartizione     : chr  "Nord-ovest" "Nord-ovest" "Nord-ovest" "Nord-ovest" ...
##  $ regione          : chr  "Piemonte" "Piemonte" "Piemonte" "Piemonte" ...
##  $ code_ripartizione: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ code_regione     : int  1 1 1 1 1 1 2 7 7 7 ...
##  $ code_provincia   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ ID               : chr  "torino" "vercelli" "novara" "cuneo" ...
##  $ code             : chr  "TO" "VC" "NO" "CN" ...
##  $ maschi           : num  1092504 82848 179588 289459 105011 ...
##  $ femmine          : num  1167019 88063 189430 297639 109627 ...
##  $ totale           : num  2259523 170911 369018 587098 214638 ...
##  $ geometry         :sfc_MULTIPOLYGON of length 107; first list element: List of 1
##   ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA
##   ..- attr(*, "names")= chr [1:10] "ripartizione" "regione" "code_ripartizione" "code_regione" ...
##  - attr(*, "unit")= chr "provincia"
##  - attr(*, "year")= chr "2018"
##  - attr(*, "colID")= chr "ID"

Data can be subsets before mapping

it <- IT(data = popIT, unit = "provincia", 
         year = "2018",colID = "ID",
         subset = ~ I(regione == "Lazio"))

in this case, we use the hierarchy to retrieve only the data of “Lazio” region.

library(tmap)
tm_shape(it) + tm_borders()

Suppose now to want the percentage of male and female, but we have only the total number:

it <- IT(data = popIT, unit = "provincia", 
         year = "2018",colID = "ID",
         add = ~I(maschi/totale) + I(femmine/totale), 
         new_var_names = c("Male percentage", "Female percentage"),
         print = FALSE)

str(it,1)
## Classes 'sf', 'IT', 'IT' and 'data.frame':   107 obs. of  13 variables:
##  $ ripartizione     : chr  "Nord-ovest" "Nord-ovest" "Nord-ovest" "Nord-ovest" ...
##  $ regione          : chr  "Piemonte" "Piemonte" "Piemonte" "Piemonte" ...
##  $ code_ripartizione: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ code_regione     : int  1 1 1 1 1 1 2 7 7 7 ...
##  $ code_provincia   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ ID               : chr  "torino" "vercelli" "novara" "cuneo" ...
##  $ code             : chr  "TO" "VC" "NO" "CN" ...
##  $ maschi           : num  1092504 82848 179588 289459 105011 ...
##  $ femmine          : num  1167019 88063 189430 297639 109627 ...
##  $ totale           : num  2259523 170911 369018 587098 214638 ...
##  $ Male.percentage  : 'AsIs' num  0.483510.... 0.484743.... 0.486664.... 0.493033.... 0.489247.... ...
##  $ Female.percentage: 'AsIs' num  0.516489.... 0.515256.... 0.513335.... 0.506966.... 0.510752.... ...
##  $ geometry         :sfc_MULTIPOLYGON of length 107; first list element: List of 1
##   ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:12] "ripartizione" "regione" "code_ripartizione" "code_regione" ...
##  - attr(*, "unit")= chr "provincia"
##  - attr(*, "year")= chr "2018"
##  - attr(*, "colID")= chr "ID"

Now, we have now the percentage and we have named the new variables.

Note that, we can also build directly this object in the mapping functions, but we can not manipulate the data and build an unique object to be used in different mapping functions.

Countries ids or units may have different names. For example, we can have a iso2 name instead of a formal name. The matchWith argument indicates the type of names we have to link.

eu <- EU(data = popEU, unit = "nuts0", colID = "GEO", 
         matchWith = "id", check.unit.names = FALSE)

The popEU data have nuts expressed with ids, which is specified in the matchWith.

Static maps

We start with a map of the European Union countries

mappingEU(data = coord_eu)

In this case, we do not provide any data to map. Now, we suppose to want to look at the distribution of population among European countries

eu <- EU(data = popEU, unit = "nuts0", colID = "GEO", 
         matchWith = "iso2", check.unit.names = FALSE)
mappingEU(eu, var = "total")

where, matchWith is equal to “iso2” because the id name in popEU are expressed according to iso2 code, instead of iso3 or country names.

It is equivalent to use mapping function without building as EU object

mappingEU(data = popEU,unit = "nuts0", colID = "GEO", matchWith = "iso2", var = "total")

Of course, the mapping function provides arguments to work and modify data before mapping.

The loadCoord functions, as explained in the previous section, automatically return all the bigger statistical unit aggregation. We can easily use this in mapping functions, in which the value of the variables are sum for each aggragation_unit

eu <- EU(data = popEU, unit = "nuts1", colID = "GEO", 
         matchWith = "id", check.unit.names = FALSE)
mappingEU(eu, var = "total")


mappingEU(eu, var = "total", aggregation_unit = "nuts0", aggregation_fun = sum)

We can also provide multiple variables to generate multiple maps

mappingEU(eu, var = c("male","female"))

or, if we are not interested in the entire data but in a specific subset, we can apply a subset statement before mapping

mappingEU(eu, var = "total", 
          subset = ~I(country == "Spain" | country == "Italy"))

Let’s look at USA example.

us <- US(data = popUS, unit = "state", matchWith = "name")
mappingUS(us)
mappingUS(us, var = "population")

mappingUS(us, var = "population", options = mapping.options(nclass = 10, legend.portrait = FALSE))


mappingUS(us, var = "population", add_text = "state_id",
          options = mapping.options(nclass = 10, legend.portrait = FALSE))

The facetes argument returns the small multiples, and in this case, It maps all the divisions.

mappingUS(us, var = "population", aggregation_unit = "division", facets = "division")

If we are not interested at only the Northeast region, we can apply a subset statement before mapping

mappingUS(us, var = "population", subset = ~I(region == "Northeast"), facets = "id")

Interactive maps

The interactive map functions work as the static functions and they share the same argument.

eu <- EU(data = popEU, unit = "nuts0", colID = "GEO", 
         matchWith = "id", check.unit.names = FALSE)
mappingEU(eu, var = "total", type = "interactive")
mappingEU(eu, var = "total", type = "interactive",
                      subset = ~I(country == "Spain" | country == "Italy"))

or aggregating for countries (“nuts0”)

mappingEU(eu, var = "total", type = "interactive",
                      aggregation_unit = "nuts0")

Multiple variable will provide a single interactive map with different layers:

mappingEU(eu, var = c("male","female"), type = "interactive")

and also the facets is implemented for interactive maps.

A generic mapping() function

The package also provide a generic function to map data, mapping. This accept object of class sf, WR, EU, IT, and US.

library(dplyr)

data("popIT")
popIT <- popIT
coords <- loadCoordIT(unit = "provincia", year = '2019')
cr <- left_join(coords, popIT, by = c( "provincia" = "ID"))
mapping(cr)

mapping(cr, var = "maschi")

library(sf)
nc = st_read(system.file("shape/nc.shp", package="sf"))
## Reading layer `nc' from data source `/github/workspace/pkglib/sf/shape/nc.shp' using driver `ESRI Shapefile'
## Simple feature collection with 100 features and 14 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## Geodetic CRS:  NAD27
class(nc)
## [1] "sf"         "data.frame"
mapping(nc)

Layout options

Aesthetic options are controlled by mapping.options() function. General options can be retrieved

mapping.options()

single or multiple options may be retrieved

mapping.options("palette.cont")
## [1] "YlGnBu"
mapping.options("legend.position")
## [1] "right" "top"

and we can globally change until a new R session, as follows

mapping.options(legend.position = c("left","bottom"))
mapping.options("legend.position")
## [1] "right" "top"

Options can be changed locally in mapping functions:

map <- mappingEU(eu, var = "total")
map_options <- mappingEU(eu, var = "total", 
                         options = mapping.options(list(legend.position = c("left","bottom"),
                                                        title = "EU total population",
                                                        map.frame = FALSE,
                                                        col.style = "pretty")))

library(tmap)
tmap_arrange(map, map_options)

mapping.options.reset()

or globally outside the functions. Original options can be reseted using mapping.options.reset().