Public data overview: Brazilian demographic census (IBGE)
The brazilian demographic census (Censo Demográfico) is a survey conducted by the Brazilian National Institute of Geography and Statistics (IBGE) every ten years, ideally, just like in the US. This survey is the main source of municipal demographic data in Brazil and covers a fairly wide topics like habitation, education, labor force and migration, to name a few.
The purpose of this article is to present the brazilian demographic census to people interested in learning how to use its data to produce ad hoc analysis beyond the tools and reports published by the IBGE. A workflow is proposed jointly with some basic concepts about the survey. Although a basic understanding of R code and tidyverse
is needed to follow the tutorial, undergraduate students in any field, journalists and people interested in demographic data independent of the programming background may benefit of this guide.
Brief history
The first census done in Brazil was conducted in 1872 by the General Directory of Statistics (DGE), a precursor of the IBGE, under the administration of the emperor Pedro II. Since the IBGE was created under the administration of the president Getúlio Vargas, the census has been conducted every ten years, in round dates (1940, 1950, …, 2010), except in year 1991 under Fernando Collor, when financial issues delayed the survey, resumed in 2000. The next census after 2010, planned to be done in 2020, was also delayed to 2021 due to the coronavirus pandemic IBGE ([s.d.]).
Granularity: time and territory
Although census data is supposed to be free and open, not every edition is available on the official website of the IBGE. Currently, just 2000 and 2010 microdata are officially available.
In Brazil, the smallest federative unit is the municipality, so the census data are available for every aggregation from them. However, the census data are collected (and published) watching the census tracts. These are arbitrarily delimited by the IBGE and respects only physical and administrative boundaries. The table 1 shows an overview of the hierarchy of the territories1 from which the data are available.
Territory | Description |
---|---|
Union | All the national territory. The sovereignty from which all the other administrative boundaries are delimitated. |
States | The second federative entity in the hierarchy. Correspondent to the states of US. |
Municipalities | These are the smallest political-administrative territory with representation of Brazil. Somewhat equivalent to the counties of US. |
Census tracts | Unlike the others, the census tracts are arbitrarily delimited by the IBGE, and only serve the purpose of guiding the data collection. |
Codification
To identify the territories, the IBGE uses a codification system which preserves this territorial hierarchy. To exemplify, here is the code of the census tract that contains the Christ the Redeemer statue in Rio de Janeiro (RJ): 330455705280091.
Code | Territory | Description |
---|---|---|
33 | State | Rio de Janeiro (State) |
04557 | Municipality | Rio de Janeiro (Municipality) |
05 | District | Main district |
28 | Subdistrict | Santa Teresa |
0091 | Census tract | Census tract |
Other boundaries and regionalization from the IBGE may have different structures than this and can be found in IBGE (2019). However, the data is stored and can be retrieved accordingly to the structure presented in table 2.
Universe and sample
The brazilian demographic census is a survey with two questionnaires: the one from the universe is applied to every person in Brazil and the other from the sample is applied to specific people sampled through a stratified sampling method. The sample questionnaire contains all the subjects of the basic one and expands most of it.
Aiming to preserve the identity of the informants, some strategies are adopted at the moment of publishing the results. While the data from the universe is aggregated by census tracts, in the sample data the identification of the census tracts are replaced with the identification of the so called weighting areas.
Weighting areas are a group of contiguous census tracts arbitrarily chosen by the IBGE. They are big enough to difficult the identification of the informants but they lack the specificity of the universe data aggregated by census tracts. Other drawback of the weighting areas is that they may challenge some intra-municipal regionalization. On the other hand, this data may be the only available for intra-municipal level that include all the national territory and allow comparison between neighborhoods of different cities.
A peek on the Census survey design
The brazilian demographic census is a stratified sample, which means that all the national territory was divided by census tracts and all of them were used, but, inside the census tract, a simple random sample was drawn from the universe of households inside it.
This method is preferred over the simple random sampling method because it guarantees not only that everyone has the same chance to be in the sample, but also that no census tract will be under or over represented in the sample. The main disadvantage of this method is the higher cost in comparison with the simple random sampling LUMLEY (2010).
Where and how to access the data
In this section a workflow to analyze this data is suggested using an example. The analysis is: which are the top 1 origin countries in each weighting area in the city of Rio de Janeiro? To accomplish it, we will tidy the data using the tidyverse
metapackage, generate our analysis with the survey
package and build a map using the tmap
package.
The sample microdata and the aggregated data by census tracts are available in the public FTP server of the IBGE. The universe data is fairly easy to open in R and some cleaning may be necessary. However, the sample microdata for the year of 2010 can be a little tricky to analyze, therefore, it will be used in this article.
The sample microdata is available as a fixed width text file. In this case we have two options: manually specify the names and widths directly in the script; parse the excel spreadsheet to retrieve the widths and the names. The second option is preferred for reproducibility.
The survey
package will be used for reconstruct the survey design LUMLEY (2020) and the readxl
package for opening the layout file in *.xls
format.
library(tidyverse)
library(survey)
library(readxl)
We will use the data from Rio de Janeiro (RJ). It can be downloaded here, among other states and the documentation zipfile (Documentacao.zip
). The function read_microdata
is designed for opening any of the four microdata files of the census.
read_microdata <- function(microdata, topic = c("domicilios", "pessoas", "emigracao", "mortalidade")) {
match.arg(topic)
topic_sheet <- case_when(
topic == "domicilios" ~ 1,
topic == "pessoas" ~ 2,
topic == "emigracao" ~ 3,
topic == "mortalidade" ~ 4,
)
raw <- read_xls("~/Layout_microdados_Amostra.xls", sheet = topic_sheet)
layout <- raw %>%
select(
VAR = 1,
START = `...8`,
END = `...9`,
INT = `...10`,
DEC = `...11`,
TYPE = `...12`
) %>%
slice(-1) %>%
mutate(
START = as.integer(START),
END = as.integer(END),
INT = as.integer(INT),
DEC = replace_na(DEC, 0) %>% as.integer(),
TYPE = str_replace_all(TYPE, "\n", "") %>%
str_replace("A|C", "c") %>%
str_replace("N", "d")
)
microdata <- read_fwf(
file = microdata,
col_positions = fwf_positions(layout$START, layout$END, layout$VAR),
col_types = paste0(rep("c", nrow(layout)), collapse = "")
)
for(i in seq_along(microdata)) {
if(layout[which(layout$VAR == names(microdata[,i])),]$DEC > 0) {
int <- layout[i, ]$INT
dec <- layout[i, ]$DEC
integer_part <- str_sub(pull(microdata, i), 1, int)
decimal_part <- str_sub(pull(microdata, i), int+1, int+dec)
float_number <- paste0(integer_part, ".", decimal_part) %>%
as_tibble() %>%
type_convert(col_types = "d")
microdata[, i] <- float_number
}
}
microdata <- type_convert(
microdata,
col_types = paste0(layout$TYPE, collapse = "")
)
return(microdata)
}
Mind the file size! The computer used to run this code has 16 gigabytes of RAM, but you may not be able to directly open this file in your memory. Consider using SQL databases if you have trouble2.
rio <- read_microdata("~/RJ/Amostra_Pessoas_33.txt", "pessoas")
The RJ people microdata has 2.1 Gb after loaded.
print(object.size(rio), units = "auto")
## 2.1 Gb
glimpse(rio)
## Rows: 1,143,650
## Columns: 244
## $ V0001 <dbl> 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, …
## $ V0002 <chr> "00100", "00100", "00100", "00100", "00100", "00100", "00100", "…
## $ V0011 <dbl> 3.3001e+12, 3.3001e+12, 3.3001e+12, 3.3001e+12, 3.3001e+12, 3.30…
## $ V0300 <dbl> 12833, 12833, 20358, 20358, 20358, 20358, 20358, 20358, 27895, 2…
## $ V0010 <dbl> 10.720090, 10.720090, 19.010606, 19.010606, 19.010606, 19.010606…
## $ V1001 <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
## $ V1002 <chr> "05", "05", "05", "05", "05", "05", "05", "05", "05", "05", "05"…
## $ V1003 <chr> "013", "013", "013", "013", "013", "013", "013", "013", "013", "…
## $ V1004 <chr> "00", "00", "00", "00", "00", "00", "00", "00", "00", "00", "00"…
## $ V1006 <dbl> 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ V0502 <chr> "02", "01", "01", "02", "04", "04", "04", "04", "01", "02", "05"…
## $ V0504 <chr> "02", "01", "01", "02", "03", "04", "05", "06", "01", "02", "03"…
## $ V0601 <dbl> 1, 2, 1, 2, 1, 2, 1, 1, 1, 2, 1, 1, 2, 2, 1, 2, 2, 1, 2, 1, 1, 2…
## $ V6033 <dbl> 66, 68, 51, 47, 26, 25, 18, 9, 42, 40, 20, 40, 11, 37, 62, 67, 1…
## $ V6036 <dbl> 66, 68, 51, 47, 26, 25, 18, 9, 42, 40, 20, 40, 11, 37, 62, 67, 1…
## $ V6037 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6040 <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ V0606 <dbl> 1, 1, 4, 1, 4, 4, 2, 2, 1, 3, 1, 1, 1, 1, 4, 4, 1, 1, 1, 1, 4, 4…
## $ V0613 <dbl> NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ V0614 <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 2, 3, 4, 4, 3, 4, 4, 4…
## $ V0615 <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 2, 4, 4, 4, 4, 4, 4…
## $ V0616 <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4…
## $ V0617 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ V0618 <dbl> 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 3, 1, 3, 3, 3, 3, 3, 2, 1…
## $ V0619 <dbl> 1, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, 3, NA, 3, 1, 1, …
## $ V0620 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, 1, NA,…
## $ V0621 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0622 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, 1, NA,…
## $ V6222 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3500000, NA, 3500000…
## $ V6224 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0623 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "017", NA, "017", NA…
## $ V0624 <chr> "003", "003", "030", NA, NA, NA, NA, NA, NA, NA, NA, "017", NA, …
## $ V0625 <dbl> 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1,…
## $ V6252 <dbl> 3300000, 3300000, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V6254 <dbl> 3304805, 3304805, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V6256 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0626 <dbl> 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V6262 <dbl> 3300000, 3300000, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V6264 <dbl> 3304805, 3304805, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ V6266 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0627 <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ V0628 <dbl> 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 1, 3, 1, 3, 3, 3, 1, 1, 3, 3, 3, 3…
## $ V0629 <chr> NA, NA, NA, NA, NA, NA, NA, "05", NA, NA, "08", NA, "05", NA, NA…
## $ V0630 <chr> NA, NA, NA, NA, NA, NA, NA, "03", NA, NA, NA, NA, "06", NA, NA, …
## $ V0631 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0632 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0633 <chr> "10", "10", "03", "04", "07", "07", "06", NA, "10", "07", NA, "1…
## $ V0634 <dbl> 1, 1, 2, 2, 2, 1, NA, NA, 1, 1, NA, 1, NA, 1, NA, NA, NA, NA, 2,…
## $ V0635 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6400 <dbl> 3, 3, 1, 1, 1, 2, 1, 1, 3, 2, 2, 3, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1…
## $ V6352 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6354 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6356 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0636 <dbl> NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, 1, NA, 1, NA, NA, NA, 2, …
## $ V6362 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6364 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6366 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0637 <dbl> 1, 1, 1, 1, 3, 3, 3, NA, 1, 1, 3, 1, 3, 1, 1, 1, 3, 3, 1, 1, 1, …
## $ V0638 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0639 <dbl> 1, 1, 1, 1, NA, NA, NA, NA, 4, 4, NA, 2, NA, 2, 4, 4, NA, NA, 3,…
## $ V0640 <dbl> 1, 1, 1, 1, 5, 5, 5, NA, 3, 5, 5, 1, 5, 1, 3, 4, 5, 5, 5, 5, 1, …
## $ V0641 <dbl> 2, 2, 1, 2, 2, 1, 2, NA, 1, 1, 2, 1, 2, 2, 1, 2, 2, 1, 1, 1, 1, …
## $ V0642 <dbl> 2, 2, NA, 2, 1, NA, 2, NA, NA, NA, 2, NA, 2, 2, NA, 2, 2, NA, NA…
## $ V0643 <dbl> 2, 2, NA, 2, NA, NA, 2, NA, NA, NA, 2, NA, 2, 2, NA, 2, 2, NA, N…
## $ V0644 <dbl> 2, 2, NA, 2, NA, NA, 2, NA, NA, NA, 2, NA, 2, 2, NA, 2, 2, NA, N…
## $ V0645 <dbl> NA, NA, 1, NA, 1, 1, NA, NA, 1, 1, NA, 1, NA, NA, 1, NA, NA, 1, …
## $ V6461 <chr> NA, NA, "7213", NA, "7233", "4120", NA, NA, "7212", "5230", NA, …
## $ V6471 <chr> NA, NA, "00000", NA, "45020", "82001", NA, NA, "30010", "48073",…
## $ V0648 <dbl> NA, NA, 1, NA, 1, 1, NA, NA, 1, 1, NA, 4, NA, NA, 1, NA, NA, 4, …
## $ V0649 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0650 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, N…
## $ V0651 <dbl> NA, NA, 1, NA, 1, 1, NA, NA, 1, 1, NA, 1, NA, NA, 1, NA, NA, 1, …
## $ V6511 <dbl> NA, NA, 1000, NA, 800, 600, NA, NA, 1800, 700, NA, 2800, NA, NA,…
## $ V6513 <dbl> NA, NA, 1000, NA, 800, 600, NA, NA, 1800, 700, NA, 2800, NA, NA,…
## $ V6514 <dbl> NA, NA, 1.96, NA, 1.57, 1.18, NA, NA, 3.53, 1.37, NA, 5.49, NA, …
## $ V0652 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6521 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6524 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6525 <dbl> NA, NA, 1000, NA, 800, 600, NA, NA, 1800, 700, NA, 2800, NA, NA,…
## $ V6526 <dbl> NA, NA, 1.96078, NA, 1.56863, 1.17647, NA, NA, 3.52941, 1.37255,…
## $ V6527 <dbl> 2200, 600, 1000, 0, 800, 600, 0, NA, 1800, 700, 0, 2800, 0, 0, 1…
## $ V6528 <dbl> 4.31373, 1.17647, 1.96078, 0.00000, 1.56863, 1.17647, 0.00000, N…
## $ V6529 <dbl> 2800, 2800, 2400, 2400, 2400, 2400, 2400, 2400, 2500, 2500, 2500…
## $ V6530 <dbl> 5.49020, 5.49020, 4.70588, 4.70588, 4.70588, 4.70588, 4.70588, 4…
## $ V6531 <dbl> 1400.00, 1400.00, 400.00, 400.00, 400.00, 400.00, 400.00, 400.00…
## $ V6532 <dbl> 2.74510, 2.74510, 0.78431, 0.78431, 0.78431, 0.78431, 0.78431, 0…
## $ V0653 <dbl> NA, NA, 50, NA, 50, 44, NA, NA, 50, 40, NA, 80, NA, NA, 40, NA, …
## $ V0654 <dbl> 2, 2, NA, 2, NA, NA, 1, NA, NA, NA, 1, NA, 2, 2, NA, 2, 2, NA, N…
## $ V0655 <dbl> NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, 1, NA, NA, NA, NA, NA, NA…
## $ V0656 <dbl> 1, 1, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, …
## $ V0657 <dbl> 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ V0658 <dbl> 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ V0659 <dbl> 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ V6591 <dbl> 2200, 600, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 510, 0, 0, 0,…
## $ V0660 <dbl> NA, NA, 2, NA, 2, 2, NA, NA, 2, 2, NA, 2, NA, NA, 2, NA, NA, 2, …
## $ V6602 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6604 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6606 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0661 <dbl> NA, NA, 1, NA, 1, 1, NA, NA, 1, 1, NA, 1, NA, NA, 1, NA, NA, 1, …
## $ V0662 <dbl> NA, NA, 2, NA, 2, 3, NA, NA, 3, 3, NA, 2, NA, NA, 2, NA, NA, 2, …
## $ V0663 <dbl> NA, 1, NA, 1, NA, 2, NA, NA, NA, 2, NA, NA, 2, 1, NA, 1, 2, NA, …
## $ V6631 <dbl> NA, 0, NA, 3, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, …
## $ V6632 <dbl> NA, 2, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, …
## $ V6633 <dbl> NA, 2, NA, 4, NA, 0, NA, NA, NA, 0, NA, NA, 0, 2, NA, 2, 0, NA, …
## $ V0664 <dbl> NA, 1, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, …
## $ V6641 <dbl> NA, 0, NA, 3, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, …
## $ V6642 <dbl> NA, 2, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, …
## $ V6643 <dbl> NA, 2, NA, 4, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2, NA, 2, NA, …
## $ V0665 <dbl> NA, 2, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2, NA, 2, NA, …
## $ V6660 <dbl> NA, 35, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11, NA, 47, N…
## $ V6664 <dbl> NA, 0, NA, 0, NA, 0, NA, NA, NA, 0, NA, NA, 0, 0, NA, 0, 0, NA, …
## $ V0667 <dbl> NA, 1, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, …
## $ V0668 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6681 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6682 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V0669 <dbl> NA, 2, NA, 2, NA, 2, NA, NA, NA, 2, NA, NA, 2, 2, NA, 2, 2, NA, …
## $ V6691 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6692 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V6693 <dbl> NA, 0, NA, 0, NA, 0, NA, NA, NA, 0, NA, NA, 0, 0, NA, 0, 0, NA, …
## $ V6800 <dbl> NA, 2, NA, 4, NA, 0, NA, NA, NA, 0, NA, NA, 0, 2, NA, 2, 0, NA, …
## $ V0670 <dbl> 2, 1, 2, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 1…
## $ V0671 <chr> "01", NA, "02", NA, "02", "02", "02", "02", "03", "03", NA, NA, …
## $ V6900 <dbl> 2, 2, 1, 2, 1, 1, 1, NA, 1, 1, 1, 1, 2, 2, 1, 2, 2, 1, 1, 1, 1, …
## $ V6910 <dbl> NA, NA, 1, NA, 1, 1, 2, NA, 1, 1, 2, 1, NA, NA, 1, NA, NA, 1, 1,…
## $ V6920 <dbl> 2, 2, 1, 2, 1, 1, 2, NA, 1, 1, 2, 1, 2, 2, 1, 2, 2, 1, 1, 1, 1, …
## $ V6930 <dbl> NA, NA, 1, NA, 1, 1, NA, NA, 1, 1, NA, 3, NA, NA, 1, NA, NA, 3, …
## $ V6940 <dbl> NA, NA, 3, NA, 3, 3, NA, NA, 3, 3, NA, 5, NA, NA, 3, NA, NA, 5, …
## $ V6121 <chr> "120", "120", "110", "350", "240", "350", "350", "370", "110", "…
## $ V0604 <dbl> 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2, 3, 1, 3, 2, 3, 1, 1, 2, 2, 2, 3…
## $ V0605 <chr> NA, NA, NA, NA, "02", "02", "02", "02", NA, NA, NA, NA, "02", NA…
## $ V5020 <chr> "01", "01", "01", "01", "01", "01", "01", "01", "01", "01", "01"…
## $ V5060 <dbl> 2, 2, 6, 6, 6, 6, 6, 6, 3, 3, 3, 3, 3, 3, 2, 2, 4, 4, 4, 4, 2, 2…
## $ V5070 <dbl> 1400.00, 1400.00, 400.00, 400.00, 400.00, 400.00, 400.00, 400.00…
## $ V5080 <dbl> 2.74510, 2.74510, 0.78431, 0.78431, 0.78431, 0.78431, 0.78431, 0…
## $ V6462 <chr> NA, NA, "7245", NA, "9113", "4121", NA, NA, "7243", "4211", NA, …
## $ V6472 <chr> NA, NA, "00000", NA, "50020", "74090", NA, NA, "35010", "53061",…
## $ V5110 <dbl> NA, NA, 1, NA, 1, 1, NA, NA, 1, 1, NA, 1, NA, NA, 1, NA, NA, 2, …
## $ V5120 <dbl> NA, NA, 1, NA, 1, 1, NA, NA, 1, 1, NA, 1, NA, NA, 1, NA, NA, 2, …
## $ V5030 <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
## $ V5040 <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ V5090 <dbl> 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 1, 1…
## $ V5100 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ V5130 <chr> "02", "01", "01", "02", "03", "04", "05", "06", "01", "02", "03"…
## $ M0502 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0601 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6033 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0606 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0613 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0614 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0615 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0616 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0617 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0618 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0619 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0620 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0621 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0622 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6222 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6224 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0623 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2…
## $ M0624 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0625 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6252 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6254 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6256 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0626 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6262 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6264 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6266 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0627 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0628 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0629 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0630 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0631 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0632 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0633 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0634 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0635 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6352 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6354 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6356 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0636 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6362 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6364 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6366 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0637 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0638 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0639 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0640 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0641 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0642 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0643 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0644 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0645 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6461 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6471 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2…
## $ M0648 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0649 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0650 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0651 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6511 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0652 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6521 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0653 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0654 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0655 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0656 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0657 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0658 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0659 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6591 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0660 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6602 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6604 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6606 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0661 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0662 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0663 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6631 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6632 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6633 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0664 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6641 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6642 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6643 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0665 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6660 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0667 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0668 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6681 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6682 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0669 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6691 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6692 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6693 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0670 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0671 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6800 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6121 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0604 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M0605 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6462 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
## $ M6472 <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2, 1, 2, 2…
## $ V1005 <dbl> 1, 1, 4, 4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
The variable V0001 refers to the state, V0002 refers to the municipality, V0011 refers to the weighting area, V0300 refers to the household and V0010 is the weight of the observation. Let’s filter only the Rio de Janeiro municipality3.
rio_mun <- rio %>%
filter(
V0001 == "33", # This comparison isn't necessary here, unless you have opened two or more files and binded them
V0002 == "04557"
)
To reconstruct the sample design, the svydesign
function of the survey
package is needed. It expects the clusters identification, strata identification, the finite population correction (FPC), the data frame (tibble in this case) and the weights of the observations. The FPC is fairly easy to calculate, and we can pass ~ 1
to the ids
argument because the sample doesn’t have any clusters:
rio_mun <- rio_mun %>%
group_by(V0011) %>%
mutate(FPC = n())
rio_design <- svydesign(
ids = ~ 1,
strata = ~ V0011,
fpc = ~ FPC,
data = rio_mun,
weights = ~ V0010
)
The variable V0619 assumes the value 3 to those who weren’t born in the state, the variable V0622 assumes the value 2 to those who were born out of Brazil and the variable V6224 specifies the country of origin.
rio_top_immigrants <- svytable(~ V0011 + V6224, subset(rio_design, V0619 == 3 & V0622 == 2)) %>%
as_tibble() %>%
mutate(n = round(n)) %>%
group_by(V0011) %>%
arrange(-n) %>%
slice(1)
Now the geobr
package will be used to download the vectors of the weighting areas of Rio de Janeiro. The next step is to join with our treated dataset and finally plot the map with the tmap
package.
library(geobr)
library(tmap)
rio_weighting <- read_weighting_area(3304557)
rj_mun <- read_municipality(33)
rio_final <- rio_weighting %>%
left_join(
rio_top_immigrants,
by = c("code_weighting" = "V0011")
) %>%
rename(origin_country = V6224)
Here is the map. Note the predominancy of the weighting areas where country 8000620 is the main origin country.
rio_final %>%
tm_shape() +
tm_polygons(col = "origin_country")
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
Although this codification is easy to manipulate, it is difficult to read. Luckily, there is a spreadsheet in the Documentacao.zip
file that helps the translation of these codes. Perhaps it is possible to parse it with the readxl
package.
country_code <- read_xls("~/Documentacao/Documentaç╞o/Anexos Auxiliares/Migraç╞o e Deslocamento_Paises estrangeiros.xls") %>%
select(
country = 2,
code = 3
) %>%
slice(-3) %>%
drop_na() %>%
mutate(
country = str_to_title(country)
)
rio_map <- rio_final %>%
left_join(
country_code,
by = c("origin_country" = "code")
) %>%
mutate(
country = as_factor(country) %>% fct_lump(5) # This is important to reduce the amount of classes
)
Now, here is the map:
tm_shape(rj_mun, bbox = st_bbox(rio_map)) + tm_polygons(col = "white") +
tm_shape(rio_map) +
tm_polygons(
title = "Origin country",
col = "country",
palette = "Pastel2"
) +
tm_scale_bar(
position = c("left", "bottom")
) +
tm_compass(
position = c("left", "top")
) +
tm_layout(
bg.color = "lightblue",
legend.outside = TRUE,
legend.outside.position = "right",
legend.outside.size = .15,
asp = 1.3
)
References
In this article, hierarchy refers to which boundary contains which, e.g., the states contains the municipalities that contains the census tracts.↩︎
Also consider to redesign the function
read_microdata
. It isn’t the most efficient code in the universe. To open this file in R Markdown has taken so much time that the file was preprocessed outside the document and imported with a simplereadr::read_csv
call. The function works really well, but isn’t that optimized and the R Markdown stresses it.↩︎Other variables should be considered in your filter, e.g. the
V1006
(Household situation) which characterizes the household as urban or rural.↩︎