Climate change

Author

Aurélien Ginolhac

Published

February 9, 2024

Note

This practical connects most lectures and practicals covered in the course as they would work together for a typical data analysis with data import, transformation, summarizing and plotting.

Atmospheric carbon dioxide

\(CO_2\)

Carbon dioxide, CO2 is as its name says, an oxide. Meaning, once in the atmosphere is it extremely stable and will remain there for thousands of years. Two main carbon sinks exist: forest, mainly trees that incorporate the carbon for their growth and oceans. The latter have absorbed approximately half of what humans have produce by burning oil, not without consequences. Oceans are getting warmer, making CO2 solubility weaker and diminishes the pH. This acidification already killed half of the animals building coral reef (91% as in 2022) and calcifying organisms. CO2, like methane is a greenhouse gas, absorbing and radiating infrared thermal energy leading to heat being trapped close to the ground. It is worth saying that the first scientist to discover the link between CO2 and heat trap was a woman Eunice Newton Foote as early as 1856. French version of the Wikipedia page is more in line with the Smithsonian article: she was not allowed to present her work because of her gender.

Find out how long carbon dioxide can last in our atmosphere and why it matters to look at cumulative emissions

Cumulative carbon dioxide emissions

Due to extreme long time CO2 remains in the atmosphere, looking at yearly emissions is of little interest. Especially since this is used by rich countries who got rid of most of their industry to justify little efforts. What matters is the cumulative emissions. For this, we will look at the data from the World Bank. Unfortunately, they don’t provide the 1960 - 2020 but 1990-2021, so please use my local copy linked below.

Read the CSV API_EN.ATM.CO2E.PC_DS2_en_csv_v2_3731558.csv, assign the name cum_co2
Tip
  • If you look at the file, the first 4 lines are not of interest and should be skipped.
  • Column names will be having spaces, leading digits etc… Using name_repair = "unique" would help.
  • Columns then named "Country code":"Indicator Code" can be discarded.
read_csv("https://r-training.pages.uni.lu/biostat1/projects/data/API_EN.ATM.CO2E.PC_DS2_en_csv_v2_3731558.csv", 
         skip = 4L,
         show_col_types = FALSE, name_repair = "unique") |> 
  select(-c("Country Code":"Indicator Code")) -> cum_co2
New names:
• `` -> `...66`
cum_co2
# A tibble: 266 × 63
   `Country Name`    `1960`  `1961`  `1962`  `1963`  `1964` `1965` `1966` `1967`
   <chr>              <dbl>   <dbl>   <dbl>   <dbl>   <dbl>  <dbl>  <dbl>  <dbl>
 1 Aruba            NA      NA      NA      NA      NA      NA     NA     NA    
 2 Africa Eastern …  0.906   0.922   0.931   0.941   0.996   1.05   1.03   1.05 
 3 Afghanistan       0.0461  0.0536  0.0737  0.0742  0.0862  0.101  0.107  0.123
 4 Africa Western …  0.0909  0.0953  0.0966  0.112   0.133   0.185  0.194  0.189
 5 Angola            0.101   0.0822  0.211   0.203   0.214   0.206  0.269  0.172
 6 Albania           1.26    1.37    1.44    1.18    1.11    1.17   1.33   1.36 
 7 Andorra          NA      NA      NA      NA      NA      NA     NA     NA    
 8 Arab World        0.609   0.663   0.727   0.853   0.972   1.14   1.25   1.32 
 9 United Arab Emi…  0.119   0.109   0.164   0.176   0.133   0.147  0.160  5.40 
10 Argentina         2.38    2.46    2.54    2.33    2.55    2.66   2.81   2.87 
# ℹ 256 more rows
# ℹ 54 more variables: `1968` <dbl>, `1969` <dbl>, `1970` <dbl>, `1971` <dbl>,
#   `1972` <dbl>, `1973` <dbl>, `1974` <dbl>, `1975` <dbl>, `1976` <dbl>,
#   `1977` <dbl>, `1978` <dbl>, `1979` <dbl>, `1980` <dbl>, `1981` <dbl>,
#   `1982` <dbl>, `1983` <dbl>, `1984` <dbl>, `1985` <dbl>, `1986` <dbl>,
#   `1987` <dbl>, `1988` <dbl>, `1989` <dbl>, `1990` <dbl>, `1991` <dbl>,
#   `1992` <dbl>, `1993` <dbl>, `1994` <dbl>, `1995` <dbl>, `1996` <dbl>, …
This dataset is not tidy, why?

Dates are variables and should be on one column, where values will be co2_emissions_mt_per_cap

Pivot accordingly and assign cum_co2_long
Tip
  • Pivot all columns but the identifier of interest: country_name
  • The names_to would make sense to be equal to year.
  • Transform the names column to integers
cum_co2 |> 
  pivot_longer(cols = -"Country Name",
               names_to = "year",
               values_to = "co2_emissions_mt_per_cap",
               names_transform = as.integer) -> cum_co2_long
Warning in f(names[[col]]): NAs introduced by coercion
cum_co2_long
# A tibble: 16,492 × 3
   `Country Name`  year co2_emissions_mt_per_cap
   <chr>          <int>                    <dbl>
 1 Aruba           1960                       NA
 2 Aruba           1961                       NA
 3 Aruba           1962                       NA
 4 Aruba           1963                       NA
 5 Aruba           1964                       NA
 6 Aruba           1965                       NA
 7 Aruba           1966                       NA
 8 Aruba           1967                       NA
 9 Aruba           1968                       NA
10 Aruba           1969                       NA
# ℹ 16,482 more rows
Cleanup by removing the missing values and the year 66
Tip

The first year is 1960, a coherent filtering on the year column. For removing missing values of CO2, drop_na() is a good pick

cum_co2_long |> 
  drop_na(co2_emissions_mt_per_cap) |> 
  filter(year > 1960) -> cum_co2_long
Plotting Worldwide yearly emissions
Tip
  • Summarise per year the emissions to have them worldwide.
  • Compute the cumulative sum (checkout the cumsum() function) of emissions.
  • Plot this cumulative sum.
  • Add vertical lines for the year 1973, 1979 and 2008. 3 major crisis and comment.
cum_co2_long |> 
  summarise(co2_sum = sum(co2_emissions_mt_per_cap),
            .by = year) |> 
  mutate(cum = cumsum(co2_sum)) |> 
  ggplot(aes(x = year)) +
  geom_line(aes(y = co2_sum), color = "purple") +
  geom_vline(xintercept = c(1973, 1979, 2008), linetype = "dashed") +
  scale_x_continuous(breaks = seq(1960, 2020, 10)) +
  labs(caption = "Source: World Bank",
       x = NULL,
       y = "Worldwide Carbon dioxide emissions")

Burning fossils is what make economy working. Any crisis comes usually after a peak of energy prices. Even the 2008 crisis is partly due to this price increase. The US reacted by from 2010 firing up the production of shale oil.

Using the website Our World in Data load in R their dataset on economic growth as a CSV the World GDP over the last millennia

GDP is the main economic indicator and the only target for most countries since WWII. And this despite a poor indicator (accidents, disasters increase GDP) it is the target.

world_gdp <- read_csv("https://r-training.pages.uni.lu/biostat1/projects/data/world-gdp-over-the-last-two-millennia.csv") |> 
  rename(GDP = "World GDP in 2011 Int.$ (OWID based on World Bank & Maddison (2017))")
Rows: 76 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): Entity, Code
dbl (2): Year, World GDP in 2011 Int.$ (OWID based on World Bank & Maddison ...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Join cum_co2_long and the World GDP per year. Plot this relationship and comment
cum_co2_long |> 
  summarise(co2_sum = sum(co2_emissions_mt_per_cap),
            .by = year) |> 
  mutate(co2_cum = cumsum(co2_sum)) |> 
  left_join(world_gdp, by = c("year" = "Year")) |> 
  ggplot(aes(x = co2_cum, y = GDP)) +
  geom_text(data = \(x) filter(x, year %% 10 == 0), 
            aes(label = year), nudge_x = -4e3) +
  geom_point() +
  geom_line() +
  scale_y_continuous(labels = scales::label_dollar(scale = 1e-12)) +
  scale_x_continuous(labels = scales::label_comma()) +
  labs(x = "Cumulative CO2 emissions (metric tons per capita)",
       y = "GDP (trillion $)")
Warning: Removed 3 rows containing missing values (`geom_point()`).
Warning: Removed 3 rows containing missing values (`geom_line()`).

Find the year on which we emitted 50% of all human caused carbon dioxide.
cum_co2_long |> 
  summarise(co2_sum = sum(co2_emissions_mt_per_cap),
            .by = year) |> 
  mutate(cum = cumsum(co2_sum)) |> 
  filter(cum > (max(cum) / 2))
# A tibble: 28 × 3
    year co2_sum    cum
   <int>   <dbl>  <dbl>
 1  1991    986. 27183.
 2  1992    953. 28136.
 3  1993    937. 29073.
 4  1994    928. 30001.
 5  1995    947. 30948.
 6  1996    961. 31909.
 7  1997    966. 32875.
 8  1998    967. 33842.
 9  1999    962. 34804.
10  2000    964. 35768.
# ℹ 18 more rows

1991!

However, one can argue that the emissions data start only in 1960 and are in per-capita units.

Let’s explore with another dataset from the famous Our World in Data organisation

Download and open the CSV CO2 file, assign name owid_co2
Tip

This file comes already in a data format you can use straight away. But there are missing values in columns, so pay attention to use the argument na.rm = TRUE in your sums.

Backup file hosted here

owid_co2 <- read_csv("https://nyc3.digitaloceanspaces.com/owid-public/data/co2/owid-co2-data.csv",
                     show_col_types = FALSE)
Using the co2 column, sum up per year the cumulative emission of carbon dioxide and find out when 50% of emissions were produced
Tip

Select directly the “World” as country and the cumulative CO2 are also already computed in cumulative_co2

owid_co2 |> 
  filter(country == "World") |> 
  select(country, year, cumulative_co2) |> 
  filter(cumulative_co2 > (max(cumulative_co2) / 2))
# A tibble: 29 × 3
   country  year cumulative_co2
   <chr>   <dbl>          <dbl>
 1 World    1994        899500.
 2 World    1995        923024.
 3 World    1996        947274.
 4 World    1997        971670.
 5 World    1998        996001.
 6 World    1999       1020835.
 7 World    2000       1046336 
 8 World    2001       1072011.
 9 World    2002       1098259.
10 World    2003       1125908.
# ℹ 19 more rows

We find 1993 this time. Previous 1960 emissions were tiny, Worldwide population almost tripled between 1960 and today and energy usage in Western countries just exploded.

Display the top 10 Carbon dioxide countries emitters of all time and comment
Tip

The Our World in Data added the calculations for continents and world inside country. to select only individual countries, select only iso_code that are 3 characters long.

owid_co2 |> 
  filter(str_length(iso_code) == 3L) |>
  summarise(co2_sum = sum(co2, na.rm = TRUE),
            .by = country) |> 
  slice_max(co2_sum, n = 10) |> 
  mutate(rank = row_number(), .before = 1L)
# A tibble: 10 × 3
    rank country        co2_sum
   <int> <chr>            <dbl>
 1     1 United States  426915.
 2     2 China          260619.
 3     3 Russia         119291.
 4     4 Germany         93986.
 5     5 United Kingdom  78835.
 6     6 Japan           67735.
 7     7 India           59741.
 8     8 France          39398.
 9     9 Canada          34613.
10    10 Ukraine         30962.

France and Germany, our very next neighbors who argue to produce less than 1% of the worldwide annual CO2 forget their historical emissions.

Acknowledgements

Appendix

There is too much of fossil energy left to make our climate livable. The carbon dioxide concentration has never been that high as today since 800,000 years. Thus, all the humankind history has lived in lower concentrations, the new era is then an unknown territory. However, as seen in Delannoy et al. 2021, in Applied Energy there is not enough fossil for continuing our current life style, and for lower income countries to reach it.