Relational Database Intro

Duration: ~30 Minutes

5 Minute Break

Clock made of Legos

Learning Objectives

  • Define relational database
  • Identify basic structure of relational databases
  • Merge two databases using a key
  • Bind rows and columns to a dataset

What you will not learn

  • How to build relational databases
  • Why to use relational databases
  • Query language such as SQL

Relational database definition

Collection of datasets that are combined using keys.
This structure reduces the need for duplication.

Relational database structure

Pet-level data
pet_name pet_type child_key
Qua cat 2
Ray iguana 3
Sam dog 3
Tea cat 1
Urn cat 1
Child-level data
child_key child_name child_age
1 Amy 8
2 Ben 10
3 Cam 7

Merged dataset structure

Child and pet-level data
pet_name pet_type child_key child_name child_age
Qua cat 2 Ben 10
Ray iguana 3 Cam 7
Sam dog 3 Cam 7
Tea cat 1 Amy 8
Urn cat 1 Amy 8

Code for merging datasets

These functions merge on a key.

  • left_join()
  • right_join()
  • full_join()
  • inner_join()
Join illustrations

Download datasets

Download and load example pet and child data


pet <- read_csv("[myfolder]/petdata.csv")
child <- read_csv("[myfolder]/childdata.csv")

Left joining

Pets
pet_name pet_type child_key
Qua cat 2
Ray iguana 3
Sam dog 4
Children
child_key child_name child_age
1 Amy 8
2 Ben 10
3 Cam 7

leftjoin <- pet %>% left_join(child, by = "child_key")

Left joined table
pet_name pet_type child_key child_name child_age
Qua cat 2 Ben 10
Ray iguana 3 Cam 7
Sam dog 4 NA NA

Right joining

Pets
pet_name pet_type child_key
Qua cat 2
Ray iguana 3
Sam dog 4
Children
child_key child_name child_age
1 Amy 8
2 Ben 10
3 Cam 7

rightjoin <- pet %>% right_join(child, by = "child_key")

Right joined table
pet_name pet_type child_key child_name child_age
NA NA 1 Amy 8
Qua cat 2 Ben 10
Ray iguana 3 Cam 7

Full joining

Pets
pet_name pet_type child_key
Qua cat 2
Ray iguana 3
Sam dog 4
Children
child_key child_name child_age
1 Amy 8
2 Ben 10
3 Cam 7

fulljoin <- pet %>% full_join(child, by = "child_key")

Fully joined table
pet_name pet_type child_key child_name child_age
Qua cat 2 Ben 10
Ray iguana 3 Cam 7
Sam dog 4 NA NA
NA NA 1 Amy 8

Inner joining

Pets
pet_name pet_type child_key
Qua cat 2
Ray iguana 3
Sam dog 4
Children
child_key child_name child_age
1 Amy 8
2 Ben 10
3 Cam 7

innerjoin <- pet %>% inner_join(child, by = "child_key")

Inner joined table
pet_name pet_type child_key child_name child_age
Qua cat 2 Ben 10
Ray iguana 3 Cam 7

Code for binding data

These functions bind data to a dataset.

  • bind_rows()
  • bind_cols()
Bind illustrations

bind_rows()

Rows with the same name must be the same type!

Columns do not all need to match


bindrows <- innerjoin %>% bind_rows(pet)

    

bind_cols()

Columns must have the same number of rows!

Columns must be in a dataframe, but can be any type.


IDs <- data.frame(IDs = 1:nrow(innerjoin))
bindcols <- IDs %>% bind_cols(innerjoin)

    

And Now You Know!

Q & A