Relational Database Intro
    
        Duration: ~30 Minutes
        5 Minute Break
     
    
    Learning Objectives
    
        - Define relational database
- Identify basic structure of relational databases
- Merge two databases using a key
- Bind rows and columns to a dataset
What you will not learn
    
        - How to build relational databases
- Why to use relational databases
- Query language such as SQL
Relational database definition
    Collection of datasets that are combined using keys.
       This structure reduces the need for duplication.
    Relational database structure
    
    
        Pet-level data
        
            
                | pet_name | pet_type | child_key | 
        
        
            
                | Qua | cat | 2 | 
            
                | Ray | iguana | 3 | 
            
                | Sam | dog | 3 | 
            
                | Tea | cat | 1 | 
            
                | Urn | cat | 1 | 
        
    
    
        Child-level data
        
            
                | child_key | child_name | child_age | 
        
        
            
                | 1 | Amy | 8 | 
            
                | 2 | Ben | 10 | 
            
                | 3 | Cam | 7 | 
        
    
    Merged dataset structure
    
        Child and pet-level data
        
            
                | pet_name | pet_type | child_key | child_name | child_age | 
        
        
            
                | Qua | cat | 2 | Ben | 10 | 
            
                | Ray | iguana | 3 | Cam | 7 | 
            
                | Sam | dog | 3 | Cam | 7 | 
            
                | Tea | cat | 1 | Amy | 8 | 
            
                | Urn | cat | 1 | Amy | 8 | 
        
    
    
    
    Code for merging datasets
    
        These functions merge on a key.
        
        
            - left_join()
- right_join()
- full_join()
- inner_join()
 
    
    Download datasets
        Download and load example 
           pet and 
           child data
pet <- read_csv("[myfolder]/petdata.csv")
child <- read_csv("[myfolder]/childdata.csv")
  
    Left joining
    
    
        Pets
        
            
                | pet_name | pet_type | child_key | 
        
        
            
                | Qua | cat | 2 | 
            
                | Ray | iguana | 3 | 
            
                | Sam | dog | 4 | 
        
    
    
        Children
        
            
                | child_key | child_name | child_age | 
        
        
            
                | 1 | Amy | 8 | 
            
                | 2 | Ben | 10 | 
            
                | 3 | Cam | 7 | 
        
    
leftjoin <- pet %>% left_join(child, by = "child_key")
  
  
    
        Left joined table
        
            
                | pet_name | pet_type | child_key | child_name | child_age | 
        
        
            
                | Qua | cat | 2 | Ben | 10 | 
            
                | Ray | iguana | 3 | Cam | 7 | 
            
                | Sam | dog | 4 | NA | NA | 
        
    
  
  
    Right joining
    
    
        Pets
        
            
                | pet_name | pet_type | child_key | 
        
        
            
                | Qua | cat | 2 | 
            
                | Ray | iguana | 3 | 
            
                | Sam | dog | 4 | 
        
    
    
        Children
        
            
                | child_key | child_name | child_age | 
        
        
            
                | 1 | Amy | 8 | 
            
                | 2 | Ben | 10 | 
            
                | 3 | Cam | 7 | 
        
    
rightjoin <- pet %>% right_join(child, by = "child_key")
  
  
    
        Right joined table
        
            
                | pet_name | pet_type | child_key | child_name | child_age | 
        
        
            
                | NA | NA | 1 | Amy | 8 | 
            
                | Qua | cat | 2 | Ben | 10 | 
            
                | Ray | iguana | 3 | Cam | 7 | 
        
    
  
  
    Full joining
    
    
        Pets
        
            
                | pet_name | pet_type | child_key | 
        
        
            
                | Qua | cat | 2 | 
            
                | Ray | iguana | 3 | 
            
                | Sam | dog | 4 | 
        
    
    
        Children
        
            
                | child_key | child_name | child_age | 
        
        
            
                | 1 | Amy | 8 | 
            
                | 2 | Ben | 10 | 
            
                | 3 | Cam | 7 | 
        
    
fulljoin <- pet %>% full_join(child, by = "child_key")
  
  
    
        Fully joined table
        
            
                | pet_name | pet_type | child_key | child_name | child_age | 
        
        
            
                | Qua | cat | 2 | Ben | 10 | 
            
                | Ray | iguana | 3 | Cam | 7 | 
            
                | Sam | dog | 4 | NA | NA | 
            
                | NA | NA | 1 | Amy | 8 | 
        
    
  
  
    Inner joining
    
    
        Pets
        
            
                | pet_name | pet_type | child_key | 
        
        
            
                | Qua | cat | 2 | 
            
                | Ray | iguana | 3 | 
            
                | Sam | dog | 4 | 
        
    
    
        Children
        
            
                | child_key | child_name | child_age | 
        
        
            
                | 1 | Amy | 8 | 
            
                | 2 | Ben | 10 | 
            
                | 3 | Cam | 7 | 
        
    
innerjoin <- pet %>% inner_join(child, by = "child_key")
  
  
    
        Inner joined table
        
            
                | pet_name | pet_type | child_key | child_name | child_age | 
        
        
            
                | Qua | cat | 2 | Ben | 10 | 
            
                | Ray | iguana | 3 | Cam | 7 | 
        
    
  
    Code for binding data
    
        These functions bind data to a dataset.
        
        
     
    
  
    bind_rows()
    
    Rows with the same name must be the same type!
    Columns do not all need to match
    
bindrows <- innerjoin %>% bind_rows(pet)
    
  
  
    bind_cols()
    
    Columns must have the same number of rows!
    Columns must be in a dataframe, but can be any type.
    
IDs <- data.frame(IDs = 1:nrow(innerjoin))
bindcols <- IDs %>% bind_cols(innerjoin)
    
  
    And Now You Know!
    
    