This is a Notebook for the blog post about merging columns in R (https://www.marsja.se/how-to-concatenate-two-columns-or-more-in-r-stringr-tidyr/). 


## Importing Data
First, download the data here and place it in the same directory as this notebook (or change the path to where you have it).

In [2]:
# Example Data:
library(readxl)
dataf <- read_excel("combine_columns_in_R.xlsx")

### See the Structure of the Data

In [3]:
str(dataf)

Classes 'tbl_df', 'tbl' and 'data.frame':	7 obs. of  5 variables:
 $ Date : num  15 11 11 10 14 10 12
 $ Month: chr  "Jun" "Jul" "Aug" "Sep" ...
 $ Year : num  2015 2016 2017 2018 2019 ...
 $ Snake: chr  "Python" "Boa" "Python" "Boa" ...
 $ Size : chr  "Small" "Large" "Medium" "Large" ...


### First 5 Row of the Data


In [4]:
head(dataf, 5)

Date,Month,Year,Snake,Size
15,Jun,2015,Python,Small
11,Jul,2016,Boa,Large
11,Aug,2017,Python,Medium
10,Sep,2018,Boa,Large
14,Oct,2019,Python,Small


## Example 1: Concatenating With Paste()

In [None]:
dataf$MY <- paste(dataf$Month, dataf$Year)

## Example 2: Concatenate Two Columns with Hyphen as Separator

In [6]:
dataf$MY <- paste(dataf$Month, "-", dataf$Year)


dataf

Date,Month,Year,Snake,Size,MY
15,Jun,2015,Python,Small,Jun - 2015
11,Jul,2016,Boa,Large,Jul - 2016
11,Aug,2017,Python,Medium,Aug - 2017
10,Sep,2018,Boa,Large,Sep - 2018
14,Oct,2019,Python,Small,Oct - 2019
10,Nov,2020,Python,Medium,Nov - 2020
12,Dec,2021,Boa,Medium,Dec - 2021


#### Alternative: 
If we want to have no whitespaces between values/characters

In [7]:
dataf$MY <- paste(dataf$Month, dataf$Year, sep= "-")
dataf

Date,Month,Year,Snake,Size,MY
15,Jun,2015,Python,Small,Jun-2015
11,Jul,2016,Boa,Large,Jul-2016
11,Aug,2017,Python,Medium,Aug-2017
10,Sep,2018,Boa,Large,Sep-2018
14,Oct,2019,Python,Small,Oct-2019
10,Nov,2020,Python,Medium,Nov-2020
12,Dec,2021,Boa,Medium,Dec-2021


## Example 3: Multiple Columns

In [8]:
dataf$DMY <- paste(dataf$Date, dataf$Month, dataf$Year)

head(dataf, 2)

Date,Month,Year,Snake,Size,MY,DMY
15,Jun,2015,Python,Small,Jun-2015,15 Jun 2015
11,Jul,2016,Boa,Large,Jul-2016,11 Jul 2016


## Example 4: Using str_c()

In [10]:
library(stringr)

dataf$SnakeNSize <- str_c(dataf$Snake," ", dataf$Size)

head(dataf)

Date,Month,Year,Snake,Size,MY,DMY,SnakeNSize
15,Jun,2015,Python,Small,Jun-2015,15 Jun 2015,Python Small
11,Jul,2016,Boa,Large,Jul-2016,11 Jul 2016,Boa Large
11,Aug,2017,Python,Medium,Aug-2017,11 Aug 2017,Python Medium
10,Sep,2018,Boa,Large,Sep-2018,10 Sep 2018,Boa Large
14,Oct,2019,Python,Small,Oct-2019,14 Oct 2019,Python Small
10,Nov,2020,Python,Medium,Nov-2020,10 Nov 2020,Python Medium


## Example 5: Using Tidyr and unite()

In [11]:
library(tidyverse) # or library(tidyr)

dataf <- dataf %>%
  unite("DM", Date:Month)

dataf

Registered S3 methods overwritten by 'ggplot2':
  method         from 
  [.quosures     rlang
  c.quosures     rlang
  print.quosures rlang
Registered S3 method overwritten by 'rvest':
  method            from
  read_xml.response xml2
-- Attaching packages --------------------------------------- tidyverse 1.2.1 --
v ggplot2 3.1.1       v readr   1.3.1  
v tibble  2.1.1       v purrr   0.3.2  
v tidyr   0.8.3       v dplyr   0.8.0.1
v ggplot2 3.1.1       v forcats 0.4.0  
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()


DM,Year,Snake,Size,MY,DMY,SnakeNSize
15_Jun,2015,Python,Small,Jun-2015,15 Jun 2015,Python Small
11_Jul,2016,Boa,Large,Jul-2016,11 Jul 2016,Boa Large
11_Aug,2017,Python,Medium,Aug-2017,11 Aug 2017,Python Medium
10_Sep,2018,Boa,Large,Sep-2018,10 Sep 2018,Boa Large
14_Oct,2019,Python,Small,Oct-2019,14 Oct 2019,Python Small
10_Nov,2020,Python,Medium,Nov-2020,10 Nov 2020,Python Medium
12_Dec,2021,Boa,Medium,Dec-2021,12 Dec 2021,Boa Medium
