Conditional statements are when we check to see if some condition is true or not, before deciding what code to execute. Conditional statements have two primary uses in R: 1) Executing different sections of code depending on whether a condition is satisfied 2) Subsetting a dataset to the subsample that meets some desired condition
These statements generate a value is of type “logical”:
Note: These aren’t the strings “TRUE” and “FALSE”. They are a special type of value.
Let’s look at some operators we use in conditional statements.
First, create a variable named x that is equal to the value 8.
x <- 8
x
Then, see if x meets certain conditions.
x == 8
x != 8
x > 10
x >= 10
x < 10
x <= 10
Logical vectors can also be treated as numeric for the purposes of seeing how many times a condition is satisfied
x == 8
sum(x == 8)
We can also string together conditional statements to see if multiple conditions are true.
x > 5 & x < 10 # both have to be true
x > 9 | x == 8 # only one has to be true
x > 9 & x == 8 # both have to be true
If/else statements use conditionals to control the flow of a program and allow you to perform different actions depending on whether a condition is met. What are some examples for when if/else statements might be useful?
They use the following format:
if (condition is true)
{
perform an action
}else{
perform an alternative action
}
Let’s try one out with the variable we made, x.
if (x < 1)
{
print('x is less than 1')
}else{
print('x is greater than or equal to 1')
}
Note - you can add as many sub-conditions as you want!
if (x < 1)
{
print('x is less than 1')
}else if(x >=1 & x <8){
print('x is between 1 and 7, inclusive')
}else{
print('x is greater than 7')
}
Now that we’ve played with some conditional statments in the context of if/else statements, let’s try applying conditionals to explore and subset of dataset.
First, let’s read in our data, which can be downloaded here.
#READ IN THE CARS CSV FILE
cars <- read.csv('~/Desktop/data/car-speeds-cleaned.csv', stringsAsFactors = FALSE)
#SEE WHAT TYPE OF VARIABLE CARS IS
class(cars)
Let’s explore some of the key variables in our data that we will be using. Remember the $ operator from this morning?
#LET'S SEE WHAT THE COLUMNS OF cars ARE
str(cars)
#LET'S USE TABLE TO SEE THE DIFFERENT COLORS AND STATES REPRESENTED IN cars
table(car$Color)
table(car$State)
#LET'S USE min AND max TO SEE THE RANGE OF SPEEDS
min(cars$Speed)
max(cars$Speed)
Now let’s use what we learend about conditional statements to explore the data.
#LET'S CHECK OUT HOW LONG COLORS IS
cars$Color
length(cars$Color)
#HOW LONG IS THE LOGICAL CHECKING IF THE COLOR OF EACH CAR IS BLUE
cars$Color == 'Blue'
length(cars$Color == 'Blue')
#NOW, LET'S SEE HOW MANY CARS ARE BLUE BY TREATING THE LOGICAL AS NUMERIC
sum(cars$Color == 'Blue')
sum(cars$Color == 'blue')
#TRY SOME OTHER COLORS ON YOUR OWN!
Next we are going to use logicals to subset our data frame to focus on the data we are interested in
blue_cars = cars[cars$Color == 'Blue',]
dim(blue_cars)
table(blue_cars$Color)
Exercise: Create a new data.frame called white_cars_utah that includes all three columns of the data.frame cars, but only includes those that are white and in utah
Solution:
white_cars_utah = cars[cars$Color == 'White' & cars$State == 'Utah',]
If we wanted to subset car color on the states new mexico and arizona we could:
cars$Color[cars$State == 'NewMexico' | cars$State == 'Arizona']
However, we could also use the %in% operator
cars$Color[cars$State %in% c('NewMexico', 'Arizona')]