what is factor

  1. continuous variable: correspond to an infinite number of vlaue
  2. categorical variable: limited number of categories

What's a factor and why would you use it? (2)

# Sex vector
sex_vector <- c("Male", "Female", "Female", "Male", "Male")

# Convert sex_vector to a factor
factor_sex_vector <- factor(sex_vector)

# Print out factor_sex_vector
factor_sex_vector
[1] Male   Female Female Male   Male  
Levels: Female Male

What's a factor and why would you use it? (3)

Categorical Variables: a nominal categorical variable, ordinal categorical variable

Nominal variables: a categorical variable without an implied order

factor levels

# To correctly map "F" to "Female" and "M" to "Male", the levels should be set to c("Female", "Male"), in this order.

# Code to build factor_survey_vector
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
# Specify the levels of factor_survey_vector
levels(factor_survey_vector) <- c("Female", "Male")
factor_survey_vector
[1] Male   Female Female Male   Male  
Levels: Female Male

Summarizing a factor

summary() : quick overview of the contesnt variable

# Build factor_survey_vector with clean levels
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")
factor_survey_vector

# Generate summary for survey_vector
summary(survey_vector)

# Generate summary for factor_survey_vector
summary(factor_survey_vector)

Battle of the sexes

NA : when since the idea doesn’t make sense

# Build factor_survey_vector with clean levels
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")

# Male
male <- factor_survey_vector[1]

# Female
female <- factor_survey_vector[2]

# Battle of the sexes: Male 'larger' than female?
male > female

Ordered factors

As a first step, assign speed_vector a vector with 5 entries, one for each analyst. Each entry should be either "slow", "medium", or "fast". Use the list below:

Analyst 1 is medium,
Analyst 2 is slow,
Analyst 3 is slow,
Analyst 4 is medium and
Analyst 5 is fast.
No need to specify these are factors yet.

# Create speed_vector
speed_vector <- c("medium", "slow", "slow", "medium", "fast")