> setwd("C:/Users/0131045s/Desktop/PS6183/rintro/week2")
Week 2 - Activities
Programming Fundamentals in R (Part 1) Workshop
In this week workshop we are going to practice some basic programming concepts in R through several activities and exercises. These should build you confident and skills in navigating R and RStudio. There is a lot here, so do not worry if you don’t get everything done in the session, or if takes a while for certain concepts to click.
Each activity includes a summary of the key points to help you understand the concepts and techniques. If you get stuck on an exercise, take a moment to review the key points in the activity—it might give you the clarity you need to move forward.
Don’t hesitate to collaborate! Feel free to chat with your neighbours and help each other out. Myself, Ciara, and Aoife will also be around to provide support, so don’t hesitate to ask us for help.
Activity 1: Set up your Working Directory
One of the first steps in each of these workshops is setting up your working directory. If you remember from last week, a directory is simply another word for a folder. The working directory is the default folder where R will look to import files or save any files you export.
If you don’t set the working directory, R might not be able to locate the files you need (e.g., when importing a dataset) or you might not know where your exported files have been saved. Setting the working directory beforehand ensures that everything is in the right place and avoids these issues.
Steps to Set Up Your Working Directory
Click:
Session → Set Working Directory → Choose DirectoryNavigate to the folder you created for this course (this should be the same folder where you created
week1
).Create a new folder called
week2
inside this directory.Select the
week2
folder and click Open.
Verify Your Working Directory
After setting the directory, check the output in the console to confirm that the file path inside the setwd()
command is correct. It should look something like this:
You can always check your current working directory by typing in the following command in the console
> getwd()
1] "C:/Users/0131045s/Desktop/PS6183/rintro/week2" [
Activity 2: Console Commands
In R, we can write and execute code in two places: a script and the console. Both options enable you to run the same code with the same results, but there are important differences between the two:
Scripts are ideal for saving, sharing, and reusing your code. You can think of them as your “notebook” for coding, where you can document and organise your work.
The console, on the other hand, is more like a “scratchpad” for quick, one-off commands. Code entered in the console is not saved automatically and cannot be reused unless you copy and save it elsewhere.
For this reason, we’ll focus primarily on using scripts throughout this course. However, there are specific scenarios where using the console is more practical or efficient. When those situations arise, we’ll discuss why the console is the better choice.
For now, let’s take some time to practice using the console. This activity will help you feel comfortable working with it when needed.
For each of the following exercises, make sure to type out the following exercises in the console and press enter/return.
Exercises
Basic Calculations
Perform the following calculations in the console:Add 45 and 32.
Divide 120 by 8 (use
/
key to divide).Multiply 7 by 15 (use
*
to multiple).Multiply 21 by 4, and then divide by 2.
Calculate the Mean
Imagine you want to calculate the average (mean) of these five numbers: 15, 22, 18, 30, and 3. Use the R console to find the correct result.
If your answer seems unusually high, remember that R follows the BEDMAS (Brackets, Exponents, Division/Multiplication, Addition/Subtraction) order of operations.
- Fixing the
+
Operator Error
Run the first line of code in the console. When you encounter the+
operator, how can you fix it? Test your answer in the console.
60 / 100
(
+
The + operator in R indicates that the command is incomplete. Finish the command and press Enter again
- Identifying and Fixing Errors
Run the following code exactly as it appears. Take note of the error message. What went wrong? What do you think the correct code should be?
20 = 30) / 34 - 21 (
Error in 20 = 30: invalid (do_set) left-hand side to assignment
Look at the 20 = 30, and remember that R takes what we say literally. What would be wrong with this statement if we meant it literally?
Using the Up Arrow to Edit Code
After running the code in Exercise 4, click anywhere in the console and press the Up arrow key on your keyboard to retrieve the last command. Fix the error in the code and run it again.Exploring Console Navigation
Press the Up arrow key a few times in the console. What happens? Now press the Down arrow key. How does it behave? Try experimenting with this feature.
Activity 3: Set up your R Script
We’ve done enough work in the console for now! Let’s switch gears and create an R script that we’ll use for the rest of today’s activities.
Creating an R Script in RStudio
Follow these steps to create a new R script:
Go to the menu bar and select:
File → New File → R Script
This will open an untitled R script.To save and name your script, select:
File → Save As, then enter the name:
02-basic-programming-activities
Click Save.
Keyboard Shortcut Tip
There’s a faster way to do this in RStudio on your laptop using your keyboard.
Create a new script (local version of RStudio):
Windows: Press
Control + Shift + N
Mac: Press
Command + Shift + N
If you are using PositCloud, then the keyboard commands to create a new script are slightly different
Create a new script (Posit Cloud):
Windows: Press
Control + Alt + Shift + N
Mac: Press
⌘ + Shift + Option + N
Save your script (works on both local version or Posit Cloud):
Windows: Press
Control + S
Mac: Press
Command + S
Add Comments to Your Script
To make your script organised and easy to understand, use comments (#
) to include a title and author information at the top of your file.
# Title: Basic Programming Activities
# Author: [Your Name]
# Date: [Today's Date]
For the rest of the activities, make sure to write your code in the R script. Keep your code neat by using spacing between lines of code and commenting. I recommend to make a comment to highlight each new activity.
Activity 4: Data Types
Information on Data Types
R categorises information into data types. There are four main data types you’ll encounter:
Character (often called a “string”):
Any text enclosed in quotation marks (either double or single).
Examples:"ryan"
,"PS6183"
,'introduction to R'
Numeric (or Double):
Any real number, with or without decimal places.
Examples:22
,34.43
,54.00
Integer:
Whole numbers without decimal places. To specify a number as an integer, add a capitalL
at the end.
Examples:78L
,55L
,21L
Logical:
A value that is eitherTRUE
orFALSE
. This is case-sensitive, so only the following examples will work:TRUE
,FALSE
,T
,F
.
It’s essential to understand data types because certain operations are valid only for specific data types. For instance, mathematical operations can only be performed on Numeric or Integer data types.
We can check the data type of a piece of information by using the class
function
class(78L)
[1] "integer"
Sometimes, we may need to convert one data type to another. R provides several functions to help with this:
as.numeric() #Converts to numeric
as.character() # Converts to character
as.integer() # Converts to integer
as.logical() # Converts to logical
Exercises
In the R script you created, complete the following exercises.
To run a piece of code from your script, you can:
Using the Run Button
Highlight the code you want to run and click the Run button in the top-right corner of the script editor.Using Keyboard Shortcuts
Place your cursor on the line of code you want to run (or highlight multiple lines), then press the following keys simultaneously:- Windows:
Ctrl Enter
- Mac:
Command Enter
- Windows:
Feel free to use whichever method is most comfortable for you!
Data Type Exercises
Guess the Data Type
Look at each of the following pieces of code. Before running them, try to guess their data type. Then, use theclass()
function in R to check your answer."Hello World!"
43
"42.34"
FALSE
44.4
72L
Fix the Data Type Errors
The following data types have been incorrectly entered into R. Use the appropriate conversion functions to correct them:Convert
"42.34"
from character to numeric.Convert
"FALSE"
from character to logical.Convert
2024
from numeric to character.Convert
1
from integer to logical (observe the result!).
Bonus: Convert0
from numeric to logical and note what happens.
Activity 5: Variables
Information on Variables
Variables are labels for pieces of information we want to save and use later. To create a variable, we first specify the variable’s name, use the assignment operator (<-), specify the information that will be stored in that variable, and then run that line of code.
<- "2024" year
Once we have created a variable, we can type of that variable instead of its information.
print(year)
[1] "2024"
We can change (reassign) the piece of information that is stored to a variable.
<- "2025"
year
print(year)
[1] "2025"
If we look in the environment pane and tab, we can see R storing and updated the information as we run our code.
Variables are really useful when we want perform operations across multiple pieces of information. For example, like calculating the total and mean of someones scores on Extraversion.
<- 1 #a participants score on our first extraversion item
extra1 <- 2
extra2 <- 4
extra3 <- 2
extra4 <- 3 #a participants score on our fifth extraversion item
extra5
<- extra1 + extra2 + extra3 + extra4 + extra5
total_extra
<- total_extra/5
mean_extra
print(total_extra)
[1] 12
print(mean_extra)
[1] 2.4
Rules for Naming Variables
There are strict and recommended rules for naming variables. Here is a summary of the strict ones (you will run into an error if you break any of these). For more details, see the textbook: Conventions for Naming Variables.
Strict Rules
Variable names can include letters (A-Z, a-z), numbers (0-9), periods (
.
), and underscores (_
), but must start with a letter or a period (e.g.,first_name
is valid;1st_name
is not).No spaces in variable names. Use underscores or periods instead (e.g.,
my_name
,my.name
).Variable names are case-sensitive (e.g.,
my_name
≠My_name
).Avoid using reserved words in R, such as
if
,else
,TRUE
, orFALSE
.
Exercises
<-
There is a keyboard shortcut for writing the assignment operator (<-).
On Windows, press these two keys at the same time: Alt
-
On Mac, press the following two keys at the same time: Option
-
Create a Character Variable
Create a variable named
favourite_colour
and assign it your favourite colour.What data type is this variable? Use
class()
to check.
Create Numeric Variables
Create two variables called
num1
andnum2
. Assign them any two numbers.Add
num1
andnum2
together and save the result in a new variable calledsum_result
.
If you are receiving this error, it is because you have not run the code that defines the variable. Make sure to highlight that code and press run, before adding num1
and num2
together.
- Use `print()` to show the value of `sum_result`.
Convert Height
Make a variable called
height_cm
and set it to your height in centimetres (e.g.,175
).Create a second variable called
height_m
and set it to your height in metres. To do this, divideheight_cm
by 100.Use
print()
to display the values ofheight_cm
andheight_m
.
Activity 6: Vectors
Information on Vectors
In data analysis, we rarely work with individual variables or data types. Instead, we usually work with collections of data organised in data structures.
The most basic and important data structure in R is the vector. Vectors are like a column or row of data.
Creating a Vector
To create a vector, assign it a variable name, use the assignment operator (<-
), and combine multiple items using the c()
function (short for “combine”):
<- c("Ryan", "Ciara", "Aoife") rintro_instructors
Key Points about Vectors
The textbook provides more details on vectors, but for today’s session, here are the key points to remember:
All Elements in a Vector Must Be the Same Data Type
<- c("Gerry", "Aoife", "Liam", "Eva", "Helena", "Ciara", "Niamh", "Owen") #character vector
rintro_names
<- c(69, 65, 80, 77, 86, 88, 92, 71) #numeric vector
rintro_marks
<- c(TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE) #true or false vector rintro_satisfied
If we include multiple data types, R will either throw an error or convert everything to a single data type.
<- c(69, 65, 80, 77, 86, 88, "A1", 71)
rintro_grades
print(rintro_grades)
[1] "69" "65" "80" "77" "86" "88" "A1" "71"
In this example, all numbers are converted to characters because “A1” is a character.
Vectors Can Contain a Single Element
<- 2L rintro_week
We can check the type of vector with the class()
function
class(rintro_week)
[1] "integer"
Functions with Vectors
You can use functions to calculate useful information about vectors:
mean(rintro_marks) # Calculate the mean
[1] 78.5
sort(rintro_marks) # Sort from lowest to highest
[1] 65 69 71 77 80 86 88 92
sort(rintro_marks, decreasing = TRUE) # Sort from highest to lowest
[1] 92 88 86 80 77 71 69 65
summary(rintro_satisfied) # Summarise logical values
Mode FALSE TRUE
logical 3 5
Vector Indexing
Vectors use an index to keep track of the position of each element. The index is determined by the order in which the data was entered.
Use square brackets []
to extract specific elements:
1] # Extract the first element rintro_satisfied[
[1] TRUE
3] # Extract the third element rintro_satisfied[
[1] TRUE
8] # Extract the eighth element rintro_satisfied[
[1] FALSE
c(2, 4, 8)] # Extract the 2nd, 4th, and 8th elements rintro_names[
[1] "Aoife" "Eva" "Owen"
c(1:4)] # Extract elements 1 through 4 rintro_names[
[1] "Gerry" "Aoife" "Liam" "Eva"
Exercises
Create Vectors of Different Data Types
Create a character vector called
friends
with the names of 3 of your friends.Create an integer vector called
years
with the number of years you’ve been friends (use1
for friendships less than a year).Create a numeric vector called
extra
with their extraversion scores (out of 5).Create a logical vector called
galway
to indicate whether each friend lives in Galway (TRUE
) or not (FALSE
).Use the
class()
function to check the data type of each vector.
Index Specific Elements
Extract the 2nd, 4th, and 6th elements from each of the following vectors. Copy, paste, and run the code before attempting this.
<- c("No, not this element", "Yes, this element", "No, not this element", "Yes, this element", "No, not this element", "Yes, this element")
vect1
<- c(0, 1, 0, 1, 0, 1)
vect2
<- c("FALSE", "TRUE", "FALSE", "TRUE", "FALSE", "TRUE") vect3
- Extract and Save the Bottom 3 Marks
How would you extract and save the lowest 3 marks from the
rintro_marks
vector? Try it. Make sure to create the variable in your script first.<- c(69, 65, 80, 77, 86, 88, 92, 71) rintro_marks
Hint: Use the
sort()
function as shown in the examples.Bonus: Calculate the mean of the bottom 3 marks (use
mean()
function).
Activity 7: Dataframes
Information on DataFrames
A data frame is a rectangular data structure made up of rows and columns, similar to a spreadsheet in Excel or a table in a Word document. In R, each column of a data frame is a vector, and all vectors must have the same length.
To create a data frame, we use the data.frame()
function:
<- data.frame(
my_df name = c("Alice", "Bob", "Charlie"), #a character vector
age = c(25L, 30L, 22L), #an integer vector
score = c(95.65, 88.12, 75.33) #a numeric vector
)
my_df
name age score
1 Alice 25 95.65
2 Bob 30 88.12
3 Charlie 22 75.33
=
instead of <-
to create our column vectors?
The main reason is that we are creating these vectors inside of a function. Inside functions like data.frame
, we need to use the =
operator to create vectors instead of <-
.
It’s hard to drill too deeply into this when we have not even covered functions yet. But just put it down to as a weird quirk of the R language!
Extracting Information from Data Frames
You can extract or subset data from a data frame in several ways:
Selecting Columns
- Using
$
notation to extract a single column:
$name my_df
[1] "Alice" "Bob" "Charlie"
Using []
notation to extract one or more columns, the syntax being the dataframe[the rows we want, the columns we want]
.
"age"] #This selects all the rows for the age column my_df[,
[1] 25 30 22
To extract multiple columns, use the c()
function:
c("age", "score")] my_df[,
age score
1 25 95.65
2 30 88.12
3 22 75.33
Selecting Rows
You can access rows using indexing, specifying the row number you want to retrieve, following the syntax: the dataframe[the rows we want, the columns we want]
.
To get the first row:
1, ] #extracts the first row and the last column my_df[
name age score
1 Alice 25 95.65
To get specific rows, use the c()
function:
c(1, 3), ] my_df[
name age score
1 Alice 25 95.65
3 Charlie 22 75.33
To select a range of rows, use the :
operator:
2:4, ] my_df[
name age score
2 Bob 30 88.12
3 Charlie 22 75.33
NA <NA> NA NA
Selecting Rows and Columns Together
You can select specific rows and columns simultaneously using the syntax: the dataframe[the rows we want, the columns we want]
.
c(1,3), c("age", "score")] my_df[
age score
1 25 95.65
3 22 75.33
Adding Columns to a Data Frame
You can add a new column to an existing data frame by assigning values to it:
#existing_df$NewColumn <- c(Value1, Value2, Value3) #syntax
$gender <- c("Female", "Non-binary", "Male")
my_df
print(my_df)
name age score gender
1 Alice 25 95.65 Female
2 Bob 30 88.12 Non-binary
3 Charlie 22 75.33 Male
Adding Rows to a Data Frame
To add a new row, you first need to create a new data frame with the same columns as the original:
<- data.frame(name = "John", age = 30, score = 77.34, gender = "Male") new_row
Then, use the rbind()
function to combine the two data frames:
<- rbind(my_df, new_row)
my_df
my_df
name age score gender
1 Alice 25 95.65 Female
2 Bob 30 88.12 Non-binary
3 Charlie 22 75.33 Male
4 John 30 77.34 Male
Important Note
Indexing does not modify the original data frame.
c(1,3), c("age", "score")] my_df[
age score
1 25 95.65
3 22 75.33
print(my_df)
name age score gender
1 Alice 25 95.65 Female
2 Bob 30 88.12 Non-binary
3 Charlie 22 75.33 Male
4 John 30 77.34 Male
To save changes, assign the result to a new variable:
<- my_df[c(1,3), c("age", "score")]
my_df2
print(my_df2)
age score
1 25 95.65
3 22 75.33
Exercises
Creating a Data Frame
Create a data frame called
student_data
with the following columns:name
: Names of 3 students (as a character vector).age
: Ages of the students (as integers).mark
: Their grades (as numeric values).
Print the
student_data
data frame.
Extracting Data
Using the
student_data
data frame, complete the following tasks:Extract the
name
column using$
notation.Extract the
mark
column using[]
notation.Extract the names and marks of the first two students using
[]
notation and save this data frame aspartial_student_data
.Extract the first row of the
partial_student_data
data frame.
Adding a Column
Add a new column called
attendance
tostudent_data
, with the valuesTRUE
,FALSE
, andTRUE
to indicate whether each student attended the most recent class.Print the updated
student_data
data frame.
Adding a Row
Create a new row (call it
new_student
) with the following details:name
: “Judith”age
: 31Lmark
: 89.5attendance
:FALSE
Add this row to the
student_data
data frame usingrbind()
.Display the updated data frame.
Keyboard Shortcuts in RStudio
Here are some useful keyboard shortcuts in RStudio.
Action | Windows | Mac |
---|---|---|
Create a new script | Ctrl + Shift + N |
Cmd + Shift + N |
Save the current script | Ctrl + S |
Cmd + S |
Run the current line or code | Ctrl + Enter |
Cmd + Enter |
Insert assignment operator <- |
Alt + - |
Option + - |
Comment/Uncomment lines | Ctrl + Shift + C |
Cmd + Shift + C |