Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Introduction to R for Data Science :: Session 3
1. Introduction to R for Data Science
Lecturers
dipl. ing Branko Kovač
Data Analyst at CUBE/Data Science Mentor
at Springboard
Data Science zajednica Srbije
branko.kovac@gmail.com
dr Goran S. Milovanović
Data Scientist at DiploFoundation
Data Science zajednica Srbije
goran.s.milovanovic@gmail.com
goranm@diplomacy.edu
2. Lists in R
• Lists can contain elements (objects) of various types/classes
• Lists can be recursive: a list of lists
• In R we use lists a lot; however, computing over lists is seldom the most efficient way
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# It's time to speak about lists
num_vct <- c(2:5) # just another num vector
chr_vct <- c("data", "science") # char vector
data_frame <- data.frame(x = c("a", "b", "c", "d"), y = c(1:4)) # simple df
lista <- list(data_frame, num_vct, chr_vct) # and this is a list
lista # this is our list
3. Lists in R
• Subsetting lists
• Think of an element (a node) of a list as a “container” which is always a list itself
• Subsetting with [[ ]] and [ ] – careful!
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
str(lista) # about a list
length(lista)
as.list(chr_vct) # another way to create a list
# Lists manipulation
names(lista) <- c("data", "numbers", "words")
lista[3] # 3rd element?
lista[[3]] # 3rd element?
is.list(lista[3]) # is this a list?
is.list(lista[[3]]) # and this?
class(lista[[3]]) # also a list? Don’t be so sure!
4. Lists in R
• More subsetting
• Adding and removing a node
• unlist()
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
lista$words # we can also extract an element this way
lista[["words"]] # or even like this
lista[["words"]][1] # digging even deeper
lista$new_elem <- c(TRUE, FALSE, FALSE, TRUE) # add new element
length(lista) # now list has 4 elements
lista$new_elem <- NULL # but we can remove it easily
new_vect <- unlist(lista) # creating a vector from list
5. Functionsin R
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Functions
# (w. less formalism but tips & tricks added)
# elementary: a definition
fun <- function(x) x+10;
fun(5)
# taking two arguments
fun2 <- function(x,y) x+y;
fun2(3,4)
# using "{" and "}" to enclose multiple R
# expressions in the function body
fun <- function(x,y) {
a <- sum(x);
b <- sum(y);
a-b}
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
r <- c(5,4,3); q <- c(1,1,1); fun(r,q)
fun(c(5,4,3),c(1,1,1))
# NOTE: "{" and "}" are generally used in R
# to mark the beginning and the end of
# block
# a function is a function:
is.function(fun);
is.function(log); # log is built-in
6. Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Functional programming ("Everything is a function...")
"^"(2,2)
"^"(2,3) # magic! - how do you do that?
2^2
2^3
# the difference between "operators" and "functions" in R: none
# Everything is a function:
"+"(2,2) # Four?
2+2 # yeah, right - Oh but I love this
"-"("+"(3,5),2)
"&"(">"(2,2),T)
"&"(">"(3,2),T) # punishment: write all your lab code for this week in this fashion...
Functionsin R
• Functional programming
7. Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Step 1: here's a list:
aList <- list(c(1,2,3), c(4,5,6), c(7,8,9), c(10,11,12))
# Step 2: I want to apply the following function:
myFun <- function(x) {x[1]+x[2]-x[3]}
# to all elements of the aList list, and get the result as a list again.
# Here it is:
res <- lapply(aList, function(x) { x[1]+x[2]-x[3]})
unlist(res) # to get a vector
Lists and Functions in R
• Two things that come handy: lapply() and apply()
8. Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Now say I've got a matrix
myMat <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3)
# now, I want the sums of all rows:
rsMyMat <- apply(myMat, 1, function(x) {sum(x)})
rsMyMat
is.list(rsMyMat) # just beatiful
# for columns:
csMyMat <- apply(myMat, 2, function(x) {sum(x)})
Lists and Functions in R
• Two things that come handy: lapply() and apply()
9. Intro to R for Data Science
Session 2: Lists & Functions
# Introductionto R for Data Science
# SESSION 3 :: 12 May, 2016
# with existings functions, such as sum(), this will do:
rsMyMat1 <- apply(myMat, 1, sum)
rsMyMat1
csMyMat1 <- apply(myMat, 2, sum)
csMyMat1
# try also…
rowSums(myMat)
colSums(myMat)
Lists and Functions in R
• Two things that come handy: lapply() and apply()