R for Biochemists @UAM
  • Home
  • Course Contents
    • R1: Variables, Operators and data structures
    • R2: Flow Control
    • R3: Read and write data in R
    • R4: Functions in R
    • R5: Base plots in R
    • R6: Advanced data management
    • R7: Project management for reproducible research in R
    • R8: Plot your data in R - Episode II
    • R9: R for Molecular Biology

On this page

  • 1 Flow control
  • 2 Conditional structures
    • 2.1 Nested conditionals
    • 2.2 Another way to code nested conditionals: else if statement
    • 2.3 A vectorized version of the if-else structures: ifelse statement
    • 2.4 A final way to code conditionals: switch statement
  • 3 Loop structures
    • 3.1 for loops
    • 3.2 while loops
    • 3.3 Control flow statements
    • 3.4 repeat loops
  • 4 References
  • 5 Session Info

R2: Flow Control

Conditionals + if-else + nested conditionals + ifelse + switch
Loops + for loops + while loops + control statements + repeat
Author

Luis

Published

July 14, 2023

Modified

August 20, 2024

1 Flow control

Sequential execution of programming commands is insufficient for solving anything beyond trivial tasks. In most cases, solving a problem programmatically involves executing specific statements only under certain conditions and/or repeating a statement or set of statements multiple times. This ability to control when and how many times a statement or set of statements is executed is referred to as Flow control. In R, flow control is achieved using two specific structures in our code:

  • Conditional structures: These structures allow the execution of a statement or set of statements based on a condition.

  • Loop structures: These structures allow the execution of a statement or set of statements for a fixed number of times or until a condition is met. In both cases, the block of code containing the statements to be conditionally or repeatedly executed is delimited by curly brackets.

2 Conditional structures

As indicated above, conditional structures are used to execute a block of code (contained within curly brackets) based on the result of a condition, that is when the result of applying a comparison operator results in the Boolean TRUE. The conditional block of code is preceded by the statement ìf(condition) where condition is usually an expression using a relational operator (see ?@sec-operators). The following code ask the user to enter a number and then prints negative if the number less than zero. Then, it will print End to indicate that the program is has finished.

x = readline("Enter a number:")
Enter a number:
y = as.integer(x)
if (y < 0) {
    print("negative")
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed
print("End")
[1] "End"

Please, notice that the statements after the if structure are always executed regardless the condition. Oftentimes we would like to execute a block of code if the condition is met and another block when the condition is FALSE. To do that we use the if-else structure just adding the else statement (an its block of code) right after the block of code under if. The following code will print negative if the number less than zero and positive otherwise.

x = readline("Enter a number:")
Enter a number:
y = as.integer(x)
if (y < 0) {
    print("negative")
} else {
    print("positive")
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed
print("End")
[1] "End"

Please, notice that the following code is INCORRECT and does not work as intended because the statements after the if structure are always executed regardless the conditional.

print("This script is WRONG!! \n does not work as intended")
[1] "This script is WRONG!! \n does not work as intended"
x = readline("Enter a number:")
Enter a number:
y = as.integer(x)
if (y < 0) {
    print("negative")
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed
print("positive")
[1] "positive"
print("End")
[1] "End"

2.1 Nested conditionals

Sometimes we need more than a single bifurcation in our code. For example, in the case above we may want to print negative when the number is less than zero,positive when it is more than zero and just zero when it is equal to zero. To do that can can just include (nest) an if-else structure within another if-else structure:

x = readline("Enter a number:")
Enter a number:
y = as.integer(x)
if (y < 0) {
    print("negative")
} else {
    if (y > 0) {
        print("positive")
    } else {
        print("zero")
    }
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed
print("End")
[1] "End"

2.2 Another way to code nested conditionals: else if statement

Nesting is not restricted to three alternative blocks of code; in fact, we can nest as many if-else structures as necessary. However, the code becomes cumbersome with more that two or three blocks of code. For this reason, in situations where many blocks of code are to be conditionally executed it is much more convenient to use the else if (condition) statement. In this case, the conditional structure begins with and if(condition)statement and continues with as many else if (condition) statements as required. Finally, the structure may end with an else although it is not strictly required.

x = readline("Enter a number:")
Enter a number:
y = as.integer(x)
if (y < 0) {
    print("negative")
} else if (y > 0) {
    print("positive")
} else {
    print("zero")
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed
print("End")
[1] "End"
Pay Attention

The conditionals are evaluated sequentially in the order the appear un the code and once any of them results in TRUEthe remaining are skipped (not evaluated) and the code resumes execution after the whole conditional structure.

x = readline("Age in years:")
Age in years:
y = as.integer(x)
if (y < 0) {
    print("Not born yet!")
} else if (y < 1) {
    print("Infant")
} else if (y < 3) {
    print("Toddler")
} else if (y < 5) {
    print("Preschooler")
} else if (y < 9) {
    print("Child")
} else if (y < 13) {
    print("Preteen")
} else if (y < 18) {
    print("Teenager")
} else if (y < 25) {
    print("Young Adult")
} else if (y < 70) {
    print("Adult")
} else if (y < 100) {
    print("Ederly/Senior")
} else {
    print("Ancient!")
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed

Note that, when coding many different conditionals the else if statement is much more convenient that nesting if-else structures. Also note that that conditionals are evaluated sequentially. For example, the following code is a variation of the previous one just changing the order of one of the conditionals and, as you can see it will not work as intended:

print("This script is WRONG!! \n does not work as intended")
[1] "This script is WRONG!! \n does not work as intended"
x = readline("Age in years:")
Age in years:
y = as.integer(x)
if (y < 0) {
    print("Not born yet!")
} else if (y < 100) {
    print("Ederly/Senior")
} else if (y < 3) {
    print("Toddler")
} else if (y < 5) {
    print("Preschooler")
} else if (y < 9) {
    print("Child")
} else if (y < 13) {
    print("Preteen")
} else if (y < 18) {
    print("Teenager")
} else if (y < 25) {
    print("Young Adult")
} else if (y < 70) {
    print("Adult")
} else if (y < 1) {
    print("Infant")
} else {
    print("Ancient!")
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed

Notice that, in any type of conditional structure, you can combine multiple relational operators and, in the following particular example, it makes the case the order of the conditionals irrelevant:

x = readline("Age in years:")
Age in years:
y = as.integer(x)
if (y < 0) {
    print("Not born yet!")
} else if (y >= 70 & y < 100) {
    print("Ederly/Senior")
} else if (y >= 1 & y < 3) {
    print("Toddler")
} else if (y >= 3 & y < 5) {
    print("Preschooler")
} else if (y >= 5 & y < 9) {
    print("Child")
} else if (y >= 9 & y < 13) {
    print("Preteen")
} else if (y >= 13 & y < 18) {
    print("Teenager")
} else if (y >= 18 & y < 25) {
    print("Young Adult")
} else if (y >= 25 & y < 65) {
    print("Adult")
} else if (y >= 0 & y < 1) {
    print("Infant")
} else {
    print("Ancient!")
}
Error in if (y < 0) {: missing value where TRUE/FALSE needed

2.3 A vectorized version of the if-else structures: ifelse statement

The ifelse function in R is a vectorized version of the if-else statement, allowing you to perform conditional operations on entire vectors or arrays at once. It takes three arguments: the condition to evaluate, the value to return if the condition is true, and the value to return if the condition is false.

For example, in the examples above used to determine if a number was positive or negative, the code evaluated a single value at a time. The ifelse can evaluate multiple values simultaneously. That is, given a vector of numbers, the ifelse function can be used to return a vector indicating positive and negative values:

MyNumbers <- c(1, -3, 4, 5, -23)
ifelse(MyNumbers > 0, "Positive", "Negative")
[1] "Positive" "Negative" "Positive" "Positive" "Negative"

If more than one condition needs to be evaluated, the code could be modified to include as many ifelse as needed. For example the code below classify numbers as positive, negative or zero:

MyNumbers <- c(1, -3, 0, 5, -23)
ifelse(MyNumbers > 0, "Positive", ifelse(MyNumbers < 0, "Negative",
    "Zero"))
[1] "Positive" "Negative" "Zero"     "Positive" "Negative"

2.4 A final way to code conditionals: switch statement

The code below asks the user to enter a DNA nucleotide using a single letter code (i.e the user enters A, C, G or T) and the program prints the full name of the base (i.e prints adenine, cytosine, guanine or thymine respectively) or NA if the user did not enter a nucleotide. Notice that the else clause handle any input other than the expected A, C,G or T; this is a typical use of the else statement.

x = readline("Enter a nucleotide:")
Enter a nucleotide:
x = toupper(x)  #just in case
if (x == "A") {
    print("adenine")
} else if (x == "C") {
    print("cytosine")
} else if (x == "G") {
    print("guanine")
} else if (x == "T") {
    print("thymine")
} else {
    print(NA)
}
[1] NA

This code can be also written using a less frequent but more compact conditional structure that uses the statement switch as shown in the code below

x = readline("Enter a nucleotide:")
Enter a nucleotide:
x = toupper(x)  #just in case!
switch(EXPR = x, A = "adenine", C = "cytosine", G = "guanine",
    T = "thymine", NA)
[1] NA

Please note that the switch statement matches the provided expression with the given cases exactly. It doesn’t perform logical evaluations like if-else statements, which check for conditions that evaluate to TRUE or FALSE.

3 Loop structures

Loop structures are programming constructs that allow you to repeat a set of statements or code multiple times. Loop structures are useful when you want to perform a specific action repeatedly, iterate over a sequence of values, or iterate until a certain condition is met. As in the case of conditionals, the block of code to be repeated is delimited (contained within) a pair of curly brackets. There are three types of loop structures in R: for, while and repeat.

3.1 for loops

The for loop is used when you know the exact number of iterations required. It iterates over a sequence, such as a vector or a range of numbers, executing a block of code for each iteration. For example, the code below calculates the square of each of the values in a matrix

MyMat <- matrix(1:9, ncol = 3)
print(MyMat)
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
MyMat3 <- MyMat
for (rr in 1:dim(MyMat3)[1]) {
    for (cc in 1:dim(MyMat3)[2]) {
        MyMat3[rr, cc] <- MyMat3[rr, cc]^2
    }
}

print(MyMat3)
     [,1] [,2] [,3]
[1,]    1   16   49
[2,]    4   25   64
[3,]    9   36   81

Note that, thanks to the vectorization of data structures in R, this task can be accomplished in a single operation

MyMat2 <- MyMat^2
print(MyMat2)
     [,1] [,2] [,3]
[1,]    1   16   49
[2,]    4   25   64
[3,]    9   36   81

3.2 while loops

“While” loops are utilized to repeatedly execute a block of code (enclosed within curly brackets) as long as a specific condition remains true. The structure of a “while” loop includes a condition followed by a code block. The code block is executed iteratively until the condition evaluates to false. The computations within the code block may modify the state of the parameter being evaluated in the condition, leading to the program exiting the loop after a certain number of repetitions. The following code make use of while loops to generate a vector containing all 64 codons:

Nucl <- LETTERS[c(1, 3, 7, 20)]
Codons <- c()
p1 <- 1
while (p1 < 5) {
    p2 <- 1
    while (p2 < 5) {
        p3 <- 1
        while (p3 < 5) {
            Codons <- c(Codons, paste0(Nucl[p1], Nucl[p2], Nucl[p3]))
            p3 <- p3 + 1
        }
        p2 <- p2 + 1
    }
    p1 <- p1 + 1
}
print(Codons)
 [1] "AAA" "AAC" "AAG" "AAT" "ACA" "ACC" "ACG" "ACT" "AGA" "AGC" "AGG" "AGT"
[13] "ATA" "ATC" "ATG" "ATT" "CAA" "CAC" "CAG" "CAT" "CCA" "CCC" "CCG" "CCT"
[25] "CGA" "CGC" "CGG" "CGT" "CTA" "CTC" "CTG" "CTT" "GAA" "GAC" "GAG" "GAT"
[37] "GCA" "GCC" "GCG" "GCT" "GGA" "GGC" "GGG" "GGT" "GTA" "GTC" "GTG" "GTT"
[49] "TAA" "TAC" "TAG" "TAT" "TCA" "TCC" "TCG" "TCT" "TGA" "TGC" "TGG" "TGT"
[61] "TTA" "TTC" "TTG" "TTT"

3.3 Control flow statements

Sometimes, it may be necessary for a program to skip executing a block of code (or part of it) and move to the next iteration. In other cases, the program may need to exit the loop prematurely if a certain condition is met. To achieve these actions, we can use the next and break statements, respectively. Both control statements are typically invoked within a conditional context. For instance, when computing the Shannon Entropy (\(H(X)=\sum_i P(x_i) log(Pi)\)) of a variable’s state distribution, it is common to skip terms with a frequency of zero to avoid encountering the undefined value of \(log(0)\). Consider the following code, which takes a vector representing the observed absolute frequencies of nucleotides at a specific genomic position and calculates the Shannon Entropy for that position:

In this code, the next statement is used to skip the calculation of terms with a frequency of zero. This prevents the program from encountering the undefined \(\log(0)\) value. The loop continues to the next iteration and calculates the entropy based on non-zero frequencies.

Nucleotide_frequencies <- c(25, 0, 5, 0)  #example of absolute frequencies of A, C, G and T
Total_Freq <- sum(Nucleotide_frequencies)
Entropy <- 0
for (Freq in Nucleotide_frequencies) {
    if (Freq == 0) {
        next  #skips the remaining of the block and goes to next iteration
    }
    Prob <- Freq/Total_Freq
    Entropy <- Entropy + (Prob * log2(Prob))
}
Entropy <- (-1) * Entropy
print(Entropy)
[1] 0.6500224

By utilizing the next statement, we ensure that the Shannon Entropy computation proceeds correctly without encountering any issues related to zero frequencies.

3.4 repeat loops

An additional statement in R that allows you to execute a block of code a specific number of times is called repeat. Similar to while loops, the block of code within the repeat statement is repeated until a certain condition is met. However, unlike while loops, the condition itself is not specified within the repeat statement. Instead, the condition is part of the code block, and the loop is terminated using the break statement when the condition is satisfied. For example, the following code prints the Fibonacci terms smaller than 250:

Fib <- c(1, 1)
repeat {
    tmp <- Fib[length(Fib) - 1] + Fib[length(Fib)]
    if (tmp > 250) {
        break
    }
    Fib <- c(Fib, tmp)
}
print(Fib)
 [1]   1   1   2   3   5   8  13  21  34  55  89 144 233

4 References

5 Session Info

sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: x86_64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Madrid
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] webexercises_1.1.0 formatR_1.14       knitr_1.48        

loaded via a namespace (and not attached):
 [1] htmlwidgets_1.6.4 compiler_4.4.0    fastmap_1.2.0     cli_3.6.3        
 [5] tools_4.4.0       htmltools_0.5.8.1 rstudioapi_0.16.0 yaml_2.3.10      
 [9] rmarkdown_2.28    jsonlite_1.8.8    xfun_0.47         digest_0.6.36    
[13] rlang_1.1.4       evaluate_0.24.0  
Back to top

Modesto Redrejo Rodríguez & Luis del Peso, 2024