Sådan opbygges en intuition til rekursion

Og hvordan man bruger det til at løse problemer

Rekursion er et af de mest skræmmende emner, som studerende står over for i programmering. Det er svært at forstå, fordi den menneskelige hjerne ikke er i stand til at udføre rekursion - men computere er det. Dette er præcis, hvorfor rekursion er et så stærkt værktøj for programmører, men det betyder også, at det er ekstremt vanskeligt at lære at bruge det. Jeg vil gerne hjælpe dig med at opbygge en intuition til rekursion, så du kan bruge den til at løse problemer.

Jeg er undervisningsassistent for det indledende datalogikursus på mit universitet. Jeg har forklaret rekursion på nøjagtig samme måde et dusin gange i denne uge. Min forklaring ser ud til at hjælpe de fleste studerende. Denne artikel har den mest generelle forklaring øverst og den mest specifikke forklaring i bunden. På denne måde kan du starte i starten og stoppe, så snart du føler at du forstår rekursion godt nok. Jeg har givet nogle eksempler i Java, og de er enkle nok til, at enhver med programmeringserfaring kan fortolke dem.

Hvad er rekursion?

For at forstå rekursion, lad os tage et skridt tilbage fra programmering. Lad os starte med at etablere en generel definition af udtrykket. Noget er rekursivt, hvis det til en vis grad defineres af sin egen definition. Det hjælper sandsynligvis ikke dig med at forstå rekursion meget, så lad os se på en matematisk definition. Du er fortrolig med funktioner - et nummer går ind, et andet nummer kommer ud. De ser sådan ud:

f (x) = 2x

Lad os ændre denne idé lidt og i stedet tænke på en sekvens. En sekvens tager et heltal, og et heltal kommer ud.

A (n) = 2n

Sekvenser kan betragtes som funktioner med input og output, der kun er begrænset til positive heltal. Generelt starter sekvenser med 1. Dette betyder, at A (0) er 1. Sekvensen ovenfor er følgende:

A (n) = 1, 2, 4, 6, 8, 10,… hvor n = 0, 1, 2, 3, 4, 5,…

Overvej nu følgende rækkefølge:

A (n) = 2 x A (n-1)

Denne sekvens er rekursivt defineret. Med andre ord afhænger værdien af ​​et givet element af værdien af ​​et andet element. Denne sekvens ser sådan ud:

A (n) = 1, 2, 4, 8, 16,… hvor n = 0, 1, 2, 3, 4,…

Ethvert element er defineret som 2 gange det forrige element.

  • Elementet n = 4, 16, defineres som 2 gange det foregående element.
  • Elementet n = 3, 8, defineres som 2 gange det forrige element.
  • Elementet n = 2, 4, defineres som 2 gange det foregående element.
  • Elementet n = 1, 2, defineres som 2 gange det foregående element.
  • Elementet n = 0, 1, er defineret som…

Elementet n = 0 kan ikke defineres rekursivt. Der er ikke noget tidligere element. Vi kalder dette en basissag , og det er en nødvendig konsekvens af rekursive definitioner. De skal være udtrykkeligt repræsenteret i din kode . Vi kunne repræsentere denne rekursive sekvens i Java sådan:

public int A(int n){ if (n == 0) return 1; return 2 * A(n - 1);}

Du bør gøre dig bekendt med anatomien i en rekursiv metode. Bemærk basissagen: hvis n er 0, defineres elementet som 1. Ellers defineres elementet som 2 gange det forrige element. Vi skal rekursivt kalde metoden for at få værdien af ​​det forrige element og derefter multiplicere den med 2. Alle rekursive metoder vil have disse to komponenter:

  • Basissag, som returnerer en veldefineret værdi.
  • Rekursiv sag, som returnerer en rekursivt defineret værdi.

Lad os gøre et andet eksempel og fortsætte med matematikkonteksten. Fibonacci-sekvensen bruges ofte til at illustrere rekursion. Ethvert element i Fibonacci-sekvensen er summen af ​​de to foregående elementer. Det går sådan her:

F (n) = 1, 1, 2, 3, 5, 8,… hvor n = 0, 1, 2, 3, 4, 5,…

  • N = 5, element, 8, defineres som summen af ​​elementet n = 4 og elementet n = 3 ...

På dette tidspunkt skal du tøve. I det foregående eksempel afhængede hvert element kun af et andet element, nu afhænger hvert element af to andre elementer. Dette komplicerer tingene.

  • Elementet n = 4, 5, er defineret som summen af ​​elementet n = 3 og elementet n = 2.
  • Elementet n = 3, 3, er defineret som summen af ​​elementet n = 2 og elementet n = 1.
  • Elementet n = 2, 2, defineres som summen af ​​elementet n = 1 og elementet n = 0.
  • The n = 1 element, 1, is defined as the sum of the n = 0 element and…

The n = 1 element cannot be recursively defined. Neither can the n = 0 element. These elements cannot be recursively defined because the recursive definition requires two preceding elements. The n = 0 element has no preceding elements, and the n = 1 element has only one preceding element. This means that there are two base cases. Before writing any code, I would write down something like this:

The n = 0 element is defined as 1. The n = 1 element is defined as 1.

The n element is defined as the sum of the n-1 element and the n-2 element.

Now we have an idea of how this task is recursively defined, and we can go ahead and write some code. Neverstart writing code without first having a natural understanding of the task.

public int F(int n) if (n == 0 

The Call Stack

As programmers, we want to have an intuition for recursion so that we may use it to do things. To do so effectively, we must understand how a computer processes recursion.

There is a data structure that the computer uses to keep track of method calls called the call stack. Each method call creates local variables from the method parameters. The computer needs to store these variables while the method is being executed. Then, the computer ditches the values when the method returns to avoid wasting memory.

The call stack (and stacks in general) function as you might imagine some sort of real-life stack would. Imagine a stack of papers on your desk — it starts as nothing, and then you add papers one by one. You don’t know anything about any of the papers in the stack except for the paper on top. The only way you can remove papers from the stack is by taking them off the top, one-by-one, in the opposite order that they were added.

This is essentially how the call stack works, except the items in the stack are activation records instead of papers. Activation records are just little pieces of data that store the method name and parameter values.

Without recursion, the call stack is pretty simple. Here’s an example. If you had some code that looked like this…

public static void main(String[] args) System.out.println(myMethod(1));

…The call stack would look like this:

* myMethod(int a)
* main(String[] args)

Here we see two methods under execution, main and myMethod. The important thing to notice is that main cannot be removed from the stack until myMethod is removed from the stack. In other words, main cannot complete until myMethod is called, executed, and returns a value.

This is true for any case of method composition (a method within a method) — so let’s look at recursive example: the A(int n) method we wrote earlier. Your code might look like this:

public static void main(String[] args) System.out.println(A(4));
public static int A(int n){ if (n == 0) return 1; return 2 * A(n - 1);}

When main is called, A is called. When A is called, it calls itself. So the call stack will start building up like so:

* A(4)* main(String[] args)

A(4) calls A(3).

* A(3)* A(4)* main(String[] args)

Now, it’s important to note that A(4) cannot be removed from the call stack until A(3) is removed from the call stack first. This makes sense, because the value of A(4) depends on the value of A(3). The recursion carries on…

* A(0)* A(1)* A(2)* A(3)* A(4)* main(String[] args)

When A(0) is called, we have reached a base case. This means that the recursion is completed, and instead of making a recursive call, a value is returned. A(0) comes off the stack, and the rest of the calls are then able to come off the stack in succession until A(4) is finally able to return its value to main.

Here’s the intuition: the return value of any method call depends on the return value of another method call. Therefore, all the method calls must be stored in memory until a base case is reached. When the base case is reached, the values start becoming well-defined instead of recursively defined. For example, A(1) is recursively defined until it knows the definition of the base case, 1. Then, it is well-defined as 2 times 1.

When we are trying to solve problems with recursion, it is often more effective to think about the order in which values are returned. This is the opposite of the order in which calls are made. This order is more useful because it consists of well-defined values, instead of recursively defined values.

For this example, it is more useful to consider that A(0) returns 1, and then A(1) returns 2 times 1, and then A(2) returns 2 times A(1), and so on. However, when we are writing our code, it can easier to frame it in the reverse order (the order that the calls are made). This is another reason that I find it helpful to write the base case and the recursive case down before writing any code.

Helper Methods and Recursion vs. Loops

We are programmers, not mathematicians, so recursion is simply a tool. In fact, recursion is a relatively simple tool. It’s very similar to loops in that both loops and recursion induce repetition in the program.

You may have heard that any repetitive task can be done using either a while loop or a for loop. Some tasks lend themselves better to while loops and other tasks lend themselves better to for loops.

The same is true with this new tool, recursion. Any repetitive task can be accomplished with either a loop or recursion, but some tasks lend themselves better to loops and others lend themselves better to recursion.

When we use loops, it is sometimes necessary to make use of a local variable to “keep track” of a calculation. Here’s an example.

public double sum (double[] a){ double sum = 0.0; for (int i = 0; i < a.length; i++) sum += a[i]; return sum;
}

This method takes an array of doubles as a parameter and returns the sum of that array. It uses a local variable, sum, to keep track of the working sum. When the loop is completed, sum will hold the actual sum of all values in the array, and that value is returned. This method actually has two other local variables that are less obvious. There is the double array a, whose scope is the method, and the iterator i (keeps track of the index), whose scope is the for loop.

What if we wanted to accomplish this same task using recursion?

public double recursiveSum(double[] a) # recursively calculate sum

This task is repetitive, so it is possible to do it using recursion, though it is probably more elegantly accomplished using a loop. We just need to create a few local variables to keep track of the working sum and the index, right?

Alas, this is impossible. Local variables only exist in the context of a single method call, and recursion makes use of repeated method calls to accomplish a repetitive task. This means that local variables are pretty much useless when we are using recursion. If you are writing a recursive method and you feel as though you need a local variable, you probably need a helper method.

A helper method is a recursive method that makes use of additional parameters to keep track of values. For recursiveSum, our helper method might look like this:

public double recursiveSum(double[] a, double sum, int index){ if (index == a.length) return sum; sum += a[index]; return recursiveSum(a, sum, index + 1);}

This method builds the sum by passing the working value to a new method call with the next index. When there are no more values in the array, the working sum is the actual sum.

Now we have two methods. The “starter method,” and the helper method.

public double recursiveSum(double[] a) # recursively calculate sum
public double recursiveSum(double[] a, double sum, int index){ if (index == a.length) return sum; sum += a[index]; return recursiveSum(a, sum, index + 1);}

The term “helper method” is actually a bit of a misnomer. It turns out that the helper method does all the work, and the other method is just a starter. It simply calls the helper method with the initial values that start the recursion.

public double recursiveSum(double[] a) return recursiveSum(a, 0.0, 0);
public double recursiveSum(double[] a, double sum, int index){ if (index == a.length) return sum; sum += a[index]; return recursiveSum(a, sum, index + 1);}

Note that the values used in the starter call to the helper method are the same values used to initialize the local variables in the loop example. We initialize the variable used to keep track of the sum to 0.0, and we initialize the variable used to keep track of the index to 0.

Earlier, I said that local variables are useless in the context of recursion. This isn’t completely true, because the method parameters are indeed local variables. They work for recursion because new ones are created every time the method is called. When the recursion is executed, there are many method calls being stored in the call stack, and as a result there are many copies of the local variables.

You might ask, “If the helper method does all the work, why do we even need the starter method? Why don’t we just call the helper method with the initial values, and then you only need to write one method?”

Husk, at vi prøvede at erstatte metoden, der brugte en for-loop. Denne metode var enkel. Det tog en matrix som parameter og returnerede summen af ​​matrixen som en dobbelt. Hvis vi udskiftede denne metode med en, der tog tre parametre, skulle vi huske at kalde den med de rigtige startværdier. Hvis en anden ville bruge din metode, ville det være umuligt, hvis han eller hun ikke kendte startværdierne.

Af disse grunde er det fornuftigt at tilføje en anden metode, der tager sig af disse startværdier for os.

Afslutter

Rekursion er et ret udfordrende koncept, men du klarede det helt til slutningen af ​​min forklaring. Jeg håber, du forstår magien lidt bedre. Nu giver jeg dig officielt titlen "Grand-Wizard of Recursion." Tillykke!