A brief theory about two pointers
Unfortunately, I have not found a sensible description of the two-pointer method, so I will write my short text.
Example
Two pointers is a fairly simple idea that is often found in problems. Let's look at an example first.
So, let's say we have a problem: given an array of $N$ positive numbers, we need to find in it several numbers running in a row, so that their sum is greater than $K$, and the numbers in it would contain as few as possible.
The first, rather clear idea: let us fix the position (number in the array) $i$ of the first of the required numbers. Then we need to find the position $j$ closest to it on the right such that the sum of the numbers in positions from $i$ to $j$ is greater than $K$. This can be done quite simply by a $j$ cycle from $i$ to $N$, then the algorithm for solving the entire problem (taking into account the $i$ iteration) turns out as follows:
for i:=1 to n do begin s:=0; for j:=i to n do begin s:=s+a[j]; // in the variable s we calculate the current amount if s>k then begin // writeln(j); // this print will come in handy later if j-i+1<min then min:=j-i+1; break; end; end; end;
It works for $O(N^2)$. Let's figure out how it can be accelerated to $O(N)$.
The key idea here is as follows: if for some $i=i_0$ we reached a certain value of $j=j_0$ in a $j$ loop, then for the next $i=i_0+1$, the found value of $j$ will necessarily be no less $j_0$. To put it another way, if we uncomment writeln
in the above code, then the output values of $j$ form a non-decreasing sequence.
Indeed, if for some $i=i_0$ we skipped all $j$ to $j_0$, then it means that the sum of the numbers starting from $i_0$ to all intermediate $j$ was $\leq K$. But then if we start counting the sum of the numbers starting from the position $i_0+1$, then for all the same intermediate $j$ the sum of the numbers will certainly be $\leq K$ (because by condition all numbers are positive).
And this means that we don't have to restart the search for $j$ every time starting from $i$! We can start searching for $j$ every time from the value we stopped at last time:
s:=0; for i:=1 to n do begin for j:=j to n do begin // the cycle starts with j! s:=s+a[j]; if s>k then begin if j-i+1<min then min:=j-i+1; s:=s-a[j]; // this is a dirty hack, because in the next iteration // we will do s again:=s+a[j] with the same j. // Below you will see how to make it more beautiful. break; end; end; s:=s-a[i]; // in order not to calculate s every time again end;The same code can be rewritten in a slightly different way, maybe easier:
j:=1; s:=0; for i:=1 to n do begin while (j<n)and(s<=k) do begin // we thus start with the old value of j s:=s+a[j]; inc(j); end; if s>k then begin // +1 is missing because now j points to the first unused element. if j-i<min then min:=j-i; end else // if we got here, then j =n, but s<=k - there is nothing further to look for break; s:=s-a[i]; // before starting the next iteration, we will throw out the number i end;
This solution works for $O(N)$. Indeed, both loop variables — both $i$ and $j$ — are only increasing all the time the program is running. The total number of iterations of the inner loop will be equal to the total number of increments of $j$, and this is equal to $N$. Everything except the inner loop also works for $O(N)$, and we get the total complexity of $O(N)$.
Variables $i$ and $j$ in this context are often called pointers — of course, not in the sense that they are of type pointer
(which is incorrect), but simply in the sense that they point to certain positions in the array. Therefore, this method is called the two-pointer method.
General idea
In this example, the essence of the two-pointer method is very clearly visible: we take advantage of the fact that when the value of one pointer increases, the value of the other pointer can also only increase. If we iterate over $i$ in ascending order, then $j$ will also only increase — so we don't have to iterate over again every time, we can just continue from the previous value.
Of course, this is not the case in every problem, but there are problems where it can be proved and it works.
More examples:
- There are $N$ points on the straight line; it is required to count the number of pairs of points whose distance is $\geq D$. Solution: sort the points by coordinate (if they are not sorted in the input data). We go with one pointer $i$ from 1 to $N$, we change the second pointer $j$ so that among the points lying "to the right" of $i$ (i.e. with large numbers) it points to the "leftmost" point such that the distance from point $i$ to point $j$ was $\geq D$. At each step, we add $N-j+1$ to the answer — this is the total number of points lying to the right of $i$ at a distance of at least $D$ from it. Since with an increase in $i$, the value of $j$ can also only increase, the two-pointer method will work.
- Two sorted arrays are given, it is necessary to check whether there are identical numbers among them. Solution: we run two pointers, the first ($i$) runs on one array ($a$), the second ($j$) on the other ($b$) so that $b[j]$ is always the first element greater than or equal to $a[i]$.
- $N$ points are given on the circle, it is necessary to find a pair of points, the distance between which (along the chord of the circle) is maximum. Solution: sort the points so that they go in the order of circumventing the circle, even counterclockwise. We take the first point ($i=1$) and find the one that is as far away from it as possible, let it be the point $j$. Then go to the next point counterclockwise ($i=2$). It is easy to see that the point furthest from it will also move counterclockwise (well, or stay in place), i.e. $j$ will only increase. And so on; it is only necessary to take into account that the circle is looped and after the $N$th point comes the first one. As a result, in the solution, both pointers will make a complete revolution around the circle.
- There are $N$ points on the circle, we need to find three points such that the area of the triangle stretched on them is maximal. Solution for $O(N^2$): sort the points the same way. Let's fix the first vertex of the triangle. Let's run two pointers for the second and third vertices of the triangle: when the second vertex of the triangle is shifted counterclockwise, the third one also moves counterclockwise. So for $O(N)$ we will find a triangle of maximum area with a fixed first vertex. After iterating over all the first vertices, for $O(N^2)$ let's solve the problem.
Alternative implementations
Above, we always had one pointer as the "master" and the other as the "slave". Often, a program can be written in such a way that both pointers will be equal. The key idea is that at every step we look at the current "situation" and, depending on it, we move one of the two pointers.
For example, in our example about the amount on the sub-section greater than $K$:
i:=1; j:=1; s:=a[1]; while j<=n do begin if s>k then begin // if s>k, then it is necessary, first, to check the solution, // and secondly, you can definitely move the first pointer if j-i+1<min then min:=j-i+1; s:=s-a[i]; // do not forget to adjust the amount inc(i); end else begin // otherwise, you need to move the second pointer inc(j); s:=s+a[j]; end; end;
Even easier — in the problem of finding two identical elements in two sorted arrays (we assume that the arrays are sorted in ascending order):
while (i<=n)and(j<=m) do begin if a[i]=b[j] then begin writeln(i,' ',j); break; end else if a[i]>b[j] then inc(j) else inc(i); end;
Two pointers and a bin search
Many problems solved by two pointers can also be solved by binpointing at the cost of an additional $\log N$ in the complexity of the solution. For example, in our main example, it would be possible not to move the second pointer, but to find it again every time by bin—searching — however, we would have to calculate the prefix sums in advance (i.e., for each $k$, calculate $s[k]$ - the sum of the first $k$ elements of the array); after that, the sum on the segment from $i$ to $j$ is calculated as the difference $s[j]-s[i-1]$.
Similarly, in the task of finding two identical elements in two arrays, you can also write a bin search to find the right $j$.But the problem about the two most distant points on the circle is not very easy to solve with a binposer (although you can do a ternary search there).
Two pointers and event sorting or scanline
In the future, you will learn about the event sorting method, also known as scanline. For now, I'll just say that it has a lot in common with the two-pointer method. Namely, many two-pointer tasks are also solved by sorting events, or scanline. For example, in the problem of finding the number of points on a straight line such that the distance between them is $\geq d$, we could take all the points given to us, add the same points again, but shifted to the right by $d$, sort everything in one array, preserving the type of point (shifted or not) and then run through this array once.
Similarly, the problem of finding two identical numbers in two arrays can be solved simply by merging these two arrays into one, which can also be considered as a variant of scanline or event sorting.A lot of pointers
There are tasks when you need a lot of pointers. If you can arrange the algorithm so that all your $M$ pointers only increase their values, then the algorithm will work for $O(NM)$. Something like this:
while all pointers have not reached the end do begin if you can increase the first pointer then increase the first pointer else if you can increase the second pointer then increase the second pointer else ... end;