% AlgoDat ue6 % 31 May 2020
Refer to scanned pages at the end!
Refer to scanned pages at the end!
Refer to scanned pages at the end!
For this example, we supply python code for the original variant of folge
as well as our optimized version opt_folge
.
strlst@ayaya ue6 cat folge.py
#!/usr/bin/env python3
def folge(n):
if n <= 2:
return 0
if n == 3:
return 1
else:
return folge(n-1) + 2 * folge(n-2) + 3 * folge(n-3)
print([folge(n) for n in range(30)])
strlst@ayaya ue6 cat opt_folge.py
#!/usr/bin/env python3
def opt_folge(n):
f = list()
for i in range(n + 1):
f.append(0)
if i <= 2:
continue
if i == 3:
f[i] = 1
continue
f[i] = f[i - 1] + 2 * f[i - 2] + 3 * f[i - 3]
return f[n]
print([opt_folge(n) for n in range(30)])
strlst@ayaya ue6 time python3 folge.py
[0, 0, 0, 1, 1, 3, 8, 17, 42, 100, 235, 561, 1331, 3158, 7503, 17812,
42292, 100425, 238445, 566171, 1344336, 3192013, 7579198, 17996232,
42730667, 101460725, 240910755, 572024206, 1358227891, 3225008568]
0m05.36s real 0m05.35s user 0m00.00s system
strlst@ayaya ue6 time python3 opt_folge.py
[0, 0, 0, 1, 1, 3, 8, 17, 42, 100, 235, 561, 1331, 3158, 7503, 17812,
42292, 100425, 238445, 566171, 1344336, 3192013, 7579198, 17996232,
42730667, 101460725, 240910755, 572024206, 1358227891, 3225008568]
0m00.02s real 0m00.02s user 0m00.00s system
It is easily verified that the results are identical up to n=30 and it is not hard to see how the actual algorithm of opt_folge
does the same as folge
. The first 10 numbers of the sequence are [0, 0, 0, 1, 1, 3, 8, 17, 42, 100]
.
As is apparent, the optimized version offers vastly increased performance. This makes sense, because variant folge
had exponential runtime (as can be seen in the call tree expansion), while we claim that variant opt_folge
has linear runtime. One quick glance at the algorithm reveals that it computes individual terms bottom-up and consists of one for
loop with operations that require constant time inside. This proves that opt_folge
has a runtime of $O(n)$.
The qualitative reason as for why opt_folge
is better was already vaguely hinted at. folge
has exponential runtime, because it computes many calls that are duplicates. folge(6)
for example would compute folge(5)
, folge(4)
and folge(3)
, whereas folge(7)' would compute
folge(6), folge(5)
and folge(4)
. As can be seen, folge(6)
is contained in the call stack of folge(7)
, folge(5)
is contained in the call stack of folge(7)
and folge(6)
, folge(4)
... and so forth. Using dynamic programming, we circumvent this issue by memorizing temporaray results.
Refer to scanned pages at the end!
To solve this problem, let's first try to construct an induction hypothesis that is always valid. We note the conditions for two values $a_i, a_j, j > i$ to be valid in sequence are $gcd(a_i, a_j) = 1$ as well as $a_i > a_j$. Note that $a_i = a_j$ cannot be valid, because then both $a_i, a_j$ would be their respective gcd.
To solve our problem, we allocate an array LS (longest sequence), where an entry at index i denotes the length of the longest subsequence up until and including index i. The solution we seek is then encoded in this array: $LS[n]$ gives us the length of the longest subsequence and using backtracking, we can reconstruct our individual choices of elements. This is because we essentially face a decision problem, should element $a_i$ be taken or not?. If we do take an element $a_i$, the length increases by 1, otherwise it doesn't.
Now, as is usual for dynamic programming problems, we make an important assumption: consider we have LS filled with correct values up to index l, meaning, LS contains all optimal solutions up until index l. Using this to our advantage we can construct the optimal solution for index $l + 1$ using the following relation:
$$ LS[l + 1] = \begin{cases} max{ LS[k] } + 1, & if\ gcd(a_{l + 1}, a_k) = 1, ak > a{l + 1}, 1 \leq k \leq l \ 1, & else \end{cases} $$
The recurrence relation only increments a previous value of LS when an element can be appended to an already valid subsequence. Correctness should follow intuitively, as the recurrence relation is valid for the base case $l = 0$, as well as for all subsequent elements due to the restrictions we placed on $LS[l + 1]$. We can now put our ideas into code. For illustration purposes, we provide a python program with a function length(...)
that provides $LS[n]$, getLS(...)
that provides LS and backtrackLS(...)
that provides a valid solution $(b_i) \subseteq (a_i)$.
#!/usr/bin/env python3
import sys
from math import gcd
def length(LS, n):
return LS[n - 1]
def getLS(A, n):
LS = list()
# loop all i until n
for i in range(n):
lengthiest = 1
# loop all j until i
for j in range(i):
# check for validity between a_i and a_j
if A[j] > A[i] and gcd(A[i], A[j]) == 1:
localLengthiest = LS[j] + 1
# update if new max was found
if localLengthiest > lengthiest:
lengthiest = localLengthiest
LS.append(lengthiest)
return LS
def backtrackLS(A, LS, n):
LSseq = list()
l = length(LS, n)
for i in reversed(range(1, n)):
# backtracking parameters
islengthier = l >= LS[i]
isfirst = len(LSseq) == 0
if not isfirst:
isbiggerthanlast = LSseq[-1] < A[i]
iscoprime = gcd(LSseq[-1], A[i]) == 1
# decide whether A[i] was part of initial solution or not
if (islengthier and isfirst) or (islengthier and
isbiggerthanlast and iscoprime):
LSseq.append(A[i])
l -= 1
# reverse result
return LSseq[::-1]
def main(args):
# parsing
A = list(map(int, args.split(' ')))
n = len(A)
print("(a_i): ", A)
# compute values of interest
LS = getLS(A, n)
print("LS: ", LS)
l = length(LS, n)
print("length: ", l)
LSseq = backtrackLS(A, LS, n)
print("sequence:", LSseq)
if __name__ == '__main__':
lines = sys.stdin.read()
main(lines)
Running this python program for the example that was given in the task description yields the following results:
strlst@ayaya ue6 echo 3 12 10 9 4 11 6 5 4 3 | python3 ls.py
(a_i): [3, 12, 10, 9, 4, 11, 6, 5, 4, 3]
LS: [1, 1, 1, 2, 3, 2, 3, 4, 5, 6]
length: 6
sequence: [12, 11, 6, 5, 4, 3]
The validity of this program hinges on our recurrence relation. The runtime of the actual implementation can be verified to be in $O(n^2)$. Because the most expensive part of the program is the double for
loop structure in getLS(...)
, because both these loops only count up to n and because all operations within the double for
loop structure are in $O(1)$, the program has to be in $O(n^2)$.