Problem 5:

Let f(z) = 1/GAMMA(z) , where is the gamma function, and let be the cubic polynomial that best approximates on the unit disk in the supremum norm . What is ?

Solution:

Since the functions are analytic, the supremum will occur on the unit circle. So write f(z) and p(z) as functions of t where z = exp(I*t).

> restart;
F:= t -> 1/GAMMA(exp(I*t));
F(Pi):= 0;

> P:= (a,t) ->
a[0] + a[1]*exp(I*t)+
a[2]*exp(2*I*t)+a[3]*exp(3*I*t);

Note that has the symmetry . It is easy to show that the best approximation must have that symmetry too, which means we can assume the coefficients to are real. And then , so it suffices to take in the interval .

It's convenient to have the conjugates of these functions available.

> Fbar:= t -> 1/GAMMA(exp(-I*t));Fbar(Pi):= 0;

> Pbar:= (a,t) -> P(a,-t);

We have to minimize with respect to the maximum with respect to of (this is a quadratic polynomial in ,..., , so it's a bit simpler to deal with than ). I can change this from a min-max question to a constrained minimization by writing it as

minimize (with respect to and ) subject to for each .

Now write the Karush-Kuhn-Tucker conditions:

The Lagrangian is

Formally the sum is over all . In general this might run into technical problems, but they won't occur here: will be nonzero for only finitely many . The Karush-Kuhn-Tucker conditions say

for

, , and for

Note that this last condition implies for only finitely many , since for only finitely many .

Also, since the objective is linear and the constraints are convex, we don't have to worry about local minima. The Karush-Kuhn-Tucker conditions should have a unique solution, which will give us the global minimum.

Finding the solution is not so easy, however. I spent some time exploring the parameter space, looking for approximate solutions.
For a starting point, we can use a Taylor polynomial for f(z), corresponding to a partial sum of the Fourier series for F(t). This will give us the best approximation in the L^2 norm on the circle.

> ptayl:= taylor(1/GAMMA(z),z=0,4);

> a0:= table([seq(j=evalf(coeff(ptayl,z,j)),j=0..3)]);

> plot(abs(P(a0,t)-F(t))^2,t=0..Pi);

Here's a fast way to approximate the maximum of . Instead of all t, I'll just look at 90 of them.
It's more convenient to use the list rather than the table (whose indices start at 0).

> eits:= evalf([seq(exp(I*i*Pi/90),i=0..90)]):
fvals:= evalf([seq(F(i*Pi/90),i=0..90)]):
appnorm:= proc(L) local i;
max(seq(
abs(fvals[i]-L[1]-eits[i]*(L[2]+eits[i]*(L[3]+eits[i]*L[4]))),i=1..91)) end:

Now we generate random steps from the current L, accepting the step if it improves the norm.

>    ti:= time():
bestL:= [a0[0],a0[1],a0[2],a0[3]]:
bestnorm:= appnorm(bestL);
for iter from 1 to 5000 do
  L:= bestL + .01*exp(-.001*iter)*[stats[random,normald](4)];
  newnorm:= appnorm(L);
  if newnorm < bestnorm then
    bestL:= L; bestnorm:= newnorm;
    print ("iteration",iter,bestnorm,L);
  else
    L:= 2*bestL-L;
    newnorm:= appnorm(L);
    if newnorm < bestnorm then
      bestL:= L; bestnorm:= newnorm;
      print ("iteration",iter,bestnorm,L);
    fi
  fi
od:
(time()-ti)*seconds;

The last solution turns out to have 4 correct digits of the actual answer.

> a1:= table([seq(j=bestL[j+1],j=0..3)]);

In our approximate solution, is close to constant from about to .

> plot(abs(F(t)-P(a1,t)),t=0..Pi);

On closer inspection, it achieves its maximum at three points in this interval, one of which is .

> plot(abs(F(t)-P(a1,t)),t=1..Pi,0.21 .. 0.21437 );

> peak1:= fsolve(diff((F(t)-P(a1,t))*(Fbar(t)-Pbar(a1,t)),t), t= 1.4 .. 1.5); peak1:= simplify(%,zero);

> peak2:= simplify(fsolve(diff((F(t)-P(a1,t))*(Fbar(t)-Pbar(a1,t)),t), t= 2.2 .. 2.3),zero); v1:= evalf(abs(F(Pi)-P(a1,Pi))^2);

So we should expect for three values of , say , and . We write the KKT equations with this assumption; note that in order to have for all and for we need at . This is automatic at .

> t[3]:= Pi:
kkt:= [ 1 - (lambda[1]+lambda[2]+lambda[3]),
seq(
add(((Pbar(a,t[i])-Fbar(t[i]))*exp(j*I*t[i])+(P(a,t[i])-F(t[i]))*exp(-j*I*t[i]))*lambda[i],i=1..3),j=0..3),
seq(
(F(t[i])-P(a,t[i]))*(Fbar(t[i])-Pbar(a,t[i])) - v, i=1..3),
seq(eval(diff((F(t)-P(a,t))*(Fbar(t)-Pbar(a,t)),t), t=t[i]),i=1..2)];