An Infinite Family of Hyper-Lambert Functions and Their Use In Numerically Solving Exponential Tower Equations

(The analysis that follows is a light introduction. For more austere results, consult Paper 5).

Lambert's W function is the function that satisfies the functional equation W(z)*e^W(z)=z (1)

Various properties of this function have been presented in the articles: Infinite Exponentials, A Maclaurin Expansion for Lambert's W function and Numerically Approximating the Principal Branch of the Complex Lambert's W function.

This function will henceforth be denoted as W, unless explicitly named. W is multi-valued so it is usually denoted as W(k,z), with k in Z, denoting the appropriate branch and W(0,z)=W(z) denoting the principal branch.

Preliminaries

Let us return to equation (1) for a moment. As W satisfies it, it is obvious that if we have an equation that can be brought into the form x*A^x=y (2), W can solve it as follows:

x*A^x=y, =>

x*e^log(A)*x=y, =>

log(A)*x*e^log(A)*x=log(A)*y, =>

log(A)*x=W(log(A)*y), =>

x=W(log(A)*y)/log(A).

Now let us verify that x as given above, satisfies equation (2):

W(log(A)*y)/log(A)*A^{W(log(A)*y)/log(A)}=

W(log(A)*y)/log(A)*e^W(log(A)*y)=

log(A)*y/log(A), (using the definition (1) with z=log(A)*y)

=y.

A quite large family of exponential equations that are otherwise unsolvable algebraically can be brought into form (2) or a similar form and as such can be solved by the W function. For example, the equation c^x=x, can be brought to a form closely resembling (2), and thus we can solve it analytically by W:

x=c^x, =>
x*c^-x=1, =>

-x*e^-x*log(c)=-1, =>

-x*log(c)*e^-x*log(c)=-log(c), =>

-x*log(c)=W(-log(c)), =>

x=-W(-log(c))/log(c).

A similar treatment^[1] solves equations of the form: A*c^x=B*x, A,B=/=1, A=/=0. (3)

The principal motivation behind such manipulations is to solve exponential equations that are otherwise impossible to solve algebraically. Knowing the behavior of W one can determine immediately for example whether equation 2^x=x has any real solutions without resorting to numerical methods. In the latter case, solving the equation using W, one gets W(-ln(2))/ln(2).

As W is real valued for x in [-1/e,+∞) and as -ln(2)<-1/e, which is the branch point of the W, it follows that there are no real values that satisfy this equation (this could have been determined by analyzing the function f(x)=2^x-x).

After working with several exponential equations, one starts noticing that there exist equations which cannot be brought into form (3). Examples would be the equations:

x*e^{e^x}=y.

x*e^{x*e^x}=y.

x*2^{x*7^x}=y.

Trying to bring any of the above into form (3), would be an exercise in futile circularity. A little tinkering reveals the most general form of equation which is "solvable" by W for z:

a*z^m*e^b*zⁿ=y.

The question that raises its head now, is, what's keeping us from defining a "function", similar to W, to solve these equations? Well, one has to be very careful with what one means by "function" here. How do we know that if such objects exist, they are well defined?

Notation

Let us start with some consistent notation. Assume we are in possession of the following:

S={f_i(x):R->R}, i in {1,2,3,...n}, n in N, finite natural.

We define a function F:R->R, recursively:

F(n,x)={e^f₁(x), if n=1, (e^f_n(x))^F(n-1,x), if n>1}.

The function that will ultimately interest us (these definitions can later be extended from C to C) will be the "inverse" of:

x*F(n,x), n in N, x in R. (4)

The above function in its nonrecursive form, given n in N, it looks like this:

x*(e^f_n(x))^{(e^f_n-1(x))^{(e^f_n-2(x))^{...^{(e^f₁(x))}}}}.

The reason the author chose the indexes going from top to bottom, is to conform to the rules of iterated exponentiation and because as such it can be easily defined in Maple.

Now, given S and momentarily disregarding issues of existence and whether such an object exists altogether, we introduce a new "object", called (the) (possibly multi valued) "inverse" of x*F(n,x), by:

HW(f₁,f₂,...f_n;y), (5)

In other words, we hope that this "object" should satisfy:

HW(f₁,f₂,...f_n;y)*F(n,HW(f₁,f₂,...f_n;y))=y. (6)

Examples

If n=1, then for f₁(x)=x, x*F(1,x)=x*e^x, which is the left side of (2), the inverse of which is the real W.

If n=2, then for f₁(x)=log(2)*x and f₂(x)=1, x*F(2,x)=x*e^{2^x}, a function whose inverse cannot be found via W algebraically.

A more exotic example: If n=3, then for f₁(x)=log(3)*x, f₂(x)=log(5)*x and f₃(x)=log(7)*x², x*F(3,x)=x*7^{x²*5^{x*3^x}}, a function whose inverse might be even more exotic.

Using the above notation, the "inverses" of the above examples, would be denoted as:

HW(x;y)=W(y)

HW(log(2)*x,1;y) and

HW(log(3)*x,log(5)*x,log(7)*x²;y)

What should the f_i look like so that we are guaranteed the existence of HW? For the real case the answer us easy.

Lemma #1:

If f_i, i in {1,2,3,...n}, n in N, take values from D <= R and are such that x*F(n,x) is defined in all of D and is 1-1 there,then HW exists there.

Proof:

Standard Calculus. Care has to be taken to ensure that x*F(n,x) is 1-1 in D. For example, if n=2, f₁=x and f₂=1/x, D cannot contain any full neighborhood of x~0.6590460684, as this x is one of the positive roots of the equation e^x*x-e^x+x=0 and that's where the (local) minimum of x*F(n,x) occurs.

Numerical Examples

Let n=1, 2, 3, ..., N and look at x*F(n,x), with f_i:R⁺->R, f_i(x)=x, for all i in {1,2,...,n}. We are then effectively looking at the inverse of x*{ⁿ(e^x)}, using the tetration operator. The main idea here, is that if we have an analytic f(x) and the equation f(x)=y, perhaps if we solve the equation T_m(x)=y (for a certain finite m), where T_m(x) is the Taylor polynomial of order m, we may be able to obtain approximations to a root a of f(x)=y. Once we have such an approximation, we can perhaps write a=f^-1(y), under suitable circumstances.

Our strategy is therefore: Given a certain analytic f and an equation f(x)=y:

Solve the equation T_m(x)=y.
Test all the solutions of the above equation and see if any of them are close to a root.

Let's try some Maple code and leave the burden of why it works for later:

> N:=6;
> for i from 1 to N do
> f[i]:=z;
> od:

> F:=proc(n)
> if n=1 then exp(f[1]);
> else exp(f[n])^F(n-1);
> fi:
> end:

And the "inverse":

> HW:=proc(y,m,n) #y:value of inverse at.n:Height of tower. m:accuracy.
> local s,p,sol,i,aprx,dy,dist,solution;
> dist:=infinity;#assume infinite distance
> s:=series(z*F(n),z,m);#find Taylor polynomial
> p:=convert(s,polynom);
> sol:={fsolve(p=y,z,complex)};#solve Taylor polynomial
> for i from 1 to nops(sol)do#check all roots one by one
> aprx:=evalf(subs(z=op(i,sol),z*F(n)));#get an estimate
> dy:=evalf(abs(y-aprx));#get epsilon
> if dy<=dist then#if closer
>   solution:=op(i,sol);#this root is a better one
>   dist:=dy;#update distance
> fi;
> od;
> if Im(solution)<>0 then#imaginary solution
> if Im(y)=0 then #pick Arg(z)>0
>   solution:=Re(solution)+abs(Im(solution))*I;
> else #Im(y)<>0#original argument complex, so apply counterclockwise continuity
>   if Im(y)>0 then
>    solution:=Re(solution)+abs(Im(solution))*I;
>   else
>    solution:=Re(solution)-abs(Im(solution))*I;
>   fi;
> fi;
> fi;
> solution;
> end:

> plot({seq('HW'(n,y,35),n=1..6)},y=0..3);

From top to bottom, they are the functions:

HW(x;y),

HW(x,x;y),

...

HW(x,x,x,x,x,x;y).

The top function (HW(x;y)) is an old friend: The real W. The functions appear to converge. We don't know this yet, but note that informally, when ⁿ(e^x) converges, it converges to e^-W(-x)^[2]. Accordingly, we are looking at the inverse of x*e^-W(-x). And here we get a surprise with Maple:

> solve(x*exp(-W(-x))=y,x);
y/e^y, which makes perfect sense, since this is log(z) whenever the infinite tower z^{z^{z^...}} converges^[3].

Informally thus, HW(x,x,x,...;y)=y/e^y.

Note that according to the same reference, we have convergence for positive real y, whenever y<=1. And here are the graphs along with the limit:

The Complex Case

Before proceeding we need to explain the code a bit:

First, the choice of "solution" is ambiguous in the do loop, whenever both "solution" and "conjugate(solution)" are in the solution set, since both minimize epsilon. For this, we pick among the two using counter-clockwise continuity, by defining our branch cut to be the negative real axis. If y is in R and "solution" is in C, we pick the one with Arg(solution)>0. If y is in C, we pick between "solution" and "conjugate(solution)" the one whose Arg has the same sign as Arg(y). This guarantees that if we approach the negative real axis from above, the solution's Arg will be positive, while if we approach the negative Real axis from below, the solution's Arg will be negative.

Second, how do we know that increasing m will increase the accuracy of the found root?

Proof:

Fix n and let g(z)=z*F(n,z), so we have the equation g(z)=y. Let f(z)=g(z)-y, g_k(z)=T_k(z), where T_k(z) is the Taylor polynomial of degree k of g(z) and let f_k(z)=g_k(z)-y. f_k(z)->f(z) uniformly and all functions are entire. In particular g(z) is entire and non-constant, so according to Picard's Little Theorem,=> (Ay)(with at most one exception)(Ea):g(a)=y,=> (Ea)(except in at most one case): f(a)=0.

Once we have a root a, Hurwitz's Root Theorem provides for a neighborhood |x-a|<δ, and a number m=N, such that within the neighborhood |x-a|<δ, => (Ak>m=N):f_k(z)=0 has a root in that neighborhood. In particular, (Ea_T): T_k(a_T)=y, and a_T satisfies: |a_T-a|<δ.

Using the continuity of g(z),=> |y-g(a_T)|<epsilon, and the code above picks the a_T that minimizes epsilon. Note that if a_{T_k} is in {z: f_k(z)=0}, the uniform convergence of f_k(z) guarantees that lim_k->+∞inf{|a-a_{T_k}|}=0, so increasing the degree of the Taylor polynomials gives us better and better approximations of the root a.

Let's see if we can solve some complicated exponential equations this way. In each case, we have to run the do loop and then substitute for the functions f_i that interest us. Then we run the iterated exponential code.

If our equation is: x*e^{e^x}=3, we are interested in HW(x,1;3):

> for i from 1 to N do
> pf[i]:=z;
> od:
> f[2]:=1;

> F:=proc(n)
> if n=1 then
> exp(f[1]);
> else
> exp(f[n])^F(n-1);
> fi:
> end:

Our y is 3:

> a:=HW(3,10,2);
a := .5397051810

> evalf(subs(z=a,z*F(2)));
3.000469496

Suppose our equation had been instead: e^{e^x}=x. This equation can be immediately brought into the form: x*e^{-e^x}=1. In this case we run the code similarly as we are interested in HW(x,-1;1):

> f[2]:=-1;

Our y now is 1:

> a:=HW(1,10,2);
a := .3144595775 + 1.338932374*I
> evalf(subs(z=a,z*F(2)));
1.003995243 + .007277260574*I

Let's increase the accuracy a bit:

> a:=HW(1,30,2);
a := .3181315052 + 1.337235701*I
> evalf(subs(z=a,z*F(2)));
.9999999991 + .5033177302*10^-9*I

A more exotic example: 2^{7^{3^x}}=x. This equation can be immediately brought into the form: x*2^{-7^{3^x}}=1, so we are looking for HW(x*log(3),log(7),-log(2);1):

> f[1]:=log(3);
> f[2]:=log(7);
> f[3]:=-log(2);

Our y is again 1:

> a:=HW(1,10,3);
a := -.1907010314 + .7599046786*I
> evalf(subs(z=a,1/F(3)));
-.5821011572 + 2.103917908*I

Let's increase the accuracy a bit:

> a:=HW(1,40,3);
a := .3517427618+.5499294499*I
> evalf(subs(z=a,1/F(3)));
.3453506993+.5666889887*I

Convergence is a little slow in the above case.

A Lambert W-like example: x*e^{x*e^x}=24, so we are looking for HW(x,x;24):

> f[1]:=z;
f[2]:=z;

Our y is 24:

> a:=HW(24,35,2);
a := 1.068699460
> evalf(subs(z=a,z*F(2)));
24.00008895

The above Maple code is an attempt to "invert" z*F(n,z) locally. For a global complex inverse, the procedure is pretty much straightforward. One finds the points where df/dz=0, and one cuts the graph, making sure the cut passes through the images of those points. For example, for W:

d(z*e^z)/dz=0,=>

e^z+z*e^z=0,=>

e^z*(1+z)=0,=>

z=-1.

So, in the case of the W, the inverse is valid provided we have a cut that starts at (-1)*e^-1=-1/e. Indeed that's where the branch cut of the principal branch of W starts, extending towards the negative Real axis.

In the code above there is no such proviso, however the code behaves correctly and gives the desired branch cut. Let's verify this. First, the graph of the W as it is coded internally by Maple:

> p[0]:=complexplot3d(W(w),w=-1-I..1+I):
> display(p[0]);

Then, our code:

> p[1]:=complexplot3d('HW'(x+y*I,10,1),-1-I..1+I):
> display(p[1]);

Does our Maple code provide a branch cut starting at -1/e? Let's see:

First, let's check the value of our code @ -1/e:

> HW(-exp(-1),20,1);
-1.000020402
> evalf(W(-exp(-1)));
-1.

Let's move a bit towards 0:

> HW(-exp(-1)+0.01,20,1);
-.7832291994
> evalf(W(-exp(-1)+0.01));
-.7832291993

Now let's move ε above and below the Real axis:

> HW(-exp(-1)+0.01+0.0001*I,20,1);
-.7832263387 + .001009590183*I
> evalf(W(-exp(-1)+0.01+0.0001*I));
-.7832263386 + .001009590182*I

> HW(-exp(-1)+0.01-0.0001*I,20,1);
-.7832263387 - .001009590183*I
> evalf(W(-exp(-1)+0.01-0.0001*I));
-.7832263386 - .001009590182*I

It sure looks as if our "function" is continuous at -1/e+ε for z approaching above and below. No branch cut there.

Now let's move a little left of -1/e and approach from above and below:

> HW(-exp(-1)-0.01+0.0001*I,20,1);
-.9832468650 + .2314374776*I
> evalf(W(-exp(-1)-0.01+0.0001*I));
-.9809718562 + .2310842101*I
> HW(-exp(-1)-0.01-0.0001*I,20,1);
-.9832468650 - .2314374776*I
> evalf(W(-exp(-1)-0.01-0.0001*I));
-.9809718562 - .2310842101*I

Success! We have a definite positive and negative imaginary quantity there which is not close to 0, which seems to validate that we are approaching a cut!

With these thoughts in mind, let's see the principal branches of some of these "inverses" extended to the complex plane:

> HW(z,z;w):
> p[2]:=complexplot3d('HW'(x+I*y,10,2),-1-I..1+I):
> display(p[2]);

> HW(z,z,z;w):
> p[3]:=complexplot3d('HW'(x+I*y,10,3),-1-I..1+I):
> display(p[3]);

All three together:

> display({p[1],p[2],p[3]});

Let's look at them from a different angle:

Finally, all three, along with the limit function y/e^y:

> p[-1]:=complexplot3d(z*exp(-z),z=-1-I..1+I):
> display({p[-1],p[1],p[2],p[3]});

Notes/References

It is instructive for the reader to try to solve the equation e^x+a*x+b=0, for example, using only definition (1) (answer: x=-W(e^(-b/a)/a)-b/a).
See A Series Expansion for ⁿ(e^x).
See A Slightly Deeper Analysis of Infinite Exponentials.