]> Constraints

### Chapter 3: Many Variables; Section 1: Differentiation; page 6

Previous topic: page 5 Stationary Points

Next topic: page 7 Jacobian

# Constraints

## Definition

In some applications of functions of many variables, their domains are restricted by constraints. A constraint is a functional relation among the variables. If the constraint can be solved as the dependence of one variable by the others, then the substitution of this variable in the function reduces by one the number of variables of the original function. The new function obtained this way can have different stationary points.

In order to clarify that, we will study the already known example of the function of two variables

 $z=\frac{xy}{2}$ (3.1.6.1)

It is advisable to display Fig. Constraints, in parallel to this study, for illustration.

The constraint

 $x+y=0\text{ }⇒\text{ }y=-x$ (3.1.6.2)

obviously should change the saddle point to a maximum, as can be seen also from the substitution of y in (3.1.6.1):

 $\begin{array}{l}z\text{'}=-\frac{{x}^{2}}{2}\\ \frac{\text{d}z\text{'}}{\text{d}x}=-x\\ \frac{{\text{d}}^{2}z\text{'}}{\text{d}{x}^{2}}=-1\end{array}\right\}⇒\left\{\begin{array}{l}\frac{\text{d}z\text{'}}{\text{d}x}=0\\ x=0\\ \frac{{\text{d}}^{2}z\text{'}}{\text{d}{x}^{2}}<0\end{array}\right\}⇒\text{\hspace{0.17em}}\text{maximum}$ (3.1.6.3)

Similarly, the constraint

 $x-y=0\text{ }⇒\text{ }y=x$ (3.1.6.4)

brings a minimum to the same point.

A constraint can also create a stationary point, where the original function does not have one. For instance, the constraint

 $x+y=1\text{ }⇒\text{ }y=1-x$ (3.1.6.5)

that does not contain the original saddle point, gives rise to a new stationary point, which is a maximum.

 $\begin{array}{l}z\text{'}=\frac{x\left(1-x\right)}{2}\\ \frac{\text{d}z\text{'}}{\text{d}x}=\frac{-2x+1}{2}\\ \frac{{\text{d}}^{2}z\text{'}}{\text{d}{x}^{2}}=-1\end{array}\right\}⇒\left\{\begin{array}{l}\frac{\text{d}z\text{'}}{\text{d}x}=0\\ x=\frac{1}{2}\\ \frac{{\text{d}}^{2}z\text{'}}{\text{d}{x}^{2}}<0\end{array}\right\}⇒\text{\hspace{0.17em}}\text{maximum}$ (3.1.6.6)

A constraint does not have to be of a linear character, as the examples used so far, but of any possible shape. In addition, there is no necessity of using Cartesian coordinates.

In the following example, polar coordinates will be used, for defining the constraint as a circle with radius of unity, centred at (x, y)=(0,1). According to (1.2.5.39), this circle is written as

 $r=2\mathrm{sin}\phi \text{ }\text{for}\text{ }0\le \phi \le \pi$ (3.1.6.7)

and the function becomes

 $z=\frac{xy}{2}=\frac{{r}^{2}\mathrm{cos}\phi \mathrm{sin}\phi }{2}=\frac{{r}^{2}\mathrm{sin}\left(2\phi \right)}{4}$ (3.1.6.8)

After substituting r from (3.1.6.7) into (3.1.6.8), one obtains the constrained function z' as function of $\phi$ and its derivative.

 $\begin{array}{l}z\text{'}=\mathrm{sin}\left(2\phi \right){\text{sin}}^{2}\phi \\ \left\{\begin{array}{l}\frac{\text{d}z\text{'}}{\text{d}\phi }=2\left[\mathrm{cos}\left(2\phi \right){\mathrm{sin}}^{2}\phi +\mathrm{sin}\left(2\phi \right)\mathrm{sin}\phi \mathrm{cos}\phi \right]=\\ =2\mathrm{sin}\phi \left[\mathrm{cos}\left(2\phi \right)\mathrm{sin}\phi +\mathrm{sin}\left(2\phi \right)\mathrm{cos}\phi \right]=\\ =2\mathrm{sin}\phi \mathrm{sin}\left(3\phi \right)\end{array}\end{array}\right\}$ (3.1.6.9)

According to (3.1.6.9), the stationary points are at

 $\phi =0,\text{\hspace{0.28em}}\frac{\pi }{3},\text{\hspace{0.28em}}\frac{2\pi }{3},\text{\hspace{0.28em}}\left(\pi \right)$ (3.1.6.10)

There are three stationary points. The point $\phi$ = π is the same as that for $\phi$ = 0. The second derivative gives

 $\begin{array}{l}\frac{{\text{d}}^{\text{2}}z\text{'}}{\text{d}{\phi }^{2}}=2\left[\mathrm{cos}\phi \mathrm{sin}\left(3\phi \right)+3\mathrm{sin}\phi \mathrm{cos}\left(3\phi \right)\right]=\\ =\left\{\begin{array}{l}0\text{ }\text{for}\text{ }\phi =0\text{ }⇒\text{ }\text{unspecified}\\ \text{-6sin}\left(\frac{\pi }{\text{3}}\right)<0\text{ }\text{for}\text{\hspace{0.17em}}\phi =\frac{\pi }{\text{3}}\text{ }⇒\text{\hspace{0.17em}}\text{maximum}\\ \text{6sin}\left(\frac{2\pi }{\text{3}}\right)>0\text{ }\text{for}\text{\hspace{0.17em}}\phi =\frac{2\pi }{\text{3}}\text{ }⇒\text{\hspace{0.17em}}\text{minimum}\end{array}\end{array}\right\}$ (3.1.6.11)

In order to classify the case of $\phi$ = 0, one needs further differentiation.

 $\begin{array}{l}\frac{{\text{d}}^{\text{3}}z\text{'}}{\text{d}{\phi }^{3}}=2\left[-10\mathrm{sin}\phi \mathrm{sin}\left(3\phi \right)+6\mathrm{cos}\phi \mathrm{cos}\left(3\phi \right)\right]=\\ =12\ne 0\text{ }\text{for}\text{ }\phi =0\text{\hspace{0.17em}}\left(\pi \right)\text{ }⇒\text{\hspace{0.17em}}\text{inflection}\text{\hspace{0.17em}}\text{point}\end{array}\right\}$ (3.1.6.12)

In summary, if one expresses z as function of the polar coordinate $\phi$ along the constraint (3.1.6.7), then there is a maximum at $\phi$ = π/3 , a minimum at $\phi$ = 2π/3 , and an inflection point at $\phi$ = 0 (or π).

A function of two variables can be subject to one constraint. Two different constraints yield two equations with two unknowns, fixing the variables. A function of n variables can be subject to at most n−1 different constraints.

The next example, of three variables, consists in finding the closest distance between a point in three dimensional space ${\left(x,\text{ }y,\text{ }z\right)}_{0}=\left({x}_{0},\text{ }{y}_{0},\text{ }{z}_{0}\right)$ and a plane containing the origin:

 $Ax+By+Cz=0$ (3.1.6.13)

For the solution we'll use the distance squared

 $f\left(x,y,z\right)={\left(x-{x}_{0}\right)}^{2}+{\left(y-{y}_{0}\right)}^{2}+{\left(z-{z}_{0}\right)}^{2}$ (3.1.6.14)

and (3.1.6.13) as the constraint on the variables $\left(x,\text{ }y,\text{ }z\right)$ .

By substituting z from (3.1.6.13) into (3.1.6.14) , it becomes a function of two variables:

 $f\text{'}\left(x,y,z\right)={\left(x-{x}_{0}\right)}^{2}+{\left(y-{y}_{0}\right)}^{2}+{\left(\frac{Ax+By}{C}+{z}_{0}\right)}^{2}$ (3.1.6.15)

In order to obtain (3.1.6.15), it was assumed that C≠0 . In any case, at least one of the constant factors of (3.1.6.13) should not vanish, and we could make a different substitution if C=0, without modifying the final result.

The derivatives are

 $\begin{array}{l}\left\{\begin{array}{l}\frac{\partial f\text{'}}{\partial x}=2\left(x-{x}_{0}\right)+2\left(\frac{Ax+By}{C}+{z}_{0}\right)\frac{A}{C}\\ \frac{\partial f\text{'}}{\partial y}=2\left(y-{y}_{0}\right)+2\left(\frac{Ax+By}{C}+{z}_{0}\right)\frac{B}{C}\end{array}\\ \left\{\begin{array}{l}\frac{{\partial }^{2}f\text{'}}{\partial {x}^{2}}=2\left(1+\frac{{A}^{2}}{{C}^{2}}\right)>0\\ \frac{{\partial }^{2}f\text{'}}{\partial {y}^{2}}=2\left(1+\frac{{B}^{2}}{{C}^{2}}\right)>0\\ \frac{{\partial }^{2}f\text{'}}{\partial x\partial y}=2\frac{AB}{{C}^{2}}\end{array}\end{array}\right\}$ (3.1.6.16)

which point to a minimum, since (3.1.6.16) also yields

 $D=\frac{{\partial }^{2}f\text{'}}{\partial {x}^{2}}\frac{{\partial }^{2}f\text{'}}{\partial {y}^{2}}-{\left(\frac{{\partial }^{2}f\text{'}}{\partial x\partial y}\right)}^{2}=4\frac{{A}^{2}+{B}^{2}+{C}^{2}}{{C}^{2}}>0$ (3.1.6.17)

The application of

 $\left(1+\frac{{B}^{2}}{{C}^{2}}\right)\frac{\partial f\text{'}}{\partial x}-\frac{AB}{{C}^{2}}\frac{\partial f\text{'}}{\partial y}=0$ (3.1.6.18)

excludes the y variable and yields

 $\begin{array}{l}\left(x-{x}_{0}\right)\left({A}^{2}+{B}^{2}+{C}^{2}\right)=-A\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)\\ \text{and}\text{\hspace{0.28em}}\text{by}\text{\hspace{0.28em}}\text{considerations}\text{\hspace{0.28em}}\text{of}\text{\hspace{0.28em}}\text{symmetry:}\\ \left(y-{y}_{0}\right)\left({A}^{2}+{B}^{2}+{C}^{2}\right)=-B\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)\\ \left(z-{z}_{0}\right)\left({A}^{2}+{B}^{2}+{C}^{2}\right)=-C\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)\end{array}\right\}$ (3.1.6.19)

Yielding finally

 ${\left(x-{x}_{0}\right)}^{2}+{\left(y-{y}_{0}\right)}^{2}+{\left(z-{z}_{0}\right)}^{2}=\frac{{\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)}^{2}}{{A}^{2}+{B}^{2}+{C}^{2}}$ (3.1.6.20)

## Lagrange multipliers

Solving a constraint analytically, as one variable expressed in terms of the other variables, is sometimes a difficult or an impossible task. There is however a method for using the constraints, given as implicit functions. This method, called Lagrange multipliers, allows us to find the stationary points, but not their classes. It is extensively applied in physics for problems of minima or maxima. In case of doubt, one applies additional methods for finding the class.

We are going to study this method, and apply it to a few examples. We'll start with the case of a function with two variables and one constraint.

 $\begin{array}{l}f=f\left(x,\text{ }y\right)\text{ }\text{is}\text{\hspace{0.28em}}\text{the}\text{\hspace{0.28em}}\text{function}\\ S\left(x,\text{ }y\right)=0\text{ }\text{is}\text{\hspace{0.28em}}\text{the}\text{\hspace{0.28em}}\text{constraint}\end{array}\right\}$ (3.1.6.21)

The differentials of (3.1.6.21) are

 $\begin{array}{l}\text{d}f=\frac{\partial f}{\partial x}\text{d}x+\frac{\partial f}{\partial y}\text{d}y=0\\ \text{d}S=\frac{\partial S}{\partial x}\text{d}x+\frac{\partial S}{\partial y}\text{d}y=0\end{array}\right\}$ (3.1.6.22)

In (3.1.6.22) the top expression equals zero for a stationary point. On the other hand the differentials dx and dy are not independent, but related among them by the constraint. Therefore df is not a total differential, meaning that $\frac{\partial f}{\partial x}$ and $\frac{\partial f}{\partial y}$ do not vanish. The equality to zero of the bottom expression follows from the constant value of S. The relation between dx and dy is the same as at the top. We can rewrite (3.1.6.22) as

 $\begin{array}{l}\frac{\partial f}{\partial x}\text{d}x=-\frac{\partial f}{\partial y}\text{d}y\\ \frac{\partial S}{\partial x}\text{d}x=-\frac{\partial S}{\partial y}\text{d}y\end{array}\right\}$ (3.1.6.23)

and after a division side by side, to obtain

 $\frac{\frac{\partial f}{\partial x}}{\frac{\partial S}{\partial x}}=\frac{\frac{\partial f}{\partial y}}{\frac{\partial S}{\partial y}}=\lambda$ (3.1.6.24)

where λ is a factor or proportion that generally speaking is an unknown function of the variables. This yields two equations

 $\begin{array}{l}\frac{\partial f}{\partial x}-\lambda \frac{\partial S}{\partial x}=0\\ \frac{\partial f}{\partial y}-\lambda \frac{\partial S}{\partial y}=0\end{array}\right\}$ (3.1.6.25)

with three unknowns. By adding the constraint

 $S\left(x,y\right)=0$ (3.1.6.26)

there are three equations with three unknowns. Their solution fixes the values of x, y and λ, for the stationary points. The value of λ, called the Lagrange multiplier, is a bonus but does not contain any important information.

As an example we'll use the function  z (3.1.6.8) with the constraint  $S$  from (3.1.6.7), which we rewrite in polar coordinates:

 $\begin{array}{l}z\left(r,\text{ }\phi \right)=\frac{{r}^{2}\mathrm{sin}\left(2\phi \right)}{4}\\ S\left(r,\text{ }\phi \right)=r-2\mathrm{sin}\phi =0\end{array}\right\}$ (3.1.6.27)

The three equations to be solved simultaneously are

 $\begin{array}{l}\frac{\partial z}{\partial r}+\lambda \frac{\partial S}{\partial r}=\frac{r\mathrm{sin}\left(2\phi \right)}{2}+\lambda =0\\ \frac{\partial z}{\partial \phi }+\lambda \frac{\partial S}{\partial \phi }=\frac{{r}^{2}\mathrm{cos}\left(2\phi \right)}{2}-2\lambda \mathrm{cos}\phi =0\\ r=2\mathrm{sin}\phi \end{array}\right\}$ (3.1.6.28)

The Lagrange multiplier from the top relation of (3.1.6.28) can be substituted in the next one, yielding two equations with two unknowns

 $\begin{array}{l}r\left[r\mathrm{cos}\left(2\phi \right)+2\mathrm{sin}\left(2\phi \right)\mathrm{cos}\phi \right]=0\\ r=2\mathrm{sin}\phi \end{array}\right\}$ (3.1.6.29)

with one immediate solution:

 $r=0\text{ }\text{and}\text{ }\phi =0\text{\hspace{0.17em}}\left(\pi \right)$ (3.1.6.30)

For r≠0 , the substitution of r from the second relation of (3.1.6.29) into the first, yields

 $\mathrm{sin}\phi \mathrm{cos}\left(2\phi \right)+\mathrm{sin}\left(2\phi \right)\mathrm{cos}\phi =\mathrm{sin}\left(3\phi \right)=0$ (3.1.6.31)

resulting with the same stationary points, as in (3.1.6.10). The classes of the stationary points can be found, by observing of the level lines of the function, shown in Fig. Constraints.

It can be shown that the method of Lagrange multipliers can be applied to obtain the stationary points of a function of any number n of variables, subject to any number of constraints k, providing that k<n. For this purpose one has to define k Lagrange multipliers, one for each constraint.

 $\begin{array}{l}f\left({x}_{1},\text{ }{x}_{2},.....,\text{ }{x}_{n}\right)\text{ }\text{function}\text{\hspace{0.28em}}\text{of}\text{\hspace{0.28em}}n\text{\hspace{0.28em}}\text{variables}\\ \left\{\begin{array}{l}{S}_{1}\left({x}_{1},\text{ }{x}_{2},.....,\text{ }{x}_{n}\right)=0\\ {S}_{2}\left({x}_{1},\text{ }{x}_{2},.....,\text{ }{x}_{n}\right)=0\\ ................................\\ {S}_{k}\left({x}_{1},\text{ }{x}_{2},.....,\text{ }{x}_{n}\right)=0\end{array}\right\}\text{\hspace{0.28em}}k\text{\hspace{0.28em}}\text{constraints}\text{\hspace{0.28em}}\left(k (3.1.6.32)

For calculating the stationary points of the function, one has to use the k constraints of (3.1.6.32), in addition to the following n equations:

 $\begin{array}{l}\frac{\partial f}{\partial {x}_{1}}+\sum _{i=1}^{k}\left({\lambda }_{i}\frac{\partial {S}_{i}}{\partial {x}_{1}}\right)=0\\ \frac{\partial f}{\partial {x}_{2}}+\sum _{i=1}^{k}\left({\lambda }_{i}\frac{\partial {S}_{i}}{\partial {x}_{2}}\right)=0\\ ................................\\ \frac{\partial f}{\partial {x}_{n}}+\sum _{i=1}^{k}\left({\lambda }_{i}\frac{\partial {S}_{i}}{\partial {x}_{n}}\right)=0\end{array}\right\}$ (3.1.6.33)

By doing so, the number of equations to be solved with Lagrange multipliers is larger, than by substituting k variables, a number equal to the number of constraints. In addition, the solution yields k number of multipliers, that do not contain important information. On the other hand, the equations obtained by the method of the Lagrange multipliers, are in many occasions, more convenient to solve. Furthermore the method deals easily with constraints given implicitly.

The example of the function f (3.1.6.14) with one constraint (3.1.6.13) will be reused, this time with the method of the Lagrange multipliers:

 $\begin{array}{l}f\left(x,y,z\right)={\left(x-{x}_{0}\right)}^{2}+{\left(y-{y}_{0}\right)}^{2}+{\left(z-{z}_{0}\right)}^{2}\\ S\left(x,y,z\right)=Ax+By+Cz=0\end{array}\right\}$ (3.1.6.34)

The four equations to be solved simultaneously are

 $\begin{array}{l}\frac{\partial f}{\partial x}+\lambda \frac{\partial S}{\partial x}=2\left(x-{x}_{0}\right)+\lambda A=0\\ \frac{\partial f}{\partial y}+\lambda \frac{\partial S}{\partial y}=2\left(y-{y}_{0}\right)+\lambda B=0\\ \frac{\partial f}{\partial z}+\lambda \frac{\partial S}{\partial z}=2\left(z-{z}_{0}\right)+\lambda C=0\\ Ax+By+Cz=0\end{array}\right\}$ (3.1.6.35)

Excluding the unnecessary multiplier yields:

 $\frac{x-{x}_{0}}{A}=\frac{y-{y}_{0}}{B}=\frac{z-{z}_{0}}{C}=-\frac{\lambda }{2}$ (3.1.6.36)

which leaves us with three linear equations

 $\begin{array}{l}\frac{x-{x}_{0}}{A}=\frac{z-{z}_{0}}{C}\\ \frac{y-{y}_{0}}{B}=\frac{z-{z}_{0}}{C}\\ \left\{\begin{array}{l}A\left(x-{x}_{0}\right)+B\left(y-{y}_{0}\right)+C\left(z-{z}_{0}\right)=\\ =-\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)\end{array}\end{array}\right\}$ (3.1.6.37)

The substitution of $\left(x-{x}_{0}\right)$ from the top equation of (3.1.6.37) into the last one, and the substitution of $\left(y-{y}_{0}\right)$ from the second equation into the last one, solves the value of $\left(z-{z}_{0}\right)$ :

 $z-{z}_{0}=-\frac{C\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)}{{A}^{2}+{B}^{2}+{C}^{2}}$ (3.1.6.38)

and by symmetry :

 $\begin{array}{l}x-{x}_{0}=-\frac{A\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)}{{A}^{2}+{B}^{2}+{C}^{2}}\\ y-{y}_{0}=-\frac{B\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)}{{A}^{2}+{B}^{2}+{C}^{2}}\end{array}\right\}$ (3.1.6.39)

yields the final result of f at the stationary point

 ${\left(x-{x}_{0}\right)}^{2}+{\left(y-{y}_{0}\right)}^{2}+{\left(z-{z}_{0}\right)}^{2}=\frac{{\left(A{x}_{0}+B{y}_{0}+C{z}_{0}\right)}^{2}}{{A}^{2}+{B}^{2}+{C}^{2}}$ (3.1.6.40)

in full agreement with (3.1.6.20).

## Exercises

Exercise 1. Use Lagrange multipliers to find what is the shortest distance of the plane

$\begin{array}{l}Ax+By+Cz+G=0\\ \text{with}\text{ }G\ne 0\end{array}\right\}$

from the coordinate's origin! (suggestion: first calculate the Lagrange multiplier .)

Exercise 2. The volume of a straight circular cylinder is V0. Find the radius r of the base and the height h, that minimize the surface area A of the cylinder?

1. By using the exclusion of one of the variables, with verification of the minimum!
2. By using the Lagrange multipliers! Compare the results!
3. Without minimization, what happens with the surface, for the limit of r→0 ?

Exercise 3. A cuboid (rectangular box) is inscribed in the ellipsoid

$\frac{{x}^{2}}{{a}^{2}}+\frac{{y}^{2}}{{b}^{2}}+\frac{{z}^{2}}{{c}^{2}}=1$

so that all its vertices are on the ellipsoid. Use Lagrange multipliers in order to find its largest volume!

Exercise 4. On the (x, y) plane there are the following geometrical forms: a straight line and a parabola.

$\begin{array}{l}y=x-2\\ y={\left(x+1\right)}^{2}+2\end{array}$
1. Make a sketch!
2. Prove that they are not intersecting!
3. Use the Lagrange multipliers, to express the equations for the minimal distance between them! (suggestion: four variables)
4. Calculate the minimal distance between them!

Previous topic: page 5 Stationary Points

Next topic: page 7 Jacobian