home | index | units | counting | geometry | algebra | trigonometry & functions | calculus
analysis | sets & logic | number theory | recreational | misc | nomenclature & history | physics

Final Answers
© 2000-2009 Gérard P. Michon, Ph.D.

 Joseph-Louis Lagrange 
 (1736-1813)

Optimization


 border
 border  border
 border
 border
 border
 border

Related articles:

Related Links (Outside this Site)

An Introduction to Lagrange Multipliers  by  Steuard Jensen.
Calculus of Variations and Functional Differentiation  by  Joceline Lega.
Saddlepoint Methods in Contour Integration  by  Michael Fowler  (Physics 752)

Wikipedia :   Plateau's Problem

 
border
border

Optimization

Vocabulary :  For an objective function of several variables, a "stationary point" is a set of values of those variables where all partial derivatives of the function vanish.  As illustrated below, a local extremum must be a stationary point but the converse need not hold.
 
Because of the inclusive meaning assigned by default to the simplest mathematical terms  (which is the exact opposite of the  exclusive  meaning often attributed to simple words in everyday language)  most mathematicians consider "stationary point" and "saddlepoint" to be synonymous:  At a saddlepoint, the relevant quantity  may  rise in some directions and fall in others but it's not required to do so...  There might very well be an extremum there!
 
Some authors reserve the term "saddlepoint" to  nonextremal  stationary points.  We prefer to call those  proper  saddlepoints  (thus following  normal  mathematical nomenclature).


(2007-09-23)   Optimizing a smooth function  f  of a single variable.
'Tis little more than finding where the function's derivative vanishes.

   Maximum within range, 
 minimum at extreme range.
For a point  x  in the  interior  of the domain of  f  and a small enough value of  e,  both points  x-e  and   x+e  are in the allowed domain and yield values of  f  which are on both sides of  f (x)  if we just assume that  f '  is nonzero.  Therefore, there can be an extremum at point  x  only when  f ' (x)  vanishes.

On the other hand, there's no such requirement for a point  x  at the border of the allowed domain, because small displacements of  only one sign  are allowed.

Away from the border,  a  saddlepoint  x  (let's use that general term to indicate that  f ' (x)  vanishes)  will be the location of a  minimum  (resp. a  maximum )  when  f '' (x)  is positive  (resp. negative).  If it's zero, further analysis is needed to determine whether  x  is the location of an extremum or not.

That discussion involving  second-order  (or higher)  derivatives is typical of what happens with several variables when it comes to decide whether a saddlepoint is an actual extremum, as illustrated in the next article.


(E. M. of Wisconsin Rapids, WI. 2000-11-21Saddlepoints & Extrema
Determine the points where the [objective] function  z = 3x3+3y3-9xy  is maximized or minimized.   [ Check for  second-order  condition. ]

A necessary (but not sufficient) condition for a smooth function of two variables to be extremal  ( "minimized" or "maximized" )  on an  open  domain is to be at a  saddlepoint  where both partial derivatives vanish.  In this case that means:

9x2-9y = 0     and     9y2-9x = 0

Thus, an extremum of  z  can  only  occur when  (x,y)  is either  (0,0)  or  (1,1).

To see whether a local extremum  actually  occurs, we must examine the second-order behavior of the function about each such  saddlepoint  (in the rare case where the second-order variation vanishes, further analysis would be required).

Well, if the second-order partial derivatives are  L, M and N,  then the second-order variation at point  (x+dx, y+dy)  is the quantity

½ [  L(dx)2 + 2M(dxdy) + N (dy)2 ]

We recognize the bracket as a quadratic expression whose sign remains the same  (regardless of the ratio dy/dx)  if and only if   its  (reduced)  discriminant (M2-LN)  is negative.  If it's positive, the point in question is not an extremum.

In our example, we have  L = 18x,  M = -9  and  N = 18y.  Therefore:

M2 - LN   =   81 (1 - 4xy)

At  (0,0)  this quantity is positive  (+81).  Thus, there's  no extremum  there.  On the other hand, at the point  (1,1)  this quantity is negative  (-243)  so the point  (1,1)  does correspond to  the only local extremum  of  z.

Is this a maximum or a minimum?  Well, just look at the sign of  L  (which is always the same as the sign of  N  for an extremum).  If it's positive, surrounding points yield higher values and, therefore, you've got a minimum.  If it's negative you've got a maximum.  Here, L = 18,  so a  minimum  is achieved at  (1,1).

To summarize:  z  has only  one  local extremum; it's a minimum of -3, reached at x=1 and y=1.  Does this mean we have an  absolute  minimum?  Certainly not!  (When  x  and  y  are negative, a large enough magnitude of either will make  z  fall below  any  preset threshold.)


(2007-09-22)   Unconstrained (or "absolute") saddlepoints
Saddlepoints of a function of several  independent  variables.

We're seeking saddlepoints (stationary points) of a scalar  objective function  M  of  n  variables  x1 , ... , xn  which we view as components of a vector  x.

M ( x1 , ... , xn )   =   M ( x )

If those  n  variables are  independent,  then a saddlepoint is obtained only at a point where the differential form  dM  vanishes.  This is to say:

0   =   dM   =   grad M  . dx

As this relation must hold for  any  infinitesimal vectorial displacement  dx, such  absolute  saddlepoints of  M  are thus characterized by:

grad M   =   0

That vectorial relation is shorthand for  n  separate  scalar  equations:

M     =   0
Vinculum
 xi


(2007-09-22)   Lagrange Multipliers
An optimization involves one  Lagrange multiplier  per constraint.

Instead of a set of independent variables  (as discussed above) we may have to deal with several variables that are tied to each other by some  constraints.

For example, the variables may be subject to a single constraint :

E ( x1 , ... , xn )   =   constant

While an unconstrained saddlepoint of  M  was obtained when  dM = 0.  A constrained saddlepoint is obtained when  dM  is proportional to  dE.

More generally, when  k  functions  E1 ... Ek  are given to be constant, a constrained saddlepoint of  M  is achieved when the differential form  dM  is a linear combination of the differentials of  E1 ... Ek.

dM   =   l1 dE1  +  l2 dE1  +   ...   +  lk dEk

In other words, there are  k  constants  li  (each is known as the  Lagrange multiplier  associated with the corresponding constraint)  such that the following function has an  unconstrained  (or absolute)  saddlepoint.

M  -  ( l1 E1  +  l2 E1  +   ...   +  lk Ek )

 Come back later, we're
 still working on this one...


(2007-09-23)   The  fattest  cone of given base...
What apex minimizes the lateral area of a cone of given base and volume?

The volume of a cone is one third of the product of its base area by its height.  So, by imposing the cone's volume, we're actually imposing the  height  of the cone and looking for an optimal apex somewhere in a fixed plane parallel to the base...

 Come back later, we're
 still working on this one...


(2007-09-23)   Calculus of Variations  (cf. Lagrangian mechanics)
Euler-Lagrange equations hold whenever a  path integral  is stationary.

Let  L  be a  smooth enough  function of  2n+1  variables:

L   =   L ( q 1 ,  q 2   ...  q n ,  v1 ,  v2   ...  vn ,  t )

Assume that the first  n  variables  (q)  are actually functions of the last one  (t)  and also that the subsequent variables  (v)  are their respective derivatives:

v i (t)    =     d   q i (t)
Vinculum
dt

This makes  L  a function of  t  alone and we may consider the following integral  S  (called the "action" of  L )  for given [fixed] values at both extremities of  all  the  2n  quantities  q i (a)  and  q i (b).

S    =    ò  b   L  dt
a

The fundamental problem of the  calculus of variations  is to establish the local conditions which make  S  stationary,  for  optimal  functions  q i .  Namely:

Euler-Lagrange equations
 L     =     d  (   L  )
Vinculum Vinculum Vinculum
q i dt vi

Proof :   From a set of  optimal  functions  q i  and  arbitrary  (sufficiently smooth)  functions  h i  which vanish at both extremities  a  and  b,  we may construct the following family of functions, depending on a single parameter  e :

Q i (t)   =   q i (t)  +  e h i (t)   V i (t)   =   v i (t)  +  e   d   h i (t)
Vinculum
dt

Those yield a value  S(e)  of the action which must be stationary at  e = 0  (since the functions  q i  are optimal).  Thus, the derivative of  S(e)  vanishes at  e = 0:

0    =    ò  b   å i   [  h i    L    +    d h i      L  ]  dt
Vinculum Vinculum Vinculum
a q i dt vi

Since every  h i  vanishes at both extremities  a  and  b ,  we may  integrate by parts  the second term of the square bracket to obtain:

0    =    ò  b   å i   h i   [   L    -    d  (   L  )  ]  dt
Vinculum Vinculum Vinculum
a q i dt vi

As  h i  is arbitrary, the square bracket must vanish everywhere, for every  i.    QED

Theoretical and Practical Examples :

The above is most commonly used as the basis for variational mechanics and related aspects of theoretical physics based on a  principle of least action.  However, it's also the correct answer to more practical concerns:

On 2008-10-27, Bill Swearer wrote:       [edited summary]
As a pilot, I’ve always been interested in writing a  proper  piece of flight-planning software to optimize the plane’s path with regard to time, fuel, or any combination thereof.  [...]
 
I’ve always felt the professionally-provided solutions were crude approximations that do not take into account the full range of possibilities, especially over long distance, as one crosses varying jet streams at different altitudes, etc.  [...] What is the correct mathematical approach?
Thanks a lot.
Bill Swearer

Well, just express carefully the  local  cost function  (L)  as a function of the position of the aicraft and of the related derivatives  (to a good approximation, the latter are only useful to compute horizontal speed).  Predicted meteorological conditions  (changing with time throughout space)  can be used for best planning.

The Euler-Lagrange equations then tell the pilot exactly what to do at all times.


(2009-07-05)   A Proof of Noether's Theorem   (1915)
Proving Noether's theorem for continuous symmetries.

A slight modification of the above proof yields a straight derivation of one of the greatest statements of mathematical physics.  Let's just do it...

Suppose that, for specific functions  hi ,  a symmetry exists which leaves  L  unchanged,  (to first-order variations of   e  about 0)  when  q i  is replaced by

Q i   =   q i  +  e h i

Formally, this leads to a situation similar to the previous one, since we still know that the derivative of  S(e)  vanishes at  e = 0  (albeit for very different reasons).  However, the  h  functions need not vanish at the extremities  a  and  b.  So, an extra "integrated term" appears in the following expression of the derivative of  S(e which results from our integration by parts:

0   =   å i   h i    L   |  b   +   ò  b   å i   h i   [   L    -    d  (   L  )  ]  dt
Vinculum Vinculum Vinculum Vinculum
vi a a q i dt vi

Now, as the integral still vanishes  (because the previously established Euler-Lagrange equations make each square bracket vanish)  the extra term must be zero as well.  This means that the following quantity doesn't change:

Conserved  Noether Charge
å i   h i    L
Vinculum
vi

Video:   Lagrangian & Field Theory  by  Leonard Susskind blog


(2008-11-10)   The Isoperimetric Inequality
Among planar loops of unit length, the circle encloses the largest area.

The surface area  S  enclosed by a closed planar curve of given perimeter  P  is largest in the case of a circle.  This ancient result can be expressed by the following relation, known as the  isoperimetric inequality:

S   ≤   P 2 / 4p

The three-dimensional equivalent of the isoperimetric inequality says that the closed surface of area  S  which encloses the largest volume  V  is a  sphere.

V 2   ≤   S 3 /  36 p

The above can be generalized to  n  dimensions:  No hypersurface of hyperarea  S  encloses a larger  hypervolume  V  than the hypersphere.  This makes the relations given in the  oldest Numericana article  yield:

V n-1   ≤    G ( 1 + n/2 )   [ S /  nÖp ] n


(2008-11-10)   Plateau's Problem
The surface of least area with a given border.

The  Plateau rules  (1873)  state that the solutions are smooth surfaces of constant mean curvature with only two possible types of singularities:  lines  where  3  such smooth surfaces meet at  120° angles and isolated  points  where  4  of those lines meet in a regular tetrahedral configuration.

 Come back later, we're
 still working on this one...


(2008-11-18)   Borderless embedded surfaces of minimal area
Minimal surfaces without edges or self-intersections in ordinary space.

Such surfaces are technically known as  complete embedded minimal surfaces.

Before 1982, only four examples of these were known:

  • The plane. 
  • The catenoid  (Euler, 1741).  In 1776, Meusnier remarked that the catenoid has zero  mean curvature. 
  • The helicoid  (Meusnier, 1776).  Catalan remarked that the plane and the helicoids are the only minimal surfaces which are  ruled. 
  • A fourth example was found by Riemann in 1860 which consists of infinitely many horizontal planes with slanted tunnels between adjacent pairs.

   Celso da Costa
Celso José da Costa
In 1982,  Celso J. Costa  (then a graduate student in Brazil)  stumbled upon a minimal surface topologically equivalent to a torus with three holes.  Costa suspected that his surface had no self-intersections but, at first, couldn't prove it...

That task was undertaken at the  University of Massachusetts at Amherst  by David Hoffman  (a 1973 Stanford Ph.D.).

 Costa's Surface   David Hoffman teamed up with  William H. Meeks III . and  Jim Hoffman,  (then a graduate student)  to produce a computer visualization of the stunning symmetries of the strange surface described by Costa  (which contains two straight lines meeting at a right angle).  Dividing Costa's surface into  8  congruent pieces, they proved, indeed, that it never intersects itself!

Loosely speaking, Costa's surface features two complementary pairs of "tunnels" through the equatorial plane which connect the inside of the catenoid's northern half and the outside of its southern half, or vice-versa.

   Dave Hoffman
David Hoffman

 
 
 Bill Meeks
William H. Meeks, III
Subsequently, Dave Hoffman and Bill Meeks discovered that Costa's surface is just the simplest member of a whole family of  complete minimal embedded surfaces  constructed in the same way but with more "tunnels"...

 Costa-Hoffman-Meeks Surface 
 (Image courtesy of Matthias Weber)

Applied to helicoids, the idea yields yet another family of  complete minimal embedded surfaces  where tunnels provide shortcuts between slices of space which would otherwise be connected only by circling around the helicoid's axis.

 Come back later, we're
 still working on this one...

Minimal Surfaces and their Classification  (pdf 4.44 MB)  by  William H. Meeks, III.


Denis Viala  (2008-02-28)   Connect the dots, without crossings...
Given n blue dots and n red dots in the plane (no 3 dots on the same line) there's always a set of n disjoint segments with blue and red extremities.

HINT:  This little puzzle appears among optimization problems...   Proof.

n=2, n=3, etc.


(2008-11-09)   Shortest Way to Connect Three Points
How to connect three dots with lines of least total length?

The three vertices of a triangle  ABC  can be connected using just two of its three sides.  If the angle between those two sides is  120°  or larger, this is the most economical way to do so.

On the other hand, for triangles where the largest inside angle is less than  120°, the most economical connecting network consists of three straight lines  (OA, OB, OC)  connecting the vertices to some optimal center  (O)  where the three lines  OA,  OB  and  OC  meet at  120° angles.

 Come back later, we're
 still working on this one...


(2008-11-09)   The Honeycomb Theorem
In 1999, Thomas Hales finally proved an "obvious" fact.

Nobody questioned the celebrated  Honeycomb conjecture  before 1993, when Phelan and Weaire showed its 3D counterpart  (Kelvin's conjecture)  to be false!  The issue was settled in 1999  with a proper proof by  Thomas C. Hales :  Thus, the 2D  Honeycomb conjecture  is now a theorem!  Here it is:

Honeycomb Theorem   In any partition of the plane into tiles of unit area, each tile cannot have a lesser perimeter than the regular hexagon of unit area.  (Loosely speaking, the regular  honeycomb  tiling is the most economical one.)

The first proof by  Thomas C. Hales  (1999)  was surprisingly intricate.  It was crucial to get rid of the convexity restriction which earlier authors had reluctantly imposed.  (The isoperimetric inequality implies that the boundaries between optimal shapes are either circular arcs or straight lines.  Both sides of such a boundary are convex only in the latter case.)  The reason curved boundaries are ruled out is not at all obvious;  it is  false  in the  3D case.


 William Thomson, Lord Kelvin 
 (1824-1907) (2008-11-09)  On Minimal Foams 
Kelvin's problem & counterexamples to Kelvin's conjecture.

The Kelvin problem asks for a partition of space into cells of equal volume and minimal area. It has been described by Simon Lhuilier (1750-1840) and others as  one of the most difficult problems in geometry.

 Truncated 
 Octahedron
The  truncated octahedron  can tile space without voids.
 

Lord Kelvin (1824-1907)  observed that a partition of space into  regular truncated octahedra  was quite economical.  In 1887, he conjectured that the optimal foam would be obtained by curving the faces of that tetradecahedron in accordance with Plateau's laws.  Kelvin's foam thus consists of a  single type of cell  whose vertices are  exactly the same  as a regular truncated osctahedron.  The square faces are planar figures with curved edges.  The hexagonal faces are monkey saddles  (with straight [great] diagonals).

 Basic pattern of the Weaire-Phelan foam 
 (c) 2008 by Ruggero Gabbrielli

In 1993, Denis Weaire and Robert Phelan disproved Kelvin’s conjecture by describing a totally different structure which turns out to be more economical than Kelvin's.  The  A15  Weaire-Phelan structure is one example of the so-called Frank-Kasper foams.  It consists of two distinct types of cells having either pentagonal or hexagonal faces; one is a squashed dodecahedron, the other is a tetradecahedron.

 Basic pattern of the p42a foam  
 (c) 2008 by Ruggero Gabbrielli

In a draft  (2008-10-25)  of a paper published on 2009-06-02Ruggero Gabbrielli  presents a new type of unit-cell foam (featuring some  quadrilateral  faces)  whose cost (5.303) is intermediary between the Weaire-Phelan foam (5.288) and Kelvin's foam (5.306).  Gabbrielli has dubbed this structure  "P42a".  It is based on the repeating pattern of  14  curved "polyhedra"  (of 4 different types)  pictured at right.  Those  10  tetradecahedra and  4  tridecahedra yield an  average  of  96/7 = 13.714285...  faces per cell.

The above pictures for the  A15  and  P42a  foams were obtained by  Ruggero Gabbrielli  using the  GAVROG 3dt visualization software.  Gabbrielli has also posted several related  interactive 3D visualizations  under Java.  The 1993 optimization on A15 and Gabbrielli's recent research both used Ken Brakke’s Surface Evolver (with specific code provided by John M. Sullivan, in the latter case at least).

At this writing, the  A15  foam described by Weaire and Phelan
in 1993 is still conjectured to be the answer to Kelvin's problem.
It is about 0.3% less costly than Kelvin's original foam.

The Weaire-Phelan foam was used as the basis for the design of the celebrated Water-cube  of the 2008 Olympics  (Beijing National Aquatics Center)  which is the largest structure covered with ETFE film.

The number of faces in a minimal foam  (1992)  by  Robert B. Kusner
Comparing the Weaire-Phelan foam to Kelvin's foam  (1996)  by Rob Kusner and John M. Sullivan
A new counterexample to Kelvin’s conjecture on minimal surfaces  by  Ruggero Gabbrielli
Comparing the Kelvin, Gabbrielli and Weaire-Phelan structures  (Java)

border
border
visits since Sept. 22, 2007
 (c) Copyright 2000-2009, Gerard P. Michon, Ph.D.