Fourier–Motzkin elimination

Mathematical algorithm for eliminating variables from a system of linear inequalities
You can help expand this article with text translated from the corresponding article in German. (September 2013) Click [show] for important translation instructions.
  • Machine translation, like DeepL or Google Translate, is a useful starting point for translations, but translators must revise errors as necessary and confirm that the translation is accurate, rather than simply copy-pasting machine-translated text into the English Wikipedia.
  • Consider adding a topic to this template: there are already 9,148 articles in the main category, and specifying|topic= will aid in categorization.
  • Do not translate text that appears unreliable or low-quality. If possible, verify the text with references provided in the foreign-language article.
  • You must provide copyright attribution in the edit summary accompanying your translation by providing an interlanguage link to the source of your translation. A model attribution edit summary is Content in this edit is translated from the existing German Wikipedia article at [[:de:Fourier-Motzkin-Elimination]]; see its history for attribution.
  • You may also add the template {{Translated|de|Fourier-Motzkin-Elimination}} to the talk page.
  • For more guidance, see Wikipedia:Translation.

Fourier–Motzkin elimination, also known as the FME method, is a mathematical algorithm for eliminating variables from a system of linear inequalities. It can output real solutions.

The algorithm is named after Joseph Fourier[1] who proposed the method in 1826 and Theodore Motzkin who re-discovered it in 1936.

Elimination

The elimination of a set of variables, say V, from a system of relations (here linear inequalities) refers to the creation of another system of the same sort, but without the variables in V, such that both systems have the same solutions over the remaining variables.

If all variables are eliminated from a system of linear inequalities, then one obtains a system of constant inequalities. It is then trivial to decide whether the resulting system is true or false. It is true if and only if the original system has solutions. As a consequence, elimination of all variables can be used to detect whether a system of inequalities has solutions or not.

Consider a system S {\displaystyle S} of n {\displaystyle n} inequalities with r {\displaystyle r} variables x 1 {\displaystyle x_{1}} to x r {\displaystyle x_{r}} , with x r {\displaystyle x_{r}} the variable to be eliminated. The linear inequalities in the system can be grouped into three classes depending on the sign (positive, negative or null) of the coefficient for x r {\displaystyle x_{r}} .

  • those inequalities that are of the form x r b i k = 1 r 1 a i k x k {\displaystyle x_{r}\geq b_{i}-\sum _{k=1}^{r-1}a_{ik}x_{k}} ; denote these by x r A i ( x 1 , , x r 1 ) {\displaystyle x_{r}\geq A_{i}(x_{1},\dots ,x_{r-1})} , for i {\displaystyle i} ranging from 1 to n A {\displaystyle n_{A}} where n A {\displaystyle n_{A}} is the number of such inequalities;
  • those inequalities that are of the form x r b i k = 1 r 1 a i k x k {\displaystyle x_{r}\leq b_{i}-\sum _{k=1}^{r-1}a_{ik}x_{k}} ; denote these by x r B i ( x 1 , , x r 1 ) {\displaystyle x_{r}\leq B_{i}(x_{1},\dots ,x_{r-1})} , for i {\displaystyle i} ranging from 1 to n B {\displaystyle n_{B}} where n B {\displaystyle n_{B}} is the number of such inequalities;
  • those inequalities in which x r {\displaystyle x_{r}} plays no role, grouped into a single conjunction ϕ {\displaystyle \phi } .

The original system is thus equivalent to

max ( A 1 ( x 1 , , x r 1 ) , , A n A ( x 1 , , x r 1 ) ) x r min ( B 1 ( x 1 , , x r 1 ) , , B n B ( x 1 , , x r 1 ) ) ϕ {\displaystyle \max(A_{1}(x_{1},\dots ,x_{r-1}),\dots ,A_{n_{A}}(x_{1},\dots ,x_{r-1}))\leq x_{r}\leq \min(B_{1}(x_{1},\dots ,x_{r-1}),\dots ,B_{n_{B}}(x_{1},\dots ,x_{r-1}))\wedge \phi } .

Elimination consists in producing a system equivalent to x r   S {\displaystyle \exists x_{r}~S} . Obviously, this formula is equivalent to

max ( A 1 ( x 1 , , x r 1 ) , , A n A ( x 1 , , x r 1 ) ) min ( B 1 ( x 1 , , x r 1 ) , , B n B ( x 1 , , x r 1 ) ) ϕ {\displaystyle \max(A_{1}(x_{1},\dots ,x_{r-1}),\dots ,A_{n_{A}}(x_{1},\dots ,x_{r-1}))\leq \min(B_{1}(x_{1},\dots ,x_{r-1}),\dots ,B_{n_{B}}(x_{1},\dots ,x_{r-1}))\wedge \phi } .

The inequality

max ( A 1 ( x 1 , , x r 1 ) , , A n A ( x 1 , , x r 1 ) ) min ( B 1 ( x 1 , , x r 1 ) , , B n B ( x 1 , , x r 1 ) ) {\displaystyle \max(A_{1}(x_{1},\dots ,x_{r-1}),\dots ,A_{n_{A}}(x_{1},\dots ,x_{r-1}))\leq \min(B_{1}(x_{1},\dots ,x_{r-1}),\dots ,B_{n_{B}}(x_{1},\dots ,x_{r-1}))}

is equivalent to n A n B {\displaystyle n_{A}n_{B}} inequalities A i ( x 1 , , x r 1 ) B j ( x 1 , , x r 1 ) {\displaystyle A_{i}(x_{1},\dots ,x_{r-1})\leq B_{j}(x_{1},\dots ,x_{r-1})} , for 1 i n A {\displaystyle 1\leq i\leq n_{A}} and 1 j n B {\displaystyle 1\leq j\leq n_{B}} .

We have therefore transformed the original system into another system where x r {\displaystyle x_{r}} is eliminated. Note that the output system has ( n n A n B ) + n A n B {\displaystyle (n-n_{A}-n_{B})+n_{A}n_{B}} inequalities. In particular, if n A = n B = n / 2 {\displaystyle n_{A}=n_{B}=n/2} , then the number of output inequalities is n 2 / 4 {\displaystyle n^{2}/4} .

Example

Consider the following system of inequalities:[2]: 100–102 

{ 2 x 5 y + 4 z 10 3 x 6 y + 3 z 9 x + 5 y 2 z 7 3 x + 2 y + 6 z 12 {\displaystyle {\begin{cases}2x-5y+4z\leqslant 10\\3x-6y+3z\leqslant 9\\-x+5y-2z\leqslant -7\\-3x+2y+6z\leqslant 12\\\end{cases}}}

To eliminate x, we can write the inequalities in terms of x:

{ x 10 + 5 y 4 z 2 x 9 + 6 y 3 z 3 x 7 + 5 y 2 z x 12 + 2 y + 6 z 3 {\displaystyle {\begin{cases}x\leqslant {\frac {10+5y-4z}{2}}\\x\leqslant {\frac {9+6y-3z}{3}}\\x\geqslant 7+5y-2z\\x\geqslant {\frac {-12+2y+6z}{3}}\\\end{cases}}}

We have two inequalities with "≤" and two with "≥"; the system has a solution if the right-hand side of each "≤" inequality is at least the right-hand side of each "≥" inequality. We have 2*2 such combinations:

{ 7 + 5 y 2 z 10 + 5 y 4 z 2 7 + 5 y 2 z 9 + 6 y 3 z 3 12 + 2 y + 6 z 3 10 + 5 y 4 z 2 12 + 2 y + 6 z 3 9 + 6 y 3 z 3 {\displaystyle {\begin{cases}7+5y-2z\leqslant {\frac {10+5y-4z}{2}}\\7+5y-2z\leqslant {\frac {9+6y-3z}{3}}\\{\frac {-12+2y+6z}{3}}\leqslant {\frac {10+5y-4z}{2}}\\{\frac {-12+2y+6z}{3}}\leqslant {\frac {9+6y-3z}{3}}\\\end{cases}}}

We now have a new system of inequalities, with one fewer variable.

Complexity

Running an elimination step over n {\displaystyle n} inequalities can result in at most n 2 / 4 {\displaystyle n^{2}/4} inequalities in the output, thus naively running d {\displaystyle d} successive steps can result in at most 4 ( n / 4 ) 2 d {\displaystyle 4(n/4)^{2^{d}}} , a double exponential complexity. This is due to the algorithm producing many unnecessary constraints (constraints that are implied by other constraints). Unnecessary constraints can be detected using linear programming. It follows from McMullen's upper bound theorem that the number of necessary constraints grows as a single exponential.[3] A single exponential implementation of Fourier-Motzkin elimination and complexity estimates are given in.[4]

Imbert's acceleration theorems

Two "acceleration" theorems due to Imbert[5] permit the elimination of redundant inequalities based solely on syntactic properties of the formula derivation tree, thus curtailing the need to solve linear programs or compute matrix ranks.

Define the history H i {\displaystyle H_{i}} of an inequality i {\displaystyle i} as the set of indexes of inequalities from the initial system S {\displaystyle S} used to produce i {\displaystyle i} . Thus, H i = { i } {\displaystyle H_{i}=\{i\}} for inequalities i S {\displaystyle i\in S} of the initial system. When adding a new inequality k : A i ( x 1 , , x r 1 ) B j ( x 1 , , x r 1 ) {\displaystyle k:A_{i}(x_{1},\dots ,x_{r-1})\leq B_{j}(x_{1},\dots ,x_{r-1})} (by eliminating x r {\displaystyle x_{r}} ), the new history H k {\displaystyle H_{k}} is constructed as H k = H i H j {\displaystyle H_{k}=H_{i}\cup H_{j}} .

Suppose that the variables O k = { x r , , x r k + 1 } {\displaystyle O_{k}=\{x_{r},\ldots ,x_{r-k+1}\}} have been officially eliminated. Each inequality i {\displaystyle i} partitions the set O k {\displaystyle O_{k}} into E i I i R i {\displaystyle E_{i}\cup I_{i}\cup R_{i}} :

  • E i {\displaystyle E_{i}} , the set of effectively eliminated variables, i.e. on purpose. A variable x j {\displaystyle x_{j}} is in the set as soon as at least one inequality in the history H i {\displaystyle H_{i}} of i {\displaystyle i} results from the elimination of x j {\displaystyle x_{j}} .
  • I i {\displaystyle I_{i}} , the set of implicitly eliminated variables, i.e. by accident. A variable is implicitly eliminated when it appears in at least one inequality of H i {\displaystyle H_{i}} , but appears neither in inequality i {\displaystyle i} nor in E i {\displaystyle E_{i}}
  • R i {\displaystyle R_{i}} , all remaining variables.

A non-redundant inequality has the property that its history is minimal.[6]

Theorem (Imbert's first acceleration theorem). If the history H i {\displaystyle H_{i}} of an inequality i {\displaystyle i} is minimal, then 1 + | E i |     | H i |   1 + | E i ( I i O k ) | {\displaystyle 1+|E_{i}|\ \leq \ |H_{i}|\ \leq 1+\left|E_{i}\cup (I_{i}\cap O_{k})\right|} .

An inequality that does not satisfy these bounds is necessarily redundant, and can be removed from the system without changing its solution set.

The second acceleration theorem detects minimal history sets:

Theorem (Imbert's second acceleration theorem). If the inequality i {\displaystyle i} is such that 1 + | E i | = | H i | {\displaystyle 1+|E_{i}|=|H_{i}|} , then H i {\displaystyle H_{i}} is minimal.

This theorem provides a quick detection criterion and is used in practice to avoid more costly checks, such as those based on matrix ranks. See the reference for implementation details.[6]

Applications in information theory

Information-theoretic achievability proofs result in conditions under which the existence of a well-performing coding scheme is guaranteed. These conditions are often described by linear system of inequalities. The variables of the system include both the transmission rates (that are part of the problem's formulation) and additional auxiliary rates used for the design of the scheme. Commonly, one aims to describe the fundamental limits of communication in terms of the problem's parameters only. This gives rise to the need of eliminating the aforementioned auxiliary rates, which is executed via Fourier–Motzkin elimination. However, the elimination process results in a new system that possibly contains more inequalities than the original. Yet, often some of the inequalities in the reduced system are redundant. Redundancy may be implied by other inequalities or by inequalities in information theory (a.k.a. Shannon type inequalities). A recently developed open-source software for MATLAB[7] performs the elimination, while identifying and removing redundant inequalities. Consequently, the software's outputs a simplified system (without redundancies) that involves the communication rates only.

Redundant constraint can be identified by solving a linear program as follows. Given a linear constraints system, if the i {\displaystyle i} -th inequality is satisfied for any solution of all other inequalities, then it is redundant. Similarly, STIs refers to inequalities that are implied by the non-negativity of information theoretic measures and basic identities they satisfy. For instance, the STI I ( X 1 ; X 2 ) H ( X 1 ) {\displaystyle I(X_{1};X_{2})\leq H(X_{1})} is a consequence of the identity I ( X 1 ; X 2 ) = H ( X 1 ) H ( X 1 | X 2 ) {\displaystyle I(X_{1};X_{2})=H(X_{1})-H(X_{1}|X_{2})} and the non-negativity of conditional entropy, i.e., H ( X 1 | X 2 ) 0 {\displaystyle H(X_{1}|X_{2})\geq 0} . Shannon-type inequalities define a cone in R 2 n 1 {\displaystyle \mathbb {R} ^{2^{n}-1}} , where n {\displaystyle n} is the number of random variables appearing in the involved information measures. Consequently, any STI can be proven via linear programming by checking if it is implied by the basic identities and non-negativity constraints. The described algorithm first performs Fourier–Motzkin elimination to remove the auxiliary rates. Then, it imposes the information theoretic non-negativity constraints on the reduced output system and removes redundant inequalities.

See also

  • Farkas' lemma – can be proved using FM elimination.
  • Real closed field – the cylindrical algebraic decomposition algorithm performs quantifier elimination over polynomial inequalities, not just linear.

References

  1. ^ Fourier, Joseph (1827). "Histoire de l'Académie, partie mathématique (1824)". Mémoires de l'Académie des sciences de l'Institut de France. Vol. 7. Gauthier-Villars.
  2. ^ Gärtner, Bernd; Matoušek, Jiří (2006). Understanding and Using Linear Programming. Berlin: Springer. ISBN 3-540-30697-8. Pages 81–104.
  3. ^ David Monniaux, Quantifier elimination by lazy model enumeration, Computer aided verification (CAV) 2010.
  4. ^ RJ. Jing, M. Moreno-Maza, and D. Talaashrafi [1] Complexity Estimates for Fourier-Motzkin Elimination. In: Boulier, F., England, M., Sadykov, T.M., Vorozhtsov, E.V. (eds) Computer Algebra in Scientific Computing. CASC 2020. Lecture Notes in Computer Science, vol 12291. Springer,]
  5. ^ Jean-Louis Imbert, About Redundant Inequalities Generated by Fourier's Algorithm, Artificial Intelligence IV: Methodology, Systems, Applications, 1990.
  6. ^ a b Jean-Louis Imbert, Fourier Elimination: Which to Choose?.
  7. ^ Gattegno, Ido B.; Goldfeld, Ziv; Permuter, Haim H. (2015-09-25). "Fourier-Motzkin Elimination Software for Information Theoretic Inequalities". arXiv:1610.03990 [cs.IT].

Further reading

  • Schrijver, Alexander (1998). Theory of Linear and Integer Programming. John Wiley & sons. pp. 155–156. ISBN 978-0-471-98232-6.
  • Keßler, Christoph W. (1996). "Parallel Fourier–Motzkin Elimination". Universität Trier: 66–71. CiteSeerX 10.1.1.54.657.
  • Williams, H. P. (1986). "Fourier's Method of Linear Programming and its Dual" (PDF). American Mathematical Monthly. 93 (9): 681–695. doi:10.2307/2322281. JSTOR 2322281.

External links

  • Chapter 1 of Undergraduate Convexity, textbook by Niels Lauritzen at Aarhus University.
  • FME software for Information theory, open-source code in MATLAB by Ido B. Gattegno, Ziv Goldfeld and Haim H. Permuter.
  • Symbolic Fourier-Motzkin elimination, open-source code in Python implementing the two Imbert acceleration theorems.


  • v
  • t
  • e
Optimization: Algorithms, methods, and heuristics
Unconstrained nonlinear
Functions
Gradients
Convergence
Quasi–Newton
Other methods
Hessians
Graph of a strictly concave quadratic function with unique maximum.
Optimization computes maxima and minima.
General
Differentiable
Convex
minimization
Linear and
quadratic
Interior point
Basis-exchange
Paradigms
Graph
algorithms
Minimum
spanning tree
Shortest path
Network flows