• Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
Sunday, March 22, 2026
newsaiworld
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us
No Result
View All Result
Morning News
No Result
View All Result
Home Machine Learning

A Light Introduction to Nonlinear Constrained Optimization with Piecewise Linear Approximations

Admin by Admin
March 22, 2026
in Machine Learning
0
Daniel von appen gnxepl wzfg unsplash scaled 1.jpg
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

READ ALSO

The Math That’s Killing Your AI Agent

The Fundamentals of Vibe Engineering


downside, the purpose is to seek out one of the best (most or minimal) worth of an goal operate by choosing actual variables that fulfill a set of equality and inequality constraints.

A common constrained optimization downside is to pick nn actual determination variables x0,x1,…,xn−1x_0,x_1,dots,x_{n-1} from a given possible area in such a approach as to optimize (reduce or maximize) a given goal operate

[f(x_0,x_1,dots,x_{n-1}).]

We normally let xx denote the vector of nn actual determination variables x0,x1,…,xn−1.x_0,x_1,dots,x_{n-1}. That’s, x=(x0,x1,…,xn−1)x=(x_0,x_1,dots,x_{n-1}) and write the overall nonlinear program as:

[begin{aligned}text{ Maximize }&f(x)text{ Subject to }&g_i(x)leq b_i&&qquad(i=0,1,dots,m-1)&xinmathbb{R}^nend{aligned}]

the place every of the constraint features g0g_0 by way of gm−1g_{m-1} is given, and every bib_i is a continuing (Bradley et al., 1977).

This is just one attainable approach to write the issue. Minimizing f(x)f(x) is equal to maximizing −f(x).-f(x). Likewise, an equality constraint h(x)=bh(x)=b will be expressed because the pair of inequalities h(x)≤bh(x)leq b and −h(x)≤−b.-h(x)leq-b. Furthermore, by including a slack variable, any inequality will be transformed into an equality constraint (Bradley et al., 1977).

Some of these issues seem throughout many utility areas, for instance, in enterprise settings the place an organization goals to maximise revenue or reduce prices whereas working on sources or funding constraints (Cherry, 2016).

If f(x)f(x) is a linear operate and the constraints are linear, then the issue is named a linear programming downside (LP) (Cherry, 2016).

“The issue is named a nonlinear programming downside (NLP) if the target operate is nonlinear and/or the possible area is decided by nonlinear constraints.” (Bradley et al., 1977, p. 410)

The assumptions and approximations utilized in linear programming can typically produce an acceptable mannequin for the choice variable ranges of curiosity. In different circumstances, nonetheless, nonlinear habits within the goal operate and/or the constraints is crucial to formulate the appliance as a mathematical program precisely (Bradley et al., 1977).

Nonlinear programming refers to strategies for fixing an NLP. Though many mature solvers exist for LPs, NLPs, particularly these involving higher-order nonlinearities, are sometimes tougher to unravel (Cherry, 2016).

Difficult nonlinear programming issues come up in areas reminiscent of digital circuit design, monetary portfolio optimization, gasoline community optimization, and chemical course of design (Cherry, 2016).

One approach to sort out nonlinear programming issues is to linearize them and use an LP solver to acquire approximation. One wish to try this for a number of causes. For example, linear fashions are normally quicker to unravel and will be extra numerically steady.

To maneuver from concept to a working Python mannequin, we are going to stroll by way of the next sections:

This textual content assumes some familiarity with mathematical programming. After defining Separable Features, we current a separable nonlinear maximization Instance and description an method primarily based on piecewise linear (PWL) approximations of the nonlinear goal operate.

Subsequent, we outline Particular Ordered Units of Kind 2 and clarify how they help the numerical formulation.

We then introduce the theoretical background in On Convex and Concave Features, offering instruments used all through the rest of this work on nonlinear programming (NLP).

Lastly, we current a Python Implementation process that makes use of Gurobipy and PWL approximations of the nonlinear goal operate, enabling LP/MIP solvers to acquire helpful approximations for pretty giant NLPs.


Separable Features

This text’s answer process is for separable applications. Separable applications are optimization issues of the shape:

[begin{aligned}text{ Maximize }&sumlimits_{j=0}^{n-1}f_j(x_j)text{ Subject to }&sumlimits_{j=0}^{n-1} g_{ij}(x_j)leq 0qquad(i=0,1,dots,m-1)&xinmathbb{R}^nend{aligned}]

the place every of the features fjf_j and gijg_{ij} are identified. These issues are known as separable as a result of the choice variables seem individually: one in every constraint operate gijg_{ij} and one in every goal operate fjf_j (Bradley et al., 1977).


Instance

Contemplate the next downside from Bradley et al. (1977) that arises in portfolio choice:

[begin{aligned}text{ Maximize }&f(x)=20x_0+16x_1-2x_0^2-x_1^2-(x_0+x_1)^2text{ Subject to }&x_0+x_1leq5&x_0geq0,x_1geq0.end{aligned}]

Goal operate (Animation by the writer).
Goal operate over the possible area (Animation by the writer).

As said, the issue shouldn’t be separable due to the time period (x0+x1)2(x_0+x_1)^2 within the goal operate. Nonetheless, nonseparable issues can ceaselessly be lowered to a separable kind utilizing varied formulation tips. Particularly, by letting x2=x0+x1x_2=x_0+x_1 we will re-express the issue above in separable kind as:

[begin{aligned}text{ Maximize }&f(x)=20x_0+16x_1-2x_0^2-x_1^2-x_2^2text{ Subject to }&x_0+x_1leq5&x_0+x_1-x_2=0&x_0geq0,x_1geq0,x_2geq 0.end{aligned}]

Clearly, the linear constraints are separable, and the target operate can now be written as f(x)=f0(x0)+f1(x1)+f2(x2),f(x)=f_0(x_0)+f_1(x_1)+f_2(x_2), the place

[begin{aligned}f_0(x_0)&=20x_0-2x_0^2f_1(x_1)&=16x_1-x_1^2end{aligned}]

and

[f_2(x_2)=-x_2^2.]

Nonlinear features (Picture by the writer).

To kind the approximation downside, we approximate every nonlinear time period fj(xj)f_j(x_j) by a piecewise linear operate fj^(xj)hat{f_j}(x_j) through the use of a specified variety of breakpoints.

Every PWL operate fj^(xj)hat{f_j}(x_j) is outlined over an interval [l,u][l,u] and consists of line segments whose endpoints lie on the nonlinear operate fj(xj),f_j(x_j), beginning at (l,fj(l))(l,f_j(l)) and ending at (u,fj(u)).(u,f_j(u)). We characterize the PWL features utilizing the next parametrization. One can write any l≤xj≤ulleq x_jleq u as:

[x_j=sumlimits_{i=0}^{r-1}theta_j^i a_i,quad hat f_j(x_j)=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i),quadsumlimits_{i=0}^{r-1}theta_j^i=1,quadtheta_j=(theta_j^0,theta_j^1,dotstheta_j^{r-1})inmathbb{R}_{geq0}^r]

for every j=0,1,2,j=0,1,2, the place the PWL features are specified by the factors {(ai,fj(ai)}{(a_i,f_j(a_i)} for i=0,1,…,r−1i=0,1,dots,r-1 (Cherry, 2016). Right here, we use 4 uniformly distributed breakpoints to approximate every fj(xj)f_j(x_j) Additionally, since x0+x1≤5,x_0+x_1leq5, x0+x1=x2,x_0+x_1=x_2, and x0,x1,x2≥0x_0,x_1,x_2geq0 we’ve that

[0leq x_0leq5,qquad0leq x_1leq 5,quadtext{ and }quad0leq x_2leq5.]

And so, our approximations needn’t prolong past these variable bounds. Subsequently the PWL operate fj^(xj)hat{f_j}(x_j) can be outlined by the factors (0,fj(0)),(0,f_j(0)), (5/3,fj(5/3)),(5/3,f_j(5/3)), (10/3,fj(10/3))(10/3,f_j(10/3)) and (5,fj(5))(5,f_j(5)) for j=0,1,2.j=0,1,2.

Any x0∈[0,5]∩ℝx_0in[0,5]capmathbb{R} will be written as

[begin{aligned}x_0&=theta_0^0cdot0+theta_0^1cdotfrac{5}{3}+theta_0^2cdotfrac{10}{3}+theta_0^3cdot5&=theta_0^1cdotfrac{5}{3}+theta_0^2cdotfrac{10}{3}+theta_0^3cdot 5text{ s.t. }sumlimits_{i=0}^3theta_0^i=1end{aligned}]

and f0(x0)f_0(x_0) will be approximated as

[begin{aligned}hat f_0(x_0)&=theta_0^0f_0(0)+theta_0^1f_0left(frac{5}{3}right)+theta_0^2f_0left(frac{10}{3}right)+theta_0^3f_0(5)&=theta_0^1cdotfrac{250}{9}+theta_0^2cdotfrac{400}{9}+theta_0^3cdot50.end{aligned}]

The identical reasoning follows for x1x_1 and x2x_2 For instance, evaluating the approximation at x0=1.5x_0=1.5 provides:

[begin{aligned}hat f_0(1.5)&=frac{1}{10}f_0(0)+frac{9}{10}f_0left(frac{5}{3}right)&=frac{1}{10}cdot 0+frac{9}{10}cdotfrac{250}{9}&=25end{aligned}]

since

[1.5=frac{1}{10}cdot0+frac{9}{10}cdotfrac{5}{3}.]

PWL approximation curves (Picture by the writer).

Such weighted sums are known as convex mixtures, i.e., linear mixtures of factors with non-negative coefficients that sum to 1. Now, since f0(x0),f_0(x_0), f1(x1),f_1(x_1), and f2(x2)f_2(x_2) are concave features, one could ignore extra restrictions, and the issue will be solved as an LP (Bradley et al., 1977). Nonetheless, on the whole, such a formulation doesn’t suffice. Particularly, we nonetheless must implement the adjacency situation: At most two θjitheta_j^i weights are constructive, and if two weights are constructive, then they’re adjoining, i.e., of the shape θjitheta_j^i and θji+1.theta_j^{i+1}. An identical restriction applies to every approximation. For example, if the weights θ00=2/5theta_0^0=2/5 and θ02=3/5theta_0^2=3/5 then the approximation provides

[begin{aligned}hat f_0(x_0)&=frac{2}{5}cdot 0+frac{3}{5}cdotfrac{400}{9}=frac{80}{3}x_0&=frac{2}{5}cdot0+frac{3}{5}cdotfrac{10}{3}=2.end{aligned}]

In distinction, at x0=2,x_0=2, the approximation curve provides

[begin{aligned}hat f_0(x_0)&=frac{4}{5}cdotfrac{250}{9}+frac{1}{5}cdotfrac{400}{9}=frac{280}{9}x_0&=frac{4}{5}cdotfrac{5}{3}+frac{1}{5}cdotfrac{10}{3}=2.end{aligned}]

Want for adjacency situation (Picture by the writer).

One can implement the adjacency situation by introducing extra binary variables into the mannequin, as formulated in Cherry (2016). Nonetheless, we leverage Gurobipy‘s SOS (Particular Ordered Set) constraint object.

The entire program is as follows

[begin{alignat}{2}text{ Maximize }&hat f_0(x_0)+hat f_1(x_1)+hat f_2(x_2)text{ Subject to }&x_0=sumlimits_{i=0}^{r-1}theta_0^ia_i&x_1=sumlimits_{i=0}^{r-1}theta_1^ia_i&x_2=sumlimits_{i=0}^{r-1}theta_2^ia_i&hat f_0=sumlimits_{i=0}^{r-1}theta_0^if_0(a_i)&hat f_1=sumlimits_{i=0}^{r-1}theta_1^if_1(a_i)&hat f_2=sumlimits_{i=0}^{r-1}theta_2^if_2(a_i)&sumlimits_{i=0}^{r-1}theta_j^i=1&&qquad j=0,1,2&{theta_j^0,theta_j^1,dotstheta_j^{r-1}}text{ SOS type 2 }&&qquad j=0,1,2&x_0,x_1,x_2in[0,5]capmathbb{R}&hat f_0in[0,50]capmathbb{R}&hat f_1in[0,55]capmathbb{R}&hat f_2in [-25,0]capmathbb{R}&theta_j^iin[0,1]capmathbb{R}&&qquad j=0,1,2,;i=0,1,dots,{r-1}.finish{alignat}]


Particular Ordered Units of Kind 2

“A set of variables {x0,x1,…,xn−1}{x_0,x_1,dots,x_{n-1}} is a particular ordered set of sort 2 (SOS sort 2) if xixj=0x_ix_j=0 at any time when |i−j|≥2,|i-j|geq2, i.e., at most two variables within the set will be nonzero, and if two variables are nonzero they have to be adjoining within the set.” (de Farias et al., 2000, p. 1)

In our formulation, the adjacency situation on the θjitheta_j^i variables matches precisely a SOS sort 2 situation: at most two θjitheta_j^i will be constructive, they usually should correspond to adjoining breakpoints. Utilizing SOS sort 2 constraints lets us guarantee adjacency immediately in Gurobi, which applies specialised branching methods that routinely seize the SOS construction, as a substitute of introducing additional binary variables by hand.


On Convex and Concave Features

For the next dialogue, we take a common separable constrained maximization downside occasion, as beforehand outlined within the Introduction part:

[begin{aligned}text{ Maximize }&sumlimits_{j=0}^{n-1}f_j(x_j)text{ Subject to }&sumlimits_{j=0}^{n-1} g_{ij}(x_j)leq 0qquad(i=0,1,dots,m-1)&xinmathbb{R}^n.end{aligned}]

the place every of the features fif_i and gijg_{ij} are identified.

All through, the time period piecewise-linear (PWL) approximation refers back to the secant interpolant obtained by connecting ordered breakpoints pairs (aok,f(aok))(a_k,f(a_k)) with line segments.

Convex and concave are among the many most important purposeful kinds in mathematical optimization. Formally, a operate f(x)f(x) is named convex if, for each yy and zz and each 0≤λ≤1,0leqlambdaleq1,

[f[lambda y+(1-lambda)z]leqlambda f(y)+(1-lambda)f(z).]

This definition implies that the sum of convex features is convex, and nonnegative multiples of convex features are additionally convex. Concave features are merely the damaging of convex features, for which the definition above follows aside from the reversed course of the inequality. In fact, some features are neither convex nor concave.

Convex and concave features (Eric W. Weisstein on MathWorld)

It’s simple to see that linear features are each convex and concave. This property is crucial because it permits us to optimize (maximize or reduce) linear goal features utilizing computationally environment friendly methods, such because the simplex methodology in linear programming (Bradley et al., 1977).

Moreover, a single-variable differentiable operate f(x)f(x) is convex on an interval precisely when its by-product f′(x)f^prime(x) is nondecreasing — which means the slope doesn’t go down. Likewise, differentiable f(x)f(x) is concave on an interval precisely when f′(x)f^prime(x) is nonincreasing — which means the slope doesn’t go up.

From the definition above, one can trivially conclude that PWL approximations overestimate convex features since each line phase becoming a member of two factors on its graph doesn’t lie beneath the graph at any level. Equally, PWL approximations underestimate concave features.

Nonlinear Goal Perform

This notion helps us clarify why we will omit the SOS sort 2 constraints from an issue such because the one within the Instance part. By concavity, the PWL approximation curve lies above the phase becoming a member of any two non-adjacent factors. Subsequently, the maximization will choose the approximation curve with solely adjoining weights. An identical argument applies if three or extra weights are constructive (Bradley et al., 1977). Formally, suppose that every fj(xj)f_j(x_j) is a concave operate and every identified gijg_{ij} is a linear operate in x.x. Then we will apply the identical process as within the Instance part — We let breakpoints fulfill

[l= a_0lt a_1lt dotslt a_{r-1}= u,qquad rgeq 2.]

For actual numbers l≤xj≤u.lleq x_jleq u. Additionally, let θj∈ℝ+rtheta_jinmathbb{R}_+^r such that,

[sumlimits_{i=0}^{r-1}theta_j^i=1]

and assign

[x_j=sumlimits_{i=0}^{r-1}theta_j^icdot a_i,qquadhat f_j=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i)]

for every j=0,1,…,n−1.j=0,1,dots,n-1. And so, the reworked LP is as follows

[begin{alignat}{2}text{ Maximize }&sumlimits_{j=0}^{n-1}hat f_jtext{ Subject to }&sumlimits_{j=0}^{n-1} g_{ij}(x_j)leq 0&&qquad(i=0,1,dots,m-1)&x_j=sumlimits_{i=0}^{r-1}theta_j^icdot a_i&&qquad(j=0,1,dots,n-1)&hat f_j=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i)&&qquad(j=0,1,dots,n-1)&sum_{i=0}^{r-1}theta_j^i=1&&qquad(j=0,1,dots,n-1)&theta_jinmathbb{R}_{geq 0}^r&&qquad(j=0,1,dots,n-1)&xinmathbb{R}^n.end{alignat}]

Now, let pj(xj)p_j(x_j) be the PWL curve that goes by way of the factors (ai,fj(ai))(a_i,f_j(a_i)) i.e., for every xj∈[ak,ak+1],x_jin[a_k,a_{k+1}], j=0,1,…,n−1:j=0,1,dots,n-1:

[p_j(x_j)=frac{a_{k+1}-x_j}{a_{k+1}-a_k}f_j(a_k)+frac{x_j-a_k}{a_{k+1}-a_k}f_j(a_{k+1})]

fj(xj)f_j(x_j) concave implies that its secant slopes are non-increasing, and since pj(xj)p_j(x_j) is constructed by becoming a member of ordered breakpoints on the graph of fj(xj),f_j(x_j), we’ve that pj(xj)p_j(x_j) can be concave. And so, by Jensen’s inequality,

[p_j(x_j)=p_jleft(sumlimits_{i=0}^{r-1}theta_j^icdot a_iright)geqsumlimits_{i=0}^{r-1}theta_j^icdot p_j(a_i)=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i)=hat f_j.]

Furthermore, for all xjx_j in any possible answer ((x0,f0^,θ0),(x1,f1^,θ1),…,(xn−1,f^n−1,θn−1)),((x_0,hat{f_0},theta_0),(x_1,hat{f_1},theta_1),dots,(x_{n-1},hat{f}_{n-1},theta_{n-1})), one can select okok such that xj∈[ak,ak+1]x_jin[a_k,a_{k+1}] and outline the vector θj^hat{theta_j} by

[hattheta_j^k=frac{a_{k+1}-x_j}{a_{k+1}-a_k},qquad hattheta_j^{k+1}=frac{x_j-a_k}{a_{k+1}-a_k},qquadhattheta_j^i=0,;(inotin {k,k+1})qquad]

with θ^jok,θ^jok+1≥0hattheta_j^ok,hattheta_j^{ok+1}geq 0 and θ^jok+θ^jok+1=1.hattheta_j^ok+hattheta_j^{ok+1}=1. Then θ^jok⋅aok+θ^jok+1⋅aok+1=xjhattheta_j^kcdot a_k+hattheta_j^{ok+1}cdot a_{ok+1}=x_j and offers

[hat f_j^text{new}=hattheta_j^k f_j(a_k)+hattheta_j^{k+1}f_j(a_{k+1})=p_j(x_j)geqhat f_j.]

As a result of the constraints depend upon θjtheta_j solely by way of the induced worth xj,x_j, changing every θjtheta_j by θ^jhattheta_j preserves feasibility (as a result of it leaves every xjx_j unchanged) and it weakly improves the target operate (as a result of it will increase every f^jhat f_j to pj(xj)p_j(x_j)). Subsequently, the SOS sort 2 constraints that implement the adjacency situation don’t change the optimum worth in separable maximization issues with concave goal features. An identical argument reveals that the SOS sort 2 constraints don’t change the optimum worth in separable minimization issues with convex goal features.

Word, nonetheless, that this reveals that there exists an optimum answer by which solely weights similar to adjoining breakpoints are constructive. It doesn’t assure that each one optimum options of the LP fulfill adjacency. Subsequently, one should still embrace SOS sort 2 constraints to implement a constant illustration.

In follow, fixing fashions with concave goal features through PWL approximations can yield an goal worth that underestimates the true optimum worth. Conversely, fixing fashions with convex goal features through PWL approximations yield an goal worth that overestimates the true optimum worth.

Nonetheless, though these approximations can distort the target worth, in fashions the place the unique linear constraints stay unchanged, feasibility shouldn’t be affected by the PWL approximations. Any variable assignments discovered by fixing the mannequin by way of PWL approximations fulfill the identical linear constraints and due to this fact lie within the unique possible area. Consequently, one can consider the unique nonlinear goal f(x)f(x) on the returned answer to acquire the target worth, though this worth needn’t equal the true nonlinear optimum.

Nonlinear Constraints

What requires totally different approaches are PWL approximations of nonlinear constraints, since these approximations can change the unique possible area. We outline the possible area/set of the issue, just like the one above, by:

[F=bigcaplimits_{i=0}^{m-1}{xinmathbb{R}^n:G_i(x)leq0},]

the place

[G_i(x):=sumlimits_{j=0}^{n-1}g_{ij}(x_j).]

Now, suppose (for notational simplicity) that each one Gi(x)G_i(x) are nonlinear features. Approximate every univariate gij(xj)g_{ij}(x_j) by a PWL operate g^ij(xj)hat g_{ij}(x_j) and outline:

[hat G_i(x):=sumlimits_{j=0}^{n-1}hat g_{ij}(x_j).]

Then

[hat F=bigcaplimits_{i=0}^{m-1}{xinmathbb{R}^n:hat G_i(x)leq0},]

is the possible set of the reworked LP that makes use of PWL approximations. To keep away from actually infeasible options, we wish

[hat Fsubseteq F.]

If, as a substitute, F⊆F^Fsubseteqhat F then there could exist x∈F^∖Fxin hat Fsetminus F which means the mannequin that makes use of PWL approximations may settle for factors that violate the unique nonlinear constraints.

Possible units (Picture by the writer).

Given the way in which we assemble the PWL approximations, a adequate situation for F^⊆Fhat Fsubseteq F is that for constraints of the shape Gi(x)≤0G_i(x)leq0 every gij(xj)g_{ij}(x_j) is convex and for constraints of the shape Gi(x)≥0G_i(x)geq0 every gij(xj)g_{ij}(x_j) is concave.

To see why this holds, take any x∈F^.xinhat F. First, take into account constraints of the shape Gi(x)≤0G_i(x)leq0 the place every gij(xj)g_{ij}(x_j) is convex. As a result of PWL approximations overestimate convex features, for any mounted i=0,1,…,m−1,i=0,1,dots,m-1, we’ve:

[(forall j,;hat g_{ij}(x_j)geq g_{ij}(x_j))rightarrowsumlimits_{j=0}^{n-1}hat g_{ij}(x_j)geqsumlimits_{j=0}^{n-1}g_{ij}(x_j)rightarrowhat G_i(x)geq G_i(x)]

Since x∈F^,xinhat F, it satisfies G^i(x)≤0.hat G_i(x)leq0. Along with G^i(x)≥Gi(x),hat G_i(x)geq G_i(x), this means Gi(x)≤0.G_i(x)leq0.

Subsequent, take into account constraints of the shape Gi(x)≥0G_i(x)geq0 the place every gij(xj)g_{ij}(x_j) is concave. As a result of PWL approximations underestimate concave features, for any mounted i=0,1,…,m−1,i=0,1,dots,m-1, we’ve:

[(forall j,;hat g_{ij}(x_j)leq g_{ij}(x_j))rightarrowsumlimits_{j=0}^{n-1}hat g_{ij}(x_j)leqsumlimits_{j=0}^{n-1}g_{ij}(x_j)rightarrowhat G_i(x)leq G_i(x).]

Once more, since x∈F^,xinhat F, it satisfies G^i(x)≥0.hat G_i(x)geq0. Mixed with G^i(x)≤Gi(x),hat G_i(x)leq G_i(x), this means Gi(x)≥0.G_i(x)geq0.

This motivates the notion of a convex set: A set CC is convex if, for any two factors x,y∈Cx,yin C the complete line phase connecting them lies inside C.C. Formally, CC is convex if for all x,y∈Cx,yin C and any λ∈[0,1]lambdain[0,1] the purpose λx+(1−λ)ylambda x+(1-lambda)y can be in C.C.

Convex and concave features (Eric W. Weisstein on MathWorld)

Particularly, if Gi:ℝn→ℝG_i:mathbb{R}^ntomathbb{R} is convex, then

[S_i:={xinmathbb{R}^n:G_i(x)leq b}]

is convex for any b∈ℝ.binmathbb{R}. Proof: Take any x1,x2∈S,x_1,x_2in S, then Gi(x1)≤bG_i(x_1)leq b and Gi(x2)≤b.G_i(x_2)leq b. Since Gi(x)G_i(x) is convex, for any λ∈[0,1]lambdain[0,1] we’ve that

[G_i(lambda x_1+(1-lambda)x_2)leqlambda G_i(x_1)+(1-lambda)G_i(x_2)leqlambda b+(1-lambda)b=b.]

Therefore, λx1+(1−λ)x2∈S,lambda x_1+(1-lambda) x_2in S, and so, SS is convex. An identical argument applies to point out that

[T_i:={xinmathbb{R}^n:G_i(x)geq b}]

is convex for any b∈ℝbinmathbb{R} if Gi:ℝn→ℝG_i:mathbb{R}^ntomathbb{R} is concave.

And since one may simply present that the intersection of any assortment of convex units can be convex,

“for a convex possible set we wish convex features for less-than-or-equal-to constraints and concave features for greater-than-or-equal to constraints.” (Bradley et al., 1977, p.418)

Nonlinear optimization issues by which the purpose is to attenuate a convex goal (or maximize a concave one) over a convex set of constraints are known as convex nonlinear issues. These issues are usually simpler to sort out than nonconvex ones.

Certainly, if F⊆F^Fsubseteqhat F then variable assignments could result in constraint violations. In these circumstances, one would wish to depend on different methods, reminiscent of the straightforward modification MAP to the Frank-Wolfe algorithm offered in Bradley et al. (1977), and/or introduce tolerances that, to some extent, permit constraint violations.


Python Implementation

Earlier than presenting the code, it’s helpful to think about how our modeling method will scale as the issue dimension will increase. For a separable program with nn determination variables and rr breakpoints per variable, the mannequin introduces n⋅rncdot r steady variables (for the convex mixtures) and rr SOS sort 2 constraints. Because of this each the variety of variables and constraints develop linearly with nn and r,r, however their product can rapidly result in giant fashions when both parameter is elevated.

As beforehand talked about, we are going to use Gurobipy to numerically resolve this system above. Usually, the process is as follows:

  1. Select bounds [l,u][l,u]
  2. Select breakpoints aia_i
  3. Add θ,theta, convex-combination constraints
  4. Add SOS sort 2
  5. Clear up
  6. Consider the nonlinear goal on the returned xx
  7. Refine breakpoints if wanted

After importing the mandatory libraries and dependencies, we declare and instantiate our nonlinear features f0,f_0, f1,f_1, and f2:f_2:

# Imports
import gurobipy as grb
import numpy as np

# Nonlinear features
f0=lambda x0: 20.0*x0-2.0*x0**2
f1=lambda x1: 16.0*x1-x1**2
f2=lambda x2: -x2**2

Subsequent, we select the variety of breakpoints. To assemble the PWL approximations, we have to distribute them throughout the interval [0,5]∩ℝ.[0,5]capmathbb{R}. Many methods exist to do that. For example, one may cluster extra factors close to the ends and even depend on adaptive procedures as in Cherry (2016). For simplicity, we follow uniform sampling:

lb,ub=0.0,5.0
r=4  # Variety of breakpoints
a_i=np.linspace(begin=lb,cease=ub,num=r)  # Distribute breakpoints uniformly

Then, we will compute the worth of the features fj(xj)f_j(x_j) on the chosen breakpoints to approximate the choice variables:

# Compute operate values at breakpoints
f0_hat_r=f0(a_i)
f1_hat_r=f1(a_i)
f2_hat_r=f2(a_i)

The following step is to declare and instantiate a Gurobipy optimization mannequin:

mannequin=gp.Mannequin()

After creating of our mannequin, we have to add the choice variables. Gurobipy gives strategies so as to add a number of determination variables to a mannequin directly, and we leverage this function to declare and instantiate a few of our determination variables, particularly the coefficients of the PWL approximations. Word that each one determination variables in our mannequin are steady and will be assigned to any actual worth inside the offered bounds. Such fashions will be solved effectively utilizing fashionable software program packages reminiscent of Gurobi.

# Choice variables
x0=mannequin.addVar(lb=0.0,ub=5.0,title='x0',vtype=GRB.CONTINUOUS)
x1=mannequin.addVar(lb=0.0,ub=5.0,title='x1',vtype=GRB.CONTINUOUS)
x2=mannequin.addVar(lb=0.0,ub=5.0,title='x2',vtype=GRB.CONTINUOUS)

f0_hat=mannequin.addVar(lb=0.0,ub=50.0,title='f0_hat',vtype=GRB.CONTINUOUS)
f1_hat=mannequin.addVar(lb=0.0,ub=55.0,title='f1_hat',vtype=GRB.CONTINUOUS)
f2_hat=mannequin.addVar(lb=-25.0,ub=0.0,title='f2_hat',vtype=GRB.CONTINUOUS)

theta=mannequin.addVars(vary(0,3),vary(0,r),lb=0.0,ub=1.0,title='theta',vtype=GRB.CONTINUOUS)

Moreover, we have to add constraints that restrict the values our determination variables could take. Firstly, we add the constraints that guarantee the right determination of the variables that yield the approximations of the nonlinear features:

# Constraints
mannequin.addConstr(x0==gp.quicksum(theta[0,i]*a_i[i] for i in vary(0,r)))
mannequin.addConstr(x1==gp.quicksum(theta[1,i]*a_i[i] for i in vary(0,r)))
mannequin.addConstr(x2==gp.quicksum(theta[2,i]*a_i[i] for i in vary(0,r)))

mannequin.addConstr(f0_hat==gp.quicksum(theta[0,i]*f0_hat_r[i] for i in vary(0,r)))
mannequin.addConstr(f1_hat==gp.quicksum(theta[1,i]*f1_hat_r[i] for i in vary(0,r)))
mannequin.addConstr(f2_hat==gp.quicksum(theta[2,i]*f2_hat_r[i] for i in vary(0,r)))

mannequin.addConstr(gp.quicksum(theta[0,i] for i in vary(0,r))==1)
mannequin.addConstr(gp.quicksum(theta[1,i] for i in vary(0,r))==1)
mannequin.addConstr(gp.quicksum(theta[2,i] for i in vary(0,r))==1)

Secondly, we add the constraints that outline the unique downside, particularly these restrictions within the nonlinear formulation:

mannequin.addConstr(x0+x1<=5.0)
mannequin.addConstr(x0+x1-x2==0.0)

And at last, we add the SOS Kind 2 constraints to make sure the adjacency situation:

# Particular ordered units
mannequin.addSOS(GRB.SOS_TYPE2,[theta[0,i] for i in vary(0,r)])
mannequin.addSOS(GRB.SOS_TYPE2,[theta[1,i] for i in vary(0,r)])
mannequin.addSOS(GRB.SOS_TYPE2,[theta[2,i] for i in vary(0,r)])

All that continues to be is to outline the target operate, which we wish to maximize:

mannequin.setObjective(f0_hat+f1_hat+f2_hat,GRB.MAXIMIZE)  # Goal operate

The mannequin is now full and we will use the Gurobi solver to retrieve the optimum worth and optimum variable assignments. The code to do that is proven beneath:

mannequin.optimize()  # Optimize!

obj_val=mannequin.ObjVal

res={}

all_vars=mannequin.getVars()
values=mannequin.getAttr('X',all_vars)
names=mannequin.getAttr('VarName',all_vars)

# Retrieve values of variables after optimizing
for title,val in zip(names,values):
    res[name]=val

After executing the code above and printing attributes to the display, the output is as follows:

- Standing: 2
- Variety of breakpoints: 4
- Optimum goal worth: 45.00
- Optimum x0, f0(x0) worth (PWL approximation): 1.67, 27.78
- Optimum x1, f1(x1) worth (PWL approximation): 3.33, 42.22
- Optimum x2, f2(x2) worth (PWL approximation): 5.00, -25.00
PWL approximation curves (3 segments) with optimum values (Picture by the writer).

The distinction from our computed answer to the optimum worth of 139/3 is 4/3. Now, to extend accuracy and acquire a greater answer, one may add extra breakpoints, consequently rising the variety of segments used to approximate the nonlinear features. For example, the output beneath reveals how rising the variety of breakpoints to 16 results in an approximation with virtually no error to the true optimum worth:

- Standing: 2
- Variety of breakpoints: 16
- Optimum goal worth: 46.33
- Optimum x0, f0(x0) worth (PWL approximation): 2.33, 35.78
- Optimum x1, f1(x1) worth (PWL approximation): 2.67, 35.56
- Optimum x2, f2(x2) worth (PWL approximation): 5.00, -25.00
PWL approximation curves (15 segments) with optimum values (Picture by the writer).
Goal operate over the possible area with optimum worth utilizing PWL approximation curves consisting of 15 segments (Animation by the writer).

You could find the whole Python code on this GitHub repository: hyperlink.


Additional Readings

For a extra superior, strong algorithm that makes use of PWL features to approximate the nonlinear goal operate, Cherry (2016) is a precious reference. It presents a process that begins with coarse approximations, that are refined iteratively utilizing the optimum answer to the LP leisure from the earlier step.

For a deeper understanding of nonlinear programming, together with strategies, utilized examples, and a proof of why nonlinear issues are intrinsically tougher to unravel, the reader could seek advice from Nonlinear Programming: Utilized Mathematical Programming (Bradley et al., 1977).

Lastly, de Farias et al. (2000) current computational experiments that use SOS constraints inside a branch-and-cut scheme to unravel a generalized project downside arising in fiber-optic cable manufacturing scheduling.


Conclusion

Nonlinear programming is a related space in numerical optimization — with functions starting from monetary portfolio optimization to chemical course of management. This text offered a process for tackling separable nonlinear applications by changing nonlinear phrases with PWL approximations and fixing the ensuing mannequin utilizing LP/MIP solvers. We additionally offered theoretical background on convex and concave features and defined how these notions relate to nonlinear programming. With these components, a reader ought to be capable of mannequin and resolve LP/MIP-compatible approximations of many nonlinear programming cases with out counting on devoted nonlinear solvers.

Thanks for studying!


References

Cherry, A., 2016. Piecewise Linear Approximation for Nonlinear Programming Issues. A dissertation. Texas Tech College. Obtainable at: https://ttu-ir.tdl.org/server/api/core/bitstreams/2ffce0e6-87e9-4e4c-b41a-60ea670c5a13/content material (Accessed: 27 January 2026).

Bradley, S. P., Hax, A. C. & Magnanti, T. L., 1977. Nonlinear Programming. In: Utilized Mathematical Programming. Studying, MA: Addison-Wesley Publishing Firm, pp. 410–464. Obtainable at: https://internet.mit.edu/15.053/www/AMP-Chapter-13.pdf (Accessed: 27 January 2026).

de Farias, I. R., Johnson, E. L. & Nemhauser, G. L., 2000. A generalized project downside with particular ordered units: a polyhedral method. Mathematical Programming, 89(1), pp. 187–203. Obtainable at: https://doi.org/10.1007/PL00011392 (Accessed: 27 January 2026).


Join With Me

Tags: ApproximationsConstrainedGentleIntroductionLinearNonlinearOptimizationPiecewise

Related Posts

Image 252.png
Machine Learning

The Math That’s Killing Your AI Agent

March 21, 2026
Skarmavbild 2026 03 16 kl. 20.16.38.jpg
Machine Learning

The Fundamentals of Vibe Engineering

March 19, 2026
Head img.png
Machine Learning

Two-Stage Hurdle Fashions: Predicting Zero-Inflated Outcomes

March 18, 2026
Embed main image scaled 1.jpg
Machine Learning

Introducing Gemini Embeddings 2 Preview | In direction of Knowledge Science

March 17, 2026
Screenshot 2026 03 12 at 9.53.17 am.jpg
Machine Learning

The Causal Inference Playbook: Superior Strategies Each Knowledge Scientist Ought to Grasp

March 16, 2026
Image 167.jpg
Machine Learning

The Multi-Agent Entice | In the direction of Knowledge Science

March 14, 2026
Next Post
019d13d2 f30e 7b76 b8fc 6a67dc0b5f97.jpg

Resolv Labs’ Stablecoin Depegs Amid Exploit

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Gemini 2.0 Fash Vs Gpt 4o.webp.webp

Gemini 2.0 Flash vs GPT 4o: Which is Higher?

January 19, 2025
Chainlink Link And Cardano Ada Dominate The Crypto Coin Development Chart.jpg

Chainlink’s Run to $20 Beneficial properties Steam Amid LINK Taking the Helm because the High Creating DeFi Challenge ⋆ ZyCrypto

May 17, 2025
Image 100 1024x683.png

Easy methods to Use LLMs for Highly effective Computerized Evaluations

August 13, 2025
Blog.png

XMN is accessible for buying and selling!

October 10, 2025
0 3.png

College endowments be a part of crypto rush, boosting meme cash like Meme Index

February 10, 2025

EDITOR'S PICK

Cardano unlock .jpg

Cardano bets on LayerZero to unlock $80B in cross-chain belongings

February 13, 2026
1722126754 scin1 examplehero.width 800.png

A brand new useful resource for consultant dermatology photos

July 28, 2024
1xa5sqfvuzzrfdqz25b5aiw.png

GGUF Quantization with Imatrix and Okay-Quantization to Run LLMs on Your CPU

September 13, 2024
Kdn chugani n8n csv feature.png

Automate Knowledge High quality Stories with n8n: From CSV to Skilled Evaluation

June 27, 2025

About Us

Welcome to News AI World, your go-to source for the latest in artificial intelligence news and developments. Our mission is to deliver comprehensive and insightful coverage of the rapidly evolving AI landscape, keeping you informed about breakthroughs, trends, and the transformative impact of AI technologies across industries.

Categories

  • Artificial Intelligence
  • ChatGPT
  • Crypto Coins
  • Data Science
  • Machine Learning

Recent Posts

  • Resolv Labs’ Stablecoin Depegs Amid Exploit
  • A Light Introduction to Nonlinear Constrained Optimization with Piecewise Linear Approximations
  • Escaping the SQL Jungle | In direction of Information Science
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy

© 2024 Newsaiworld.com. All rights reserved.

No Result
View All Result
  • Home
  • Artificial Intelligence
  • ChatGPT
  • Data Science
  • Machine Learning
  • Crypto Coins
  • Contact Us

© 2024 Newsaiworld.com. All rights reserved.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?