Subjects/Math/Advanced Mathematics/Applied Mathematics/Optimization

Optimization Study Guide

Study Guide

📖 Core Concepts Mathematical Optimization – choosing the best element from a set of alternatives by maximizing or minimizing a real‑valued objective function. Search space (feasible set) \(A\) – all candidate solutions that satisfy any constraints. Objective / loss / utility function \(f(x)\) – maps each candidate \(x\in A\) to a real number to be minimized \(\min{x\in A} f(x)\) or maximized \(\max{x\in A} f(x)\). Feasible solution – any \(x\) that belongs to \(A\). Optimal solution – feasible solution that attains the smallest (or largest) objective value. Local vs. Global Optimum – a local optimum is best only within a neighbourhood; a global optimum is best over the entire feasible set. Convexity – if \(f\) is convex (and \(A\) convex), any interior local minimum is automatically global. Problem families – discrete (integer, combinatorial), continuous (constrained/unconstrained), multi‑objective, and global (multimodal). Optimality conditions – first‑order (gradient/subgradient = 0, KKT for constraints) and second‑order (Hessian definiteness). 📌 Must Remember Standard forms: \(\min{x\in A} f(x)\) or \(\max{x\in A} f(x)\). Convex ⇒ local = global (interior points). KKT conditions are necessary for optimality in problems with equality/inequality constraints (under regularity). Fermat’s theorem: stationary point ⇔ \(\nabla f(x)=0\) for unconstrained problems. Extreme Value Theorem: a continuous function on a compact set attains its min and max. Simplex algorithm solves linear programs exactly in a finite number of pivots. Newton’s method gives quadratic convergence when the Hessian is positive‑definite. BFGS and other quasi‑Newton methods achieve super‑linear convergence without forming the full Hessian. Pareto optimality: no other feasible point improves one objective without worsening another. 🔄 Key Processes Formulating an Optimization Problem Identify decision variables \(x\). Write objective \(f(x)\) (min or max). List constraints (equalities/inequalities) that define \(A\). Checking Convexity Verify \(f\) is convex (second derivative \(\ge 0\) or Hessian PSD). Verify feasible set \(A\) is convex (linear/convex constraints). Applying First‑Order Optimality Compute gradient \(\nabla f(x)\). Set \(\nabla f(x)=0\) for unconstrained or construct Lagrangian \(L(x,\lambda,\mu)\) for constrained. Solve KKT system. Second‑Order Test Form Hessian \(H = \nabla^2 f(x)\). Positive‑definite ⇒ local min; negative‑definite ⇒ local max; indefinite ⇒ saddle. Algorithm Selection Workflow Is the problem linear? → Simplex or interior‑point LP solver. Convex quadratic? → Interior‑point QP or active‑set methods. General convex? → Gradient‑based (proximal, ADMM) or interior‑point. Non‑convex/discrete? → Heuristics (GA, SA, DE) + local refinement. 🔍 Key Comparisons Linear Programming vs. Integer Programming LP: variables continuous, feasible region polyhedron, solved exactly. IP: some/all variables integer → feasible set non‑convex, NP‑hard, often solved with branch‑and‑bound. Gradient Descent vs. Newton’s Method GD: uses \(-\nabla f\) only, linear convergence, cheap per iteration. Newton: uses \(-H^{-1}\nabla f\), quadratic convergence, expensive Hessian inversion. Local Search (e.g., hill‑climbing) vs. Global Heuristics (e.g., Simulated Annealing) Local: fast, may get trapped in local minima. Global: stochastic moves allow escape, slower but better for multimodal problems. KKT Conditions vs. Lagrange Multipliers Lagrange multipliers: equality constraints only. KKT: extends to inequality constraints with complementary slackness. ⚠️ Common Misunderstandings “Convex ⇒ easy” – convex problems are polynomial‑time solvable, but large‑scale instances still need careful algorithm choice. “Gradient = 0 guarantees a minimum” – could be a maximum or saddle; second‑order test required. “Global optimum = best solution found by any heuristic” – heuristics provide no guarantee; only deterministic global methods do. “Feasibility implies optimality” – a feasible point may be far from optimal; must still satisfy optimality conditions. 🧠 Mental Models / Intuition “Landscape view” – imagine \(f(x)\) as a terrain; gradients point uphill, negatives point downhill. Convex terrain has a single basin (global minimum). “Cone of feasibility” – for convex constraints, the feasible region behaves like a cone; any line segment between feasible points stays feasible. “Pareto frontier as a skyline” – each point on the skyline cannot be dominated; moving along the frontier trades one objective for another. 🚩 Exceptions & Edge Cases Non‑convex but locally convex – interior local minima may still be global if the region is isolated. Boundary optima – KKT first‑order conditions may involve active constraints; gradient need not be zero in the full space. Degenerate Hessian – positive semidefinite (not definite) gives a flat direction; additional criteria needed. Discrete feasible set – concepts of gradient/Hessian lose meaning; rely on combinatorial algorithms or relaxations. 📍 When to Use Which Linear objective & linear constraints → Simplex or interior‑point LP solver. Quadratic objective, linear constraints → Quadratic programming (active‑set or interior‑point). Convex but non‑quadratic → Gradient‑based (projected gradient, proximal) or interior‑point methods. Non‑convex continuous → Start from multiple random seeds with a local optimizer (BFGS, Newton) + global heuristic (GA, SA) for exploration. Integer/ combinatorial → Branch‑and‑bound, cutting planes, or problem‑specific combinatorial algorithms. Multi‑objective → Compute Pareto set via weighted‑sum, ε‑constraint, or evolutionary multi‑objective algorithms (NSGA‑II). 👀 Patterns to Recognize “All constraints linear → convex polyhedron” → LP or IP depending on variable type. “Quadratic term only in objective, constraints linear → convex QP if Hessian PSD. “Presence of inequality constraints + stationarity → check KKT complementary slackness. “Multiple local minima + rugged landscape → likely multimodal → need global/metaheuristic. “Objective separable into sum of terms → coordinate descent or ADMM may be efficient. 🗂️ Exam Traps Confusing local with global optimum – exam may ask which statement is always true; only convex interior optima are guaranteed global. Misapplying KKT without checking regularity – complementary slackness fails if constraint qualifications are violated. Assuming gradient descent always converges quickly – step‑size choice critical; too large → divergence, too small → extremely slow. Choosing Simplex for a non‑linear problem – Simplex solves linear programs only; a non‑linear convex problem needs interior‑point or other convex solvers. Treating integer variables as continuous – relaxation gives a bound but not a feasible integer solution; answer choices that ignore integrality are distractors. --- Keep this sheet handy. Review each bullet before the exam, and practice identifying which bullet applies to a given problem statement.

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.

Start learning in seconds

Drop your PDFs here or