Approximation Algorithms

(近似アルゴリズム)

Data Structures and Algorithms

15th lecture, January 19, 2017

http://www.sw.it.aoyama.ac.jp/2016/DA/lecture15.html

Martin J. Dürst

Today's Schedule

Leftovers and summary of last lecture
How to deal with NP problems
Approximation algorithms
Schedule from now on

Summary of Last Lecture

There are problems that cannot (yet?) be solved in polynomial time
Many of them are NP-complete (or NP-hard) problems
There may be some algorithms that are not in P, but not NP-complete
(an example is graph isomorphism)
It is important to recognize difficult (NP-complete) problems early
There are problems that are even more difficult than NP-complete problems

The Importance of Dealing with NP Problems

There are many NP-complete and NP-hard problems
Most problems have many practical applications
No efficient algorithms are known
A practical approach is needed

Strategies for Dealing with NP Problems

Limit or change problem
Concentrate on actual data
Design and use an approximation algorithm

There are some general solutions, but mostly, each problem has to be addressed separately.

Changing NP Problems

Example: Traveling salesman

General problem: Arbitrary cost structure
Simplification: Costs are planar distances on a plane
Variant: Costs satisfy triangle inequality (AB + BC > AC)
(there is a polynomial algorithm that solves this variant within 1.5 times of the optimal solution)

Deal with Actual Data

Example: 3-SAT

Problems can be classified into three main categories:

The number of terms is high relative to the number of variables
→ It may be easy to show that there is no solution
The number of terms is low relative to the number of variables
→ It may be easy to find a solution
The number of terms is medium relative to the number of variables
→ This is the really difficult case

With lots of careful optimizations and tricks, practical usages are possible.

Competition: http://www.satcompetition.org

Approximation Algorithms

Even if finding a perfect solution is impossible, it may still be desirable to find a close-to-optimal solution
An algorithm producing an approximate (close to optimal) solution is called an approximation algorithm
For many approximation algorithms, there may be a guarantee of how much from the optimum the solution can differ
There are problem-specific approximation algorithms
There are also general approximation algorithms (algorithm design strategies)
- Hill climbing
- Simulate annealing
- Genetic algorithms
- ...

Problem-specifc Approximation Algorithms

Example problem: Load balancing
Using m identical machines, finish (as quickly as possible) a number n of tasks that each take time t_i to complete

Algorithm 1:
- Assign the next task to a machine as soon as the previous task finishes
- The overall time is guaranteed to be ≦ two times the optimal solution time
Algorithm 2:
- Assign the tasks in decreasing order to a machine as soon as the previous task finishes
- The overall time is guaranteed to be ≦ 1.5 times the optimal solution time

Hill Climbing

Start with a (maybe highly non-optimal) solution
Produce solutions close to the current solution,
and select the best one among these
Repeat until no improvement is possible anymore
Problem: Impossible to avoid getting stuck in a local optimum

Simulated Annealing

Origin of name:
Cristal production be carefully lowering temperature
Start with an arbitrary solution
Randomly change the solution to produce new solution candidates
Always keep new, better solutions
Keep some of the not so good solutions, too
Repeat but slowly reduce the amount of random change (corresponds to reducing the temperature)
Stop after a certain number of steps or when there is no further improvement of solutions
Use the overall best solution as the output of the algorithm
Problems:
- Tuning is necessary for each problem (speed of lowering temperature, ...)
- Solutions cannot be combined

Genetic Algorithms

Using concepts from evolution theory
Solution details are interpreted as genetic information
Start with multiple randomly generated "solutions" as the first generation
To get from one generation to the next:
- Combine information from two (or more) different solutions to get a new solution (corresponds to sexual reproduction)
- Move information pieces around inside the new solution (corresponds to crossover)
- Modify information randomly (corresponds to mutation)
In each generation, produce a lot of new solutions,
and delete (mostly) the less optimal ones (corresponds to natural selection)
Stop after a certain number of generations or when there is no further improvement of solutions
Use the overall best solution as the output of the algorithm
Problems:
- How to combine solutions
- Parameters choice (number of generations, number of solutions per generation,...)

Summary

Approaches to deal with "intractable problem" that cannot be solved perfectly:

Change or limit the problem
Concentrate on actual data
Design and use an approximation algorithm
Find an approximate solution:
- Problem-specific algorithms
- Hill climbing
- Simulated annealing
- Genetic algorithms, ...

Schedule from Now On

Spring term of junior year (3rd year): Language Theory and Compilers
Senior year (4th year): Bachelor Research
January 28 (Friday) 9:30~10:55: Term Final Exam

Glossary

graph isomorphism: グラフ同型
approximation algorithm: 近似アルゴリズム
planar distance: 平面距離 (平面上の直線の距離)
triangle inequality: 三角不等式
hill climbing: 山登り法
simulated annealing: 焼き鈍し法、シミュレーテッドアニーリング
genetic algorithm: 遺伝的アルゴリズム
load balancing: ロード・バランシング
local optimum: 局所的な最適解
cristal: 結晶
evolution theory: 進化論
genetic information: 遺伝的情報
generation: 世代
sexual reproduction: 有性生殖
crossover: 交叉 (交差、組み替え)
mutation: 突然変異
natural selection: 自然淘汰 (自然選択)