123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193 |
- Support Vector Machines
- (lightning talk)
- (LPW '07) (john melesky)
- ---
- Presupposing:
- <ul>
- <li>You have a bunch of something.</li>
- <li>You can transform relevant attributes of those things into numbers.</li>
- <li>You can connect those numbers into vectors (think coordinates in an attribute space).</li>
- <li>You want to categorise them base on those numbers.</li>
- </ul>
- ---
- The problem: find a line that separates these two categories of thing
- <img style="float: right;" src="basedata.png" />
- ---
- For humans, this is easy.
- <img style="float: right;" src="cleansep.png" />
- For mathematicians, it's actually not too hard.
- ---
- For humans, this is easy.
- <img style="float: right;" src="cleansep.png" />
- For <del>mathematicians</del> computers, it's actually not too hard.
- ---
- There are two problems, though.
- ---
- Problem, the first:
- <img style="float: right;" src="badline1.png" />
- ---
- Problem, the first:
- <img style="float: right;" src="badline2.png" />
- ---
- Problem, the second:
- <img style="float: right;" src="badset1.png" />
- ---
- Problem, the second:
- <img style="float: right;" src="badset2.png" />
- ---
- Problem, the second:
- <img style="float: right;" src="badset3.png" />
- ---
- Conveniently, Support Vector Machines address both of the problems i've identified.
- ---
- Solution, the first:
- ---
- Solution, the first:
- <img style="float: right;" src="bordervectors.png" />
- <ul>
- <li>Create "border" vectors, parallel to eachother, touching the outermost edge of each category dataset.</li>
- </ul>
- ---
- Solution, the first:
- <img style="float: right;" src="bordervectors.png" />
- <ul>
- <li>Create "border" vectors, parallel to eachother, touching the outermost edge of each category dataset.</li>
- <li>As you add new items, ensure these "borders" stay parallel.</li>
- </ul>
- ---
- Solution, the first:
- <img style="float: right;" src="supportvectors.png" />
- <ul>
- <li>Create "border" vectors, parallel to eachother, touching the outermost edge of each category dataset.</li>
- <li>As you add new items, ensure these "borders" stay parallel.</li>
- <li>Create your categorizing vector equidistant from your two "borders".</li>
- </ul>
- ---
- Solution, the first:
- <img style="float: right;" src="supportvectors.png" />
- <ul>
- <li>Create "border" vectors, parallel to eachother, touching the outermost edge of each category dataset.</li>
- <li>As you add new items, ensure these "borders" stay parallel.</li>
- <li>Create your categorizing vector equidistant from your two "borders".</li>
- <li>These "borders" are called "support vectors".</li>
- </ul>
- ---
- A joke:
- Q: How many mathematicians does it take to change a lightbulb?
- ---
- A joke:
- Q: How many mathematicians does it take to change a lightbulb?
- A: One, who hands it to 127 Londoners, thus reducing it to an earlier joke.
- ---
- A question:
- Q: How do mathematicians categorize non-linearly-separable data?
- ---
- A question:
- Q: How do mathematicians categorize non-linearly-separable data?
- A: Munge the data until it's linearly separable, thus reducing it to an earlier slide.
- ---
- A question:
- Q: How do mathematicians categorize non-linearly-separable data?
- A: Munge the data until it's linearly separable, thus reducing it to an earlier slide.
- Seriously. The munging is done using what are known as "kernel methods".
- ---
- Kernel Methods
- <ul>
- <li>Functions that munge data</li>
- <li>Very faintly magical (because i have no idea how they were derived)</li>
- <li>Require some skill to choose the right one for the problem</li>
- <ul>
- ---
- Kernel Methods + Support Vectors = Support Vector Machines
- ---
- In Perl:
- Algorithm::SVM - bindings to libsvm
- (Also wrapped by AI::Categorizer)
|