Browse Source

first post!

john melesky 8 years ago
parent
commit
b71573e455

+ 11 - 0
css/default.css

@@ -95,6 +95,17 @@ div.figure {
     float: right;
 }
 
+img {
+    margin: 3px;
+    float: right;
+}
+
+div.sourceCode {
+    margin-left: 10%;
+    margin-right: 10%;
+    background-color: #ffffff;
+}
+
 div.caption {
     color: #335555;
     font-size: 10px;

BIN
images/post_2016_08_25/civ_voronoi.png


BIN
images/post_2016_08_25/dandd_voronoi.png


BIN
images/post_2016_08_25/euclid_voronoi.png


BIN
images/post_2016_08_25/manhattan_voronoi.png


BIN
images/post_2016_08_25/odd_dandd_voronoi.png


BIN
images/post_2016_08_25/taxi_voronoi.png


+ 266 - 3
posts/2016-08-25-different-distances.md

@@ -5,8 +5,12 @@ description: A few different ways of calculating distance in noncontinuous 2d sp
 ---
 
 How far is it from one place to another? How does that change if
-you're on a flat plane? What if that plane has discrete distances
-(like a checkerboard, or Civ map)?
+you're on a flat plane? What if that plane has discrete distances,
+like a checkerboard, or Civ map, or pixels on a screen?
+
+If we've got two points, *how far do you have to move to get from one
+to the other?*
+
 
 ### Euclidean Distance
 
@@ -18,19 +22,278 @@ higher-dimensional spaces, too.], [Euclidean
 distance](https://en.wikipedia.org/wiki/Euclidean_distance) is the
 shortest distance between two points.
 
+In the most general sense, the way you get this distance is by
+measuring it. If you're lucky enough to be using [Cartesian
+coordinates](https://en.wikipedia.org/wiki/Cartesian_coordinate_system),
+then you can calculate the distance based on the distance between the
+two coordinate points. For our purposes, I'm going to use $a$ and $b$,
+located at $(x_1, y_1)$ and $(x_2, y_2)$.
 
+[Pythagoras](https://en.wikipedia.org/wiki/Pythagorean_theorem)
+figured out how to calculate the length of the hypotenuse of a right
+triangle. The fun part is that, with Cartesian coordinates, *any pair
+of points* can now define a right triangle. Just take one point with
+an x coordinate from one point and a y coordinate from the other, and
+the triangle itself can be defined as $((x_!, y_1), (x_1, y_2), (x_2,
+y_2))$^[Or $((x_!, y_1), (x_2, y_1), (x_2, y_2))$ if you're feeling
+frisky].
+
+At that point, it becomes pretty easy. The length of the two sides is
+just the absolute value of the difference between their x and y
+coordinates. So, $|x_1 - x_2|$ and $|y_1 - y_2|$. Those are easy to
+plug into the Pythagorean equation to get the distance between the
+points ($d$):
 
 $$d = \sqrt{ |x_1 - x_2|^2 + |y_1 - y_2|^2 }$$
 
+In code, this might look something like:
+
+~~~ { .python }
+def eucliddistance(x1, y1, x2, y2):
+    xdist = abs(x1 - x2)
+    ydist = abs(y1 - y2)
+    return math.sqrt( (xdist ** 2) + (ydist ** 2) )
+~~~
+
+This all seems a bit unnecessary, though, right? Cartesian coordinates
+are universal. They're how we think about points. All of our screens
+are Cartesian planes, and even our 3d worlds are Cartesian. Done and
+done.
+
+Still, I think it's good to appreciate how we got here. Why? **Because
+it doesn't work for our purposes.**
+
+[![A Euclidean distance Voronoi diagram](/images/post_2016_08_25/euclid_voronoi.png "A Euclidean distance Voronoi diagram"){width=150 height=150}](/images/post_2016_08_25/euclid_voronoi.png)
+
+A "point", to Euclid, is infinitesimal -- it has no size, nor
+shape. And a Euclidean plane is continuous -- there is no "smallest
+distance" you can move.
+
+Pixels, and graph squares, on the other hand, take up space. And if
+you want to get from $(0,0)$ to $(7,14)$ on a discrete plane, there's
+no such thing as moving 15.56247..... That's the nature of a discrete
+space -- everything's an integer. And the other fun bit is that you
+can't move in arbitrary directions. On a square grid, you've only got
+8 directions to go, at most.
+
+It is also, frankly, a bit costly to calculate. Square roots are a bit
+more complicated than adding and subtracting.
+
+Still, Euclidean distance is the definition of distance. And though we
+can't use it to determine distance of movement, we can still use it
+for comparative purposes.
+
+#### Voronoi Diagrams
+
+For illustration purposes, I've decided to use [Voronoi
+diagrams](https://en.wikipedia.org/wiki/Voronoi_diagram). Voronoi
+diagrams are a way of splitting up planes around a set of sites, such
+that all points in the plane are associated with the closest site.
+
+The simplest, slowest possible way to generate such a diagram is to
+iterate over each pixel in an image, calculate the distance between
+that pixel and each site, sort the sites by distance, and assign that
+site to the pixel. So I coded that up. I've put that code [up on
+github](https://github.com/jmelesky/voronoi_example). Nothing fancy
+going on, I assure you.
+
+By looking at the Voronoi diagrams for the same set of sites, using
+different distance measures, we can get an appreciation for how those
+measures vary, and what their characteristics are.
 
 
 ### Manhattan Distance
 
+[![A taxicab distance Voronoi diagram](/images/post_2016_08_25/taxi_voronoi.png "A taxicab distance Voronoi diagram"){width=150 height=150}](/images/post_2016_08_25/taxi_voronoi.png)
+
+The island of Manhattan is densly populated, and densely built. For
+our purposes, though, it's also, notably, laid out like a grid. If you
+want to get two blocks up and two blocks over, you're going to have to
+move four total block lengths to get there. And that's a pretty
+straightforward distance measure.
+
+It's also known as taxicab distance, rectilinear distance, and a bunch
+of other things. And it's the simplest calculation we're going to be
+making. Just add the lengths of each side of our triangle:
+
+$$d = |x_1 - x_2| + |y_1 - y_2|$$
+
+In code, it might look like this:
+
+~~~ { .python }
+def taxidistance(x1, y1, x2, y2):
+    return (abs(x1 - x2) + abs(y1 - y2))
+~~~
+
+If you're on a checkerboard grid, and you are limited in movement to
+only the four cardinal directions, this is how you're going to measure
+getting from place to place. If you want to get from $(0,0)$ to
+$(7,14)$ in a taxi, you're going to go 21 squares, which is a fair bit
+further than 15.56.
+
+If you take a look at the Voronoi diagram derived from taxi distance,
+you'll see that the shapes have changed from straight-edged polygons
+to odd, almost art-deco-like shapes. In particular, some of the places
+where there were very long, thin cells in the Euclidean version have
+changed to blocky, angled, and often much distorted versions.
+
 ### Civ Distance
 
+[![A 'Civ' distance Voronoi diagram](/images/post_2016_08_25/civ_voronoi.png "A 'Civ' distance Voronoi diagram"){width=150 height=150}](/images/post_2016_08_25/civ_voronoi.png)
+
+If Manhattan distance is what you reckon with when you can only move
+in the four cardinal directions, what do you do when you can move in
+all 8 possible directions? Up until the most recent edition, the
+[Civilization games](https://en.wikipedia.org/wiki/Civilization_%28series%29) had
+exactly that scenario[^1]. So let's call this the Civ distance.
+
+Calculating this one is also easy. Instead of adding the sides
+together, you just take the length of the longest side.
+
+How's that work, exactly? Well, if you're going from $(0,0)$ to
+$(7,14)$, you go at an angle for 7 squares (moving twice as fast as
+taxicabs). At that point, you're at $(7,7)$, and it's a straight shot
+to $(7,14)$ only 7 squares away. Just 14 squares total.
+
+Mathematically, it looks like this:
+
+$$d = \max(|x_1 - x_2|, |y_1 - y_2|)$$
+
+And in code, something like this:
+
+~~~ {.python}
+def civdistance(x1, y1, x2, y2):
+    return (max(abs(x1 - x2), abs(y1 - y2)))
+~~~
+
+Again, very simple. And 14 is actually shorter than 15.56. By taxi, up
+and over is a distance of two, by Euclid that same distance is 1.41
+and change. But by Civ distance, that same span is only one. That, of
+course leads to some different distortions in the voronoi
+diagram. Where the taxicab Voronoi cells stretched up, down, left, and
+right, the Civ cells are stretching further in the diagonals.
+
+Overall, Civ distance is truer to Euclid than taking the taxi, but
+it's still a very different beast.
+
 ### D&D Distance
 
-This is a bit more complex.
+Perhaps you've played some Dungeons & Dragons. Perhaps you've
+specifically played 3.5ed^[Or its offshoot, Pathfinder; or, for that
+matter, any of the d20 games that also derived from that source]. That
+game was notorious for its complex, tactical, grid-based combat
+system. Each square represented 5 feet of space, and each character
+had a limited amount of movement per turn.
+
+[![A D&D distance Voronoi diagram](/images/post_2016_08_25/dandd_voronoi.png "A D&D distance Voronoi diagram"){width=150 height=150}](/images/post_2016_08_25/dandd_voronoi.png)
+
+Now, it didn't make sense to allow only movement in the four cardinal
+directions. That would, if nothing else, be incredibly frustrating for
+players trying to move their characters around the map.
+
+They also didn't want to allow willy-nilly Civ-style movement, since
+that would offer a clear advantage to anyone who wished to use the
+diagonals[^4ed].
+
+Ultimately, they decided on an iterative process which was a bit more
+complex:
+
+  - The first time (and every odd time) you move a diagonal, it costs
+    5 feet of movement/distance, just like moving in a cardinal
+    direction
+
+  - The second time (and every even time) you move a diagonal, it
+    costs 10 feet of movement/distance, like you're taking a taxicab
+
+The net result is that, if you start out moving diagonally only, it's
+5 feet, then 15, then 20, then 30, then 35, then 45, etc. Translating
+feet to grid square units, it's 1 unit, then 3, then 4, then 6, then
+7, then 9.
+
+That pattern, as it turns out, is the floor of 1.5 * the diagonal
+distance.
+
+To generalize that to more than just the purely diagonal, we figure
+out how much we can go straight, then do the rest diagonal, and add
+them together:
+
+$$d_1 = \max(|x_1 - x_2|, |y_1 - y_2|) - \min(|x_1 - x_2|, |y_1 - y_2|)$$
+$$d_2 = \lfloor 1.5 \times (\min(|x_1 - x_2|, |y_1 - y_2|))\rfloor$$
+$$d  = d_1 + d_2$$
+
+And in code, that looks like this[^odd]:
+~~~ {.python}
+def dandddistance(x1, y1, x2, y2):
+    mindist = min(abs(x1 - x2), abs(y1 - y2))
+    maxdist = max(abs(x1 - x2), abs(y1 - y2))
+    return ((maxdist - mindist) + (1.5 * mindist))
+~~~
+
+Looking at the Voronoi diagram, this is easily the closest match to
+Euclid. And that's not surprsing, really. When Euclid things up and
+over is 1.41 and change, taxicabs and Civ think it's 2 and 1,
+respectively, D&D thinks it's (about) 1.5, which is much, much closer
+to the continuous distance. To get from $(0,0)$ to $(7,14)$ by way of
+D&D distance is 17, which is pretty close to the 15.56 of Euclid, and
+without the distortions of Civ.
+
+Of course, it's a bit more complex to calculate than Civ or taxi, but
+it doesn't require taking any square routes, which marks it as still
+computationally simpler than Euclid. And it's grid-native. The
+designers of 3.5 did a good job finding a relatively simple
+approximation for distance.
+
+### Conclusion
+
+And here we are at the end. Hopefully getting from point $a$ to point
+$b$ wasn't too rough, and I know I learned a thing or two on the way.
+
+Things get more complicated when you start measuring things in three
+dimensions. The D&D algorithm, particularly, is unlikely to scale well
+as you end up figuring out the appropriate values for each of the 26
+different directions you can go. Maybe I'll revisit this again.
+
+And let's not forget non-spatial distances! [Hamming
+distance](https://en.wikipedia.org/wiki/Hamming_distance) is
+particularly fun to think about. It has uses in small-scale AI tasks
+like autocorrect, as well as in very-large-dimension binary vector
+spaces[^duh]. I guess I'm saying that distances are fascinating
+things, and there's so much more to talk about.
+
+Thanks for bearing with me.
+
+
+[^1]: For an even longer time, the king piece on a chessboard had the
+exact same abilities, which is why this is probably more commonly
+known as "chessboard distance", or [Chebyshev
+distance](https://en.wikipedia.org/wiki/Chebyshev_distance), after
+Pafnuty Chebyshev, a 19th-century Russian mathematician. But I've
+played more Civ than chess, frankly.
+
+[^4ed]: When it came time to publish the 4th edition of D&D, however,
+they changed their mind on this. The complexity of the 3.5 distance
+calculation wasn't worth the verisimilitude, so they moved to using
+Civ distances instead. This decision also led to the infamous [square
+fireballs](http://diceofdoom.com/blog/2009/10/powergaming-understanding-area-of-effect-in-dnd4e/),
+but no system is perfect.
 
+[^odd]: You might notice that there isn't a `min` or `math.floor` call
+in that code. There was, for a bit, but it was generating [some
+strange artifacts](/images/post_2016_08_25/odd_dandd_voronoi.png),
+where strange striping was occurring. I suspect that's due to an odd
+quantizing effect, but haven't looked into it deeply. In the meantime,
+using the un-floored numbers produce a more coherent image.
 
+[^duh]: *Obviously*\
+\
+...\
+\
+For some machine learning tasks, you end up
+representing different things in very large vector spaces (one
+dimension for each "feature" you are keeping track of). In many cases,
+*binary* features are more tractable and just as useful -- for
+example, you might keep track of *whether* a certain word shows up in
+a document, rather than *how many times* it shows up. When those two
+situations coincide, Hamming distance becomes more palatable and
+tractable than Euclidean distance.]
 

+ 8 - 0
templates/default.html

@@ -9,6 +9,14 @@
         <script type="text/javascript"
                 src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
         </script>
+        <script>
+          MathJax.Hub.Config({
+          jax: ["input/TeX","output/HTML-CSS"],
+          displayAlign: "left",
+          displayIndent: "10%"
+          });
+        </script>
+
     </head>
     <body>
         <div id="header">