index.html 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621
  1. <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
  2. <html> <head>
  3. <title>Speed Up that Library when You Can't C a Thing</title>
  4. <style>
  5. .slide {
  6. border: 2px solid #668888;
  7. background-color: #A4CAE1;
  8. padding: 2%;
  9. width: 94%;
  10. }
  11. pre {
  12. border: 1px solid #224444;
  13. background-color: #B4DAF1;
  14. padding: 2px;
  15. }
  16. </style>
  17. <script src="scripts/jquery-1.2.3.js" type="text/javascript" />
  18. <script src="scripts/slideshow.js" type="text/javascript" />
  19. </head>
  20. <body>
  21. <div class='slide'>
  22. <h1>Speed up that library when you can't C a thing</h1>
  23. <p>John Melesky
  24. (Open Source Bridge, June 2009)</p>
  25. </div>
  26. <div class='slide'>
  27. <h1>Caveats</h1>
  28. <ul>
  29. <li>We're talking about Unix and Linux<ul>
  30. <li>Most should be applicable on OSX once you factor in the dynlib stuff</li>
  31. <li>I have no idea on the Windows front; sorry</li>
  32. </ul>
  33. </li>
  34. </ul>
  35. </div>
  36. <div class='slide'>
  37. <h1>Caveats</h1>
  38. <ul>
  39. <li>We're talking about Unix and Linux<ul>
  40. <li>Most should be applicable on OSX once you factor in the dynlib stuff</li>
  41. <li>I have no idea on the Windows front; sorry</li>
  42. </ul>
  43. </li>
  44. <li>The example is in Ruby and Ocaml<ul>
  45. <li>The FFIs for Perl and Python are similar</li>
  46. <li>Other called languages are possible</li>
  47. <li>I'll try to be explicit what's general and what's example-specific</li>
  48. </ul>
  49. </li>
  50. </ul>
  51. </div>
  52. <div class='slide'>
  53. <h1>Ever in this situation?</h1>
  54. </div>
  55. <div class='slide'>
  56. <h1>The culprit</h1>
  57. <pre><code>def matrix_mul(a,b)
  58. rows = (0 .. a.size - 1).collect {
  59. |i|
  60. (0 .. b[0].size - 1).collect {
  61. |j|
  62. vij = 0
  63. 0.upto(a[i].size - 1) do
  64. |k|
  65. vij += a[i][k] * b[k][j]
  66. end
  67. vij
  68. }
  69. }
  70. return rows
  71. end
  72. </code></pre>
  73. <p>(adapted from matrix.rb)</p>
  74. </div>
  75. <div class='slide'>
  76. <h1>The solution</h1>
  77. <table>
  78. <tr>
  79. <td><b><font size="+2">That's what C is for, right?</b></td>
  80. <td><img src="img/dhh.jpg"></td>
  81. </tr>
  82. </table>
  83. </div>
  84. <div class='slide'>
  85. <h1>That's not what C is for!</h1>
  86. <p>C is a systems language, intended for use with 30-year-old kernels and drivers, on processors with fewer than 10k transistors (we're currently in the billion-transistor range, thankyouverymuch Moore's Law).</p>
  87. </div>
  88. <div class='slide'>
  89. <h1>That's not what C is for!</h1>
  90. <p>C is a systems language, intended for use with 30-year-old kernels and drivers, on processors with fewer than 10k transistors (we're currently in the billion-transistor range, thankyouverymuch Moore's Law).</p>
  91. <p>In other words, it's fast only as a side effect.</p>
  92. </div>
  93. <div class='slide'>
  94. <h1>Other languages are better</h1>
  95. <p>For raw numerics, Fortran still has C beat. APL is pretty dang fast, too.</p>
  96. <p>For complex datatypes, languages like Haskell or ML will all give you better type support, safer usage, and possibly faster code.</p>
  97. <p>Really, almost any language offers benefits over C, whether it's faster, safer, more concise, more advanced, or all of the above.</p>
  98. </div>
  99. <div class='slide'>
  100. <h1>What are the options?</h1>
  101. <p>The only constraint is that your target language needs an FFI (Foreign Function Interface) that allows C to call it. Most awesome languages allow this. I'm going to use Ocaml.</p>
  102. <p>The other material requirements are your language header files (<code>ruby.h</code>, <code>caml/callback.h</code>, etc.), and perserverence.</p>
  103. </div>
  104. <div class='slide'>
  105. <h1>What are the options?</h1>
  106. <p>The only constraint is that your target language needs an FFI (Foreign Function Interface) that allows C to call it. Most awesome languages allow this. I'm going to use Ocaml.</p>
  107. <p>The other material requirements are your language header files (<code>ruby.h</code>, <code>caml/callback.h</code>, etc.), and perserverence.</p>
  108. <p>(You may look warily upon the whole "allows C to call it" thing. This is understandable. Things will get worse before they get better.)</p>
  109. </div>
  110. <div class='slide'>
  111. <h1>What's Ocaml?</h1>
  112. <p>In brief, it's a statically- and strongly- typed, non-pure, strict, functional programming language. It's from the ML family (which includes SML, F#, and some others).</p>
  113. <p>Ocaml's pretty cool at the moment, because it's one of the fastest languages around, routinely rivalling C++ in the Shootout.</p>
  114. </div>
  115. <div class='slide'>
  116. <h1>Matrix Multiplication in Ocaml, round 1</h1>
  117. <pre><code>let matrix_mul a b =
  118. List.map
  119. (fun row -&gt;
  120. mapn
  121. (fun column -&gt;
  122. List.fold_left (+) 0
  123. (List.map2 ( * ) row column))
  124. b)
  125. a
  126. </code></pre>
  127. <p>(cribbed from RosettaCode)</p>
  128. </div>
  129. <div class='slide'>
  130. <h1>Benefits</h1>
  131. <pre><code>let matrix_mul a b =
  132. List.map
  133. (fun row -&gt;
  134. mapn
  135. (fun column -&gt;
  136. List.fold_left (+) 0
  137. (List.map2 ( * ) row column))
  138. b)
  139. a
  140. </code></pre>
  141. <p>We've acheived a 10x speedup, and our code is not actually much more complicated than the Ruby.</p>
  142. <p>This code uses Lists, which are pretty natural to use in functional languages. Arrays would probably be faster, though.</p>
  143. </div>
  144. <div class='slide'>
  145. <h1>Matrix Multiplication in Ocaml, round 2</h1>
  146. <pre><code>let matrix_mul x y =
  147. let x0 = Array.length x
  148. and y0 = Array.length y in
  149. let y1 = if y0 = 0 then 0 else Array.length y.(0) in
  150. let z = Array.make_matrix x0 y1 0 in
  151. for i = 0 to x0-1 do
  152. for j = 0 to y1-1 do
  153. for k = 0 to y0-1 do
  154. z.(i).(j) &lt;- z.(i).(j) + x.(i).(k) * y.(k).(j)
  155. done
  156. done
  157. done;
  158. z
  159. </code></pre>
  160. <p>(also cribbed from RosettaCode)</p>
  161. </div>
  162. <div class='slide'>
  163. <h1>Benefits</h1>
  164. <pre><code>let matrix_mul x y =
  165. let x0 = Array.length x
  166. and y0 = Array.length y in
  167. let y1 = if y0 = 0 then 0 else Array.length y.(0) in
  168. let z = Array.make_matrix x0 y1 0 in
  169. for i = 0 to x0-1 do
  170. for j = 0 to y1-1 do
  171. for k = 0 to y0-1 do
  172. z.(i).(j) &lt;- z.(i).(j) + x.(i).(k) * y.(k).(j)
  173. done
  174. done
  175. done;
  176. z
  177. </code></pre>
  178. <p>Ocaml Arrays are pretty darn fast, and have a great initialization function tailored to this sort of problem (make_matrix).</p>
  179. <p>We've now managed a speedup around 30-35x. And, again, the code is not that much more complex than the original Ruby.</p>
  180. </div>
  181. <div class='slide'>
  182. <h1>Gluing Pieces Together</h1>
  183. <p>Of course, the Ocaml code is operating on Ocaml types, and Ruby wants to pass it Ruby types. What to do, what to do?</p>
  184. </div>
  185. <div class='slide'>
  186. <h1>It's gonna get worse before it gets better...</h1>
  187. <p>Ruby can call C code...</p>
  188. </div>
  189. <div class='slide'>
  190. <h1>It's gonna get worse before it gets better...</h1>
  191. <p>Ruby can call C code...</p>
  192. <p>Ocaml can be called by C code...</p>
  193. </div>
  194. <div class='slide'>
  195. <h1>It's gonna get worse before it gets better...</h1>
  196. <p>Ruby can call C code...</p>
  197. <p>Ocaml can be called by C code...</p>
  198. <p>They both have facilities to translate their types into C types (and Ocaml vice-versa)...</p>
  199. </div>
  200. <div class='slide'>
  201. <h1>FFI Aside</h1>
  202. <p>Many languages provide C type definitions and functions for extensions. There are two problems with this.</p>
  203. <ol>
  204. <li>Many of these FFIs aren't merely type declarations and functions, but also macros. That means they only work in C.</li>
  205. <li>Even if they were macro-free, few languages have all-purpose transparent C access. Thus, wrapper code is always needed.</li>
  206. </ol>
  207. </div>
  208. <div class='slide'>
  209. <h1>Dancing the Type Two-Step</h1>
  210. <p>Let's work our way out, from Ocaml to Ruby, via C. First, we need to tell Ocaml to export a function to C.</p>
  211. <pre><code>let matrix_mul x y =
  212. let x0 = Array.length x
  213. and y0 = Array.length y in
  214. let y1 = if y0 = 0 then 0 else Array.length y.(0) in
  215. let z = Array.make_matrix x0 y1 0 in
  216. for i = 0 to x0-1 do
  217. for j = 0 to y1-1 do
  218. for k = 0 to y0-1 do
  219. z.(i).(j) &lt;- z.(i).(j) + x.(i).(k) * y.(k).(j)
  220. done
  221. done
  222. done;
  223. z
  224. let _ = Callback.register "ocaml_matrix_mul" matrix_mul;;
  225. </code></pre>
  226. </div>
  227. <div class='slide'>
  228. <h1>Dancing the Type Two-Step</h1>
  229. <p>Next, we need some C code to call the Ocaml.</p>
  230. <pre><code>#include "caml/callback.h"
  231. void initcaml(void) {
  232. char *dummy_argv[] = {0};
  233. caml_startup(dummy_argv);
  234. }
  235. </code></pre>
  236. <p>First we initialize the Ocaml runtime.</p>
  237. </div>
  238. <div class='slide'>
  239. <h1>Dancing the Type Two-Step</h1>
  240. <pre><code>int interm_matrix_mul(*a, *b) {
  241. static value *f = NULL;
  242. f = caml_named_value("ocaml_matrix_mul");
  243. if (f == NULL)
  244. return -1;
  245. value r_caml;
  246. r_caml = callback2(*f, a, b);
  247. </code></pre>
  248. <p>This is how we grab and execute the exported code. Note that we haven't figured the types for <code>a</code> or <code>b</code>, nor have we converted r_caml to something C-like. Who's up for C arrays of C arrays of C <code>int</code>s? (Or should they be <code>long</code>s?)</p>
  249. </div>
  250. <div class='slide'>
  251. <h1>Dancing the Type Two-Step</h1>
  252. <p>From the Ruby side, we'd do something like this.</p>
  253. <pre><code>#include "ruby.h"
  254. VALUE OcamlMatrix = Qnil; // set up the Ruby namespace
  255. // prototype our init and method
  256. void Init_ocamlmatrix();
  257. VALUE method_matrix_mul(VALUE self, VALUE a, VALUE b);
  258. void Init_ocamlmatrix() {
  259. OcamlMatrix = rb_define_module("OcamlMatrix");
  260. rb_define_method(OcamlMatrix, "matrix_mul", method_matrix_mul, 0);
  261. }
  262. </code></pre>
  263. </div>
  264. <div class='slide'>
  265. <h1>Dancing the Type Two-Step</h1>
  266. <pre><code>VALUE method_matrix_mul(VALUE self, VALUE a, VALUE b) {
  267. // here we would convert a and b to C arrays
  268. c = interm_matrix_mul(conv_a, conv_b);
  269. // and then convert c from C arrays to a Ruby VALUE
  270. return conv_c;
  271. }
  272. </code></pre>
  273. </div>
  274. <div class='slide'>
  275. <h1>You probably saw this coming</h1>
  276. <p>You can convert directly from the Ruby types to Ocaml types without C intermediate types. This is nice, because it avoids any need for structs or pointer arithmetic.</p>
  277. </div>
  278. <div class='slide'>
  279. <h1>You probably saw this coming</h1>
  280. <p>You can convert directly from the Ruby types to Ocaml types without C intermediate types. This is nice, because it avoids any need for structs or pointer arithmetic.</p>
  281. <p>But... this only works if you can import both languages' headers without name conflicts (thankfully Ruby's VALUE and Ocaml's value are spelled differently).</p>
  282. </div>
  283. <div class='slide'>
  284. <h1>From Two-Step to Tango</h1>
  285. <pre><code>VALUE method_matrix_mul(VALUE self, VALUE a, VALUE b) {
  286. CAMLparam0 ();
  287. CAMLlocal2 (caml_a, caml_b);
  288. CAMLlocal1 (caml_interm);
  289. CAMLlocal1 (caml_item);
  290. int x = RARRAY(a)-&gt;len;
  291. for (int i = 0; i&lt;x; i++) {
  292. VALUE row = rb_ary_entry(a, i); // row = a[i]
  293. int y = RARRAY(row)-&gt;len;
  294. caml_interm = caml_alloc(y, 0);
  295. for (int j = 0; j&lt;y; j++) {
  296. VALUE item = rb_ary_entry(row, j);
  297. long c_item = NUM2LONG(item);
  298. caml_item = Val_long(c_item);
  299. Store_field(caml_interm, j, caml_item);
  300. }
  301. }
  302. // do the same for b
  303. caml_c = interm_matrix_mul(caml_a, caml_b);
  304. // and then convert caml_c from value to a Ruby VALUE
  305. CAMLreturnT(VALUE, ruby_c);
  306. }
  307. </code></pre>
  308. </div>
  309. <div class='slide'>
  310. <h1>Did you spot the problems?</h1>
  311. </div>
  312. <div class='slide'>
  313. <h1>Hmmm...</h1>
  314. </div>
  315. <div class='slide'>
  316. <h1>Hmmm...</h1>
  317. <p>Complex data is the problem.</p>
  318. </div>
  319. <div class='slide'>
  320. <h1>Hmmm...</h1>
  321. <p>Complex data is the problem.</p>
  322. <p>What if the data was already in Ocaml types when matrix_mul is called?</p>
  323. </div>
  324. <div class='slide'>
  325. <h1>Don't wrap a function, wrap a class</h1>
  326. </div>
  327. <div class='slide'>
  328. <h1>Tango to Salsa</h1>
  329. <pre><code>let new_matrix width height =
  330. Array.make_matrix width height 0
  331. let matrix_assign m x y item =
  332. m.(x).(y) &lt;- item
  333. let matrix_retrieve m x y =
  334. m.(x).(y)
  335. let _ = Callback.register "ocaml_new_matrix" new_matrix
  336. let _ = Callback.register "ocaml_matrix_assign" matrix_assign
  337. let _ = Callback.register "ocaml_matrix_retrieve" matrix_retrieve
  338. let _ = Callback.register "ocaml_matrix_mul" matrix_mul
  339. </code></pre>
  340. </div>
  341. <div class='slide'>
  342. <h1>Tango to Salsa</h1>
  343. <pre><code>void Init_ocamlmatrix() {
  344. OcamlMatrix = rb_define_module("OcamlMatrix");
  345. rb_define_singleton_method(OcamlMatrix, "w_new_matrix", singleton_matrix_new, 2);
  346. rb_define_method(OcamlMatrix, "w_matrix_assign", method_assign, 3);
  347. rb_define_method(OcamlMatrix, "w_matrix_retrieve", method_retrieve, 2);
  348. rb_define_method(OcamlMatrix, "w_matrix_mul", method_matrix_mul, 1);
  349. initcaml();
  350. }
  351. </code></pre>
  352. </div>
  353. <div class='slide'>
  354. <h1>Tango to Salsa</h1>
  355. <pre><code>VALUE singleton_matrix_new(VALUE class, VALUE width, VALUE height) {
  356. CAMLparam0 ();
  357. CAMLlocal1 (ocaml_m);
  358. ocaml_m = ocaml_new_matrix(Val_long(NUM2LONG(width)),
  359. Val_long(NUM2LONG(height)));
  360. VALUE my_m = Data_Wrap_Struct(class, 0, 0, ocaml_m);
  361. rb_obj_call_init(my_m, 0, NULL);
  362. CAMLreturnT (VALUE, my_m);
  363. }
  364. </code></pre>
  365. </div>
  366. <div class='slide'>
  367. <h1>Tango to Salsa</h1>
  368. <pre><code>void method_assign(VALUE self, VALUE x, VALUE y, VALUE item) {
  369. CAMLparam0 ();
  370. CAMLlocal1 (ocaml_m);
  371. Data_Get_Struct(self, value, ocaml_m);
  372. ocaml_matrix_assign(ocaml_m, Val_long(NUM2LONG(x)),
  373. Val_long(NUM2LONG(y)),
  374. Val_long(NUM2LONG(item)));
  375. CAMLreturn0;
  376. }
  377. VALUE method_retrieve(VALUE self, VALUE x, VALUE y) {
  378. CAMLparam0 ();
  379. CAMLlocal1 (ocaml_m);
  380. Data_Get_Struct(self, value, ocaml_m);
  381. VALUE retval = LONG2NUM(Long_val(ocaml_matrix_retrieve(ocaml_m,
  382. Val_long(NUM2LONG(x)),
  383. Val_long(NUM2LONG(y)))
  384. ));
  385. CAMLreturnT (VALUE, retval);
  386. }
  387. </code></pre>
  388. </div>
  389. <div class='slide'>
  390. <h1>Tango to Salsa</h1>
  391. <pre><code>VALUE method_matrix_mul(VALUE self, VALUE b) {
  392. CAMLparam0 ();
  393. CAMLlocal3 (ocaml_m1, ocaml_m2, ocaml_m3);
  394. Data_Get_Struct(self, value, ocaml_m1);
  395. Data_Get_Struct(b, value, ocaml_m2);
  396. ocaml_m3 = ocaml_matrix_mul(ocaml_m1, ocaml_m2);
  397. VALUE my_m = Data_Wrap_Struct(class, 0, 0, ocaml_m);
  398. rb_obj_call_init(my_m, 0, NULL);
  399. CAMLreturnT (VALUE, my_m);
  400. }
  401. </code></pre>
  402. </div>
  403. <div class='slide'>
  404. <h1>Add a Ruby slipper</h1>
  405. </div>
  406. <div class='slide'>
  407. <h1>No Silver Bullets</h1>
  408. <p>But we have a couple bronze ones.</p>
  409. </div>
  410. <div class='slide'>
  411. <h1>Perl</h1>
  412. <p>Inline:: libraries on CPAN.</p>
  413. </div>
  414. <div class='slide'>
  415. <h1>Perl</h1>
  416. <pre><code>use Inline "Befunge";
  417. hello();
  418. __END__
  419. __Befunge__
  420. ;:hello - print a msg;&lt;q_,#! #:&lt;"Hello world!"a
  421. ;
  422. :
  423. a q - ;tcartsbus:;
  424. d
  425. d
  426. ;
  427. + ;not a valid func def;
  428. q
  429. </code></pre>
  430. </div>
  431. <div class='slide'>
  432. <h1>Python</h1>
  433. <p>PyCaml, PyObjC.</p>
  434. </div>
  435. <div class='slide'>
  436. <h1>Python Calling Python</h1>
  437. <p>SciPy, NumPy.</p>
  438. <p>Cython.</p>
  439. </div>
  440. <div class='slide'>
  441. <h1>Ruby</h1>
  442. <p>Well after i decided on using Ruby and using Ocaml...</p>
  443. </div>
  444. <div class='slide'>
  445. <h1>Ruby</h1>
  446. <p>Well after i decided on using Ruby and using Ocaml...</p>
  447. <p>... i found <a href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/256569">Rocaml</a>.</p>
  448. </div>
  449. <div class='slide'>
  450. <h1>Ruby</h1>
  451. <p>Well after i decided on using Ruby and using Ocaml...</p>
  452. <p>... i found <a href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/256569">Rocaml</a>.</p>
  453. <p>Also available is the Ruby clone of Perl's Inline::, including InlineFortran.</p>
  454. </div>
  455. <div class='slide'>
  456. <h1>Phew</h1>
  457. </div>
  458. <div class='slide'>
  459. <h1>Phew</h1>
  460. <p>Any questions?</p>
  461. </div>
  462. <div class='slide'>
  463. <h1>Thanks</h1>
  464. <p>Thanks for coming. Thanks OSBridge for putting together an awesome conference.</p>
  465. </div>
  466. <div class='slide'>
  467. </div>
  468. </body></html>