Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them

561 views

Published on

Published in: Software
  • Login to see the comments

Alexander Dymo - RubyConf 2014 - Ruby Performance Secrets and How to Uncover Them

  1. 1. Ruby Performance Secrets and How to Uncover Them http://www.slideshare.net/adymo/adymo-rubyconf-performance
  2. 2. Who am I? Alexander Dymo C/C++ since 2000 Ruby/Rails since 2006 Started to optimize back in 2007 Never stopped since then
  3. 3. Rails Performance: What You Need to Know https://www.airpair.com/ruby-on-rails/performance Make Your Ruby/Rails App Fast: Performance And Memory Profiling Using ruby-prof and Kcachegrind http://www.acunote.com/blog/2008/02/make-your-ruby-rails-applications-fast-performance-and-memory-profiling.html Ruby Performance Tuning http://theprosegarden.com/contents-of-recent-issues/#10-14
  4. 4. Ruby Performance The first comprehensive book on Ruby Performance I'm 50% done. Beta soon. ruby-performance-book.com
  5. 5. Big thanks to:
  6. 6. What do we talk about today? Performance tips Performance best practices
  7. 7. What do we talk about today? Performance tips Performance best practices How to understand what's wrong How to find your own performance tips/best practices
  8. 8. In examples
  9. 9. Example 1
  10. 10. What can go wrong with this code?
  11. 11. What can go wrong with this code?
  12. 12. This was faster
  13. 13. 100-200ms faster Sometimes …
  14. 14. Smells like...
  15. 15. https:// www.flickr.com/photos/timquijano/5720765523/
  16. 16. Let's check what happens:
  17. 17. Let's profile memory allocations Need patched ruby rvm reinstall 1.9.3 --patch railsexpress rvm reinstall 2.0.0 --patch railsexpress rvm reinstall 2.1.4 --patch railsexpress
  18. 18. Let's profile memory allocations Need profiler gem install ruby-prof
  19. 19. Let's profile memory allocations Need visualization tool Mac: brew install qcachegrind Linux: <your package manager> install kcachegrind Windows: http://sourceforge.net/projects/qcachegrindwin/
  20. 20. Let's profile memory allocations ruby-prof -p call_tree –mode=allocations before.rb > callgrind.out.before ruby-prof -p call_tree –mode=allocations after.rb > callgrind.out.after kcachegrind callgrind.out.before kcachegrind callgrind.out.after
  21. 21. static VALUE enum_inject(int argc, VALUE *argv, VALUE obj) { NODE *memo; VALUE init, op; rb_block_call_func *iter = inject_i; … memo = NEW_MEMO(init, argc, op); rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); return memo->u1.value; }
  22. 22. > gdb `rbenv which ruby` GNU gdb (GDB) SUSE (7.5.1-2.5.1) Reading symbols from /home/gremlin/.rbenv/versions/2.1.4/bin/ruby...done. (gdb)
  23. 23. (gdb) l enum_inject 632 * longest #=> "sheep" 633 * 634 */ 635 static VALUE 636 enum_inject(int argc, VALUE *argv, VALUE obj) 637 { 638 NODE *memo; 639 VALUE init, op; 640 rb_block_call_func *iter = inject_i; 641 ID id; (gdb)
  24. 24. 636 enum_inject(int argc, VALUE *argv, VALUE obj) 637 { 638 NODE *memo; 639 VALUE init, op; 640 rb_block_call_func *iter = inject_i; 641 ID id; (gdb) b 638 Breakpoint 1 at 0x1cbc0a: file enum.c, line 638. (gdb)
  25. 25. (gdb) r -e '[1,2,3].inject {}' Starting program: /home/gremlin/.rbenv/versions/2.1.4/bin/ruby -e '[1,2,3].inject {}' [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff7ff2700 (LWP 3893)] Breakpoint 1, enum_inject (argc=0, argv=<optimized out>, obj=93825001586240) at enum.c:640 640 rb_block_call_func *iter = inject_i; (gdb)
  26. 26. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb)
  27. 27. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb)
  28. 28. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb)
  29. 29. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb) s 1145 arg.obj = obj; (gdb)
  30. 30. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb) s 1145 arg.obj = obj; (gdb) s 1146 arg.mid = mid; (gdb)
  31. 31. 640 rb_block_call_func *iter = inject_i; (gdb) n 665 memo = NEW_MEMO(init, argc, op); (gdb) n 666 rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); (gdb) s rb_block_call (obj=93825001586240, mid=1456, argc=0, argv=0x0, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1142 1142 { (gdb) s 1145 arg.obj = obj; (gdb) s 1146 arg.mid = mid; (gdb) s 1147 arg.argc = argc; (gdb)
  32. 32. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb)
  33. 33. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb)
  34. 34. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb) s rb_iterate (it_proc=it_proc@entry=0x5555556c0790 <iterate_method>, data1=data1@entry=140737488340304, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1054 1054 { (gdb)
  35. 35. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb) s rb_iterate (it_proc=it_proc@entry=0x5555556c0790 <iterate_method>, data1=data1@entry=140737488340304, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1054 1054 { (gdb) s 1057 NODE *node = NEW_IFUNC(bl_proc, data2); (gdb)
  36. 36. (gdb) s 1147 arg.argc = argc; (gdb) s 1148 arg.argv = argv; (gdb) s 1149 return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); (gdb) s rb_iterate (it_proc=it_proc@entry=0x5555556c0790 <iterate_method>, data1=data1@entry=140737488340304, bl_proc=0x555555722460 <inject_i>, data2=93825001586200) at vm_eval.c:1054 1054 { (gdb) s 1057 NODE *node = NEW_IFUNC(bl_proc, data2); (gdb)
  37. 37. static VALUE enum_inject(int argc, VALUE *argv, VALUE obj) { NODE *memo; VALUE init, op; rb_block_call_func *iter = inject_i; … memo = NEW_MEMO(init, argc, op); rb_block_call(obj, id_each, 0, 0, iter, (VALUE)memo); return memo->u1.value; }
  38. 38. VALUE rb_block_call(…) { … return rb_iterate(iterate_method, (VALUE)&arg, bl_proc, data2); } VALUE rb_iterate(…) { int state; volatile VALUE retval = Qnil; NODE *node = NEW_IFUNC(bl_proc, data2); … }
  39. 39. 2 T_NODE's per inject() call
  40. 40. 10000.times { [].inject } 20000 extra T_NODE objects some work for GC
  41. 41. Ruby Performance More in my book ruby-performance-book.com
  42. 42. Lessons learned: 1. use profiler to understand why your code is slow 2. use C debugger to understand Ruby behavior
  43. 43. Example 2
  44. 44. What's the difference? str = 'a'*1024*1024*10 str = str.gsub('a', 'b') str = 'a'*1024*1024*10 str.gsub!('a', 'b')
  45. 45. str = 'a'*1024*1024*10 str = str.gsub('a', 'b') str = 'a'*1024*1024*10 str.gsub!('a', 'b') replaces 'a' with 'b' creates a new object reuses "str" name replaces 'a' with 'b' changes the original
  46. 46. Supposedly
  47. 47. Let's profile memory usage ruby-prof -p call_tree –mode=memory after.rb > callgrind.out.after kcachegrind callgrind.out.after
  48. 48. So, gsub! doesn't save any memory
  49. 49. So, gsub! doesn't save any memory … except one slot on Ruby heap
  50. 50. So, gsub! doesn't save any memory except one slot on Ruby heap … which is 40 bytes
  51. 51. Not all bang! functions are the same str = 'a'*1024*1024*10 str.downcase! ruby-prof -p call_tree –mode=memory downcase.rb > callgrind.out.downcase kcachegrind callgrind.out.downcase
  52. 52. Lessons learned: 1. profile memory 2. challenge all tips/tricks/best practices
  53. 53. Conclusions 1. Don't guess. Profile. 2. Guess. Profile. 3. Profile not only CPU, but Memory. 4. Look at the source, use GDB if not enlightened. 5. Challenge all tips/tricks. Understand instead.
  54. 54. Big thanks to:
  55. 55. Ruby Performance ruby-performance-book.com airpair.me/adymo @alexander_dymo

×