On Sept 4th 2020, I got pinged on a revert PR to fix a 150% slowdown on the Shopify monolith. It was a two-line change reverting the addition of a Sorbet signature on a Shop method, implicating Sorbet as the suspect.
That was the start of a journey that took me through a deeper understanding of the Sorbet, Rails and Ruby codebases. The fix is in Ruby 3.0 and I can sleep better now.
This talk will take you on that journey with me. Along the way, you will find tools and tricks that you can use for debugging similar problems. We will also delve into some nuances of Ruby, which is always fun.
This talk is intended for all audiences that have a basic familiarity working with Ruby. No prior knowledge of Sorbet or Rails is required.
This talk aims to provide an arsenal of debugging tools for the audience while taking them through a fun bug chasing journey.
The surface reason of the problem was the method_added
hook that was being added to the Shop
model to make Sorbet sig
blocks work. This is not a widely known technique but has specific uses, and the audience should be more familiar with them after the talk.
The root cause of the bug turns out to be related to the difference in behaviour between .clone
and .dup
and with how .clone
does the wrong thing with cloning singleton classes. Thus the talk will give a deeper understanding of these nuances to the audience as well.
The Bug
T::Private::Methods::MethodHooks#method_added
Sorbet
bundle open sorbet-runtime
and instrument (Key Takeaway #1)bin/rails console
with DISABLE_SORBET=1
and EAGER_LOAD=1
to simulate prod behaviour.shop.clone
does.method_added
method_added
using binding.pry if $break
(Key Takeway #2)$break = true
and call shop.clone
Kernel.clone
; doesn't make sense.initialize_clone
or initialize_dup
definitions in RailsRails
method_added
method_added
hooks are being called but only for mattr_
defined methods.?
) methods. Leading to this commit on RailsKernel.clone
and Kernel.dup
code.Kernel.clone
does a whole lot more copying than dup
. Especially that it is copying the singleton class, etc. (Key Takeaway #4)Shop
in other AR models. Add a custom model and notice that we can trigger the same case by including the Shopify custom podding module.Ruby
before_initialize
callback on the model.Kernel.clone
realize that someone should be setting shop.singleton_class
, but who?ObjectSpace.trace_object_allocations
and associated ObjectSpace.allocation_sourcefile
/ObjectSpace.allocation_sourceline
.instance_exec
being called by before_initialize
which ends up creating a singleton class for the model object.Kernel.clone
method doing something wrong.method_added
was being called but not singleton_method_added
.This bug hunt ended up being the most interesting one I've had in my 20+ year career and it thought me a lot about some of the areas of Sorbet, Rails and Ruby. I've had the help of experts which are Rails Core and Ruby Core members which have also been amazing.
I feel like there is a lot information captured in this bug hunt that would be useful for others in their similar adventures. Sharing the debugging methods and the deeper reasons for what exactly was breaking, and how I was able to hone in on it, should be educational for a general audience of any level of expertise.
This plays really well with the proposed topic of "Let Me Tell You About This Bug"