Jul 28, 2021
Your approach to dealing with the unbalanced dataset is pretty good but there is a subtelty which may be missed.
It may be worth emphasising that the process you took to create a new dataset had a 50:50 split between resignations and not rather than the original 6% and therefore you didn't need to take a stratified splitting of the data.