I think that the asm-based callc may be messing up the 16-byte or 128-bit stack alignment. This results in the program crashing later on whenever a dylib function is called.
We need to create OSX specific call_c assembly code that handles the 128-bit alignment.
Right now this is worked around by using the c-only call_c code.
This ticket should be expanded. We don't have asm call_c working on any x86_64 platform, and we seem to lack an asm call_c in ARM altogether.