Also introduces an optimziation that uses the stack, if the buffer size required, is lower than a pre-determined threshold