For now it's either 8- or 16-bit RGBA, adding more output formats adds complexity and requires more test cases. What's your use case for a 3-byte layout?
RGB uses less memory than RGBA (obviously), so if there is no alpha channel I don't want to allocate the memory for a channel completely full of 255 bytes.
A 3-byte layout can be worse for performance, I think the simplest solution would be to add a special format that always matches the PNG's format, that way you always get RGB from RGB images, grayscale from grayscale images, etc. This wouldn't require a conversion step.