What you said about RAG makes sense, but my understanding is that fine-tuning is actually not very good at getting deeper understanding out of LLMs. It's more useful for teaching general instructions like output format rather than teaching deep concepts like a new domain of science.
This isn't true either, because if you don't have access to the original data set, the model will overfit on your fine tuning data set and (in the extreme cases) lose its ability to even do basic reasoning.
Yes. It's called "catastrophic forgetting". These models were trained on trillions of tokens and then underwent a significant RLHF process. Fine tuning them on your tiny data set (relative to the original training data) almost always results in the model performing worse at everything else. There's also the issue of updating changed information. This is easy with RAG - replace the document in the repository with a new version and it just works. Not so easy with fine tuning since you can't identify and update just the weights that were changed (there's research in this area but it's early days).