Could you please point me to the best practise document/guidelines for encoding; I can't see to find one? For the encoding of the source content, we would always recommend have the "Scan" step as part of the workflow as this will correctly identify the correct encoding. For the encoding of the target content, you just need to make sure you have the correct setting in the configuration.
Why is the default encoding set to default (this name really doesn’t describe what it does), when the best practice is to apply UTF8 encoding?
Suggestion: Update default encoding to be UTF8, as per industry best practices.
Forgetting to change "Default" into "utf-8" on the Encoding page of a Config, very often leads to target HTML or XML files being converted from "utf-8" to "windows-1252", even if you have a SCAN stage in your workflow. And we get a lot of client complaints because of this. "Codepage" was the default encoding in the 1980's, before Unicode was invented, but we switched to a new millenium now!
Unfortunately I don't have any document to confirm that.
However I can say we all always set source and target encodings for all non-object file types to UTF-8.
This is a common practice and sometimes if forgotten it remains unnoticed until client gets back to us with complaints.
Therefore I also think it would be great if UTF-8 was set automatically instead of Default.
If for some reason someone would need different encoding (which would probably be 1% of all cases) then he'd manually change them.