Anecdotally, I use it a lot and I feel like my responses are better when I'm polite. I have a couple of theories as to why.
- More tokens in the context window of your question, and a clear separator between ideas in a conversation make it easier for the inference tokenizer to recognize disparate ideas.
- Higher quality datasets contain american boomer/millennial notions of "politeness" and when responses are structured in kind, they're more likely to contain tokens from those higher quality datasets.
I haven't mathematically proven any of this within the llama.cpp tokenizer, but I strongly suspect that I could at least prove a correlation between polite token input and dataset representation output tokens
I try to keep my commentary as apolitical as possible, so what I'll say is:
If you believe that the current ticket is the Cthulhu/Lucifer ticket, imagine what they could accomplish if every bill only required a simple majority.