Search
1 results for “munhitsu”
-
I'm playing with G-Eval to test the LLM outputs using LLM. Sounds very meta, but there is logic to it. And it roughly works until it doesn't.
How am I supposed to reason with test result explanation:
"the actual output's prompt is in Polish which mismatches the language-prompt specified as Polish, aligning correctly"
???
#llm #gpt #deepeval