home.social

Search

1 results for “munhitsu”

  1. I'm playing with G-Eval to test the LLM outputs using LLM. Sounds very meta, but there is logic to it. And it roughly works until it doesn't.
    How am I supposed to reason with test result explanation:
    "the actual output's prompt is in Polish which mismatches the language-prompt specified as Polish, aligning correctly"
    ???