• The Picard Maneuver@lemmy.worldOP
      link
      fedilink
      arrow-up
      6
      arrow-down
      2
      ·
      5 days ago

      I wouldn’t be surprised if it’s literally zero. I’ve tried with a few LLMs, and they’re all very confident that they know how to play chess, but they just start hallucinating illegal moves immediately.

      • la_scriba@sopuli.xyz
        link
        fedilink
        arrow-up
        2
        ·
        4 days ago

        Immediately? When was the last time you tried? The newer models can hold a game well for 10-20 moves.

        • The Picard Maneuver@lemmy.worldOP
          link
          fedilink
          arrow-up
          1
          ·
          4 days ago

          A few weeks ago, Gemini got confused when it tried to go first as black multiple times, so that’s the most immediate one I can remember. Last week, chatGPT offered to set up chess puzzles for me, but it made mistakes 3 out of 3 times.

          Maybe I’ll try again. Is there a certain one you’ve seen good performance out of?

          • la_scriba@sopuli.xyz
            link
            fedilink
            arrow-up
            2
            ·
            9 hours ago

            DeepSeek is very consistent ime. ChatGPT is hit or miss–sometimes it’s excellent, sometimes it gets really confused and says random stuff. Though DeepSeek has a server reliability problem.