• The Picard Maneuver@lemmy.worldOP
    link
    fedilink
    arrow-up
    6
    arrow-down
    2
    ·
    5 days ago

    I wouldn’t be surprised if it’s literally zero. I’ve tried with a few LLMs, and they’re all very confident that they know how to play chess, but they just start hallucinating illegal moves immediately.

    • la_scriba@sopuli.xyz
      link
      fedilink
      arrow-up
      2
      ·
      4 days ago

      Immediately? When was the last time you tried? The newer models can hold a game well for 10-20 moves.

      • The Picard Maneuver@lemmy.worldOP
        link
        fedilink
        arrow-up
        1
        ·
        4 days ago

        A few weeks ago, Gemini got confused when it tried to go first as black multiple times, so that’s the most immediate one I can remember. Last week, chatGPT offered to set up chess puzzles for me, but it made mistakes 3 out of 3 times.

        Maybe I’ll try again. Is there a certain one you’ve seen good performance out of?

        • la_scriba@sopuli.xyz
          link
          fedilink
          arrow-up
          2
          ·
          8 hours ago

          DeepSeek is very consistent ime. ChatGPT is hit or miss–sometimes it’s excellent, sometimes it gets really confused and says random stuff. Though DeepSeek has a server reliability problem.