Jellie Frontier
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Lifecoach5000@lemmy.world to Technology@lemmy.worldEnglish · 3 days ago

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

www.tomshardware.com

external-link
message-square
138
fedilink
  • cross-posted to:
  • retrogaming@lemmy.world
631
external-link

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

www.tomshardware.com

Lifecoach5000@lemmy.world to Technology@lemmy.worldEnglish · 3 days ago
message-square
138
fedilink
  • cross-posted to:
  • retrogaming@lemmy.world
OpenAI's latest and greatest AI model was outclassed by the 1.19 MHz near 50-year-old console gaming legend.
  • nednobbins@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    I imagine the “author” did something like, “Search http://google.scholar.com/ find a publication where AI failed at something and write a paragraph about it.”

    It’s not even as bad as the article claims.

    Atari isn’t great at chess. https://chess.stackexchange.com/questions/24952/how-strong-is-each-level-of-atari-2600s-video-chess
    Random LLMs were nearly as good 2 years ago. https://lmsys.org/blog/2023-05-03-arena/
    LLMs that are actually trained for chess have done much better. https://arxiv.org/abs/2501.17186

    • Lovable Sidekick@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Wouldn’t surprise me if an LLM trained on records of chess moves made good chess moves. I just wouldn’t expect the deployed version of ChatGPT to generate coherent chess moves based on the general text it’s been trained on.

      • nednobbins@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        I wouldn’t either but that’s exactly what lmsys.org found.

        That blog post had ratings between 858 and 1169. Those are slightly higher than the average rating of human users on popular chess sites. Their latest leaderboard shows them doing even better.

        https://lmarena.ai/leaderboard has one of the Gemini models with a rating of 1470. That’s pretty good.

Technology@lemmy.world

technology@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.world

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


  • @L4s@lemmy.world
  • @autotldr@lemmings.world
  • @PipedLinkBot@feddit.rocks
  • @wikibot@lemmy.world
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 2.62K users / day
  • 6.33K users / week
  • 11.4K users / month
  • 24.6K users / 6 months
  • 1 local subscriber
  • 71.3K subscribers
  • 3.22K Posts
  • 42.1K Comments
  • Modlog
  • mods:
  • L3s@lemmy.world
  • enu@lemmy.world
  • Technopagan@lemmy.world
  • L4sBot@lemmy.world
  • BE: 0.19.8
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org