• Capsicones@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    97
    arrow-down
    1
    ·
    2 days ago

    There seems to be some confusion here on what PTX is – it does not bypass the CUDA platform at all. Nor does this diminish NVIDIA’s monopoly here. CUDA is a programming environment for NVIDIA GPUs, but many say CUDA to mean the C/C++ extension in CUDA (CUDA can be thought of as a C/C++ dialect here.) PTX is NVIDIA specific, and sits at a similar level as LLVM’s IR. If anything, DeepSeek is more dependent on NVIDIA than everyone else, since PTX is tightly dependent on their specific GPUs. Things like ZLUDA (effort to run CUDA code on AMD GPUs) won’t work. This is not a feel good story here.

    • pr06lefs@lemmy.ml
      link
      fedilink
      English
      arrow-up
      27
      ·
      2 days ago

      This specific tech is, yes, nvidia dependent. The game changer is that a team was able to beat the big players with less than 10 million dollars. They did it by operating at a low level of nvidia’s stack, practically machine code. What this team has done, another could do. Building for AMD GPU ISA would be tough but not impossible.

    • Eager Eagle@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      edit-2
      2 days ago

      I don’t think anyone is saying CUDA as in the platform, but as in the API for higher level languages like C and C++.

      PTX is a close-to-metal ISA that exposes the GPU as a data-parallel computing device and, therefore, allows fine-grained optimizations, such as register allocation and thread/warp-level adjustments, something that CUDA C/C++ and other languages cannot enable.

  • filister@lemmy.world
    link
    fedilink
    English
    arrow-up
    48
    arrow-down
    2
    ·
    3 days ago

    What is amazing in this case is that they achieved spending a fraction of the inference cost that OpenAI is paying.

    Plus they are a lot cheaper too. But I am pretty sure that the American government will ban them in no time, citing national security concerns, etc.

    Nevertheless, I think we need more open source models.

    Not to mention that NVIDIA also needs to be brought to earth.

    • demesisx@infosec.pub
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      4
      ·
      3 days ago

      Even if they get banned, any startup could replicate their work if it is truly open source. The best thing about their solution is that it breaks the CUDA monopoly that NVDA has enjoyed. Buy your puts when NVDA bounces because that stock is GOING DOWN. There’s no world where a company that makes GPU’s is worth more than both Apple and Microsoft. It’s inevitable.

      • Pieisawesome@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        ·
        2 days ago

        It’s written in nvidia instruction set PTX which is part of CUDA ecosystem.

        Hardly going to affect nvidia

        • demesisx@infosec.pub
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          2 days ago

          It certainly does.

          Until last week, you absolutely NEEDED an NVidia GPU equipped with CUDA to run all AI models.

          Today, that is simply not true. (watch the video at the end of this comment)

          I watched this video and my initial reaction to this news was validated and then some: this video made me even more bearish on NVDA.

          Edit: corrected and redacted.

  • Corngood@lemmy.ml
    link
    fedilink
    English
    arrow-up
    35
    ·
    3 days ago

    This sounds like good engineering, but surely there’s not a big gap with their competitors. They are spending tens of millions on hardware and energy, and this is something a handful of (very good) programmers should be able to pull off.

    Unless I’m missing something, It’s the sort of thing that’s done all the time on console games.

    • KingRandomGuy@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 day ago

      Part of this was an optimization that was necessary due to their resource restrictions. Chinese firms can only purchase H800 GPUs instead of H200 or H100. These have much slower inter-GPU communication (less than half the bandwidth!) as a result of export bans by the US government, so this optimization was done to try and alleviate some of that bottleneck. It’s unclear to me if this type of optimization would make as big of a difference for a lab using H100s/H200s; my guess is that it probably matters less.

  • mesamune@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    4
    ·
    3 days ago

    Reminds me of the Bitcoin mining and how askii miners overtook graphic card mining practically overnight. It would not surprise me if this goes the same way.