GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

chaospatterns@lemmy.world · edit-2 2 months ago

GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

traches@sh.itjust.works · 2 months ago

Probably getting hammered by ai scrapers

adarza@lemmy.ca · 2 months ago

you mean, doin’ what microsoft and their ai ‘partners’ do to others?

Also, as Microsoft appears to have recognized scraping for AI training as a problem, are you seizing your own scraping activities on public code and the larger web or is this a case of double standards?

Ricky Rigatoni@lemm.ee · 2 months ago

Yeah but they’re allowed to do it because they have brazillions of dollars.

Ugurcan@lemmy.world · 2 months ago

They literally own GitHub. Brazillions well spent.

potatopotato@sh.itjust.works · 2 months ago

Everything seems to be. There was a period where you could kinda have a sane experience browsing over a VPN or otherwise using a cloud service IP range endpoint but especially the past 6 months or so things have gotten worse exponentially by the week. Everything is moving behind cloudflare or other systems

hackeryarn@lemmy.world · 2 months ago

If Microsoft knows how to do one thing well, it’s killing a successful product.

henfredemars@infosec.pub · 2 months ago

I came here looking for this comment. They bought the service to destroy it. It’s kind of their thing.

douglasg14b@lemmy.world · 2 months ago

Github has literally never been doing better. What are you talking about??

ZeroOne@lemmy.world · 2 months ago

We are talking about EEE

Boomer Humor Doomergod@lemmy.world · 2 months ago

RIP Skype

adarza@lemmy.ca · 2 months ago

we could have had bob or clippy instead of ‘cortana’ or ‘copilot’

Gork@lemm.ee · 2 months ago

Microsoft really should have just leaned into it and named it Clippy again.

Boomer Humor Doomergod@lemmy.world · 2 months ago

If Cortana was named Bob I don’t think people would have less of a problem with it

midori matcha@lemmy.world · 2 months ago

Github is owned by Microsoft, so don’t worry, it’s going to get worse

Lv_InSaNe_vL@lemmy.world · edit-2 2 months ago

I honestly don’t really see the problem here. This seems to mostly be targeting scrapers.

For unauthenticated users you are limited to public data only and 60 requests per hour, or 30k if you’re using Git LFS. And for authenticated users it’s 60k/hr.

What could you possibly be doing besides scraping that would hit those limits?

chaospatterns@lemmy.world · edit-2 2 months ago

You might behind a shared IP with NAT or CG-NAT that shares that limit with others, or might be fetching files from raw.githubusercontent.com as part of an update system that doesn’t have access to browser credentials, or Git cloning over https:// to avoid having to unlock your SSH key every time, or cloning a Git repo with submodules that separately issue requests. An hour is a long time. Imagine if you let uBlock Origin update filter lists, then you git clone something with a few modules, and so does your coworker and now you’re blocked for an entire hour.

MangoPenguin@lemmy.blahaj.zone · 2 months ago

60 requests per hour per IP could easily be hit from say, uBlock origin updating filter lists in a household with 5-10 devices.

Disregard3145@lemmy.world · 2 months ago

I hit those many times when signed out just scrolling through the code. The front end must be sending off tonnes of background requests

Lv_InSaNe_vL@lemmy.world · 2 months ago

This doesn’t include any requests from the website itself

onlinepersona@programming.dev · 2 months ago

I see the “just create an account” and “just login” crowd have joined the discussion. Some people will defend a monopolist no matter what. If github introduced ID checks à la Google or required a Microsoft account to login, they’d just shrug and go “create a Microsoft account then, stop bitching”. They don’t realise they are being boiled and don’t care. Consoomer behaviour.

Anti Commercial-AI license

calcopiritus@lemmy.world · 2 months ago

Or we just realize that GitHub without logging in is a service we are getting for free. And when there’s something free, there’s someone trying to exploit it. Using GitHub while logged in is also free and has none of these limits, while allowing them to much easier block exploiters.

onlinepersona@programming.dev · 2 months ago

I would like to remind you that you are arguing for a monopolist. I’d agree with you if it were for a startup or mid-sized company that had lots of competition and was providing a good product being abused by competitors or users. But Github has a quasi-monopoly, is owned by a monopolist that is part of the reason other websites are being bombarded by requests (aka, they are part of the problem), and you are sitting here arguing that more people should join the monopoly because of an issue they created.

Can you see the flaws in reasoning in your statements?

Anti Commercial-AI license

calcopiritus@lemmy.world · 2 months ago

No. I cannot find the flaws in my reasoning. Because you are not attacking my reasoning, you are saying that i am on the side of the bad people, and the bad people are bad, and you are opposed to the bad people, therefore you are right.

The world is more than black or white. GitHub rate-limiting non-logged-in users makes sense, and is the expected result in the age of web scrapping LLM training.

Yes, the parent company of GitHub also does web scrapped for the purpose of training LLMs. I don’t see what that has to do with defending themselves from other scrappers.

onlinepersona@programming.dev · edit-2 2 months ago

Company creates problem. Requires users to change because of created problem. You defend company creating problem.

That’s the logical flaw.

If you see no flaws in defending a monopolist, well, you cannot be helped then.

Anti Commercial-AI license

calcopiritus@lemmy.world · 2 months ago

I don’t think Microsoft invented scrapping. Or LLM training.

Also, GitHub doesn’t have an issue with Microsoft scraping its data. They can just directly access whatever data they want. And rate-limiting non logged in accounts won’t affect Microsoft’s LLM training at all.

I’m not defending a monopolist because of monopolist actions. First of all because GitHub doesn’t have any kind of monopoly. There are plenty of git forges. And second of all. How does this make their position on the market stronger? If anything, it makes it weaker.

sturlabragason@lemmy.world · edit-2 2 months ago

No no, no no no no, no no no no, no no there’s no limit

https://forgejo.org/

Xanza@lemm.ee · 2 months ago

Until there will be.

I think people are grossly underestimating the sheer size and significance of the issue at hand. Forgejo will very likely eventually get to the same point Github is at right now, and will have to employ some of the same safeguards.

FlexibleToast@lemmy.world · 2 months ago

Except Forgejo is open source and you can run your own instance of it. I do, and it’s great.

Xanza@lemm.ee · 2 months ago

That’s a very accurate statement which has absolutely nothing to do with what I’ve said. Fact of the matter stands, is that those who generally seek to use a Github alternative do so because they dislike Microsoft or closed source platforms. Which is great, but those platforms with hosted instances see an overwhelmingly significant portion of users who visit because they choose not to selfhost. It’s a lifecycle.

Create cool software for free
Cool software gets popular
Release new features and improve free software
Lots of users use your cool software
Running software becomes expensive, monetize
Software becomes even more popular, single stream monetization no longer possible
Monetize more
Get more popular
Monetize more

By step 30 you’re selling everyone’s data and pushing resource restrictions because it’s expensive to run a popular service that’s generally free. That doesn’t change simply because people can selfhost if they want.

FlexibleToast@lemmy.world · 2 months ago

To me, this reads strongly like someone who is confidently incorrect. Your starting premise is incorrect. You are claiming Forgejo will do this. Forgejo is nothing but an open source project designed to self host. If you were making this claim about Codeberg, the project’s hosted version, then your starting premise would be correct. Obviously, they monetize Codeberg because they’re providing a service. That monetization feeds Forgejo development. They could also sell official support for people hosting their own instances of Forgejo. This is a very common thing that open source companies do…

Xanza@lemm.ee · 2 months ago

Obviously, they monetize Codeberg because they’re providing a service. That monetization feeds Forgejo development. They could also sell official support for people hosting their own instances of Forgejo. This is a very common thing that open source companies do…

This is literally what I said in my original post. Free products must monetize, as they get larger they have to continue to monetize more and more because development and infrastructure costs continue to climb…and you budged in as if this somehow doesn’t apply to Forgejo and then literally listed examples of why it does. I mean, Jesus my guy.

You are claiming Forgejo will do this.

I’m claiming that it is a virtual certainty of the age of technology that we live in that popular free products (like Github) eventually balloon into sizes which are unmanageable while maintaining a completely free model (especially without restriction), which then proceed to get even more popular at which time they have to find new revenue streams or die.

It’s what’s happened with Microsoft, Apple, Netflix, Hulu, Amazon Prime, Amazon Prime Video, Discord, Reddit, Emby, MongoDB, just about any CMS CRM or forum software, and is currently happening to Plex, I mean the list is quite literally endless. You could list any large software company that provides a free or mostly free product and you’ll find a commercial product that they use to fund future development because their products become so popular and so difficult/costly to maintain they were forced into a monetization model to continue development.

Why you think Forgejo is the only exception to this natural evolution is beyond my understanding.

I’m fully aware of the difference between Codeberg and Forgejo. And Forgejo is a product and its exceptionally costly to build and maintain. Costs which will continue to rise as it has to change over time to suit more and more user needs. People seem to heavily imply that free products cost nothing to build, which is just insane.

I’ve been a FOSS developer for 25 years and a tech PM for almost 20. I speak with a little bit of authority here because it’s my literal wheelhouse.

FlexibleToast@lemmy.world · 2 months ago

That’s a huge wall of text to still entirely miss the point. Forgejo is NOT a free service. It is an open-source project that you can host yourself. Do you know what will happen if Forgejo ends up enshitifying? They’ll get forked. Why do I expect that? Because that’s literally how Forgejo was created. It forked Gitea. Why don’t I think that will happen any time soon? It has massive community buy-in, including the Fedora Project. You being a PM explains a lot about being confidently incorrect.

Xanza@lemm.ee · 2 months ago

That’s a huge wall of text to still entirely miss the point.

So then it makes sense that you didn’t read it where I very specifically and intentionally touch the subjects you speak about.

If you’re not going to read what people reply, then don’t even bother throwing your opinion around. Just makes you look like an idiot tbh.

ABetterTomorrow@lemm.ee · 2 months ago

Dude, this is cool!

furikuri@programming.dev · 2 months ago

Amazon’s AI crawler is making my git server unstable

End of the day someone still has to pay for those requests

yo_scottie_oh@lemmy.ml · 2 months ago

No, no limits, we’ll reach for the skyyyy

theunknownmuncher@lemmy.world · edit-2 2 months ago

LOL!!! RIP GitHub

EDIT: trying to compile any projects from source that use git submodules will be interesting. eg ROCm has more than 60 submodules to pull in 💀

John Richard@lemmy.world · 2 months ago

Crazy how many people think this is okay, yet left Reddit cause of their API shenanigans. GitHub is already halfway to requiring signing in to view anything like Twitter (X).

plz1@lemmy.world · 2 months ago

They make you sign in to use search, on code anyways.

calcopiritus@lemmy.world · 2 months ago

It’s not the same making API costs unbearable for a social media user and limiting the rate non-logged-in users.

You can still use GitHub without being logged in. You can still use GitHub without almost any limit on a free account.

You cannot even use reddit on a third party app with an account with reddit gold.

bigkahuna1986@lemmy.ml · 2 months ago

Just browsing GitHub I’ve got this limit

adarza@lemmy.ca · 2 months ago

i’ve hit it many times so far… even as quick as the second page view (first internal link clicked) after more than a day or two since the last visit (yes, even with cleaned browser data or private window).

it’s fucking stupid how quick they are to throw up a roadblock.

k_rol@lemmy.ca · 2 months ago

Just browse authenticated, you won’t have that issue.

adarza@lemmy.ca · 2 months ago

that is not an acceptable ‘solution’ and opens up an entirely different and more significant can o’ worms instead.

Xanza@lemm.ee · 2 months ago

Then login.

Sunshine (she/her)@lemmy.ca · 2 months ago

!codeberg@programming.dev

atzanteol@sh.itjust.works · 2 months ago

The enshittification begins (continues?)…

kixik@lemmy.ml · 2 months ago

just now? :)

irelephant [he/him]@programming.dev · 2 months ago

Its always blocked me from searching in firefox when I’m logged out for some reason.

kevin____@lemm.ee · 2 months ago

Good thing git is “federated” by default.

ZeroOne@lemmy.world · 2 months ago

& then you have fossil which is github in a box

varnia@lemm.ee · 2 months ago

Good thing I moved all my repos from git[lab|hub] to Codeberg recently.

katy ✨@lemmy.blahaj.zone · 2 months ago

is authenticated like when you use a private key with git clone? stupid question i know

also this might be terrible if you subscribe to filter lists on raw github in ublock or adguard

chaospatterns@lemmy.world · 2 months ago

is authenticated like when you use a private key with git clone

Yes

also this might be terrible if you subscribe to filter lists on raw github in ublock or adguard

Yes exactly why this is actually quite problematic. There’s a lot of HTTPS Git pull remotes around and random software that uses raw.githubusercontent.com to fetch data. All of that is now subject to the 60 req/hr limit and not all of it will be easy to fix.

GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

Updated rate limits for unauthenticated requests - GitHub Changelog