Researchers have found that large language models (LLMs) tend to parrot buggy code when tasked with completing flawed snippets.
That is to say, when shown a snippet of shoddy code and asked to fill in the blanks, AI models are just as likely to repeat the mistake as to fix it.
I don’t see why anyone would expect anything else out of a “what is the most likely way to continue this” algorithm.
To be fair, if you give me a shit code base and expect me to add features with no time to fix the existing ones, I will also just add more shit on the pile. Because obviously that’s how you want your codebase to look.
And if you do that without saying you want to refactor, I likely won’t stand up for you on the next round of layoffs. If I wanted to make the codebase worse, I’d use AI.
I’ve been in this scenario and I didn’t wait for layoffs. I left and applied my skills where shit code is not tolerated, and quality is rewarded.
But in this hypothetical, we got this shit code not by management encouraging the right behavior, and giving time to make it right. They’re going to keep the yes men and fire the “unproductive” ones (and I know fully, adding to the pile is not, in the long run, productive, but what does the management overseeing this mess think?)
Fair.
That said, we have a lot of awful code at my org, yet we also have time to fix it. Most of the crap came from the “move fast and break things” period, but now we have the room to push back a bit.
There’s obviously a balance, and as a lead, I’m looking for my devs to push back and make the case for why we need the extra time. If you convince me, I’ll back you up and push for it, and we’ll probably get the go-ahead. I’m not going to approve everything though because we can’t fix everything at once. But if you ignore the problems and trudge along anyway, I’ll be disappointed.
I bet the people you work with are very happy to have you as a lead.
I hope so.
This is what I was thinking, if you give the code to a person and ask them to finish it, they would do the same.
If you rather ask the LLM to give some insights about the code, it might tell you what’s wrong with it.
As a software developer I’ve never used AI to write code, but several of my friends use it daily and they say it really helps them in their jobs. To explain this to non-programmers, they don’t tell it “Write some code” and then watch TV while it does their job. Coding involves a lot of very routine busy work that’s little more than typing. AI can generate approximately what they want, which they then edit, and according to them this helps them work a lot faster.
A hammer is a useful tool, even though can’t build a building by itself and is really shitty as a drill. I look at AI the same way.
100%. As a solo dev who used to work corporate, I compare it to having a jr engineer who completes every task instantly. If you give it something well-documented and not too complex, it’ll be perfect. If you give it something more complex or newer tech, it could work, but may have some mistakes or unadvised shortcuts.
I’ve also found it pretty good for when a dependency I’m evaluating has shit documentation. Not always correct, but sometimes it’ll spit out some apis I didn’t notice.
Edit: Oh also I should mention, I’ve found TDD is pretty good with ai. Since I’m building the tests anyways, it can often give the ai a good description of what you’re looking for, and save some time.
I’ve found it okay to get a general feel for stuff but I’ve been given insidiously bad code. Functions and data structures that look similar enough to real stuff but are deeply wrong or non+existent.
Mmm it sounds like you’re using it in a very different way to me; by the time I’m using an LLM, I generally have way more than a general feel for what I’m looking for. People rag on ai for being a “fancy autocomplete”, but that’s literally what I like to use it for. I’ll feed it a detailed spec for what I need, give it a skeleton function with type definitions, and tell the ai to fill it in. It generally fills in basic functions pretty well with that level of definition (ymmv depending on the scope of the function).
This lets me focus more on the code design/structure and validation, while the ai handles a decent amount of grunt work. And if it does a bad job, I would have written the spec and skeleton anyways, so it’s more like bonus if it works. It’s also very good at imitation, so it can help to avoid double-work with similar functionalities.
Kind of shortened/naive example of how I use:
/* Example of another db update function within the app */ /* UnifiedEventUpdate and UnifiedEvent type definitions */
Help me fill in this function
/// Updates event properties, and children: /// - If `event.updated` is newer than existing, update as normal /// - If `event.updated` is older than existing, error /// - If no `event.updated` is provided, assume updated to be now() /// For updating Content(s): /// - If `content.id` exists, update the existing content /// - If `content.id` does not exist, create a new content /// - If an existing content isn't present, delete the content pub fn update_event( conn: &mut Conn, event: UnifiedEventUpdate, ) -> Result<UnifiedEvent, Error> {
Coding involves a lot of very routine busy work that’s little more than typing.
That’s right. You watch it type it out and right where it gets to the important part you realize that’s not what you meant at all, so you hit the stop button. Then you modify the prompt and repeat that one more time. That’s when you realize there are so many things it’s not even considering which gives you the satisfaction that your job is still secure. Then you write a more focused prompt for one aspect of them problem and take whatever good enough bullshit it spewed as a starting point for you to do the manual work. Rinse and repeat.
Exactly. I have a coworker use it effectively.
Personally, I’ve been around the block so it’s usually faster for me to just do the busy work myself. I have lots of tricks for manipulating text quickly (I’m quite proficient with vim), so it’s not a big deal to automate turning JSON into a serializer class or copy and modify a function a bunch of times to build out a bunch of controllers or something. What takes others on my team 30 min I can sometimes get done in 5 through the power of regex or macros.
But at the end of the day, it doesn’t really matter what tools you use because you’re not being paid for your typing speed or ability to do mundane work quickly, you’re being paid to design and support complex software.
We have a handful of Python tools that we require to adhere to PEP8 formatting, and have Jenkins pipeline jobs to validate it and block merge requests if any of the code isn’t properly formatted. I haven’t personally tried it yet, but I wonder if these AI’s might be good for fixing up this sort of formatting lint.
Why bother with AI for that? https://black.readthedocs.io/en/stable/index.html
fancy autocomplete autocompletes whatever it given. tech bros: "surprised Pikachu*.
It’s that time again… for LLMentalist.
Seriously, it should be linked to every mention of LLM anywhere.
If you ask the llm for code it will often give you a buggy code but if you run it get an error annd then tell the ai what error you had it will often fix the error so that is cool.
Wont always work though…
AutoComplete 2, Wasted Electricity Boogaloo!
AutoComplete 2, Wasted Electricity Boogaloo!
What a waste of time. Both the article and the researchers.
Literally by the time their research was published, it was using irrelevant models, on top of the fact that, yeah, that’s how LLMs work. That would be obvious from 5m of using them.