rjpower9000
π Joined in 2013
πΌ 139 Karma
βοΈ 35 posts
Load more
(Replying to PARENT post)
(Replying to PARENT post)
It's hard to imagine how a company could be more extractive than this.
(Replying to PARENT post)
Here you've got the advantage that you're repeating the same task over and over, so you can tweak your prompt as you go, and you've got the "spec" in the form of the C code there, so I think there's less to go wrong. It still did break things sometimes, but the fuzzing often caught it.
It does require careful prompting. In my first attempt Claude decided that some fields in the middle of an FFI struct weren't necessary. You can imagine the joy of trying to debug how a random pointer was changing to null after calling into a Rust routine that didn't even touch it. It was around then I knew the naive approach wasn't going to work.
The second attempt thus had a whole bunch of "port the whole struct or else" in the prompt: https://github.com/rjpower/zopfli/blob/master/port/RUST_PORT... .
In general I've found the agents to be a mixed bag, but overall positive if I use them in the right way. I find it works best for me if I used the agent as a sounding board to write down what I want to do anyway. I then have it write some tests for what should happen, and then I see how far it can go. If it's not doing something useful, I abort and just write things myself.
It does change your development flow a bit for sure. For instance, it's so much more important to concrete test cases to force the agent to get it right; as you mention, otherwise it's easy for it do something subtly broken.
For instance, I switched to tree-sitter from the clang API to do symbol parsing, and Claude wrote effectively all of it; in this case it was certainly much faster than writing it myself, even if I needed to poke it once or twice. This is sort of a perfect task for it though: I roughly knew what symbols should come out and in what order, so it was easy to validate the LLM was going in the right direction.
I've certainly had them go the other way, reporting back that "I removed all of the failing parts of the test, and thus the tests are passing, boss" more times than I'd like. I suspect the constrained environment again helped here, there's less wiggle room for the LLM to misinterpret the situation.
(Replying to PARENT post)
You could certainly try using c2rust to do the initial translation, and it's a reasonable idea, but I didn't find the LLMs really struggled with this part of the task, and there's certainly more flexibility this way. c2rust seemed to choke on some simple functions as well, so I didn't pursue it further.
And of course for external symbols, you're constrained by the C API, so how much leeway you have depends on the project.
You can also imagine having the LLM produce more idiomatic code from the beginning, but that can be hard to square with the incremental symbol-by-symbol translation.
(Replying to PARENT post)
That said, you could have the LLM write equivalence tests, and you'd still have the top-level fuzz tests for validation.
So I wouldn't say it's impossible, just a bit harder to mechanize directly.
(Replying to PARENT post)
I didn't measure consistently, but I would guess 60-70% of the symbols ported easily, with either one-shot or trivial edits, 20% Gemini managed to get there but ended up using most of its attempts, and 10% it just struggled with.
The 20% would be good candidates for multiple generations & certainly consumed more than 20% of the porting time.
(Replying to PARENT post)
I guess I worry it would be hard to separate out the "noise", e.g. the C code touches some memory on each call so now the Rust version has to as well.
(Replying to PARENT post)
Indeed, this is exactly the type of subtle case you'd worry about when porting. Fuzzing would be unlikely to discover a bug that only occurs on giant inputs or needs a special configuration of lists.
In practice I think it works out okay because most of the time the LLM has written correct code, and when it doesn't it's introduced a dumb bug that's quickly fixed.
Of course, if the LLM introduces subtle bugs, that's even harder to deal with...
(Replying to PARENT post)
https://lwn.net/Articles/820424/
The objections are of course reasonable, but I kept thinking this shouldn't be as big a problem in the future. A lot of times we want to make some changes that aren't _quite_ mechanical, and if they hit a large part of the code base, it's hard to justify. But if we're able to defer these types of cleanups to LLMs, it seems like this could change.
I don't want a world with no API stability of course, and you still have to design for compatibility windows, but it seems like we should be able to do better in the future. (More so in mono-repos, where you can hit everything at once).
Exactly as you write, the idea with prompts is that they're directly actionable. If I want to make a change to API X, I can test the prompt against some projects to validate agents handle it well, even doing direct prompt optimization, and then sharing it with end users.
(Replying to PARENT post)
(Replying to PARENT post)
I came to the conclusion that while I didn't necessarily _like it_ per se, I had to acknowledge how absurdly talented Joyce was, and that there was some justification for being in the top books list. My feeling was that the lack of enjoyment was a fault of the book but more that I didn't have the background to appreciate it. Though there were also some chapters where most people agree Joyce was just trying too hard and it shows.
(Replying to PARENT post)
βThey want that land to be clean corn and soybeans,β Bishop said. Before the restrictions, his father was growing organic corn and soybeans on part of the field and letting Bishop grow vegetables on the rest.
I've seen this mentioned elsewhere, but the idea that you'd force someone else to create a mono-crop desert, not even out of a sense of efficiency, but _just because it looks right_, is just so frustrating.