rjpower9000

πŸ“… Joined in 2013

πŸ”Ό 139 Karma

✍️ 35 posts

πŸŒ€
15 latest posts

Load

(Replying to PARENT post)

Part of the land β€” 120 of the nearly 700 acres β€” is rented from a family who owns multiple farm properties and wants their fields weed-free with perfectly straight grids of crops, a deep-rooted tradition among Midwestern farming communities.

β€œThey want that land to be clean corn and soybeans,” Bishop said. Before the restrictions, his father was growing organic corn and soybeans on part of the field and letting Bishop grow vegetables on the rest.

I've seen this mentioned elsewhere, but the idea that you'd force someone else to create a mono-crop desert, not even out of a sense of efficiency, but _just because it looks right_, is just so frustrating.

πŸ‘€rjpower9000πŸ•‘3moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

Not sure where I stumbled on this, but fascinating historical article on how people were already using spreadsheets for task management and development back in the 80s.
πŸ‘€rjpower9000πŸ•‘3moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

Thanks for sharing. I ended up reading through one of these -- https://atonementlicensing.com/surviving-your-first-oracle-l... -- it's truly amazing/terrifying that it's so bad.

It's hard to imagine how a company could be more extractive than this.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

I might have phrased this unclearly, I meant specifically for the case of translating one symbol at a time from C to Rust. I certainly won't claim I've figured out any magic that makes the coding agents consistent!

Here you've got the advantage that you're repeating the same task over and over, so you can tweak your prompt as you go, and you've got the "spec" in the form of the C code there, so I think there's less to go wrong. It still did break things sometimes, but the fuzzing often caught it.

It does require careful prompting. In my first attempt Claude decided that some fields in the middle of an FFI struct weren't necessary. You can imagine the joy of trying to debug how a random pointer was changing to null after calling into a Rust routine that didn't even touch it. It was around then I knew the naive approach wasn't going to work.

The second attempt thus had a whole bunch of "port the whole struct or else" in the prompt: https://github.com/rjpower/zopfli/blob/master/port/RUST_PORT... .

In general I've found the agents to be a mixed bag, but overall positive if I use them in the right way. I find it works best for me if I used the agent as a sounding board to write down what I want to do anyway. I then have it write some tests for what should happen, and then I see how far it can go. If it's not doing something useful, I abort and just write things myself.

It does change your development flow a bit for sure. For instance, it's so much more important to concrete test cases to force the agent to get it right; as you mention, otherwise it's easy for it do something subtly broken.

For instance, I switched to tree-sitter from the clang API to do symbol parsing, and Claude wrote effectively all of it; in this case it was certainly much faster than writing it myself, even if I needed to poke it once or twice. This is sort of a perfect task for it though: I roughly knew what symbols should come out and in what order, so it was easy to validate the LLM was going in the right direction.

I've certainly had them go the other way, reporting back that "I removed all of the failing parts of the test, and thus the tests are passing, boss" more times than I'd like. I suspect the constrained environment again helped here, there's less wiggle room for the LLM to misinterpret the situation.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

It's in between. It's more C like than the Claude port, but it's more Rust-y than c2rust. How much depends on how fine-grained you want to make your port and how you want to prompt your LLM. For inside of functions and internal symbols, the LLM is free to use more idiomatic construction and structures. But since the goal was to test the effectiveness of the fuzz testing, using the LLM to do the symbol translation is more of an implementation detail.

You could certainly try using c2rust to do the initial translation, and it's a reasonable idea, but I didn't find the LLMs really struggled with this part of the task, and there's certainly more flexibility this way. c2rust seemed to choke on some simple functions as well, so I didn't pursue it further.

And of course for external symbols, you're constrained by the C API, so how much leeway you have depends on the project.

You can also imagine having the LLM produce more idiomatic code from the beginning, but that can be hard to square with the incremental symbol-by-symbol translation.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

It seems feasible, but I haven't thought enough it. One challenge is that as you Rustify the code, it's harder to keep the 1-1 mapping with C interfaces. Sometimes to make it more Rust-y, you might want an internal function or structure to change. You then lose your low-level fuzz tests.

That said, you could have the LLM write equivalence tests, and you'd still have the top-level fuzz tests for validation.

So I wouldn't say it's impossible, just a bit harder to mechanize directly.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

I'm the author. That's a great idea. I didn't explore that for this session but it's worth trying.

I didn't measure consistently, but I would guess 60-70% of the symbols ported easily, with either one-shot or trivial edits, 20% Gemini managed to get there but ended up using most of its attempts, and 10% it just struggled with.

The 20% would be good candidates for multiple generations & certainly consumed more than 20% of the porting time.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

That's an interesting idea. I hadn't thought about it, but it would be interesting to consider doing something similar for the porting task. I don't know enough about the space, could you have an LLM write a formal spec for a C function and the validate the translated function has the same properties?

I guess I worry it would be hard to separate out the "noise", e.g. the C code touches some memory on each call so now the Rust version has to as well.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

Thanks for sharing, I did not know about that!

Indeed, this is exactly the type of subtle case you'd worry about when porting. Fuzzing would be unlikely to discover a bug that only occurs on giant inputs or needs a special configuration of lists.

In practice I think it works out okay because most of the time the LLM has written correct code, and when it doesn't it's introduced a dumb bug that's quickly fixed.

Of course, if the LLM introduces subtle bugs, that's even harder to deal with...

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

I was pretty hand-wavy when I made the original comment. I was thinking implicitly to things like the Python sub-interpreter proposal, which had strong pushback from the Numpy engineers at the time (I don't know the current status, whether it's a good idea, etc, just something that came to mind).

https://lwn.net/Articles/820424/

The objections are of course reasonable, but I kept thinking this shouldn't be as big a problem in the future. A lot of times we want to make some changes that aren't _quite_ mechanical, and if they hit a large part of the code base, it's hard to justify. But if we're able to defer these types of cleanups to LLMs, it seems like this could change.

I don't want a world with no API stability of course, and you still have to design for compatibility windows, but it seems like we should be able to do better in the future. (More so in mono-repos, where you can hit everything at once).

Exactly as you write, the idea with prompts is that they're directly actionable. If I want to make a change to API X, I can test the prompt against some projects to validate agents handle it well, even doing direct prompt optimization, and then sharing it with end users.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

Fixed, thanks!
πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

I had a similar experience. I finally got around to reading Ulysses when I had some downtime between jobs and pushed my way through it. I ended up referring to https://www.ulyssesguide.com/ as I went along which helped substantially: the extra context and discussion made me appreciate the novel more.

I came to the conclusion that while I didn't necessarily _like it_ per se, I had to acknowledge how absurdly talented Joyce was, and that there was some justification for being in the top books list. My feeling was that the lack of enjoyment was a fault of the book but more that I didn't have the background to appreciate it. Though there were also some chapters where most people agree Joyce was just trying too hard and it shows.

πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0

(Replying to PARENT post)

The one in which I spend way too much time building a crappy version of Claude Code, ending up with yet another Rust port of Zopfli, but different from that of Syzygy, and I promise I didn't make up any of these words.
πŸ‘€rjpower9000πŸ•‘4moπŸ”Ό0πŸ—¨οΈ0