snickell

📅 Joined in 2013

🔼 396 Karma

✍️ 68 posts

🌀
15 latest posts

Load

(Replying to PARENT post)

Parrot? Sure, but a parrot operating in a high dimensional manifold. This breaks naive human assumptions.
👤snickell🕑2d🔼0🗨️0

(Replying to PARENT post)

I really like "agent assisted coding". I think the word "vibe" is gonna always swing in a yolo direction, so having different words is helpful for differentiating fundamentally different applications of the same agentic coding tools.
👤snickell🕑8d🔼0🗨️0

(Replying to PARENT post)

This is a brilliant idea, I really hope somebody on the iPhone 18 design team reads it. I think there’s a huge pent up demand for a mini model, many of us would pay more for it than the large versions.
👤snickell🕑1mo🔼0🗨️0

(Replying to PARENT post)

On an iPhone 12 mini, wishing I hadn’t upgraded to iOS 26 because now my phone is notably laggy. Word to the wise. I use swiping for input and would consider it now unusable due to extreme lag.

The physical aspect I can’t give up is I can hold the phone with my thumb on the bottom and my middle finger on the top and scroll with my index finger to read. Wish I could buy that capability on a new iPhone, maybe one even slightly smaller.

Time to go find out if there’s even a way to downgrade, oof this is slow.

👤snickell🕑1mo🔼0🗨️0

(Replying to PARENT post)

The distance between earth and mars varies between 150 and 2000 light seconds.
👤snickell🕑3mo🔼0🗨️0

(Replying to PARENT post)

I think many open source projects already experience two buckets of contributors which maps nicely to the two class distinction inherent in this model:

1) a bunch of people who contributed one or two PRs, but it took the maintainers more time to review/merge the PR than the dev time contributed

2) a much smaller set of people who come back and do more and more PRs, eventually contributing more time than it takes to review their work

A major existing reason to review PRs from class 1 "once or twice" contributors (perhaps the main reason?) is that all class 2 "maintainer-level" contributors start as class 1.

I agree there's an awkward middle ground here, now you have to define where the boundary is between class 1 and class 2, but I think if you were able to graph contribution level you'd find there's already something of a bimodal distribution naturally in many projects anyway.

👤snickell🕑3mo🔼0🗨️0

Show HN:

"universal application where LLM does all computation directly"

👤snickell🕑4mo🔼2🗨️0

(Replying to PARENT post)

If you want to try what Karpathy is describing live today, here's a demo I wrote a few months ago: https://universal.oroborus.org/

It takes mouse clicks, sends them to the LLM, and asks it to render static HTML+CSS of the output frame. HTML+CSS is basically a JPEG here, the original implementation WAS JPEG but diffusion models can't do accurate enough text yet.

My conclusions from doing this project and interacting with the result were: if LLMs keep scaling in performance and cost, programming languages are going to fade away. The long-term future won't be LLMs writing code, it'll be LLMs doing direct computation.

👤snickell🕑4mo🔼0🗨️0

(Replying to PARENT post)

What scares me is that the obvious pool of money to fund the deficit in the cost of operating of LLMs comes from the most subtle native advertising imaginable. Can you resist ads where, say, AirBnB pays OpenAI privately to “dope” the o3 hyperspace such that AirBnB is moved imperceptibly closer to tokens like value and authentic??

How much would AirBnB pay for the intelligence everyone gets all their info from having a subtle bias like this? Sliiightly more likely to assume folks will stay in airbnbs vs a hotel when they travel, sliiightly more likely to describe the world in these terms.

How much would companies pay to directly, methodically and indetectably bias “everyone’s most frequent conversant” toward them?

👤snickell🕑4mo🔼0🗨️0

(Replying to PARENT post)

I use AI heavily in my own programming, so I’m not against, but I suspect this “as much as” is mostly copilot doing “tab completion” style autocompletions, not AI writing and modifying functions on its own.
👤snickell🕑6mo🔼0🗨️0

(Replying to PARENT post)

This is a really interesting project, and a great read. I learned a lot. I'm falling down the rabbit hole pretty hard reading about the "Leap" algorithm (https://www.usenix.org/system/files/atc20-maruf.pdf) it uses to predict remote memory prefetches.

It's easy to focus on libgraft's SQLite integration (comparing to turso, etc), but I appreciate that the author approached this as a more general and lower-level distributed storage problem. If it proves robust in practice, I could see this being used for a lot more than just sqlite.

At the same time, I think "low level general solutions" are often unhinged when they're not guided by concrete experience. The author's experience with sqlsync, and applying graft to sqlite on day one, feels like it gives them standing to take a stab at a general solution. I like the approach they came up with, particularly shifting responsibility for reconciliation to the application/client layer. Because reconciliation lives heavily in tradeoff space, it feels right to require the application to think closely about how they want to do it.

A lot of the questions here are requesting comparison's to existing SQLite replication systems, the article actually has a great section on this topic at the bottom: https://sqlsync.dev/posts/stop-syncing-everything/#compariso...

👤snickell🕑6mo🔼0🗨️0

(Replying to PARENT post)

It definitely knows what GTK4 is, when it freaked out on me and lost the code, it was using all gtkmm-4.0 headers, and had the compiler error count down to 10 (most likely with tons of logic errors, but who knows).

But LLMs performance varies (and this is a huge critique!) not just on what they theoretically know, but how, erm, cross-linked it is with everything else, and that requires lots of training data in the topic.

Metaphorically, I think this is a little like the difference for humans in math between being able to list+define techniques to solve integrals vs being able to fluidly apply them without error.

I think a big and very valid critique of LLMs (compared to humans) is that they are stronger at "memory" than reasoning. They use their vast memory as a crutch to hide the weaknesses in their reasoning. This makes benchmarks like "convert from gtkmm3 to gtkmm4" both challenging AND very good benchmarks of what real programmers are able to do.

I suspect if we gave it a similarly sized 2kloc conversion problem with a popular web framework in TS or JS, it would one-shot it. But again, its "cheating" to do this, its leveraging having read a zillion conversion by humans and what they did.

👤snickell🕑6mo🔼0🗨️0

(Replying to PARENT post)

Yes, very much agree, an interesting benchmark. Particularly because it’s in a “tier 2” framework (gtkmm) in terms of amount of code available to train an LLM on. That tests the LLMs ability to plan and problem solve compared with, say, “convert to the latest version of react” where the LLM has access to tens of thousands (more?) of similar ports in its training dataset and more has to pattern match.
👤snickell🕑7mo🔼0🗨️0

(Replying to PARENT post)

This is the smoothest tom sawyer move I've ever seen IRL, I wonder how many people are now grinding out your GTK4 port with our favorite LLM/system to see if it can. It'll be interesting to see if anyone gets something working with current-gen LLMs.

UPDATE: naive (just fed it your description verbatim) cline + claude 3.7 was a total wipeout. It looked like it was making progress, then freaked out, deleted 3/4 of its port, and never recovered.

👤snickell🕑7mo🔼0🗨️0

(Replying to PARENT post)

> How do you even read code without types?

We're not going to settle the preference for dynamic vs static types here. Its probably older than both of us, with many fine programmers on both sides of the fence. I'll leave it at this: well-informed programmers choosing to write in dynamically typed languages DO read code without types, and have happily done so since the late 1950s (lisp).

The funny thing is, I experience the same "how do you even??" feeling reading statically typed code. There's so much... noise on the screen, how can you even follow what's going on with the code? I guess people are just different?

> LLMs will make fewer type errors, and more errors that are uncaught by types

The errors I'm talking about are like "this CSS causes the element to draw part of its content off-screen, when it probably shouldn't". In theory, some sufficiently advanced type system could catch that (and not catch elements off screen that you want off-screen)? But realistically: pretty challenging for a static type system to catch.

The errors I see are NOT errors that throw exceptions at runtime either, in other words, they are beyond the scope of current type systems, either dynamic (runtime) or static (compile time). Remember that dynamic languages ARE usually typed, they are just type checked at runtime not compile time.

> perhaps that gives the delusion that the LLM is doing it completely without type system.

I mentioned coding in JS with cline, so no delusion. It does fine w/o a type system, and it rarely generates runtime errors. I fix those like I do with runtime errors generated when /I/ program with a dynamic language: I see them, I fix them. I find they're a lot rarer in both LLM generated code and in human generated code that proponents of static typing seem to think?

👤snickell🕑7mo🔼0🗨️0