I don’t mean to be difficult. I’m neurodivergent

  • 0 Posts
  • 35 Comments
Joined 3 months ago
cake
Cake day: March 26th, 2025

help-circle




  • I use it almost every day, and most of those days, it says something incorrect. That’s okay for my purposes because I can plainly see that it’s incorrect. I’m using it as an assistant, and I’m the one who is deciding whether to take its not-always-reliable advice.

    I would HARDLY contemplate turning it loose to handle things unsupervised. It just isn’t that good, or even close.

    These CEOs and others who are trying to replace CSRs are caught up in the hype from Eric Schmidt and others who proclaim “no programmers in 4 months” and similar. Well, he said that about 2 months ago and, yeah, nah. Nah.

    If that day comes, it won’t be soon, and it’ll take many, many small, hard-won advancements. As they say, there is no free lunch in AI.


  • What do you mean by “retrain your model”? Retaining it would erase it. It’s not practical to prevent adjusting the weights on an open source model because the weights have to be published for it to work at all. Plenty of open source software can be used to do evil things, and isn’t regulated on that account. If someone was to sue the developers of Wireshark because it was used to exploit their network, they would be very likely to lose because that software has many legitimate non-criminal uses.

    Requiring US commercial vendors to implement fingerprinting would disadvantage them against open source models, and against vendors from other countries (like DeepSeek) who wouldn’t comply. A theoretical government could try to do that, but I don’t know if it would survive legal challenges. The current US government is very unlikely to try in the first place, so it seems like a moot point for the next few years. After that, I don’t know.


  • If you don’t play chess, the Atari is probably going to beat you as well.

    LLMs are only good at things to the extent that they have been well-trained in the relevant areas. Not just learning to predict text string sequences, but reinforcement learning after that, where a human or some other agent says “this answer is better than that one” enough times in enough of the right contexts. It mimics the way humans learn, which is through repeated and diverse exposure.

    If they set up a system to train it against some chess program, or (much simpler) simply gave it a tool call, it would do much better. Tool calling already exists and would be by far the easiest way.

    It could also be instructed to write a chess solver program and then run it, at which point it would be on par with the Atari, but it wouldn’t compete well with a serious chess solver.



  • If someone trains an open source AI model to fingerprint its output, someone else can use abliteration or other methods to defeat that. It will not require re-training. An example of this is deepseek-r1’s “1776” variant, where someone uncensored it, and now it will talk freely about Tiananmen Square.

    Even without that, it’s not practical for a government to find all instances of model training. Thousands of people can rent the same GPUs in the same data centers. A small organization training one model can have the same power consumption as a large organization running inference. It would take advanced surveillance to get around that.

    It’s also becoming possible to train larger and larger models without needing a data center at all. nVidia is coming out with a 128GB desktop machine that delivers 1 petaflop @ FP4 for 170 watts. FP8 would be on the order of hundreds of teraflops. Ten of them could talk over an InfiniBand switch. You could run that setup in an apartment, or in a LAN closet.