OpenAI now has 35 in-house lobbyists, and will have 50 by the end of the year.

keepthepace@slrpnk.net · 5 days ago

It is llama3-8B so it is not out of question but I am not sure how much memory you would need to really go to 1M context window. They use ring attention to achieve high context window, which I am unfamiliar with but that seems to lower greatly the memory requirements.

keepthepace@slrpnk.net · 5 days ago

To actually read how they did it, here is there model page: https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k

Approach:

meta-llama/Meta-Llama-3-8B-Instruct as the base

NTK-aware interpolation [1] to initialize an optimal schedule for RoPE theta, followed by empirical RoPE theta optimization

Progressive training on increasing context lengths, similar to Large World Model [2] (See details below)

Infra

We build on top of the EasyContext Blockwise RingAttention library [3] to scalably and efficiently train on contexts up to 1048k tokens on Crusoe Energy high performance L40S cluster.

Notably, we layered parallelism on top of Ring Attention with a custom network topology to better leverage large GPU clusters in the face of network bottlenecks from passing many KV blocks between devices. This gave us a 33x speedup in model training (compare 524k and 1048k to 65k and 262k in the table below).

Data

For training data, we generate long contexts by augmenting SlimPajama. We also fine-tune on a chat dataset based on UltraChat [4], following a similar recipe for data augmentation to [2].

keepthepace@slrpnk.net · 15 days ago

OpenAI now has 35 in-house lobbyists, and will have 50 by the end of the year.

keepthepace@slrpnk.net · 18 days ago

“Theft” is actually legal. Sharing (what they call “piracy”) is not. How about getting the fucking copyright reform that we should have done two decades ago?

keepthepace@slrpnk.net · 24 days ago

OpenAI should be fine. They are leaders but there are plenty of competitors.

Microsoft is in a much more dominant situation and will have to argue that Google competes with them, which is true but may be hard to sell given the fact that I dont think Google offers its TPU services to any other company.

NVidia is in a situation of monopoly. For them it will be hard to argue otherwise. AMD is simply not there, no one using it.

keepthepace@slrpnk.net · 28 days ago

And this is why research is going in another direction: smaller models which allow easier experiments.

keepthepace@slrpnk.net · 29 days ago

I am pretty sure that there are ASIC being put in production as we speak with Whisper embeded. Expect a 4 dollars chip to add voice recognition and a basic LLM to any appliance.

keepthepace@slrpnk.net · 29 days ago

Also, as a side effect, we just solve speech recognition. In a year or two, speaking to machines will be the default interface.

keepthepace@slrpnk.net · 29 days ago

There is a company-wide demotivation plague at Google. Don’t blame middle manager, it extends to the top.

keepthepace@slrpnk.net · edit-2 30 days ago

I use it almost daily.

It does produce good code. It does not reliably produce good code. I am a programmer, it makes my job 10x faster and I just have to fix a few bugs in the code it usually generates. Over time, I learned what it is good at (UI code, converting things, boilerplate) and what it struggles with (anything involving newer tech, algorithmic understanding, etc.)

I often refer to it as my intern: It acts like an academically trained, not particularly competent, but very motivated, fast typing intern.

But then I am also working on the field. Prompting it correctly is too often dismissed as a skill (I used to dismiss it too). It needs more understanding than people give it credit for.

I think that like many IT tech it will go from being a dev tool to everyday tool gradually.

All the pieces of the puzzle to be able to control a computer by voice using only natural language are there. You don’t realize how big it is. Companies haven’t assembled it yet because it is actually harder to monetize on it than code it. I think probably Apple is in the best position for it. Microsoft is going to attempt and will fail like usual and Google will probably put a half-assed attempt at it. I’ll personally go for the open source version of it.

keepthepace@slrpnk.net · edit-2 1 month ago

Damn I want to read it but it is from the only two accounts I muted (for different reasons)

EDIT: God the sewer when you unblock Musk’s account! I am never doing it again. Why do people talk over this stupidly noisy channel instead of having a threaded discussion like civilized great apes?

keepthepace@slrpnk.net · 1 month ago

Would be hard to believe just by itself, but when you assemble that with the lies OpenAI gave around the stories of the non-disparagement agreements and the number of employees quietly quitting to join other companies, I do veer on Helen’s side.

I still have a hard time understanding how the petition to bring Altman was so popular if he was so toxic.

keepthepace@slrpnk.net · 1 month ago

You know, it is a bit hard to believe that this type of arguments are done in good faith when they only target open models and open datasets. I have seen the same old arguments against internet, file sharing, anonymity, open source encryption…

It is almost certain that LAION is also included in the datasets of closed models but these are not even mentioned.

keepthepace@slrpnk.net · 1 month ago

“We’re incredibly sorry that we’re only changing this language now; it doesn’t reflect our values or the company we want to be,” a spokesperson said.

Suuuuuure.

keepthepace@slrpnk.net · 1 month ago

Private companies are the last remnants of the feudal system.

keepthepace@slrpnk.net · 1 month ago

Weird choice of title. Clickbaity I guess? It is not about the sex and drugs:

While she said she doesn’t think there’s anything generally wrong with “sex parties and heavy LSD use,” she also charged that the culture surrounding these alleged parties “leads to some of the most coercive and fucked up social dynamics that I have ever seen.”

Would help to be a bit more specific though.

keepthepace@slrpnk.net · 1 month ago

Emad is kind of a hero for burning VC cash and giving us open source models. Stability is gone now for the OSS community, but that run was a blast!

keepthepace@slrpnk.net · edit-2 2 months ago

Heh, when you can’t compete on the performance anymore, compete on the openness.

I don’t like calling it “NSFW”, the correct term is “uncensored”. The problem is not that it can’t generate titties, the problem is that it seriously limits its intelligence to limit it to “socially acceptable” answers. That was one of the points of Fahrenheit 451: censorship is a one way street. You can’t go back once you start squashing “fringe” opinions purely on the ground it may shock people.

keepthepace@slrpnk.net · 2 months ago

What do you guys think about this?

Not enough information to know if this is a good or a bad idea.

A tool that does bodycam -> written reports automatically may improve things. One that simply fakes a lot of details so that it looks like a well fleshed report is a terrible idea.

Generally speaking, the less human subjectivity intervenes in law enforcement, the better off we are. Yet companies and police always somehow find a way to turn good ideas into terrible implementation. I do hope it is for the best, but it could as well increase accountability as it could mass-manufacture lies.

keepthepace@slrpnk.net · 2 months ago

Open source AI group had to discuss openly about it. There is no way around it.

keepthepace@slrpnk.net · 3 months ago

“He was just giving shit away,” one former employee told Forbes. “That man legitimately wanted to transform the world. He actually wanted to train AI models for kids in Malawi. Was it practical? Absolutely not.”

That’s the nicest criticism of a former CEO I have ever seen.