@WalnutLum

WalnutLum@lemmy.ml · 2 days ago

ChatGpt already is multiple smaller models. Most guesses peg chatgpt4 as a 8x220 Billion parameter mixture of experts, or 8 220 billion parameter models squished together

WalnutLum@lemmy.ml · 2 days ago

Let them Fight

WalnutLum@lemmy.ml · 2 days ago

My one dark hope is AI will be enough of an impetus for somebody to update DMCA

WalnutLum@lemmy.ml · edit-2 3 days ago

> pay once, get access to everything everywhere

> thinks about Elsevier

OH GOD PLEASE NO

WalnutLum@lemmy.ml · 4 days ago

That doesn’t seem to be the same article

WalnutLum@lemmy.ml · 5 days ago

Turns out that whole idea of women being the primary bearers of hundred of years of exploited reproductive labor might have had some weight to it, huh.

All that labor being redirected into “L’economie” means that, at base, you’ll have less children.

WalnutLum@lemmy.ml · 5 days ago

This is interesting but I’ll reserve judgement until I see comparable performance past 8 billion params.

All sub-4 billion parameter models all seem to have the same performance regardless of quantization nowadays, so 3 billion is a little hard to see potential in.

WalnutLum@lemmy.ml · 5 days ago

I seriously doubt the viability of this, but I’m looking forward to being proven wrong.

WalnutLum@lemmy.ml · 5 days ago

The OSI just published a resultnof some of the discussions around their upcoming Open Source AI Definition. It seems like a good idea to read it and see some of the issues they’re trying to work around…

https://opensource.org/blog/explaining-the-concept-of-data-information

WalnutLum@lemmy.ml · 7 days ago

I would recommend instead to use the AI Horde: https://stablehorde.net/ It’s a collection of people hosting stable diffusion/text generation models

There’s also openrouter which can connect to ChatGPT with a token-based system. (They check your prompts for hornyposting though)

WalnutLum@lemmy.ml · 7 days ago

It helps differentiate between GNU/Linux users and the five people who use GNU/Hurd

WalnutLum@lemmy.ml · 13 days ago

Judging by my bank account I’m transitioning to non-profit status as well.

WalnutLum@lemmy.ml · 13 days ago

https://www.gnu.org/software/gnuzilla/

WalnutLum@lemmy.ml · 14 days ago

Yes of course, there’s nothing gestalt about model training, fixed inputs result in fixed outputs

WalnutLum@lemmy.ml · 15 days ago

I suppose the importance of the openness of the training data depends on your view of what a model is doing.

If you feel like a model is more like a media file that the model loaders are playing back, where the prompt is more of a type of control over how you access this model then yes I suppose from a trustworthiness aspect there’s not much to the model’s training corpus being open

I see models more in terms of how any other text encoder or serializer would work, if you were, say, manually encoding text. While there is a very low chance of any “malicious code” being executed, the importance is in the fact that you can check the expectations about how your inputs are being encoded against what the provider is telling you.

As an example attack vector, much like with something like a malicious replacement technique for anything, if I were to download a pre-trained model from what I thought was a reputable source, but was man-in-the middled and provided with a maliciously trained model, suddenly the system I was relying on that uses that model is compromised in terms of the expected text output. Obviously that exact problem could be fixed with some has checking but I hope you see that in some cases even that wouldn’t be enough. (Such as malicious “official” providence)

As these models become more prevalent, being able to guarantee integrity will become more and more of an issue.

WalnutLum@lemmy.ml · 15 days ago

I’ve seen this said multiple times, but I’m not sure where the idea that model training is inherently non-deterministic is coming from. I’ve trained a few very tiny models deterministically before…

WalnutLum@lemmy.ml · 15 days ago

I’m not sure where you get that idea. Model training isn’t inherently non-deterministic. Making fully reproducible models is 360ai’s apparent entire modus operandi.

WalnutLum@lemmy.ml · 15 days ago

There are VERY FEW fully open LLMs. Most are the equivalent of source-available in licensing and at best, they’re only partially open source because they provide you with the pretrained model.

To be fully open source they need to publish both the model and the training data. The importance is being “fully reproducible” in order to make the model trustworthy.

In that vein there’s at least one project that’s turning out great so far:

https://www.llm360.ai/

WalnutLum@lemmy.ml · 16 days ago

Holy crap there are still working nitter instances? God bless

WalnutLum@lemmy.ml · 18 days ago

In my experience these open models is where the real work is being done. The large supervised models like DALL-E etc are more flashy but there’s a lot more going on behind the scenes than the model itself so it feels like it’s hard to gauge the real progress being done