• 0 Posts
  • 45 Comments
Joined 1 year ago
cake
Cake day: June 14th, 2023

help-circle
  • Separately, Jong has also alleged that Apple subjected her to a hostile work environment after a senior member of her team, Blaine Weilert, sexually harassed her. After she complained, Apple investigated and Weilert reportedly admitted to touching her “in a sexually suggestive manner without her consent,” the complaint said. Apple then disciplined Weilert but ultimately would not allow Jong to escape the hostile work environment, requiring that she work with Weilert on different projects. Apple later promoted Weilert.

    As a result of Weilert’s promotion, the complaint said that Apple placed Weilert in a desk “sitting adjacent” to Jong’s in Apple’s offices. Following a request to move her desk, a manager allegedly “questioned” Jong’s “willingness to perform her job and collaborate” with Weilert, advising that she be “professional, respectful, and collaborative,” rather than honoring her request for a non-hostile workplace.


  • Now instead of just querying the goddamn database, a one line fucking SQL statement, I have to deal with the user team

    Exactly, you understand very well the purpose of microservices. You can submit a patch if you need that feature now.

    Funnily enough I’m the technical lead of the team that handles the user service in an insurance company.

    Due to direct access to our data without consulting us, we’re getting legal issues as people were using addresses to guess where people lived instead of using our endpoints.

    I guess some people really hate the validation that service layers have.


  • You are in a bubble. A neo nazi march was banned two weeks ago in France before being allowed again by the judicial system. The exact same scenario has been repeating for pro-palestine protests.

    At least in France, the scenario seems to be that the government wants to ban any controversial march and is being kept under control by the justice system.


  • I’m afraid that would not be sufficient.

    These instructions are a small part of what makes a model answer like it does. Much more important is the training data. If you want to make a racist model, training it on racist text is sufficient.

    Great care is put in the training data of these models by AI companies, to ensure that their biases are socially acceptable. If you train an LLM on the internet without care, a user will easily be able to prompt them into saying racist text.

    Gab is forced to use this prompt because they’re unable to train a model, but as other comments show it’s pretty weak way to force a bias.

    The ideal solution for transparency would be public sharing of the training data.



  • It’s absolutely amazing, but it is also literally and technologically impossible for that to spontaneously coelesce into reason/logic/sentience.

    This is not true. If you train these models on game of Othello, they’ll keep a state of the world internally and use that to predict the next move played (1). To execute addition and multiplication they are executing an algorithm on which they were not explicitly trained (although the gpt family is surprisingly bad at it, due to a badly designed tokenizer).

    These models are still pretty bad at most reasoning tasks. But training on predicting the next word is a perfectly valid strategy, after all the best way to predict what comes after the “=” in 1432 + 212 = is to do the addition.










  • These models do not see letters but tokens. For the model, violet is probably two symbols viol and et. Apart from learning by heart the number of letters in each token, it is impossible for the model to know the number of letters in a word.

    This is also why gpt family sucks at addition their tokenizer has symbols for common numbers like 14. This meant that to do 14 + 1 it could not use the knowledge 4 + 1 was 5 as it could not see the link between the token 4 and the token 14. The Llama tokenizer fixes this, and is thus much better at basic algebra even with much smaller models.


  • Yes to your question, but that’s not what I was saying.

    Here is one of the most popular training datasets : https://pile.eleuther.ai/

    If you look at the pdf describing the dataset, you’ll find the mean length of these documents to be somewhat short with mean length being less than 20kb (20000 characters) for most documents.

    You are asking for a model to retain a memory for the whole duration of a discussion, which can be very long. If I chat for one hour I’ll type approximately 8400 words, or around 42KB. Longer than most documents in the training set. If I chat for 20 hours, It’ll be longer than almost all the documents in the training set. The model needs to learn how to extract information from a long context and it can’t do that well if the documents on which it trained are short.

    You are also right that during training the text is cut off. A value I often see is 2k to 8k tokens. This is arbitrary, some models are trained with a cut off of 200k tokens. You can use models on context lengths longer than that what they were trained on (with some caveats) but performance falls of badly.


  • There are two issues with large prompts. One is linked to the current language technology, were the computation time and memory usage scale badly with prompt size. This is being solved by projects such as RWKV or mamba, but these remain unproven at large sizes (more than 100 billion parameters). Somebody will have to spend some millions to train one.

    The other issue will probably be harder to solve. There is less high quality long context training data. Most datasets were created for small context models.




  • As long as the demographic chart of Palestinians murdered by the IDF looks like the actual Palestinian population demographic (1/3 women, 1/3 kids) it’s safe to assume that there is absolutely no real targeting taking place.

    Yes, there is a bump if you look at the Hamas fighting population demographics but it is a minority. The large majority of people killed in this war are civilians there is no doubt about that. I was denying the 1:100 figure. For example Hamas has 1\3 of female victims, yet have a 1:4 casualty rate.

    Netanyahu literally said publicly that he saw wants to kill all Palestinians including the women and children and his deeds match his words.

    No he didn’t and you know it. Why lie ?

    Some senior Hamas executives have had such a discourse for Jews before being very softly reprimanded by Hamas but no executive from the Israeli government. There have been plenty of dog whistles, but they are not stupid enough to say it literally.

    Edit : I didn’t realize it but you were the person calling for the massacre of civilians in an earlier comment. Explains why you would lie, you need to dehumanise your enemy. I’m not spending more energy on this. You’re too far gone.