goofy1958

Copyright laws shouldn't apply to AI training, proposes Google

goofy1958

August 2023

22july2013 said:

AI has multiple data processing stages that could have different legal rules apply. It seems that Google is talking only about the training stage here. The training stage needs to examine large data sets, true, but the language model file that results from that training is surprisingly small (smaller than many large computer games, like World of Warcraft.) I'm no AI expert, or information theory expert, but I can't see any way that the language model itself (eg, a 50 GB file) "contains a copy of the Internet" therefore the model by itself probably isn't violating anyone's copyright. If you ask a chatbot something like "how many basketballs can fit in an average house?" the file itself doesn't contain the answer in its data, it still has to go to the Internet to get the data (eg, size of a basketball, size of a house) needed to generate the answer. If there is a copyright issue, it's probably at the answer-generation stage, not at the model-training stage, because the model that gets created does not contain any of the data that the training-stage had to read to generate it.

In order for a human to get to where we are today, we had to read Dr Seuss books (or something similar) to learn English, but we don't have to pay Dr Seuss' estate every time we do something that makes money that exploits or knowledge of the English language. We might even forget the words to the Seuss books, but the principles of the language that we learned from those books remain in our heads. "Language principles" extracted from any data should not be copyrightable, while specific data should be copyrightable.

If you want to argue that the Google search engine violates copyright because it actually requires 10 exabytes of data (stored on Google's server farms) that it obtained from crawling the Internet, I could probably agree with that. But I can't see how a puny 50 GB file or anything that small could be a violation of anyone's copyright. You can't compress the entire Internet into a puny file like that, therefore it can't violate anyone's copyright.

The reason most people can't run large language models on their local computers is that the "small 50 GB file" has to fit in local memory (RAM), and most users don't have that much RAM. The reason it needs to fit in memory is that every byte of the file has to be accessed about 50 times in order to generate the answer to any question. If the file was stored on disk, it could take hours or weeks to calculate a single answer.

10,000,000,000,000,000,000 = The number of bytes of data on Google's servers
00,000,000,050,000,000,000 = The number of bytes that an LL model file requires, which is practically nothing

I'd be happy if someone can correct me, although evidence would be appreciated.

Where did you get the 50GB number? I don't see that anywhere in the article.

Lawmakers urged to block TSMC worker visas over Arizona plant construction

goofy1958

August 2023

beowulfschmidt said:

goofy1958 said:

hmurchison said:

Hiring Americans is part of the deal otherwise ...we don't need you here.

How short-sighted of you. They are only bringing help from Taiwan on a temporary basis until they can get all of the positions filled by Americans.

If that really is the case, then prohibiting them really is shortsighted. However, given that the first rule of corporate operations is to lie to everyone except the government and shareholders, and to withhold as much as possible even from them, I'm somewhat skeptical. If the interested parties, e.g. unions, can get them to actually agree to make these 500 people trainers, that would be good.

Wow! Tin foil hat much? So you really think that Apple would let TSMC get away with lying to them??? Since the plant is supposed to have 12,000 workers when fully staffed, what difference would 500 trained workers from Taiwan make to get the plant up and running sooner? And yes, I fully believe that they would end up training a lot of people.

Lawmakers urged to block TSMC worker visas over Arizona plant construction

goofy1958

August 2023

hmurchison said:

Hiring Americans is part of the deal otherwise ...we don't need you here.

How short-sighted of you. They are only bringing help from Taiwan on a temporary basis until they can get all of the positions filled by Americans.

GM ditching CarPlay could go bad, complain car dealers

goofy1958

July 2023

Well, what will happen is that GM will lose customers like us. No carplay and adding in subscription costs - hard pass. We will never own a GM EV because of this, and had actually considered them prior to this boneheaded decision.

Why Apple uses integrated memory in Apple Silicon -- and why it's both good and bad

goofy1958

June 2023

I've been a Windows PC user my whole life until I got a MacBook from my company to use for a year. I fell in love with it, so my next PC will be the new MacBook Air 15" with maxed out specs on memory and storage. It will be more than enough computer for my needs for the foreseeable future. I already have an iPhone, Airpods Pro, Apple TV 4k, and a pair of Homepods, so will be nice to round out my Apple collection.

goofy1958

About

Reactions

Copyright laws shouldn't apply to AI training, proposes Google

Lawmakers urged to block TSMC worker visas over Arizona plant construction

Lawmakers urged to block TSMC worker visas over Arizona plant construction

GM ditching CarPlay could go bad, complain car dealers

Why Apple uses integrated memory in Apple Silicon -- and why it's both good and bad