Courts say AI training on copyrighted material is legal

2

Comments

  • Reply 21 of 49
    If everyone who writes a comment on this page will send a fee to Dr Seuss for learning from his books to read and speak, then I will pay attention to their views if they oppose AI learning from published sources. But if you aren't willing to pay everyone that you learn from, for every word that comes out of your mouth, then I don't see why AI should have to pay either. Next, are we going to charge aliens for learning English by reading the radio waves that are being sent into deep space?
    Somehow you’ve convinced yourself that AI is sentient and learning from influences. It’s not. It has a database of pirated data that it uses to essentially copy/paste responses from.
    By no means do I think AI is sentient. Stop putting words in my mouth. Also, I do not believe that sentience absolves an entity from paying fees for using data that it has absorbed from other beings, as you seem to be implying there. And I totally disagree with your explanation of AI. It is not a "copy" of data. In fact, every single time I use AI to get some data, it ALWAYS goes to the internet to look things up, because it hasn't "memorized the internet" as simpletons think it has.

    Up until now, the courts have been the entity that decides whether a "work" that has been "used for profit" has "infringed" on someone else's work. That's a perfectly valid system for going forward. AI doesn't change anything here. If anyone uses AI to write a plagiarized work, then the persons who benefit from that plagiarization should be suable. But we shouldn't stop AI from creating fair use derivatives of other people's work, just as you shouldn't be sued for writing a song that sounds vaguely similar to an ABBA song. If you can take advantage of "fair use", then so can other people who use AI for the same thing. After all, half the videos on Youtube are taking advantage of fair use laws, by using someone else's video or audio.
    This is hilariously, multiple people have pointed out that your initial post about paying for content was incorrect and rather than defend your initial comment you just opted to talk about something else. You nailed it in the credibility front. 
    williamlondonsconosciuto
     2Likes 0Dislikes 0Informatives
  • Reply 22 of 49
    mfrydmfryd Posts: 273member
    longfang said:
    mfryd said:
    Meh. Seems emotional and sentimental. If you are placing your content on the web, you are practically posting it on the street for general view with absurd hopes of pennies trickling in on some desperate fancy rather than through proper business channels with an effective strategy of legally protecting and promoting yourself - childish. Most people who do such art that they may avoid other types of structured paid work - what do they expect when they treat their skill set as a hobby - likely not wanting to work for others on a structured gig - if that's even around much? What's even the issue here - not getting a piece of the trifling leavings of scrapers and edu-content pedlars? pedantic. Art needs to stop being a vague creation-vocation of the rando people and grow up. Successful society is based on complex businesses and legal structures requiring serious people acting seriously. Creativity is a real skill and needs focused training and  a hierarchy of knowledgeable people to propagate it through society. Sorry, but I have little symp for the dilettantes and dabblers hoping to otherwise avoid the soulless cubicle, construction site, and assembly line.
    It's not that simple.  The AI companies are scraping material that isn't on the web.  They are scanning and scraping printed books.  They are scraping copyrighted movies.   

    They are scraping the copyrighted works of artists who earn their living licensing their work.
    Would it be okay then if the scrapping were done via the AI’s “eyes” aka camera reading a physical book?
    Having AI "read" the book with a camera is an interesting example.  The camera is making a copy of the work to be stored in the computers memory.   Is this fair use or a copyright infringement?

    I am not saying whether or not it is "OK".  I am merely pointing out that these are a complicated issues.  In the past, it was acceptable for someone to study copyrighted work in order to better their skills. It was common for these copyrighted works to teach and influence the work of others.  An up and coming writer might read the copyrighted works of popular authors in order to learn their techniques and style.

    Modern technology has allowed these traditional uses on an unprecedented scale.  A computer can easily memorize the compete works of Stephen King.  That's not something a person is likely to do.

    Society needs to determine what should be allowed and whether there should be limitations.  These determinations are typically made by Congress, and interpreted by the courts.
    randominternetpersonkillroymuthuk_vanalingam
     3Likes 0Dislikes 0Informatives
  • Reply 23 of 49
    Meh. Seems emotional and sentimental. If you are placing your content on the web, you are practically posting it on the street for general view with absurd hopes of pennies trickling in on some desperate fancy rather than through proper business channels with an effective strategy of legally protecting and promoting yourself - childish. Most people who do such art that they may avoid other types of structured paid work - what do they expect when they treat their skill set as a hobby - likely not wanting to work for others on a structured gig - if that's even around much? What's even the issue here - not getting a piece of the trifling leavings of scrapers and edu-content pedlars? pedantic. Art needs to stop being a vague creation-vocation of the rando people and grow up. Successful society is based on complex businesses and legal structures requiring serious people acting seriously. Creativity is a real skill and needs focused training and  a hierarchy of knowledgeable people to propagate it through society. Sorry, but I have little symp for the dilettantes and dabblers hoping to otherwise avoid the soulless cubicle, construction site, and assembly line.
    You realize, of course, that you posted this on a site that does exactly what you describe as a childish, desperate fancy, right?

    I think most of us would agree that "Successful society is based on complex businesses and legal structures requiring serious people acting seriously." I might get that as a tattoo.

    Now, when I read that I think of wise, thoughtful comments like mfryd has been posting here. AI raises complex business and legal questions and perhaps we need new legal models to ensure we end up with a successful society. Not sure how denigrating content creators as dilettantes and dabblers helps advance the conversation (although I do appreciate the alliteration).
    killroymuthuk_vanalingam
     1Like 0Dislikes 1Informative
  • Reply 24 of 49
    By no means do I think AI is sentient. Stop putting words in my mouth. Also, I do not believe that sentience absolves an entity from paying fees for using data that it has absorbed from other beings, as you seem to be implying there. And I totally disagree with your explanation of AI. It is not a "copy" of data. In fact, every single time I use AI to get some data, it ALWAYS goes to the internet to look things up, because it hasn't "memorized the internet" as simpletons think it has.

    Is this true or not? My understanding was that once a LLM is trained, it knows what it knows and doesn't need to "look things up on the Internet." But perhaps I'm a simpleton.


    killroy
     0Likes 1Dislike 0Informatives
  • Reply 25 of 49
    mfryd said: I am merely pointing out that these are a complicated issues.
    "Fair use" isn't really complicated. I'll provide an example that applies to myself. I like to design my own graphics for movies and TV shows that I have ripped from discs and stream from a home media server. To create the graphics, I typically go to the internet and look at a variety of posters and packaging that exist for that particular movie/show and choose the images and typographic treatments I like best and then digitally edit those elements into a new composition. This approach is "fair use" ONLY because the copyrighted material that I'm using without permission will not appear anywhere other than on my home Apple TV. In other words, it's purely for personal use. It has no commercial or professional application. And I'm not displaying these graphics to the general public.

    However, if I put those same graphic treatments into a professional portfolio to try and get a job designing those kinds of graphics, it wouldn't be "fair use" anymore. I would be violating copyright because I had never received permission to use any of the material professionally. 

    So you can see how the ruling by this particular judge is ignoring a very obvious copyright issue in regards to permissions. The only way an AI program that was trained on copyrighted material without the appropriate permissions could be considered "fair use" would be if the AI program was never made available to the public. Literally like if Sam Altman was the only person that could use ChatGPT. Because once it's publicly available as a product of a professional organization, it can't possibly be considered personal use anymore...just like the example of putting my home ATV graphics that used copyrighted material without permission into a professional portfolio. 
    edited June 25
    thtmuthuk_vanalingam
     2Likes 0Dislikes 0Informatives
  • Reply 26 of 49
    netroxnetrox Posts: 1,577member
    Finally a common sense ruling! 


    killroywilliamlondonsconosciuto
     0Likes 3Dislikes 0Informatives
  • Reply 27 of 49
    mfrydmfryd Posts: 273member
    mfryd said: I am merely pointing out that these are a complicated issues.
    "Fair use" isn't really complicated. I'll provide an example that applies to myself. I like to design my own graphics for movies and TV shows that I have ripped from discs and stream from a home media server. To create the graphics, I typically go to the internet and look at a variety of posters and packaging that exist for that particular movie/show and choose the images and typographic treatments I like best and then digitally edit those elements into a new composition. This approach is "fair use" ONLY because the copyrighted material that I'm using without permission will not appear anywhere other than on my home Apple TV. In other words, it's purely for personal use. It has no commercial or professional application.

    However, if I put those same graphic treatments into a professional portfolio to try and get a job designing those kinds of graphics, it wouldn't be "fair use" anymore. I would be violating copyright because I had never received permission to use any of the material professionally. 

    So you can see how the ruling by this particular judge is ignoring a very obvious copyright issue in regards to permissions. The only way an AI program that was trained on copyrighted material without the appropriate permissions could be considered "fair use" would be if the AI program was never made available to the public. Literally like if Sam Altman was the only person that could use ChatGPT. Because once it's publicly available as a product of a professional organization, it can't possibly be considered personal use anymore...just like the example of putting my home ATV graphics that used copyrighted material without permission into a professional portfolio. 
    "Fair use" includes all sorts of things.   My understanding is that I can incorporate verbatim scenes from a copyrighted movie into a new production in which I provide a review of the movie.  I believe this is the case even if my production is a commercial endeavor, and I am making a boatload of money from it. 

    According to  §107 of the copyright act, "Fair Use" includes use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use).  One could make the case that AI Scraping of copyrighted work falls into the "research" category.

    One can make a reasonable case that under current copyright law AI can scrape copyrighted material for training purposes.  However, the output of AI has to be careful not to violate any copyrights.  In other words, you can train AI on the Harry Potter books, but the AI be careful that its output doesn't violate J. K. Rowling's copyrights.   As has been mentioned, this is not yet settled law.  The courts may see things differently than I do.  Congress may choose to update copyright law.

    In any case, one needs to be careful not to confuse discussions of "right and wrong" with discussions of what the current law allows.

    randominternetpersonsconosciutomuthuk_vanalingam
     3Likes 0Dislikes 0Informatives
  • Reply 28 of 49
    killroykillroy Posts: 295member
    timpetus said:
    I seem to remember someone offering a program that "poisons" image data in a way that is undetectable by the human eye, but makes it not only useless for AI training but actually harmful to any AI model trained using the photo. This is a great way to (a) protect your work without relying on nearly impossible detection and legal enforcement measures and (b) accelerate the inevitable destruction of so-called AI image generation. I believe no matter what, we will get to a state of GIGO with "AI" soon, where the verifiably human-generated pool of training data will continually shrink in comparison to the massive deluge of AI-generated data, some of which will be unidentifiable as such at time of selection.
    There's been a security report that the Russians are poisoning contant that is used to train AI to make it give errors in his output. It's gonna be another form of a cyber attack.
    Exposing Pravda: How pro-Kremlin forces are poisoning AI ...Atlantic Councilhttps://www.atlanticcouncil.org › blogs › new-atlanticist
    edited June 25
    sconosciuto
     1Like 0Dislikes 0Informatives
  • Reply 29 of 49
    Mike Wuerthelemike wuerthele Posts: 7,184administrator
    Meh. Seems emotional and sentimental. If you are placing your content on the web, you are practically posting it on the street for general view with absurd hopes of pennies trickling in on some desperate fancy rather than through proper business channels with an effective strategy of legally protecting and promoting yourself - childish. Most people who do such art that they may avoid other types of structured paid work - what do they expect when they treat their skill set as a hobby - likely not wanting to work for others on a structured gig - if that's even around much? What's even the issue here - not getting a piece of the trifling leavings of scrapers and edu-content pedlars? pedantic. Art needs to stop being a vague creation-vocation of the rando people and grow up. Successful society is based on complex businesses and legal structures requiring serious people acting seriously. Creativity is a real skill and needs focused training and  a hierarchy of knowledgeable people to propagate it through society. Sorry, but I have little symp for the dilettantes and dabblers hoping to otherwise avoid the soulless cubicle, construction site, and assembly line.
    This is the strangest pro-AI manifesto that I've read this week.
    randominternetpersonsconosciuto
     2Likes 0Dislikes 0Informatives
  • Reply 30 of 49
    danoxdanox Posts: 3,882member
    mfryd said:
    danox said:
    mfryd said:
    It's a complicated topic.

    There are good points on both sides of the training question.  On one hand, AI programs are being trained based on the hard work of previous human artists.  The AI companies are profiting, but the original artists get nothing. 

    On the other hand, the AI is not doing anything new.  It's common for individuals to study the work of others, and use that study to inform their work.  When interviewed, great directors often discuss how they have studied the works of great directors to learn their techniques and style.  The AI programs are simply really good at this.

    My understanding, is that an art student can study the works of a current artist, and produce new works in that style.   I don't believe an artist's style is protectable by copyright.  What an artist can't do, is to produce work that is essentially a copy of an existing copyrighted work, or that contains copyrighted elements (including copyrighted characters).  An artist also has to be careful that work done in someone else's style is not represented as being that artist's work.  If I were to write a book in the style of Dr. Seuss, I would need to make it very clear that the book was *not* a work by Dr. Seuss. 

    Copyright allows control over making copies of a creative work.  It does not allow control over works that were "inspired" by a copyrighted piece.

    An issue with current AI, is that it doesn't understand the limitations of copyright law, and can sometimes produce results that would typically be considered copyright infringement.  

    It's going to take a while to sort out what rights various parties should have.   There is more than one reasonable way to resolve the legal issues.  It will be interesting to see how Congress and the courts resolve these issues.

    Disclaimer: I am not an attorney, and this is not legal advice.  It is merely my imperfect understanding of some of the issues.

    AI can’t think and it can’t reason and because of that it knows no limitations today, however one day it will, but that day is decades away, but that does not mean you should get to scrape all of the copyrighted material since 1920 at your leisure but the protected class gets to do so.
    People are allowed to scrape as much copyrighted material as they like.  Machines are simply better at it.

    This is a common challenge with new technology.  In the past, certain activities were limited by the technology of the time.  Therefore, certain activities could not rise to the level where they were a common issue.  As technology improves, so do various abilities.

    For instance, 50 years ago we didn't really need laws governing the ability for private companies to track people.  If they wanted to track someone, they hired a private investigator, and he would follow the person of interest.  If you wanted to track 50 people, you would need 50 private investigators.  The available technology limited the collection of tracking data.   If a company wanted to track someone, and sell that information, they could.  It just wasn't a common thing.

    Today, the three major cellular companies maintain a real time database of where just about every adult is currently located.  They have to.  They need to know where you are so when someone calls you the signal only needs to go to the cell tower closest to you.  That data is extremely valuable.  Knowing where you are, and where you have been, makes it possible to make some very good guesses about your likes and dislikes.  That makes it possible to target you with ads, that are designed to appeal to your personal preferences, or feed off your personal fears.

    Once it becomes trivial to track people, we need to think about whether and how to regulate tracking.

    In the past, it wasn't possible to read a large percentage of what gets published.  It was even less possible to memorize every passage of every book you have ever read.   Now that computers are doing this, it's important that we consider whether we need new regulations and what should they be?

    People are not allowed to scrape if scraping means reading something once or twice or thrice, then write a thesis/paper at a university, but later on become famous/prominent, see if you’ll be allowed to get away with copying/scraping (remembering it too well) it once again if you have all the knowledge before 1920 which is in the public domain shouldn’t that not be enough? And everything afterwards in the last 125 years, you pay for? How difficult is that? 

    And the way the court systems work if you don’t raise a fuss now you will never get satisfaction similar to trade marks if you don’t keep on top of it, if you don’t try to enforce it, the court system say’s too bad.

    Greedy, AI companies all of civilized (dawn of agriculture) human history 11,000 B.C. approximately until 1920 free and it still isn’t enough…. The kicker in this is Apple being sought out and sued, for scraping in the next five years despite this ruling.
    WillfulJonsinneoncat
     0Likes 2Dislikes 0Informatives
  • Reply 31 of 49
    danox said:
    mfryd said:
    danox said:
    mfryd said:
    It's a complicated topic.

    There are good points on both sides of the training question.  On one hand, AI programs are being trained based on the hard work of previous human artists.  The AI companies are profiting, but the original artists get nothing. 

    On the other hand, the AI is not doing anything new.  It's common for individuals to study the work of others, and use that study to inform their work.  When interviewed, great directors often discuss how they have studied the works of great directors to learn their techniques and style.  The AI programs are simply really good at this.

    My understanding, is that an art student can study the works of a current artist, and produce new works in that style.   I don't believe an artist's style is protectable by copyright.  What an artist can't do, is to produce work that is essentially a copy of an existing copyrighted work, or that contains copyrighted elements (including copyrighted characters).  An artist also has to be careful that work done in someone else's style is not represented as being that artist's work.  If I were to write a book in the style of Dr. Seuss, I would need to make it very clear that the book was *not* a work by Dr. Seuss. 

    Copyright allows control over making copies of a creative work.  It does not allow control over works that were "inspired" by a copyrighted piece.

    An issue with current AI, is that it doesn't understand the limitations of copyright law, and can sometimes produce results that would typically be considered copyright infringement.  

    It's going to take a while to sort out what rights various parties should have.   There is more than one reasonable way to resolve the legal issues.  It will be interesting to see how Congress and the courts resolve these issues.

    Disclaimer: I am not an attorney, and this is not legal advice.  It is merely my imperfect understanding of some of the issues.

    AI can’t think and it can’t reason and because of that it knows no limitations today, however one day it will, but that day is decades away, but that does not mean you should get to scrape all of the copyrighted material since 1920 at your leisure but the protected class gets to do so.
    People are allowed to scrape as much copyrighted material as they like.  Machines are simply better at it.

    This is a common challenge with new technology.  In the past, certain activities were limited by the technology of the time.  Therefore, certain activities could not rise to the level where they were a common issue.  As technology improves, so do various abilities.

    For instance, 50 years ago we didn't really need laws governing the ability for private companies to track people.  If they wanted to track someone, they hired a private investigator, and he would follow the person of interest.  If you wanted to track 50 people, you would need 50 private investigators.  The available technology limited the collection of tracking data.   If a company wanted to track someone, and sell that information, they could.  It just wasn't a common thing.

    Today, the three major cellular companies maintain a real time database of where just about every adult is currently located.  They have to.  They need to know where you are so when someone calls you the signal only needs to go to the cell tower closest to you.  That data is extremely valuable.  Knowing where you are, and where you have been, makes it possible to make some very good guesses about your likes and dislikes.  That makes it possible to target you with ads, that are designed to appeal to your personal preferences, or feed off your personal fears.

    Once it becomes trivial to track people, we need to think about whether and how to regulate tracking.

    In the past, it wasn't possible to read a large percentage of what gets published.  It was even less possible to memorize every passage of every book you have ever read.   Now that computers are doing this, it's important that we consider whether we need new regulations and what should they be?

    People are not allowed to scrape if scraping means reading something once or twice or thrice, then write a thesis/paper at a university, but later on become famous/prominent, see if you’ll be allowed to get away with copying/scraping (remembering it too well) it once again if you have all the knowledge before 1920 which is in the public domain shouldn’t that not be enough? And everything afterwards in the last 125 years, you pay for? How difficult is that? 

    And the way the court systems work if you don’t raise a fuss now you will never get satisfaction similar to trade marks if you don’t keep on top of it, if you don’t try to enforce it, the court system say’s too bad.

    Greedy, AI companies all of civilized (dawn of agriculture) human history 11,000 B.C. approximately until 1920 free and it still isn’t enough…. The kicker in this is Apple being sought out and sued, for scraping in the next five years despite this ruling.
    You're misconstruing mfryd's point. He's not defending the notion that AI should be ungoverned or that content creators shouldn't be compensated (somehow). He's main thesis is that simple analogies and simple solutions won't work.

    And your first example isn't factually correct. Experts (real experts) do exactly what you say they aren't "allowed to get away with". They read vociferously and are able to present that information coherently and effectively to people. For example, that's exactly what Neil deGrasse Tyson does. As far as I know, he's not a world-class researcher who has independently amassed a wealth of astronomical data thru observation. He's a guy that has read what a lot of scientists and other experts have written, and he is very successful presenting that information. If he drifted into the bounds of plagiarism, we would hear about it and he would be chastised appropriately. I'm sure there are scientists who are jealous of his fame, but no one disputes his right to say and write what he says based on what he "scraped" from various sources.

    On the other hand, it feels very different when a computer does this, because it can do so at massive scale. So maybe, morally and/or legal, that's fine. Or maybe it's qualitatively different and should be treated different. It's complicated.

    There are plenty of legal and regulatory changes that were sparked by something radically changing due to technology and industrialization. Why would we expect anything different here? 
    mfrydmuthuk_vanalingam
     2Likes 0Dislikes 0Informatives
  • Reply 32 of 49
    mfryd said:
    mfryd said: I am merely pointing out that these are a complicated issues.
    "Fair use" isn't really complicated. I'll provide an example that applies to myself. I like to design my own graphics for movies and TV shows that I have ripped from discs and stream from a home media server. To create the graphics, I typically go to the internet and look at a variety of posters and packaging that exist for that particular movie/show and choose the images and typographic treatments I like best and then digitally edit those elements into a new composition. This approach is "fair use" ONLY because the copyrighted material that I'm using without permission will not appear anywhere other than on my home Apple TV. In other words, it's purely for personal use. It has no commercial or professional application.

    However, if I put those same graphic treatments into a professional portfolio to try and get a job designing those kinds of graphics, it wouldn't be "fair use" anymore. I would be violating copyright because I had never received permission to use any of the material professionally. 

    So you can see how the ruling by this particular judge is ignoring a very obvious copyright issue in regards to permissions. The only way an AI program that was trained on copyrighted material without the appropriate permissions could be considered "fair use" would be if the AI program was never made available to the public. Literally like if Sam Altman was the only person that could use ChatGPT. Because once it's publicly available as a product of a professional organization, it can't possibly be considered personal use anymore...just like the example of putting my home ATV graphics that used copyrighted material without permission into a professional portfolio. 
    "Fair use" includes all sorts of things.   My understanding is that I can incorporate verbatim scenes from a copyrighted movie into a new production in which I provide a review of the movie.  I believe this is the case even if my production is a commercial endeavor, and I am making a boatload of money from it. 
    From https://www.copyright.gov/help/faq/faq-fairuse.html:
    How much of someone else's work can I use without getting permission?

    Under the fair use doctrine of the U.S. copyright statute, it is permissible to use limited portions of a work including quotes, for purposes such as commentary, criticism, news reporting, and scholarly reports. There are no legal rules permitting the use of a specific number of words, a certain number of musical notes, or percentage of a work. Whether a particular use qualifies as fair use depends on all the circumstances. See, Fair Use Index, and Circular 21Reproductions of Copyrighted Works by Educators and Librarians.

    If you include whole "verbatim scenes" in a review, there's an excellent chance that you'd be violating "fair use." If one could argue that you are profiting by showing of the unaltered scenes from the movie, rather than thru your commentary, you're on legal thin ice. But IANAL.

    (Update/clarification: the issue is not whether the potential offender "profits," it is whether those actions result in "actual or potential market substitution." That is, if someone would possibly view your movie instead of my movie because you included whole scenes of my movie in your "review" movie, that would be a point in favor of it not being fair use.)

    edited June 25
    foregoneconclusion
     1Like 0Dislikes 0Informatives
  • Reply 33 of 49
    anonymouseanonymouse Posts: 7,122member
    mfryd said:

    This brings up the question as to what AI should do if the prompt is "write a new short story about Harry Potter that takes place during his first year at Hogwarts."   Such a story would likely violate J. K. Rowling's copyrights, as the characters in the Harry Potter stories are copyrighted intellectual property.
    This is not technically correct. The characters in a copyrighted work are not copyrighted IP in and of themselves. What is actually copyrighted IP are the words used to describe those characters and what they do. However, copyright also includes the right to make, or control the making of, derivative works (that do not fall under fair use doctrine). So, even though the characters themselves are not copyrighted, a story that essentially duplicates the names, traits and behavior of one or more characters would likely violate copyright by being considered a derivative work.
     0Likes 0Dislikes 0Informatives
  • Reply 34 of 49
    mfryd said:

    This brings up the question as to what AI should do if the prompt is "write a new short story about Harry Potter that takes place during his first year at Hogwarts."   Such a story would likely violate J. K. Rowling's copyrights, as the characters in the Harry Potter stories are copyrighted intellectual property.
    This is not technically correct. The characters in a copyrighted work are not copyrighted IP in and of themselves. What is actually copyrighted IP are the words used to describe those characters and what they do. However, copyright also includes the right to make, or control the making of, derivative works (that do not fall under fair use doctrine). So, even though the characters themselves are not copyrighted, a story that essentially duplicates the names, traits and behavior of one or more characters would likely violate copyright by being considered a derivative work.
    They are also likely to be protected by trademark law. That might be more relevant in this example.
     0Likes 0Dislikes 0Informatives
  • Reply 35 of 49
    If everyone who writes a comment on this page will send a fee to Dr Seuss for learning from his books to read and speak, then I will pay attention to their views if they oppose AI learning from published sources. But if you aren't willing to pay everyone that you learn from, for every word that comes out of your mouth, then I don't see why AI should have to pay either. Next, are we going to charge aliens for learning English by reading the radio waves that are being sent into deep space?
    Somehow you’ve convinced yourself that AI is sentient and learning from influences. It’s not. It has a database of pirated data that it uses to essentially copy/paste responses from.
    By no means do I think AI is sentient. Stop putting words in my mouth. Also, I do not believe that sentience absolves an entity from paying fees for using data that it has absorbed from other beings, as you seem to be implying there. And I totally disagree with your explanation of AI. It is not a "copy" of data. In fact, every single time I use AI to get some data, it ALWAYS goes to the internet to look things up, because it hasn't "memorized the internet" as simpletons think it has.

    To answer my own question above, related to this assertion.  Here's a quote from the judge's ruling in this case. Sure sounds like the "simpletons" are onto something.

    https://storage.courtlistener.com/recap/gov.uscourts.cand.434709/gov.uscourts.cand.434709.231.0_2.pdf (page 7)

    Fourth, each fully trained LLM itself retained “compressed” copies of the works it had trained upon, or so Authors contend and this order takes for granted.  In essence, each LLM’s mapping of contingent relationships was so complete it mapped or indeed simply “memorized” the works it trained upon almost verbatim.  So, if each completed LLM had been asked to recite works it had trained upon, it could have done so.
     0Likes 0Dislikes 0Informatives
  • Reply 36 of 49
    mfryd said: "Fair use" includes all sorts of things.  
    "Fair use" for professional products is always limited to excerpts. AI programs are not trained on excerpts and they are definitely professional products. 
     0Likes 0Dislikes 0Informatives
  • Reply 37 of 49
    mfrydmfryd Posts: 273member
    danox said:
    mfryd said:
    danox said:
    mfryd said:
    It's a complicated topic.

    There are good points on both sides of the training question.  On one hand, AI programs are being trained based on the hard work of previous human artists.  The AI companies are profiting, but the original artists get nothing. 

    On the other hand, the AI is not doing anything new.  It's common for individuals to study the work of others, and use that study to inform their work.  When interviewed, great directors often discuss how they have studied the works of great directors to learn their techniques and style.  The AI programs are simply really good at this.

    My understanding, is that an art student can study the works of a current artist, and produce new works in that style.   I don't believe an artist's style is protectable by copyright.  What an artist can't do, is to produce work that is essentially a copy of an existing copyrighted work, or that contains copyrighted elements (including copyrighted characters).  An artist also has to be careful that work done in someone else's style is not represented as being that artist's work.  If I were to write a book in the style of Dr. Seuss, I would need to make it very clear that the book was *not* a work by Dr. Seuss. 

    Copyright allows control over making copies of a creative work.  It does not allow control over works that were "inspired" by a copyrighted piece.

    An issue with current AI, is that it doesn't understand the limitations of copyright law, and can sometimes produce results that would typically be considered copyright infringement.  

    It's going to take a while to sort out what rights various parties should have.   There is more than one reasonable way to resolve the legal issues.  It will be interesting to see how Congress and the courts resolve these issues.

    Disclaimer: I am not an attorney, and this is not legal advice.  It is merely my imperfect understanding of some of the issues.

    AI can’t think and it can’t reason and because of that it knows no limitations today, however one day it will, but that day is decades away, but that does not mean you should get to scrape all of the copyrighted material since 1920 at your leisure but the protected class gets to do so.
    People are allowed to scrape as much copyrighted material as they like.  Machines are simply better at it.

    This is a common challenge with new technology.  In the past, certain activities were limited by the technology of the time.  Therefore, certain activities could not rise to the level where they were a common issue.  As technology improves, so do various abilities.

    For instance, 50 years ago we didn't really need laws governing the ability for private companies to track people.  If they wanted to track someone, they hired a private investigator, and he would follow the person of interest.  If you wanted to track 50 people, you would need 50 private investigators.  The available technology limited the collection of tracking data.   If a company wanted to track someone, and sell that information, they could.  It just wasn't a common thing.

    Today, the three major cellular companies maintain a real time database of where just about every adult is currently located.  They have to.  They need to know where you are so when someone calls you the signal only needs to go to the cell tower closest to you.  That data is extremely valuable.  Knowing where you are, and where you have been, makes it possible to make some very good guesses about your likes and dislikes.  That makes it possible to target you with ads, that are designed to appeal to your personal preferences, or feed off your personal fears.

    Once it becomes trivial to track people, we need to think about whether and how to regulate tracking.

    In the past, it wasn't possible to read a large percentage of what gets published.  It was even less possible to memorize every passage of every book you have ever read.   Now that computers are doing this, it's important that we consider whether we need new regulations and what should they be?

    People are not allowed to scrape if scraping means reading something once or twice or thrice, then write a thesis/paper at a university, but later on become famous/prominent, see if you’ll be allowed to get away with copying/scraping (remembering it too well) it once again if you have all the knowledge before 1920 which is in the public domain shouldn’t that not be enough? And everything afterwards in the last 125 years, you pay for? How difficult is that? 

    And the way the court systems work if you don’t raise a fuss now you will never get satisfaction similar to trade marks if you don’t keep on top of it, if you don’t try to enforce it, the court system say’s too bad.

    Greedy, AI companies all of civilized (dawn of agriculture) human history 11,000 B.C. approximately until 1920 free and it still isn’t enough…. The kicker in this is Apple being sought out and sued, for scraping in the next five years despite this ruling.
    I am not a lawyer, however this is my understanding of what someone can do without violating copyright law.

    A human writer can read everything ever written by Stephen King.
    They can study his story structure, how he introduces characters. How he spaces out the various plot points.  Whether or not he uses cliffhangers at the end of chapters.  How he structures story arcs into various acts.  Whether he uses fictional, or real locations.  Which aspects of characters he describes, and which he leaves to the reader's imagination.  Etc.  They can look at every nuance of King's style.

    With enough studying, the human could write a story that reads like a Stephen King novel.  Readers may enjoy it as much as a Stephen King novel.  If the reader didn't look at the author's name on the jacket, they might even mistake it for a new Stephen King novel.   

    I don't think such a story would violate King's intellectual property rights, as long as the human writer didn't claim it was a Stephen King story, didn't use any of King's plots, characters, or fictional locations.

    In essence, the new story would be heavily influenced by Stephen King's style, without using any of his copyrighted material.

    Now, this would be difficult for a human to do.  It would be hard for a human to fully study King's work to such a level that he could precisely match King's style. However, that's exactly what modern AI is good at doing.    

    We are presented with situations not envisioned by the law because AI can now easily do something, that was previously impractical.
    randominternetperson
     1Like 0Dislikes 0Informatives
  • Reply 38 of 49
    mfryd said: I am not a lawyer, however this is my understanding of what someone can do without violating copyright law.
    Yes, a person can do that without violating copyright. But AI doesn't work like the human mind. AI requires the complete works of Stephen King to be copied into a database. If it's done without permission, then it's a violation of copyright. 
    thtmuthuk_vanalingam
     2Likes 0Dislikes 0Informatives
  • Reply 39 of 49
    mfrydmfryd Posts: 273member
    mfryd said: I am not a lawyer, however this is my understanding of what someone can do without violating copyright law.
    Yes, a person can do that without violating copyright. But AI doesn't work like the human mind. AI requires the complete works of Stephen King to be copied into a database. If it's done without permission, then it's a violation of copyright. 
    One can certainly make a reasonable case that an internal copy is a violation.  One can also make a case that an internal copy is fair use.  

    AI may not store text in a simple format.  A computer can parse the text, break it down into verbs, nouns, concepts, etc.  The computer might be storing a sophisticated parse tree and analysis of the original text, not the original text itself.  Perhaps this conversion is transformative enough that it isn't a violation?   Does it make a difference if the computer and undo the transformation and can recover the original text?

    The courts are currently working on clarifying how to apply the existing copyright law to these new, and unforeseen usages.
    muthuk_vanalingam
     1Like 0Dislikes 0Informatives
  • Reply 40 of 49
    If everyone who writes a comment on this page will send a fee to Dr Seuss for learning from his books to read and speak, then I will pay attention to their views if they oppose AI learning from published sources. But if you aren't willing to pay everyone that you learn from, for every word that comes out of your mouth, then I don't see why AI should have to pay either. Next, are we going to charge aliens for learning English by reading the radio waves that are being sent into deep space?
    In addition to the fee that was paid when the books were purchased?

    Somehow you’ve convinced yourself that AI is sentient and learning from influences. It’s not. It has a database of pirated data that it uses to essentially copy/paste responses from.

    If I wrote a book called “Blue Eggs and Spam” and charged people for it, you better believe I’d be sued. It shouldn’t be any different when AI companies do it. 
    True story: someone tried to publish a Star Trek-themed rewrite of a Dr. Suess book. Oh the Places You'll Boldly Go! The estate of Dr. Geisel sued for copyright infringement and won. The knockoff was judged “a non-transformative commercial work that targeted and usurped Go!’s potential market.”

    https://www.copyright.gov/fair-use/summaries/drseuss-comicmix-9thcir2020.pdf
     0Likes 0Dislikes 0Informatives
Sign In or Register to comment.