Skip to content

Menu

LexBlog, Inc. logo
NetworkSub-MenuBrowse by SubjectBrowse by PublisherJoin the NetworkGet StartedSubscribeSupport
Contact Us
Search
Close

California Enacts Additional Generative AI Bills Touching on Training Data

By Jeffrey Neuburger on September 30, 2024
Email this postTweet this postLike this postShare this post on LinkedIn

After several weeks of handwringing about the fate of SB 1047 – the controversial AI safety bill that would have required developers of powerful AI models and entities providing the computing resources to train such models to put appropriate safeguards and policies into place to prevent critical harms – California Governor Gavin Newsom announced that he vetoed the “well-intentioned” bill because it was not necessarily “the best approach to protecting the public from real threats posed by the technology.” 

However, amidst the hubbub surrounding SB 1047, the Governor in the past month has reportedly signed 17 bills covering generative AI (“GenAI”), including bills on deepfakes, AI watermarking, child safety, performers’ AI rights and election misinformation, some of which we discussed in a prior post. 

Most notably, over the weekend the Governor signed AB 2013, which would require that, on or before January 1, 2026, developers of GenAI systems or services released on or after January 1, 2022 post a “high-level summary” of the datasets used to train[1] such GenAI systems or services “made publicly available to Californians for use” (as well as post a summary of the training data after a “substantial modification” to a GenAI system or service that is made publicly available to Californians for use).  The law lists multiple items that should be part of a “high-level summary,” including:

  • The sources or owners of the datasets
  • The general number of data points included in the datasets and a description of the types of data points within the datasets
  • Whether the datasets include any data protected by IP
  • Whether the datasets were purchased or licensed by the developer
  • Whether the datasets include personal information, or aggregate consumer information as defined under the CCPA
  • Whether there was any cleaning, processing, or other modification to the datasets by the developer
  • The time period during which the data in the datasets were collected or first used during the GenAI development process
  • Whether the GenAI system or service uses synthetic data generation in its development.

Beyond the requirement that the covered GenAI system be “publicly available to Californians,” which would perhaps exclude certain enterprise models that are not generally made available to the public, the law also exempts, among other things, GenAI systems whose “sole purpose” is for data security and integrity or used by a federal entity for national defense purposes. 

AB 2013 is at the forefront of state AI regulation surrounding GenAI training data sources. GenAI developers generally do not disclose data sources used to train existing or frontier models for a multitude of business or competitive reasons.  In addition, with a host of ongoing copyright and privacy-related litigations against major GenAI developers over allegations that such developers used copyrighted material or consumers’ personal information for training purposes without authorization, training data sources have been at the center of such claims and not discussed in detail in public.

Other AI-related bills recently signed include:

  • AB 1008: The law clarifies that “personal information” under the CCPA can exist in various formats, including, but not limited to, physical formats, digital formats, and  abstract digital formats, which would also include “artificial intelligence systems that are capable of outputting personal information.” [An Assembly Floor Analysis of AB 1008 law can be found here].
  • SB 1120: The law requires, among other things, a health care service plan or disability insurer that uses an AI, algorithm, or other software tool for the purpose of coverage decisions and functions (or contracts for such services through a third party vendor) to ensure that the AI tool bases its determination on specified information and is fairly and equitably applied.
  • AB 3030: The law requires a health facility, clinic, physician’s office, or group practice that uses GenAI to produce certain written or verbal patient communications pertaining to patient clinical information to include a disclaimer that informs the patient that GenAI was used to generate the communication, along with instructions on how the patient may contact a human health care provider or employee of the practice.

[1] Under AB 2013, “train a generative artificial intelligence system or service” includes “testing, validating, or fine tuning by the developer of the artificial intelligence system or service.”

Tags: AI
Photo of Jeffrey Neuburger Jeffrey Neuburger

Jeffrey Neuburger is co-head of Proskauer’s Technology, Media & Telecommunications Group, head of the Firm’s Blockchain Group and a member of the Firm’s Privacy & Cybersecurity Group.

Jeff’s practice focuses on technology, media and intellectual property-related transactions, counseling and dispute resolution. That expertise…

Jeffrey Neuburger is co-head of Proskauer’s Technology, Media & Telecommunications Group, head of the Firm’s Blockchain Group and a member of the Firm’s Privacy & Cybersecurity Group.

Jeff’s practice focuses on technology, media and intellectual property-related transactions, counseling and dispute resolution. That expertise, combined with his professional experience at General Electric and academic experience in computer science, makes him a leader in the field.

As one of the architects of the technology law discipline, Jeff continues to lead on a range of business-critical transactions involving the use of emerging technology and distribution methods. For example, Jeff has become one of the foremost private practice lawyers in the country for the implementation of blockchain-based technology solutions, helping clients in a wide variety of industries capture the business opportunities presented by the rapid evolution of blockchain. He is a member of the New York State Bar Association’s Task Force on Emerging Digital Finance and Currency.

Jeff counsels on a variety of e-commerce, social media and advertising matters; represents many organizations in large infrastructure-related projects, such as outsourcing, technology acquisitions, cloud computing initiatives and related services agreements; advises on the implementation of biometric technology; and represents clients on a wide range of data aggregation, privacy and data security matters. In addition, Jeff assists clients on a wide range of issues related to intellectual property and publishing matters in the context of both technology-based applications and traditional media.

Read more about Jeffrey NeuburgerEmail
Show more Show less
  • Posted in:
    Technology and AI
  • Blog:
    New Media and Technology Law Blog
  • Organization:
    Proskauer Rose LLP
  • Article: View Original Source

Call us at 1-800-913-0988 or email sales@lexblog.com.

Facebook LinkedIn Twitter RSS
  • About LexBlog
  • The Field We Built
  • Our Beliefs
  • Our Team
  • Contact LexBlog
  • Disclaimer
  • Editorial Policy
  • Terms of Service
  • Get Started
  • Publishing Solutions
  • Compass
  • Submit a Request
  • Support Center
  • System Status
Copyright © 2026, LexBlog, Inc. All Rights Reserved.
Law blog design & platform by LexBlog LexBlog Logo