In this document, we collect questions delineating the field of research and activism that we refer to as "Openness and AI."
This draft document was prepared by Zuzanna Warso, Paul Keller, and Alek Tarkowski. We are sharing this as a request for comments and additional questions.
If you’d like to be attributed as a contributor, please leave your name below.
If you have any questions, please write to [email protected]
In this document, we collect questions delineating the field of what we refer to as "Openness and AI." We are, broadly speaking, interested in:
The role that “open” and open movement play as far as AI input data is concerned — how that role can be understood and shaped by open movement actors.
The role of “open” in the AI outputs (or “products”), including the code and models. We are interested in whether the traditional assumptions about open — and their codification in concepts like the open or the open source definition, CC licenses — still work and suffice.
These questions could be answered both through research and advocacy/activism. We are using these questions in our work, to structure the discussions about AI/ML and openness and to map our analytical and policy interests.
💡 This is interesting for stewards of “information commons,” but AI researchers also declare this a key issue. It also concerns the rights of content providers.
What is the proper governance model for commons-based, public-interest AI datasets?
What should data intermediaries’ role be (data trusts, coops, public data, commons)?
What role do opt-out mechanisms play in AI and data governance? Could they be part of a broader concept of "open AI," similar to the RAIL approach?
Should living creators have a say in how their works are used by AI (for AI training)? If yes, what tools are and should be available to make this a reality?
Does the open movement have an obligation to ensure the public availability of data for AI training?
Should making an AI system open-source entail a requirement to provide information about the data used to develop it and access to that information? What are the key considerations that should be considered in addressing this issue?
💡 This concerns non-traditional stakeholders of AI systems: creators and platforms that share creative content.
Should the output of AI systems be protected in any form (copyright or otherwise)?
What harms are brought by AI systems replicating someone’s style? How could these harms be managed?
What are the rights of creators (including collective rights) vis-a-vis AI systems that create content replicating their style?
What might be the impact of open AI systems on creativity, e.g., in the context of processes such as sampling and remixing)?
What might be the long-term impact of AI systems on openness, e.g., through the automatization of commons-based peer-production?
How does opening up (GP)AI systems impact AI research and development?
💡 These are critical questions for both AI researchers and open-source system administrators. This appears to be the starting point for understanding what "openness and AI" mean and its policy implications.
Is “open AI” the same as “open-source AI” (is it defined just by licensing norms, or also something else)? How do we define "open" in the context of open-source AI?
Where do the different types of licensing come in the overall AI stack (data, datasets, models, etc. — and then content licenses, open source licenses, RAIL licenses)?
Are RAIL licenses compliant with Open Definition? Do they need to be?
Should the definition of open source be expanded to include RAIL-type licensed AI?
If we assume that “open source AI” means AI licensed under an open source license, how do we call the broader space, including RAIL-type licensed AI?
What are the challenges of open/RAIL licensing of AI from the perspective of open frameworks (for example, license proliferation, the enforcement of use restrictions)?
What is the map of licensing practices in the open AI field?
Questions related to transparency /traceability/legibility requirements on the one hand and “openness and AI” on the other — how do these issues interact?
💡 These topics are relevant to AI Act advocacy or broader policy debates on AI regulation.
Should open-source AI be regulated differently than other AI systems? If yes, why? How should it be regulated?
What are the arguments in favor of excluding open-source GPAI from the scope of the AI Act?
What does “putting into service or on the market” mean in the case of an open-source (GP)AI?
What is the relationship between regulation and licensing? Is there a role for the ToS of platforms such as GitHub?
What is the role of (GP)AI licensing vis-a-vis the high-risk uses of (GP)AI? Can licenses shield the developer/provider from the requirement for high-risk uses?