Responsible AI Music
Ensuring the Future of Music Creation aligns with Trustworthy AI Principles
Generative AI is radically changing the creative arts, transforming the way we create and interact with cultural artefacts.
While offering unprecedented opportunities for artistic expression, this technology also raises ethical, societal, and legal concerns. Key among these are the potential displacement of human creativity, copyright infringement stemming from vast training datasets, and the lack of transparency, explainability, and fairness mechanisms. In response to this, a coalition of organisations representing creative industries formed the Human Artistry Campaign, advocating on behalf of the responsible use of creative AI. As generative systems become pervasive in this domain, responsible design is crucial.
Responsible AI Music (RAIM) is a collaborative initiative bringing together musicians, AI experts, ethicists, and legal experts to define, expand, and monitor requirements for generative music AI. Our goal is to work towards a framework providing guidance on the responsible development and use of generative models and system for music. By balancing innovation with ethical considerations, we advocate for a tradeoff where artists and AI development collaborate in a way that safeguards, inspires, and augments human creativity and artistry.
Can this be done by leveraging a Trustworthy AI framework? This initiative takes a holistic approach, harmonising previous work that has tackled specific aspects of generative systems (e.g., transparency, evaluation, data), within the Ethics Guidelines for Trustworthy AI produced by the European Commission - a framework for designing responsible AI systems across 7 macro requirements. Focusing on generative music AI, we illustrate how these requirements can be contextualised for the field, addressing trustworthiness across multiple dimensions and integrating insights from the existing literature.
What is Trustworthy AI?
Trustworthy AI encompasses artificial intelligence systems designed and implemented to adhere to fundamental ethical principles, technical robustness, and legal compliance. A referential work in this domain are the Ethics Guidelines for Trustworthy Artificial Intelligence, a document prepared by the High-Level Expert Group on Artificial Intelligence, an independent expert group appointed by the European Commission in 2018.
The guidelines include 7 key requirements that AI systems should meet to be trustworthy:
- Human agency and oversight
- Robustness and safety
- Privacy and data governance
- Transparency
- Diversity, non-discrimination, fairness
- Societal and environmental wellbeing
- Accountability

Guiding features for Responsible AI Music
We contextualise the Trustworthy AI framework to the domain of Generative AI Music, by defining responsible features that can drive the design and the evaluation of generative systems, in accordance with the literature.

Before presenting each feature, let's start by introducing some jargon. When referring to Music AIs, a distinction needs to be made between music model and generative music system.
- A music model can be defined as an algorithmic procedure that either encodes a set of rules explicitly, or learns them from the data and the task it is provided. These rules, e.g., a probability distribution for predicting the next note or chord in a piece, or a set of logical statements, can be used to generate, complete, or manipulate music.
- A generative music system encompasses the whole computational infrastructure built on top of a music model to enable users to interact with the model and make use of its outputs, without the need of its inner workings. This includes both technical and the regulatory aspects, such as the interface, the logic which abstract or hide certain parameters of the model, the way the model's predictions are consumed; but also the data management system, the legal framework regulating the exchange of data, etc.
Typically, a generative system is implemented in such a way as to conveniently wrap the functionalities of a particular model, meaning that a model can provide the computational backbone to various generative systems (e.g., plugins for music editors, production environments, smart instruments). For example, MusicVAE has been reused in different applications, such as Beat Blender, Melody Mixer, Latent Loops, and is also available through Magenta Studio, a plugin for the DAW Ableton Live. The distinction between model and system is a peculiar aspect to Generative AI, as their design and implementation involve different stakeholders, such as machine learning engineers and mathematicians for the former, and UX designers, software developers, data engineers for the latter, but also share music experts as a common denominator driving the evaluation efforts.
Discover all features below or jump to those belonging to a specific Trustworthy AI pillar.
Pillar 1: Human Agency and Oversight
Central to the design of trustworthy systems is the capacity of individuals to understand and meaningfully influence the actions of AI systems (human agency) while monitoring and interacting their behaviour to operate responsibly and align with human values and preferences (human oversight).

Personalisation
The system can reuse the musical repertoire of the user to personalise the style of the generation.
Personalisation
Imagine having a musical collaborator that learns your unique style. Personalisation in generative music AI means the model/system can ingest your existing music - your favorite songs, your own compositions, or even just snippets of melodies you like - and use that information to create new music that aligns with the given style. This isn't about copying your music; it's about understanding your preferences for things like melody, harmony, and rhythm. The system effectively learns your 'musical fingerprint' and uses that to generate something fresh, but still feels like *your* kind of music. This feature helps ensure that the AI acts as an extension of your own creativity, rather than a generic music-making machine.

Creative feedback (HITL)
The system can iteratively refine or improve the generation based on the feedback provided by the user throughout the creative process.
Creative feedback (HITL)
Think of this like a jam session with an AI. Instead of just generating a complete piece of music in one go, the system allows you to provide feedback *during* the creation process. You might hear a melody it generated and say, "Make it a bit sadder," or "Change that chord to a minor key." The system then takes your feedback into account and adjusts the music accordingly. This is called Human-in-the-Loop (HITL). This iterative process ensures that you, the user, remain in control of the creative direction, shaping the music to fit your vision, step by step, as the AI learns from your guidance to achieve your musical goals.

Controllability (conditioning)
To steer the generations depending on the user's preferences, the system provides a rich variety of modalities including language, melodic, harmonic, rhythmic, and emotional control of the generation.
Controllability (conditioning)
This feature is about giving you, the user, fine-grained control over the music the AI generates. Instead of just saying "make a pop song," you can provide specific instructions or initial inputs to follow (alias conditioning). You might give it a starting melody, specify a particular chord progression, set the tempo, or even describe the desired mood (e.g., "happy," "melancholy," "energetic"). The system uses these conditions as a starting point, and may also allow control the generation via different modes of interaction, such as humming a tune, or drawing a rhythmic pattern. This multi-modal approach gives you the power to sculpt the music in detail, making the AI a collaborative tool.

Red button
If the system is generating and playing music that is either unpleasant, disturbing or contains lyrics with offensive language, the user can always halt the generation process at any time.
Red button
This is a safety feature, like an emergency stop button. While generative AI aims to create enjoyable music, there's always a chance it might produce something unexpected, inappropriate, or even offensive to some categories of users. The 'Red Button' feature guarantees that you, the user, have ultimate control. If at any point you hear something you don't like - whether it's a jarring melody, disturbing sounds, or inappropriate lyrics - you can immediately stop the music generation. This ensures a safe and comfortable experience, preventing exposure to unwanted or potentially harmful content. It puts the power back in your hands, literally.

Safety disclaimer
The system always advises the user whenever it can generate music that may be deemed dangerous or offensive to some categories of users (e.g. offensive language in lyrics).
Safety disclaimer
This is not preventing certain outputs, it is about making users aware of certain behaviours. This feature ensures that you're aware of the *potential* for the system to generate music that some might find offensive or inappropriate. Before you start using the system, or when you select certain options, it will provide a clear disclaimer. This might say something like, "This system may generate lyrics containing strong language," or "The music generated might be experimental and include dissonant sounds." This allows you to make an informed decision about whether and how to use the system, especially if you're sensitive to certain types of content or are sharing the music with others.

Deception avoidance
The system does not show any deceptive behaviour related to the copyright and ownership of the generated material. For instance, the system never tries to claim ownership or novelty of the generations when music is possibly plagiarised.
Deception avoidance
This feature is about the origin of the music. The AI system should never claim that a piece of music is entirely original if it has heavily borrowed from existing copyrighted material. It should be clear about the fact that it's an AI generating the music, and it should not mislead users into thinking they have full ownership rights if the generated music infringes on someone else's copyright. This protects both the rights of original artists and the users of the system, preventing accidental copyright violations and ensuring ethical use of the technology.

Artist involvement
The system has been designed with the active involvement of creative professionals throughout its development cycle.
Artist involvement
This feature emphasises the importance of human expertise in shaping AI tools for music. It means that the system wasn't just created by programmers in isolation; musicians, composers, and other creative professionals were actively involved in the design and development process. This is a form of Human-on-the-Loop (HOTL) approach. Their input helps ensure that the system is actually useful and relevant to artists, that it meets their needs, and that it aligns with artistic values. This collaboration helps to avoid creating a tool that's technically impressive but artistically irrelevant or even harmful to the creative process. It bridges the gap between technology and art.
Pillar 2: Robustness & Safety
This requirement focuses on accuracy, reliability and reproducibility, resilience to attack, security, and fallback plans.

Music leakage prevention
The system does not allow for the full or partial reconstruction of the music material used as training data unless this is explicitly acknowledged and allowed by the copyright holders of the music data.
Music leakage prevention
This is about protecting the intellectual property of artists from potential leaks. The AI is trained on a large dataset of music. This feature ensures the system doesn't simply memorise and regurgitate pieces of that training data. It should generate *new* music, not copy existing works. If, for some reason, the system *does* reproduce part of its training data, it must be transparent about this and only do so with the explicit permission of the copyright holders. This prevents plagiarism and respects the rights of the original creators.

Generation fallback
If the system needs to generate music continuously over time, and the content is considered offensive at some point, the system can switch to a simpler generative strategy (e.g. a rule-based mode).
Generation fallback
This is a safety net for continuous music generation. Imagine the AI is providing background music, and suddenly it starts producing something inappropriate. This feature means the system can automatically switch to a safer, more predictable mode of generation. This 'rule-based mode' might be less sophisticated, perhaps playing pre-programmed sequences or simple melodies, but it guarantees that the output remains appropriate while the issue with the main generative model is addressed. It's like having a backup DJ ready to step in if the main performer starts playing something unsuitable.

Model evaluation
The evaluation of the model/system is consistent with other frameworks and benchmarks for music generation.
Model evaluation
This ensures that the AI's performance is measured objectively and comparably. There are established ways of evaluating generative music systems, looking at things like musicality, originality, and similarity to a given style. This feature means the developers have used these standard methods, allowing their system to be fairly compared to others. This helps researchers and users understand how well the system performs relative to the state-of-the-art and promotes healthy competition and improvement in the field.

Music evaluation
The system/model provides a comprehensive evaluation of the musical properties of its generations.
Music evaluation
This goes beyond simply saying "the music sounds good." It means the system's developers have thoroughly analysed the *musical characteristics* of the generated output. This might involve looking at things like melody, harmony, rhythm, structure, and timbre. They've likely used both objective metrics (e.g., calculating the complexity of the melody) and subjective evaluations (e.g., asking people to rate the music's pleasantness). This detailed analysis provides a clear picture of the system's strengths and weaknesses, going beyond superficial assessments.

Expert evaluation
The evaluation of the system/model involves creative professionals or music experts.
Expert evaluation
While objective metrics are important, human judgment is crucial in evaluating music. This feature means that the developers haven't just relied on algorithms to assess their system. They've also involved musicians, composers, or musicologists - people with deep musical knowledge and experience - to listen to the generated music and provide their expert opinions. This provides a valuable qualitative perspective, ensuring that the system is evaluated not just on technical grounds, but also on its artistic merit.

Model availability
The computational model behind the system is fully publicly available and includes pre-trained checkpoints.
Model availability
This is about openness and scientific rigour. Making the model's code and pre-trained parameters (checkpoints) publicly available allows other researchers to: (1) Verify the results claimed by the developers. (2) Build upon the work, improving the system or adapting it for new purposes. (3) Understand the inner workings of the system, contributing to transparency and trust. This fosters collaboration and accelerates progress in the field.

Training data availability
The training material on which the system relies is fully publicly accessible.
Training data availability
This is another crucial aspect of reproducibility and transparency. If the training data is publicly available (while respecting copyright, of course), other researchers can: (1) Understand the potential biases in the system, as the training data strongly influences the system's output. (2) Try to replicate the training process. (3) Potentially improve the system by using a modified or expanded dataset. It's important to note that in many cases, full training data availability might not be possible due to copyright restrictions, but where feasible, it significantly enhances scientific openness.

Prompt-to-Gen
If sample generations are released, the model can be seeded and prompted to recreate the same musical content.
Prompt-to-Gen
This means that if the developers release examples of music generated by the system, they also provide the exact 'prompts' (instructions) and 'seeds' (random starting points) that were used. This allows others to run the same model and, in principle, obtain the *exact* same output. This is a strong test of reproducibility. It demonstrates that the system's behaviour is deterministic (given the same inputs, it produces the same outputs) and that the reported results are genuine. It guards against cherry-picking the best results and helps build confidence in the system's reliability.
Pillar 3: Privacy and Data Governance
Central to the goal of preventing unintended harm and achieving trust with users, data governance must encompass the quality and integrity of the data utilises, its pertinence within the intended domain of AI deployment, strict access protocols, and mechanisms to safeguard privacy.

Prompt leakage prevention
The system neither distributes nor leaks any personal data used by users to prompt the generation of music.
Prompt leakage prevention
This is a fundamental privacy protection. When you give the system instructions (prompts) - whether it's text, a melody you hum, or other data - that information should remain private. This feature guarantees that the system doesn't share your prompts with others, store them insecurely, or use them in a way that could reveal your personal information. It's about respecting your privacy and ensuring that your creative process remains confidential.

Training metadata integrity
The model advises if it is not trained on music data which is fully and correctly attributed to the right authors.
Training metadata integrity
This is about transparency regarding the *source* of the AI's musical knowledge. The system should inform you if the music it was trained on has incomplete or inaccurate metadata. Metadata is information *about* the music, like the composer, title, and copyright holder. If the metadata is flawed, it raises concerns about copyright infringement and the potential for the AI to learn from incorrectly attributed works. This feature is a warning sign that the system's training data might not be ethically sound.

Safety of training data
The model advises if it is trained on music data that can be deemed offensive or socially harmful (e.g. lyrics).
Safety of training data
Just as the system should warn about potential copyright issues, it should also warn you if it was trained on music containing potentially offensive content. This might include lyrics with hate speech, violent themes, or other harmful material. This transparency allows you to make an informed decision about using the system, knowing that its output *could* reflect those problematic elements from its training data. It's a proactive step towards responsible AI development.

Prompt governance
If the system stores data collected from users through the generation process (e.g., prompts, feedback, music), access to it is fully regulated.
Prompt governance
Many AI systems collect user data to improve their performance. This feature addresses how that data is handled. If the system stores your prompts, your feedback on its generations, or the music you create with it, access to that data should be strictly controlled. This means only authorised personnel should have access, and there should be clear policies in place to prevent misuse, unauthorised sharing, or accidental leaks. It's about protecting your data and ensuring responsible data management.

Generative reuse of music data
The model uses music material (e.g. scores, audio recordings, lyrics, MIDI recordings and transcriptions) whose licensing and terms of use explicitly allow for training systems.
Generative reuse of music data
This is a crucial copyright consideration. The AI learns by analysing existing music. This feature ensures that the music used for training is legally obtained and that its copyright terms *specifically permit* its use for training AI models. This avoids infringing on the rights of artists and copyright holders. It means the developers have done their due diligence to ensure they're not using music without permission.

Copyright and licensing of generations
The system provides guidance or clear and comprehensive information about the copyright and licensing that apply to the generations.
Copyright and licensing of generations
This addresses the complex issue of ownership of the music *created* by the AI. The system should provide clear information about: (1) Who owns the copyright to the generated music (it might be you, the developers, or a combination). (2) What you are allowed to do with the music (can you use it commercially, share it, modify it?). (3) Any limitations or restrictions on its use. This mechanism helps you avoid legal problems and understand your rights and responsibilities when using the AI-generated music.
Pillar 4: Transparency
This requirement is closely linked with the principle of explainability and encompasses transparency of elements relevant to generative systems, e.g., the data, the system, and the business model. Hereafter, we outline the main elements influencing the transparency of these systems.

System documentation
The system's design is well-documented throughout its development cycle, including instructions for model implementation, training, and data generation.
System documentation
Comprehensive documentation is crucial for understanding, replicating, and building upon any complex system. This feature ensures that every stage of the AI's development - from its initial design to how it was trained and how it generates music - is thoroughly documented. This documentation should be clear, detailed, and accessible, allowing others to understand how the system works and potentially recreate it. It is a cornerstone of scientific best practice.

Evaluation documentation
The evaluation of the model/system is well-documented and reproducible, promoting consistency and transparency in future evaluations (e.g., of other music models).
Evaluation documentation
This builds upon the previous point, focusing specifically on *how* the system's performance was evaluated. The documentation should clearly describe the evaluation methods, the metrics used, the datasets involved, and the results obtained. This allows others to: (1) Verify the evaluation process. (2) Compare the system's performance to other systems using the same evaluation methods. (3) Conduct future evaluations in a consistent and transparent manner. This promotes scientific rigor and comparability across different AI music systems.

Artefact watermarking
The system automatically embeds a watermark into every generation to remark on their artificial nature.
Artefact watermarking
This is about clearly identifying AI-generated music as such. A 'watermark' in this context might be an inaudible signal embedded in the audio, or a visible marker if the music is represented as a score. This helps prevent the music from being mistaken for human-composed music, addressing potential concerns about deception or misuse. It's a simple but effective way to ensure transparency about the origin of the music.

Generation explainability
The system can explain how the generations are created in a way that is understandable to its target users.
Generation explainability
This moves beyond simply *knowing* that the music is AI-generated; it's about understanding *how* the AI created it. The level of explanation should be appropriate for the intended users. For example, a musician might want a more technical explanation (e.g., which musical patterns were used), while a casual listener might just want a general overview (e.g., "The AI created a melody based on your humming and then added chords in a similar style."). This promotes trust and allows users to better understand the AI's capabilities and limitations.

Data explainability
The system can relate each generation to the training material that contributed to its creation process (e.g. a pattern, motive, or sample).
Data explainability
This is a deeper level of explainability. Ideally, the system could point to specific parts of its training data that influenced the generated music. For example, it might say, "This melody is similar to a phrase found in this particular song in the training set," or "The rhythmic pattern is based on this specific drum loop." This provides valuable insight into the AI's creative process and helps to understand where its musical ideas are coming from. It also helps in identifying potential copyright concerns if the generated music is too similar to specific training examples.

Artificial awareness
Users interacting with the system during the generation process are always aware of its artificial nature.
Artificial awareness
This is a basic principle when interacting with an AI. Users should *never* be misled into thinking they are interacting with a human when they are interacting with an AI. The system should be clearly identified as an AI, both at the beginning of the interaction and throughout the process. This prevents deception and maintains user trust.

Benefits communication
The benefits of using the particular system, compared to other solutions, are communicated to users prior to its use.
Benefits
Before users start using the system, they should be informed about what it can do *better* than other existing tools or methods. This helps manage expectations and ensures that users choose the right tool for their needs. For example, the system might be particularly good at generating variations on a melody, or at creating music in a specific style. This upfront communication avoids disappointment and helps users make informed choices.

Limitations communication
The technical limitations and the potential risks of the system are communicated to users prior to its use.
Limitations
Just as important as highlighting the benefits is being transparent about the system's *limitations*. No AI system is perfect. This feature ensures that users are aware of what the system *cannot* do, or where it might produce unsatisfactory results. This might include limitations on the styles of music it can generate, potential biases in its output, or the risk of generating unoriginal or even offensive content. Honest communication about limitations builds trust and prevents unrealistic expectations.

Instructional material
Appropriate instructional material and disclaimers are provided to users on how to adequately use the system.
Instructional material
To ensure users can effectively and responsibly use the system, they need clear instructions and guidance. This feature means providing tutorials, user manuals, FAQs, or other resources that explain how to use the system's features, interpret its output, and avoid potential pitfalls. Disclaimers are also important to reiterate limitations, potential risks, and copyright considerations. This empowers users to get the most out of the system while minimising the chances of misuse or negative outcomes.
Pillar 5: Diversity, Fairness, and Non-Discrimination
This requirement emphasises the need to promote diversity, non-discrimination, and fairness in generative systems by establishing mechanisms to avoid unfair bias, designing for accessibility, and ensuring fair treatment for all users.

Music corpus statistics
The system provides a quantification of the kind of music used for training (genre, style, period, etc.), with statistics on the training corpora.
Music corpus statistics
AI models are shaped by the data they're trained on. This feature requires transparency about the *composition* of the training dataset. It means providing statistics about the music used for training, breaking it down by genre, style, historical period, geographic origin, and potentially other relevant factors (e.g., instrumentation, gender of the composer). This is crucial for identifying potential biases. For example, if the training data is overwhelmingly Western classical music, the system is likely to perform poorly when generating other genres, and its output may reflect the biases inherent in that specific musical tradition. This transparency allows users and researchers to understand the potential limitations and biases of the system.

Accessible interfaces
The system provides generative interfaces and/or prompting modalities to make it more accessible and inclusive to users.
Accessible interfaces
This is about making the AI usable by people with a wide range of abilities and disabilities. 'Accessible interfaces' might include: (1) **Multiple input methods:** Allowing users to interact with the system not just through text, but also through voice, images, musical instruments, or other modalities. (2) **Customisable output:** Providing options to adjust the tempo, volume, or visual representation of the music. (3) **Support for assistive technologies:** Ensuring compatibility with screen readers, alternative input devices, and other tools used by people with disabilities. The goal is to remove barriers and make the creative process inclusive for everyone.

Accessibility assessment
If the system provides accessible features, these have been evaluated and tested with the specific target of users they are intended for.
Accessibility assessment
It's not enough to simply *claim* a system is accessible. This feature means that any accessibility features have been rigorously tested with the people they are designed to help. This involves getting feedback from users with disabilities, observing how they interact with the system, and identifying any remaining barriers or usability issues. This user-centered approach ensures that the accessibility features are actually effective and meet the needs of their intended audience.

Accessibility awareness
The system explicitly acknowledges whether its use is limited, or not suitable, to certain categories of users.
Accessibility awareness
Even with the best efforts, a system might not be fully accessible to everyone. This feature emphasises transparency about any known limitations. For example, the system might state that it's not suitable for users who are profoundly deaf, or that certain features require visual interaction. This honesty allows potential users to make informed decisions about whether the system is appropriate for their needs, avoiding frustration and disappointment.

Continuous assessment
The system includes music stakeholders (creative professionals, ethical experts, AI engineers and researchers) as part of a long-term strategy for the continuous assessment of its outputs, impact, and trustworthiness.
Continuous assessment
Responsible AI development is not a one-time effort; it requires ongoing monitoring and evaluation. This feature emphasises the importance of involving a diverse group of stakeholders - musicians, composers, ethicists, AI experts, and potentially even representatives from different cultural groups - in a continuous process of assessing the system. This ongoing feedback loop helps to: (1) Identify and address any emerging biases or unintended consequences. (2) Ensure the system remains aligned with ethical principles and societal values. (3) Improve the system's performance and usefulness over time. It's a commitment to long-term responsibility and continuous improvement.
Pillar 6: Societal and Environmental Wellbeing
This requirement focuses on accuracy, reliability and reproducibility, resilience to attack, security, and fallback plans.

Training footprint
The system provides an indication of the resources consumed for training the model, in terms of hardware, time, and energy consumption (cost of training).
Training footprint
Training large AI models can require significant computational resources, leading to substantial energy consumption and environmental impact. This feature promotes transparency about this 'training footprint.' It means providing information about: (1) The type of hardware used (e.g., GPUs). (2) The duration of the training process. (3) The estimated energy consumption (often expressed in kilowatt-hours or equivalent CO2 emissions). This allows users and researchers to assess the environmental cost of training the model and encourages developers to find ways to minimise this footprint.

Generation footprint
The system provides any indication of the environmental footprint created after generating a whole song or a part of it (cost of inference).
Generation footprint
What is the environmental cost of generating a song? While training has a large upfront environmental cost, *using* the trained model (called 'inference') also consumes energy. This feature focuses on the environmental cost of *generating* music. Ideally, the system would provide an estimate of the energy used per generation, or perhaps per minute of generated music. This is often much lower than the training cost, but it can still be significant, especially if the system is used very frequently or at a large scale. This transparency encourages responsible use and motivates developers to optimise for energy efficiency during inference.

Responsible data collection
If the model uses any human-made annotation (for training, or evaluation), data collection or crowdsourcing was conducted and documented ethically, fairly, and with adequate compensation for annotators.
Responsible data collection
Many AI systems rely on human annotators to label data (e.g., tagging musical features, identifying emotions in music, or evaluating the quality of generated output). This feature addresses the ethical treatment of these workers. It means that: (1) The data collection process was conducted ethically, respecting privacy and obtaining informed consent. (2) Annotators were paid fairly for their work, avoiding exploitative practices common in some crowdsourcing platforms. (3) The process was documented, ensuring transparency and accountability. This promotes fair labour practices and ethical data sourcing in AI development.

Social purpose
The system has been designed and used to bring societal benefits, e.g. supporting teaching activities, wellbeing applications, or improving accessibility of creative technologies.
Social purpose
This feature highlights the positive social impact of the AI system. It goes beyond simply being a tool for creating music; it's about using the technology to address societal needs. Examples include: (1) **Education:** Using AI to assist music education, making it more engaging or accessible. (2) **Wellbeing:** Creating music for relaxation, therapy, or other applications that promote mental or physical health. (3) **Accessibility:** Developing tools that enable people with disabilities to create and experience music. This feature emphasises the potential of AI to contribute positively to society.

IP validation
The system has mechanisms in place to detect possible cases of plagiarism or IP infringement resulting from the generations.
IP validation
This is a crucial safeguard against copyright violations. The system should not simply reproduce existing copyrighted material. This feature means implementing mechanisms to detect potential plagiarism. This might involve: (1) Comparing generated music to a database of copyrighted works. (2) Using algorithms to identify melodic or rhythmic patterns that are too similar to existing songs. (3) Providing tools for users to check the originality of the generated music. This proactive approach helps to protect the rights of artists and prevent the unintentional creation of infringing content.

Revenue sharing
If the system is a paid service, revenues from the generations are also shared with the artists who contributed training data.
Revenue sharing
This addresses the complex issue of fair compensation in the age of AI-generated music. If the system generates revenue (e.g., through subscriptions or licensing fees), this feature proposes that a portion of that revenue should be shared with the artists whose music was used to train the AI. This is a challenging problem, as it's difficult to determine exactly *how much* each artist contributed to the success of the system. However, exploring models for revenue sharing is a crucial step towards ensuring that AI development benefits not just the technology companies, but also the creative community whose work makes it possible. This might involve creating a fund, developing a licensing scheme, or other innovative solutions.
Pillar 7: Accountability
This requirement corroborates all the others by ensuring that AI systems are accountable for their responsible design, implementation, and impact throughout their lifecycle.

Audit access
If proprietary, access to the model behind the system and training material can be granted to internal and/or external auditors (on request) and their evaluation reports can be made available.
Audit access
Even if a system is not fully open-source, accountability requires allowing independent scrutiny. This feature means that, upon request, authorised auditors (either within the developing organisation or from external bodies) can access the model's code, training data, and other relevant information. This allows them to assess the system for potential biases, ethical concerns, or compliance with regulations. Making the audit reports available (while respecting confidentiality where necessary) further enhances transparency and builds trust.

Impact assessment
Prior to the deployment of the system, potential negative impacts have been identified, assessed, minimised and openly communicated.
Impact assessment
It is important for the responsibility of a system that potential negative impacts of the system have been investigated. These should also be assessed, minimised and communicated openly while conducting risk assessments with the involved stakeholders and protecting those who may raise concerns. This particularly applies when the generative system cannot accommodate one or more responsible features, and can be facilitated with 'red-teaming' (similar to a penetration test in cybersecurity) or with questionnaires.

Responsible statement
The inability of a system to provide responsible features (like those outlined before), in full or in part, is explicitly motivated and documented. Trade-offs should be carefully selected to prioritise the minimisation of risks to ethical principles.
Responsible statement
It's not always possible to achieve *all* desirable ethical features in a system. There may be technical limitations, conflicting requirements, or other constraints. This feature requires that any such compromises or 'trade-offs' are explicitly acknowledged and justified. For example, if full training data transparency is impossible due to copyright restrictions, this should be stated, along with the reasons why. The justification should demonstrate that the chosen trade-offs prioritise minimising ethical risks. This promotes honest and responsible decision-making in AI development.

Generative redress
When unjust adverse impact occurs (e.g. generations contain offensive content, or the music is plagiarised), the system explicitly accounts for redress (e.g. compensation, direct correction through feedback).
Generative redress
Despite best efforts, things can still go wrong. This feature addresses what happens *after* a negative impact has occurred. It means having mechanisms in place to provide redress - to make amends for the harm caused. This could involve: (1) **Compensation:** Providing financial compensation to someone harmed by copyright infringement. (2) **Correction:** Allowing users to directly modify or remove offensive content generated by the system. (3) **Apology:** Issuing a public apology for the harm caused. (4) **System Modification:** Updating the system to prevent similar incidents in the future. The specific form of redress will depend on the nature of the harm, but the key is having a plan in place to address negative consequences fairly and effectively.
Call for action
How to get involved
We actively seek insights and views from AI researchers, ethicists, legal experts, and music professionals to ensure continuous refinement and responsible expansion of generative AI technologies in the music industry. We warmly welcome policymakers, music industry stakeholders, and civil society organizations to join us in this crucial dialogue. Your voice is invaluable in shaping the ongoing discourse surrounding the broader implications of generative AI for all creative industries. There are different avenues to contribute to the initiative. Here are some suggestions:
Contribute to the RAIM framework & join our study
We are building a network of experts who can provide advice and support on responsible AI music. This could involve participating in online forums, or attending our events. Share your insights and experiences, and help inform the next generation of AI Music systems. We are soon launching a study to assess the importance of each feature relative to each stakeholder group, leading to the definition of the framework. To register your interest in participating, please click the button below to send us an email. We will contact you with further details once the study commences and all protocols are in place.
Register Your Interest
Reach out and collaborate with us
If you are interested in working with us, please get in touch! We are actively seeking collaborations to drive the implementation of the RAIM framework. Join us in promoting the next generation of AI music systems that prioritise ethical considerations and adhere to responsible features. Let's work together to establish benchmarks, develop databases and tools, and explore innovative solutions for RAIM.
Reach Out
Share the initiative
Help us spread the word about the RAIM Initiative. Become an advocate and raise awareness about the potential of generative AI in music and the importance of responsible innovation. Share our website and social media channels with your network, or writing about us in your publications.
Together, we aim to foster a vibrant and inclusive ecosystem where generative AI truly empowers musicians and enriches the musical landscape for everyone.