Recommendations of the Commission on the Implementation of the Code of Ethics in AI on the topic: “Transparency of artificial intelligence algorithms and information systems based on them” — Альянс в сфере искусственного интеллекта

Если вы заинтересованы в сотрудничестве и готовы вкладывать усилия в развитие ИИ в России, напишите нам на
info@a-ai.ru

Recommendations of the Commission on the Implementation of the Code of Ethics in AI on the topic: “Transparency of artificial intelligence algorithms and information systems based on them”

MARCH 19, 2024

Recommendations on ensuring transparency of artificial intelligence algorithms and information systems based on them (hereinafter — Recommendations) are a generalization of experience and views of the members of the Commission for the Implementation of the Code of Ethics in the Field of Artificial Intelligence (hereinafter — the Commission), as well as experts who participated in the discussion of the issue of transparency of artificial intelligence algorithms The Recommendations are intended for a wide range of developers of artificial intelligence algorithms, as well as information systems and digital services based on them (hereinafter — Developers, Algorithms).

 

With the help of the Recommendations, the Developers will be able to assess the comprehensiveness and proportionality of the efforts undertaken to ensure transparency of the Algorithms, as well as to plan and prepare the next steps for disclosure of information about the Algorithms.

 

To ensure predictability of the consequences of the Algorithm use and, as a consequence, a higher level of trust to it on the part of existing and potential users, as well as on the part of third parties, whose rights may be affected as a result of legal use of the Algorithms.

As a result, solving the main transparency issue of Algorithms allows for:
• The implementation by the consumer of their right to an informed and independent choice of the Algorithm among alternatives;
• The good faith actions of the Developer in disclosing significant facts regarding the Algorithm;
• The accumulation and systematization of significant behavioral features of the Algorithm for the consumer, both before the design of information systems and services using it, and in case of unforeseen situations;
• The possibility of distributing responsibility among Developers in complex information systems and services, where various Algorithms are used alongside traditional methods of hard programming the behavior of such systems.
Sequential and systematic disclosure of significant facts regarding Algorithms by Developers, which includes:
• Initial disclosure of information about the Algorithm, which is carried out by the Developer when deciding to provide the public with information about its development at any time, conditioned by attracting investments, grant funding, or other activities of the Developer;
• Additional disclosure of information upon the emergence of results that significantly impact the assessment of the potential consequences of using the Algorithm;
• Disclosure of information to a potential client, implying good faith informing of decision-makers about the use or implementation of the Algorithm in various fields.

 

Disclosure of relevant facts by the Developers in relation to the Algorithms may be made in a free format convenient for the general public and certain categories of consumers of such information, including formatted full-text documents, web pages on corporate websites, specialized machine-readable formats for subsequent use in databases and personal productivity tools (including digital personal assistants).

In doing so, the Commission recommends that published documents be kept unchanged and, if they need to be modified, properly versioned to preserve access to documents containing previously published information.

The Commission considers it appropriate that, over time, appropriate infographics, classifiers and other means of making documents more understandable should be developed for the relevant facts to reduce the time spent by users in accessing information on the relevant facts of transparency in the Algorithm.

The initial disclosure of information about the Algorithm is made voluntarily by the Developer, addressed to the general public, publicly available, and contains the following information about significant facts:
-Algorithm training goals: what goals were set for the Algorithm during its training (using simple vocabulary);
-Metrics for evaluating the effectiveness of the Algorithm: which function was optimized from which parameters during the machine learning process (using professional terminology, including mathematical formulas);
-Composition of the training data set: what data was used for training, whether standardized data sets were used, and if not, the method of data collection and other important features that, in the Developer’s opinion, could affect the statistical properties of the training set compared to the general population;
-Machine learning algorithms used: a professional description of the standard algorithms and their combinations used, in a format that does not violate the commercial rights of the Developers.

 

In addition, based on the aforementioned significant facts, the Developers are expected to provide reasoned judgments on the following significant facts during the initial disclosure of information:
-known deficiencies of the training data set, including the indication of categories or combinations of object properties that are deliberately or likely absent due to the data source or the specifics of their collection;
-known deficiencies of the Algorithms used, including indications of deficiencies that are due to their combination;
-results obtained by the Developer on their own test data sets, including indicative information on the share of false positives and false negatives in their applicability, as well as the characteristics of objects on which the Algorithm performs better or worse than average on the test data set. The Commission considers it unethical for Developers to intentionally hinder consumers’ identification of deficiencies in the training data set or the algorithms used for training.