Blog Post

Written by
Holistic AI
Category
Date
June 6th, 2022
Responsible Adoption of AI: a Cloud-Centric Approach
A cloud with arrows and cogs emanating from it

Introduction

With the exponential growth in the use of algorithms there is an acute need to ensure that such systems are governed appropriately. Indeed, precipitated by high-profile cases of harm, such as bias in AI-driven uses in recruitment and criminal justice contexts, there is an active legislative debate [1][13] concerning the regulation and risks of AI.

In this paper, we provide an overview of the move towards Responsible AI, with a particular focus on the adoption of cloud technology. We begin by outlining the general risks endemic to algorithmic systems, outlining the potential financial and reputational costs that algorithmic systems can cause to commercial enterprises, particularly as the importance and scope of algorithmic systems increase (section 2.1). We then turn to cloud-based AI in particular, to provide a more specific analysis of the risks and benefits of the field (section 2.2). Accordingly, this paper introduces the need for Responsible AI, which we use to denote the field of development towards regulated and safe algorithmic systems. To this end, we envision a new field: algorithmic auditing. As we set out in this paper, the purpose of algorithmic auditing is to perform ex ante assessments of the levels and types of risks in particular algorithmic systems, as well as to provide recommendations of risk mitigation and prevention strategies (section 2.3). Following our outline of the field, we survey the key technical risks and mitigation strategies (section 3).

Our main takeaways are the following: (a) the use of algorithmic systems–particularly in the context of cloud computing–occasions financial, reputational, and ethical risks; (b) a system of algorithmic auditing can provide effective assurance of the robustness, transparency, fairness, and privacy of an algorithmic system; (c) we envision the emergence of a new industry of algorithmic auditing and assurance at the centre of an ecosystem of trust in AI.

2. Background

In this section, we provide a precis of the field of Responsible AI, by mapping the risks of algorithmic systems. We begin with a general assessment of algorithmic systems, before introducing a more specific assessment of cloud-based algorithmic systems. Thereafter, this section provides an overview of algorithmic auditing – a new field of risk-assessment and management for algorithmic systems.

2.1 Responsible AI

Business reliance on algorithmic systems is set to become ubiquitous. AI is estimated to contribute approximately $16 trillion to global GDP by 2030 [14]. The commercial value of algorithmic decision and evaluation systems can be summarised as follows:

  • Volume: an increase in technical knowledge of and resources invested in algorithmic systems will cause an exponential proliferation of algorithms into the billions in commercial application.
  • Velocity: algorithms make decisions at unobservable speeds, including decisions about financial allocation, often with no human intervention.
  • Variety: algorithms are wide-ranging in commercial application (employment, finance, resource management, etc.) and will become ubiquitous in almost every part of an enterprise.
  • Veracity: the reliability, accuracy, and compliance of algorithms is increasingly becoming key to the management of commercial enterprises.
  • Value: the proliferation of algorithmic systems will create new services, sources of revenue, new sources of profit and cost-saving, and industries [7].

Algorithms will be ubiquitous, making billions of decisions with minimal or no human intervention, including decisions with important financial, legal and political implications [7]. Despite the transformative potential of algorithmic systems, the reach of their effects – combined with the paucity of supervision – carries with it the risk of major financial and reputational damage [15]. Vokswagen’s Dieselgate scandal [16] (with fines of $34.69B) and Knight Capital’s bankruptcy (with ramifications exceeding $400M) are two high-profile examples of the potential costs of adopting unsafe algorithmic systems [17].

In light of the various activities and high-profile cases of harm and public interest [18], a community and literature has emerged that can broadly be encompassed by the phrase ‘Responsible AI’ (synonyms of which can be referred to as ‘AI Ethics’, ‘Trustworthy AI’, ‘AI Safety’ etc…). Stakeholders in this debate include government, industry and academia. Indeed we read the space as having gone through three stages of evolution, namely: a principles phrase, where the impetus was to articulate and publish statements of principles to ensure responsible use of AI [1][19][20]; a processes phrase, where the impetus was to build processes whereby ‘ethical by design’ could be achieved (in situ) [21]; and, finally, an audit and assurance phrase [22], where systems should be assessed and reported upon with respect to their performance and in accordance with developing public standards (such as legislation or authoritative policy recommendations).

In particular, the current phrase (audit and assurance) is maturing insofar as frameworks of assessment and reporting are being proposed and contested. In this paper we structure our discussion on our readings of the best in class governance approaches - however, we recognise that there remains significant outstanding debate (See [23], [24], [25], [26]).

2.2 Cloud-based AI: Benefits and Risks

Cloud-based AI brings together two technologies that have witnessed widespread growth and adoption during the past decade. By way of example, total AI startup funding worldwide has grown from 670 million U.S. dollars in 2011 to 36 billion U.S. dollars in 2020, and 38 billion U.S. dollars in the first half of 2021 alone (see [27]), whilst infrastructure-as-a-service’s (IaaS) industry value is predicted to exceed 623 billion U.S. dollars by 2025, from a level of around 12 billion U.S. dollars in 2010.


Whilst the benefits and risks of both of these technologies have separately received attention (see [7] for a survey of risks pertaining to AI, and [28] for a discussion of benefits and risk pertaining to cloud computing), the bringing together of the two technologies, via the implementation of machine learning operations (MLOps) via a cloud provider’s IaaS offering, highlight certain aspects for particular attention. In the following, the benefits and costs of implementing AI in the cloud versus implementation on-premises are discussed in turn. It should be noted that only those aspects particularly exacerbated by the confluence of AI and cloud are set out, and that a wider reading is required (for example, how data science, in the absence of AI, and cloud come together) to gather a more complete understanding of the benefits and risks of implementing AI in the cloud. This section ends with a short discussion that considers the balance of the benefits and risks.

Benefits

We see the benefits of implementing AI in the cloud as falling four broad categories:

  • Cost
    • Efficient use of computing capacity: on-premises data centres typically only use 12-18% [29] of their server capacity whilst the largest cloud providers can realise higher utilisation rates (40-70%) [29] , in part due to load sharing across time zones and smart resource allocation. Such efficiency massively reduces the amount of hardware needed to support machine learning operations.
    • Energy efficiency: training machine learning (ML) models can be especially energy-intensive. For example, it is estimated that training OpenAI’s GPT-3 natural language model consumed approximately 190,000 kWh [30] of electricity. Large cloud providers maximise building design and location (for climate, water supply for cooling and renewable energy generation co-location) to minimise non-renewable energy demand.
  • Operations
    • Pushing ML operations to the cloud removes an outsized on-premises operations overhead, reducing machine learning package and dependency installations, hardware and software conflicts, and ML-specific vulnerability updates.
    • With AI in the cloud typically provided as ML as-a-service, the on-premises ML engineering requirement is reduced and can be re-deployed.
  • Robustness
    • Cloud-based AI utilises robust model backup protocols by design, ensuring business continuity in the event of failure and protecting against high model re-training costs.
  • Privacy
    • Moving ML operations to the cloud allows the user to benefit from best-in-class enterprise data protection and privacy infrastructure, noting that the creation of inference data, particular to AI, can contain sensitive personal data which did not form part of the input data to the AI model.
    • Machine learning implementations have an outsized number of software dependencies which create vulnerabilities to privacy and data attacks. The AI cloud offering abstracts away the intensive software monitoring and update requirement [31] for the user which is a key mitigation against such attacks.

Risks

We see the risks of implementing AI in the cloud as being across five broad categories:

  • Efficacy
    • Cloud-based AI offering suffers from increased latency [32] at model inference time when compared to on-premises implementation. With machine learning models, and in particular large, deep learning models, already suffering from a certain increased amount of latency as compared to simpler data retrieval tasks, the AI use-case can be sensitive to any further increase. For example, this would be a particular issue for fast market trading operations within the financial services industry, for whom latency is a key competitive differentiator (cf. stock exchange co-location) [33].
  • Robustness
    • Cloud-based AI does not provide certainty of computing capacity. This can be particularly acute when external shocks (e.g. pandemic, geopolitical action) require multiple AI cloud users to simultaneously and reactively re-train their ML models, and can result in sizeable financial losses for those users unable to re-train in a timely manner.
    • The cloud-based AI business continuity process might fail in the event of the cloud provider entering into forced liquidity or being subject to lawful restriction.
  • Privacy
    • Cloud-based AI necessitates data and information transfer between the user and the cloud provider generating a new point of data protection vulnerability, especially as compared to a fully internal on-premises implementation.
    • ML models can contain training set data, either by design (e.g. Support Vector Machine, k-Nearest Neighbours) or through overfitting. Moreover, ML models’ outputs (predictions or inference results) can contain sensitive personal data even where the data input to the models contained none. In addition to standard security protocols around the data input to the model (both during training and inference), cloud-based AI needs to protect against these further AI-specific data risks.
    • Cloud-based AI must have query-monitoring capabilities in place to protect against model and functionality extraction, both of which might form the user’s competitive advantage.
  • Explainability
    • The cloud-based AI’s user does not have direct access to the model, data and time-stamped snapshots of both. This can inhibit the provision of acceptable explanations (pertaining to model predictions) on a post-hoc basis. Moreover, this adds to regulation risk when such explanations are in response to regulator requests.
  • Regulation
    • The generation of personal sensitive data by ML models can lead to regulatory scope that is beyond the AI cloud provider’s standard regulatory overhead, generating regulation risk for the user.

Commentary

Although the risks section above apparently outweighs the benefits section, it should be noted that a number of the risks are anticipated to dissolve as cloud-based AI matures. In particular, the concerns around privacy, explainability and regulation should be well addressed in the coming years as the specific risks pertinent to AI come into focus. The efficacy and robustness risks are by design and less straightforward to mitigate. Such limitations will inevitably lead to certain AI use-cases proving non-viable via a cloud platform. Conversely, the benefits of implementing AI in the cloud are already well-understood and the reduction in cost and operational overhead, combined with outsourcing much of the risk mitigation surrounding AI to scale providers, outweighs the associated concerns for the majority of use-cases.

2.3 AI Auditing and Assurance

Towards the end of achieving Responsible AI, we envision a new field: algorithmic auditing and assurance. The development of this field will operationalise and professionalise current theoretical research in Responsible AI, AI Ethics, and Data Ethics [7]. The purpose of AI auditing and assurance is to provide standards, practical codes, and regulations to assure users of the safety and legality of their algorithmic systems.

Algorithmic auditing is composed of four stages of activity:

  1. Development: an audit will have to account for the process of development and documentation of an algorithmic system.
  2. Assessment: an audit will have to evaluate an algorithmic system’s behaviours and capacities.
  3. Mitigation: an audit will have to recommend service and improvement processes for addressing particular high-risk features of algorithmic systems.
  4. Assurance: an audit will be aimed at providing a formal declaration that an algorithmic system conforms to a defined set of standards, codes of practice, or regulations.

The purpose of this process is to produce an ecosystem of Trustworthy and Responsible AI, in which algorithmic systems have been been properly appraised (as per stages 1 and 2), all plausible measures for reducing or eliminating risk have been undertaken (as per stage 3), and users, providers, and third-parties (including governments) have been assured of the systems’ safety (as per stage 4).

In section 3, we survey the risks and mitigation strategies that will provide the content of the above stages of activity to constitute an algorithmic audit.

3. Key Technical Risks and Mitigation

Regardless of the algorithm, broadly speaking, there are five stages of Model Development:

  1. Data and Task Setup: collecting, storing, extracting, normalising, transforming, and loading data. Ensuring that the data pipelines are well-structured, and the task (regression, classification, etc.) has been well-specified and designed. Ensuring that data and software artefacts are well documented and preserved.
  2. Feature Pre-Processing: selecting, enriching, transforming, and engineering a feature space.
  3. Model Selection: running model cross-validation, optimization, and comparison.
  4. Post-Processing and Reporting: adding thresholds, auxiliary tools and feedback mechanisms to improve interpretability, presenting the results to key stakeholders, evaluating the impact of the algorithmic system on the business.
  5. Productionising and Deploying: passing through several review processes, from IT to Business, and putting in place monitoring and delivery interfaces. Maintaining an appropriate record of in-field results and feedback.

Although these stages appear static and self-contained, in practice they interact in a dynamic fashion, not following a linear progression but a series of loops, particularly in between Pre/Post-processing.  In the table below we also list how each stage interacts with four key risk levers:

  • Privacy: quality of a system to mitigate personal or critical data leakage.
  • Fairness: quality of a system to avoid unfair treatment of individuals or organisations.
  • Explainability: quality of a system to provide decisions or suggestions that can be understood by their users and developers.
  • Robustness: quality of a system to be safe, not vulnerable to tampering.

In a similar fashion to the stages, each lever appears to be self-contained, but these are also interrelated. Though the research on each vertical is mostly conducted in silos, there is a growing reckoning from the scientific and industry community of the Trade-offs and Interactions between them. Accuracy, a component of robustness, may need to be traded for lowering any existing outcome metric of bias; making the model more explainable may affect the system performance and privacy; improving privacy affects ability to assess the impact of algorithmic systems. Optimisation of these features and tradeoffs will depend on multiple factors, notably the use case domain, the regulatory jurisdiction, and the risk appetite and values of the organisation implementing the algorithm.

Conclusion

The purpose of this paper was to map out the adoption of Responsible AI in cloud-based technology. We began, in section 2.1, by surveying the reputational, financial, and ethical risks inherent in the transition to algorithmic systems that have prompted a need for Responsible AI. In section 2.2, we surveyed more specifically the risks in the adoption of cloud-based algorithmic systems. To resolve concerns about both sets of risks, we propose the adoption of algorithmic auditing and assurance to achieve Responsible AI, including Responsible Cloud-based AI. The purpose of AI auditing is to assess algorithmic systems according to their key technical risks: bias and discrimination, performance and robustness, interpretability and explainability, and privacy.

Interested in our company?