How to Manage the Risk of AI Bias in Identity Verification

Authored by

Published on

October 20, 2022

last updated on

August 23, 2024

The spread of remote identify verification

We all interact with companies providing us financial products and services. Most of these interactions appear seamless. But all financial transactions, whether with traditional banks or fintech companies, are predicated on identity verification (IDV).

Regulations require financial services providers to know the identity of an individual before they are onboarded as customers (KYC rules). This helps to prevent money laundering and other illegal activity.

Previously, customers visited bank branches for identity verification and document validation.

The proliferation of online banking, and the rise of fintechs with no physical branches, has created the need for remote IDV.

IDV technology allows the individual to prove their identity by submitting images of their face and identity documents. It replaces the role of the bank teller, who previously would have checked the document against the individual who turned up at the counter, by seeking to determine the likelihood that:

the identity document provided is legitimate and not tampered with; and
the facial image embedded in the identity document matches the individual’s face.

After this is completed, the IDV provider generates a report documenting the likelihood that the identity document belongs to the individual, alongside any red flags that may have been triggered by the algorithm.

The bank must then decide how to proceed, based upon the IDV report and its own internal thresholds.

The ethical implications are complex.

Whilst such technology increases efficiencies, it also poses new risks. If online IDV is the only means to access the product and the algorithm fails to correctly match the individual, it means the individual will have no other channels to access the service with that provider. This creates barriers to participation in banking and time-critical products such as access to credit.

How does IDV work from a Machine Learning (ML) perspective?

The ML models that enable IDV perform two fundamental jobs:

they extract relevant data from the identity document such as date of birth, first and last name, document number, and validate that the document is original;
perform a facial verification control between the photo presented in the identity document and the selfie taken within the IDV app.

To conduct these tasks, the ML model is trained to assign a numerical feature vector to the facial image that is presented: images belonging to the same individual will generate vectors with high similarity scores, and those that represent different people will display vectors with a low similarity score. The similarity score will be underpinned by an agreed threshold: any images equalling or higher to the similarity threshold will be deemed as belonging to the same person, those that are lower will fail the similarity test.

Whilst it is reported that on average such ML models perform better than humans conducting the same checks thanks to advances in deep learning models and computing power, they are still prone to inaccuracies. This is because the performance of ML models is contingent on the quality of their training data.

Algorithmic bias in IDV

Companies relying on ML tools to power their IDV must consider:

Are the training data sets sufficiently large, complete and relevant?
Are the training data sets sufficiently diverse?
Is the system sufficiently robust?

The quality of the datasets will be driven by both intrinsic and extrinsic factors. Intrinsic factors include diversity of gender, skin tone, age and facial geometry, whereas extrinsic factors include the background environment, image quality, facial expression and facial decoration.

A training dataset that is insufficiently representative can lead to poor model performance on under-represented populations, even if global metrics suggest strong performance.

If the dataset is deficient in key intrinsic factors such as skin tone and gender (and particularly an intersection thereof), the model will present differential performance on the individuals that are represented by those factors, leading to algorithmic bias.

Algorithmic bias results into two outcomes, detrimental to both the individual and the business:

Individuals rejected because two images of the same person are given a low similarity score (high false rejection rate).
Individuals face unfair treatment and are not given access to the service whilst companies lose new potential customers.
Individual accepted even when the two images belong to different people because they are assigned a high similarity score (high false acceptance rate).
Individuals are more subject to risk of their identities being stolen and used for fraudulent purposes and companies bearing the liability for not preventing fraud.

How do we manage AI bias risk in IDV?

The only effective solution is to implement robust AI Risk Management systems and processes.

AI Risk Management is the process of identifying, verifying, mitigating and preventing AI risks. Concrete steps must be taken at each stage of the AI lifecycle, to reduce the likelihood of bias.

Risk management approaches must be adapted to reflect the novel risks AI poses. For example, as AI systems continuously learn and evolve, and performance tends to decay over time, they must be carefully monitored on an ongoing basis. This requires an automated and scalable solution.

Managing AI risks also requires technical assessment of the AI system’s code and data. Best practice entails the independent auditing, testing and review of AI tools against bias metrics and other industry standards.

We have the technical expertise to assess the quality and performance of ML models, and the representativeness of their training datasets, to support IDV providers in mitigating bias risks. We also support businesses in designing and establishing policies and processes to effectively govern the use of AI, such as training, governance and accountability, and other operational controls.

To learn more about how you can identify and mitigate AI bias issues, contact us to request a demo!

DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.

Holistic AI OSL Library

Table of contents

Heading 2

Heading 3

How to Manage the Risk of AI Bias in Identity Verification

The spread of remote identify verification

How does IDV work from a Machine Learning (ML) perspective?

Algorithmic bias in IDV

How do we manage AI bias risk in IDV?

See the industry-leading AI governance platform in action