Diversity in thinking about ethics in AI: Brief notes on some recent research findings

These are a few notes on the issue of diversity and how addressing it might or might not help improve ethical decision making and the creation and implementation of codes of conduct. There will be a fuller discussion of this in the forthcoming book associated with this project, Towards a Code of Ethics for Artificial Intelligence Research, to be published later this year by Springer.

Note that in the following discussion, some of the focus is on findings related to gender, partly because of research demonstrating their significance, and partly because of the dearth of research on other aspects of diversity, and because some of the research on group diversity in fact examines proxies for gender such as dominance traits. However, the question of bias and discrimination of course also affects other categories.

The scale and difficulty of the problem: There’s known social influence bias in how we think. But we don’t know all that much about this, and we do know that the use of online resources and social media is likely to be a significant factor. In other words, there are changing and possibly significant influencers on our judgements and decision making, some of which have the potential to herd us into like-minded cabals of opinion (Muchnik, Aral, and Taylor 2013; Pariser 2011). It would be arrogant to assume that academics and the other opinion leaders working on the ethics of AI are immune to this.

There is also considerable work on social network theory which likewise demonstrates such large effects, and effects which are significant even if not large (Christakis and Fowler 2010; Jackson 2010).

There are reasons of fairness for including the voices of diverse groups in discussions of any topics and ethics is no exception. These groups are sometimes identified by legally protected characteristics, which may vary to an extent in different jurisdictions, and which may not necessarily cover all categories of people facing various forms of discrimination, or whose voices are harder to hear. Therefore, there are reasons which surpass purely legal reasons for including these voices, should one be concerned with fairness.

A considerable body of work in public engagement with policy making and with research and development deals with these issues of wider engagement and inclusion (O’Doherty and Einseidel 2013). The rationale for this inclusion tends to focus on issues of justice in getting voices heard, and in forming policy and developing research and innovation which caters effectively to people’s needs.

Further reasons for inclusion: There could be additional reasons for inclusion however. It is helpful to distinguish between factors associated with bringing knowledge and understanding of particular experiences to a group, and contributions to a group’s thought and decision processes.

Work in social epistemology argues for the importance of considering the social structures within which knowers gain and share knowledge (Goldman and Blanchard 2015). Work in ethics such as work on standpoint epistemologies argues for the inclusion of diverse voices in debates, with the claims made that those with certain experiences or identities have privileged, sometimes exclusive, access to certain insights and understandings relevant to ethical inquiry (Campbell 2015). This may or may not go along with relativist viewpoints; it should be noted that the need to check all the facts, which might involve something as non-controversial as asking someone else for information which one does not have, is essential to ethical judgement. Furthermore, the awareness that the viewpoints and critiques of others are needed to orient one’s moral views is consistent with an account of the objective nature of moral reality, and indeed seeking alternative viewpoints may be a good strategy for reaching an objective account (MacNaughton 1988).

Collective intelligence: Research findings suggest that the problem solving skills of a diverse group outperform those of a group comprised of the most able individuals (Hong and Page 2004). Recent work shows that collective intelligence in group decision making is a factor independent of individual intelligence, which is correlated with the average social sensitivity of group members, the equality in distribution of conversational turn-taking, and the proportion of females in the group (Woolley et al. 2010). This gives particular reason, in general, to pay attention to the gender balance of groups, although the research so far is suggestive of the importance of social skills, rather necessarily gender per se. The notion of collective intelligence is of relevance to, e.g., concerns about the make-up of ethics committees, such as the finding that the average research ethics committee member is a middle aged female without degree level education. This may indeed matter less than it appears, and may even be an advantage.

This work in collective intelligence also finds some confirmation in recent work examining the notion of metacognition (Frith 2012). This is a process by which we monitor our own thought processes, taking into account the knowledge and intentions of others. This enhances and enables joint action by allowing us to reflect on and justify our thoughts to others. Individuals have limited ability to do this solo, but working in groups can enhance this capacity. Whilst metacognition would have widespread application, it is in ethics and in the justification of actions and decisions to others that it is of particular relevance. This recent work in psychology indeed chimes extremely well with long traditions in ethics, in both philosophy and theology, from diverse thinkers, about the importance of our own ability to understand our own motivations and thoughts, yet the difficulty of doing this alone (Butler 1827; Boddington 1998). This strongly suggests that working in groups will be particularly important for work in ethics, over and above the need to include a diverse range of experience. (It also has possible implications for attempts to build ethics into AI, but that is a separate topic.) It is important to note that the value of gaining feedback from others is not inconsistent with views in ethics on the importance of moral autonomy and integrity of one’s own moral judgement and decision making, indeed can enhance these values. Even such a foremost proponent of individual moral autonomy as Immanuel Kant recognised this means moral actions must be based on the right motivation, but also how difficult it was for us to know our own real motivations (Kant 1972). The insights of others can surely help here.

Inclusion, group performance, and hierarchy: However, there is also work which suggests that for maximal performance, it is not simply enough to create diverse groups, or mixed gender groups. Humans are not just social animals, they are also hierarchical animals, but working out which forms of hierarchy and which methods of collaboration produce the best results in particular situations is complex. Work which shows the effectiveness of groups with mixed social dominance (which relates to but is not simply a matter of gender composition) also shows the complexity of these effects, and suggests that more work is needed (Ronay et al. 2012). Other work shows that testosterone disrupts collaboration, with a tendency to overweight one’s own judgements compared to that of others (Wright et al. 2012). Since testosterone levels vary not just between genders but within genders, again, simple recipes for good group construction cannot be necessarily drawn from these hints.

Likewise, other work which echoes the work on metacognition in postulating group level traits and the importance of collaboration for humans, points out that in collaborating, it is important not just that you collaborate, but with whom within a group you collaborate (Smaldino 2014). These findings cohere with longer established findings within social science and communication studies showing that in communicating significant information and in making ethical decisions, people make precise judgements about whom to communicate and collaborate with, and who is the most appropriate person in a group to act (Arribas-Ayllon, Featherstone, and Atkinson 2011; Boddington and Gregory 2008).

Work on the gender mix of groups: reviewing the current literature shows the value of the presence of females for improving team collaboration, but there are mixed findings for the value of the presence of females for team performance, with differences attributable to context (Bear and Woolley 2011). The suggestion is that in areas of work where there is already a reasonable gender balance, the presence of women enhances team performance, but where there is an imbalance with a preponderance of men, then sub-groups with a fairer gender balance suffer from the negative stereotypes of women in that field and do not perform so well. This could be problematic in some technical areas, where gender parity is either far off, or may never be naturally achieved, since it is entirely possible that in some areas of human activity, even absent any barriers to participation, one gender may have a greater interest in participating than the other.

The double whammy of AI and philosophical ethics: At the moment, AI, computer science, and also, relevantly to our considerations, philosophy, are very male dominated areas (BeeBee and Saul 2011; Aspray 2016). This is so even in the sub-speciality of philosophy, ethics. This places us in a possible conundrum. The presence of women in a group may enhance group collaboration, but not necessarily enhance group performance, where negative stereotypes of women persist. Hence, where ethics exists as an activity within a male dominated area where there are negative stereotypes of women, inclusion of women within the ethics endeavour might, it could be speculated, act to produce negative stereotypes of the enterprise of ethics, especially if the presence of women merely enhances group collaboration, but not group performance. My personal and admittedly unscientific hunch is that, within philosophy, ‘applied’ subjects like ethics are indeed still often seen as inferior to the hard-boiled theoretical subjects; but that, nonetheless, (and varying greatly with local context and local culture), ethics is still often male dominated. There are many different elements to how human enterprises as slotted into our complex, hierarchical ways of operating in the world.

There are some specific difficulties with forming effective groups. Recent work shows that narcissistic individuals are perceived as effective leaders, yet this may diverge considerably from reality, where such narcissistic individuals inhibit information between group members and inhibits group performance (Nevicka et al. 2011). Note that this research does however yet again show the importance of group dynamics in effective group outcomes. The inhibition of sharing information would be especially relevant in the case of ethics, and the importance in ethics of effective self-reflection likewise suggests that the trait of narcissism in group leaders would be especially unwelcome. Given however the preference for picking narcissistic leaders, this is food for thought in how groups are made up. It seems that perhaps Plato had a good hunch when he argued in the Republic that only those who do not wish to lead should be allowed to do so.

A tentative conclusion from this research is that if you are interested solely in the representation of women or other particular groups for issues of justice pertaining to the participation of those groups, priorities and strategies may not be precisely the same as if you are interested additionally in how diversity improves outcomes. For instance, work which shows that group communication and interaction is vital to success, would indicate that some moves to include particular groups – for the sake of argument here, women – might have little or no impact on improving outcomes or collaboration, if the particular ways in which women are included do not afford the opportunity for communication, empathy, and feedback on ideas, and other such effective procedures as the research indicates merit attention. For example, inclusion of women in panels where the opportunities for interaction are very formal, may do little to enhance collaboration or output. More worryingly, research indicating the possibility for negative reactions to the inclusion of women in certain contexts might suggest that it is possible that the visible inclusion of women in an area of work might lead to that area being stigmatised. It might be wise to take steps to consider and where possible ameliorate such possibilities. Perhaps steps could be taken in various ways to signal the high regard in which work on ethics in AI is held. (But then, I would say that, wouldn’t I!)

There are particular reasons to be especially concerned with these issues in relation to AI. The very question of bias in algorithms is a major ethical challenge in AI; potential problems of bias concern the application of algorithms in certain context, the assumptions that drive the creation of algorithims, and the data sets used to create algorithms and to drive machine learning (O’Neil 2016; Nature Editorial 2016). The problem is quite acute. Figures show, for instance, that the numbers of women in computer science are actually in significant decline, and that this is especially the case in AI, a problem that Margaret Mitchell, a researcher at Microsoft, calls the ‘sea of dudes’ problem (Clark 2016). Research even finds that many job ads in tech tend to contain words which are rated ‘masculine’ and hence tend to put women off applying (Snyder 2016). In Snyder’s study, words were rated as gendered ‘if it statistically changes the proportion of men and women who respond to a job post containing it’; in other words, by an operational definition which used large data sets, rather than one making presuppositions about language and gender. Note then that this is making use of technology to spot the human problems in technology. Job ads for Machine Intelligence were found to be the most masculine biased of them all. The optimistic side of this is that having discovered these issues, steps can be taken to ameliorate them.

Within certain groups and certain professions, viewpoints may be less diverse than in the population as a whole. For instance, within universities, more academics lean towards the left of the political spectrum than to the right. There is currently a particular concern expressed about the dampening down of free speech both within universities and in the media, including social media. Whatever the exact nuances of this situation, in any endeavour towards developing codes of ethics in AI, it would be beneficial to watch out for and address any such biases or gaps in thinking. One arena which is attempting to encourage diversity of viewpoints in debate is the Heterodox Academy.

We wish to thank the Future of Life Institute for their generous sponsorship of our programme of research.


Arribas-Ayllon, Michael, Katie Featherstone, and Paul Atkinson. 2011. ‘The practical ethics of genetic responsibility: Non-disclosure and the autonomy of affect’, Social Theory & Health, 9: 3-23.

Aspray, William. 2016. Women and Underrepresented Minorities in Computing: A Historical and Social Study (Springer: Heidelberg).

Bear, Julia B., and Anita Williams Woolley. 2011. ‘The role of gender in team collaboration and performance’, Interdisciplinary Science Reviews, 36: 146-53.

BeeBee, Helen, and Jenny Saul. 2011. “Women in Philosophy in the UK: A Report by the British Philosophical Association and the Society for Women in Philosophy in the UK.” In. London: British Philosophical Association, Society for Women in Philosophy UK.

Boddington, P. 1998. “Self-Deception.” In Encyclopedia of Applied Ethics, edited by Ruth Chadwick, 39-51. San Diego: Academic Press, Inc.

Boddington, Paula, and Maggie Gregory. 2008. ‘Communicating genetic information in the family: enriching the debate through the notion of integrity’, Medicine, Health Care and Philosophy, 11: 445-54.

Butler, Joseph. 1827. Fifteen Sermons Preached at the Rolls Chapel (Hilliard, Grey, Little and Wilkins: Boston).

Campbell, Richmond. 2015. “Moral Epistemology.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta. Stanford: Stanford University.

Christakis, Nicholas, and James Fowler. 2010. Connected: The Amazing Power of Social Networks and How They Shape Our Lives (Harper Press: London).

Clark, Jack. 2016. ‘Artificial Intelligence Has a ‘Sea of Dudes’ Problem’, Bloomberg Technology.

Editorial. 2016. ‘Algorithm and blues’, Nature: 449.

Frith, Chris D. 2012. ‘The role of metacognition in human social interactions’, Phil. Trans. R. Soc. B, 367: 2213-23.

Goldman, Alvin, and Thomas Blanchard. 2015. “Social Epistemology   ” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta.

Hong, Lu, and Scott E Page. 2004. ‘Groups of diverse problem solvers can outperform groups of high-ability problem solvers’, Proceedings of the National Academy of Sciences of the United States of America, 101: 16385-89.

Jackson, Matthew O. 2010. Social and Economic Networks (Princeton University Press: Princeton).

Kant, Immanuel. 1972. The Moral Law, translation of Groundwork for the Metaphysics of Morals (Hutchinson).

MacNaughton, David. 1988. Moral Vision (Blackwell: Oxford).

Muchnik, Lev, Sinan Aral, and Sean J Taylor. 2013. ‘Social influence bias: A randomized experiment’, science, 341: 647-51.

Nevicka, Barbora, Femke S Ten Velden, Annebel HB De Hoogh, and Annelies EM Van Vianen. 2011. ‘Reality at odds with perceptions narcissistic leaders and group performance’, Psychological science: 0956797611417259.

O’Doherty, Kieran, and Edna Einseidel. 2013. Public Engagement and Emerging Technologies (UBC Press: Vancouver).

O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Allen Lane).

Pariser, Eli. 2011. The Filter Bubble (Viking Penguin: London).

Ronay, Richard, Katharine Greenaway, Eric M Anicich, and Adam D Galinsky. 2012. ‘The path to glory is paved with hierarchy when hierarchical differentiation increases group effectiveness’, Psychological science, 23: 669-77.

Smaldino, Paul E. 2014. ‘Group-level traits emerge’, Behavioral and Brain Sciences, 37: 281-95.

Snyder, Kieran. 2016. ‘Language in your job post predicts the gender of your hire’. https://textio.ai/gendered-language-in-your-job-post-predicts-the-gender-of-the-person-youll-hire-cd150452407d#.gz88w5ovr.

Woolley, Anita Williams, Christopher F Chabris, Alex Pentland, Nada Hashmi, and Thomas W Malone. 2010. ‘Evidence for a collective intelligence factor in the performance of human groups’, science, 330: 686-88.

Wright, Nicholas D, Bahador Bahrami, Emily Johnson, Gina Di Malta, Geraint Rees, Christopher D Frith, and Raymond J Dolan. 2012. ‘Testosterone disrupts human collaboration by increasing egocentric choices’, Proceedings of the Royal Society of London B: Biological Sciences: rspb20112523.

Paula Boddington

We would like to thank the Future of Life Institute for sponsoring our work