The Ethics Revolution: How Can We Move Beyond Profit-Driven Data Practices?

Much gets lost in the sea of chatter about ethics for data and AI.

For instance, we aren’t talking enough about the unseen underpaid laborers behind the data that fuels Artificial Intelligence, not to mention how data is exploited from every one, with risks falling onto those who tend to be marginalized to begin with. These were some of the topics discussed on a recent conference panel that I was invited to speak at.

The panel, “Global Voices in Data: Navigating Responsible Management and Processing with Diverse Perspectives,” took place at the ACM DEI SIGMOD/PODS conference in Chile. The panel aimed to facilitate a conversation involving a range of voices from both the Global North and South, to pinpoint essential elements that would enhance the progression of ethical and reliable research on databases. This involved taking into consideration the distinct requirements, expectations, and viewpoints of communities across various regions. The annual ACM SIGMOD/PODS Conference is a prominent global gathering for database researchers, practitioners, developers, and users to engage with the latest ideas and findings, and to share strategies, tools, and experiences. The conference’s audience includes individuals involved in research, industry, and development within the field of database technology, as well as representatives from leading companies. The event encompasses a diverse technical program featuring research and industrial talks, tutorials, demos, workshops, a poster session, industrial exhibition, and a careers-in-industry panel.

The discussion left me with a critical question: can we achieve an ethics revolution in data practices within a system driven by profit maximization?

The panel discussions were rich and diverse. Ricardo Baeza Yates (Director of Research at the Institute for Experiential AI of Northeastern University and professor at the Dept. of Computer Science of the University of Chile) emphasized the importance of decolonial discourse and a multidisciplinary approach. Jocelyn Dunstan (a professor at the Catholic University of Chile in a joint appointment between the Department of Computer Science and the Institute for Mathematical and Computational Engineering) highlighted the need for personal awareness and systemic change from within. We all agreed on the importance of Indigenous Data Sovereignty, ensuring communities control their own data.

The participation of diverse voices, specifically from the Global South, brought value to the conversation surrounding responsible data analytics and AI. While discussions on responsible AI primarily come from the Global North and focus on technological perspectives, voices from the Global South are engaging in decolonial discourse that questions accountability and responsibility surrounding data exploitation techniques. These discussions highlight the unequal distribution of benefits from data processing and global colonization of data and include key concepts such as Indigenous Data Sovereignty, “counterdata”, and missing data.

Some of the points discussed that exemplify the importance of participation from the Global South were the attention to the existence of colonial influence on data, data labeling, exploitation through data, and data work. Having colonial labels on data, for instance, can be quite harmful to the people that the data represents. The principles of Indigenous Data Sovereignty can function as a framework that can guide data practices towards being responsible and ethical.

My Presentation: Decolonizing Data for an Ethical Future

During my presentation, I explored the concept of Indigenous Data Sovereignty and its connection to responsible data practices. Traditional data practices often reflect colonial patterns of control and exploitation. To achieve truly ethical AI, we need to decolonize data by centering Indigenous knowledge and self-determination. The CARE Principles (Collective Benefit, Authority to Control, Responsibility and Ethics) provide a framework for ethical data governance, particularly in relation to Indigenous communities. These principles ensure data projects benefit the community, respect their control over their knowledge, and are conducted ethically. They address a gap left by the FAIR Principles (Findable, Accessible, Interoperable, Reusable), which lack focus on Indigenous rights.

The Responsibility Gap: Exploitation in Data Labeling

Transparency and accountability are crucial for responsible data practices. Those handling data, especially Indigenous data, must demonstrate how their work upholds self-determination and collective benefit for the communities involved. The underpaid data workers labeling sensitive content exemplify the dangers of neglecting responsibility in data practices.

The Path Forward: A Call to Collective Action

Discussions around codifying and operationalizing AI ethics are a critical step toward responsible data management. By translating ethical principles into actionable practices, we can create a ripple effect that fosters a new, more ethical way of thinking about data and AI. This includes embracing diverse cultural perspectives and dismantling systems of bias, discrimination, and exploitation.

But the question remains: How do we move beyond a system focused solely on profit?

Currently, the focus on profit maximization often leads to harmful practices, such as:

  • Paying data workers a pittance to label massive amounts of sensitive content, causing them psychological harm.

  • Exploiting communities through data collection without their knowledge or consent.

  • Perpetuating algorithmic bias through data that reflects historical inequalities.

The Need for a New Goal: Do No Harm

While AI has the potential to bring about positive change, there is also a possibility that it could negatively impact certain groups. It’s crucial to recognize that AI is just one tool among many, and a key initial step is determining whether it will be beneficial for a specific project.

Ethical considerations around data are paramount, as data can inadvertently lead to harm, such as in the case of revealing sensitive information through images like satellite imagery after a natural disaster.

It’s essential to handle data responsibly to prevent any leaks or unauthorized release of personal information, as there are real risks associated with compromised data, including potential misuse by authoritarian regimes for profiling and repressing dissidents.

Engaging with stakeholders is crucial to comprehensively understand potential negative effects, as seen in the example of tracking endangered species where the collected data could potentially be misused by illegal trackers. Therefore, it is imperative to apply the principle of “Do No Harm” and carefully consider the potential impacts on all those affected.

The current system prioritizes profit over people and the planet. We need to shift the goal of AI and data practices to “Do No Harm”, borrowed from the Hippocratic Oath which is the foundation of ethics in medicine and healthcare. For the growing field of AI, this requires a fundamental change from the beginning in how we approach data collection, labeling, use, and storage.

It’s a Difficult Question, But We Can’t Avoid It

Change can be daunting, but it’s necessary. The people and the planet deserve to be treated with dignity and respect. This should be the guiding principle for AI and data practices, not profit maximization. Achieving this will require a multi-pronged approach, including:

  • Empowering Communities: Indigenous communities and other marginalized groups should have control over their own data.

  • Fair Compensation for Data Labor: Data workers deserve fair wages and protections.

  • Transparency in Data Collection: People should know when and how their data is being collected and used.

  • Algorithmic Auditing: We need to continually identify and mitigate bias in AI systems.

The Responsibility Lies With All of Us

The conversation about ethics in AI and data practices needs to continue. We all have a role to play in demanding change. Corporate responsibility is huge, but the public can put pressure on corporations to shift their priorities in order to protect everyone. As with a lot of areas that need attention and change, the ability is there, but so is the apathy; and what is worse, assumed apathy of everyone else. It is extremely important to implement responsibility at the level we are now, and not wait until AI is much more powerful. It is vital that we pressure corporations to put more into responsibility than they do into profits and speed.

Final Thoughts on the Panel

It was an honor to be invited to speak on this panel and be a part of such important conversations about data responsibility and ethics. I hope that it spurs much thought and work in the community for others to join the ethics revolution.

If you are working with data, your part in this revolution is to abide by the CARE principles of Indigenous Data Sovereignty and to take the time to consider harms when working with data and AI. This needs to be done at all stages of the data lifecycle, especially when working with Indigenous data and data from the Global South. As a citizen of the world, ensure you are voting for representatives that consider regulation and policy that protects all people in the development of technology and put your money towards ethically developed AI products.

If you want to read more about my work in this area and ethics and AI more broadly, you can access my published works below.

Previous
Previous

Exploring the NTIA Report on Dual-Use Foundation Models

Next
Next

The AI Regulation Race