Bridging the gap between theoretical and practical applications of machine learning with Kaggle

Jigsaw
Jigsaw
Published in
4 min readJun 23, 2022

It’s been five years since we launched Perspective API, Jigsaw’s free service that helps moderators manage user generated content online. Looking back on how we’ve been able to evolve and improve the tool, we’ve been lucky to receive important and productive feedback from the machine learning community. When people interact directly with any model, they often uncover unanticipated outcomes. These are problems familiar to practitioners, and the primary focus of our ongoing efforts to improve and support Perspective’s functionality.

That is one of the reasons we’ve repeatedly turned to Google’s Kaggle, a site for hosting machine learning competitions, to test our assumptions and engage with the machine learning community. Over the last five years, we have used Kaggle competitions as a quintessential step in Jigsaw’s product development process, helping to bridge the gap between technical hypothesis and practical application. Along the way we have tested our theories, revised our assumptions and fundamentally improved our outcomes.

More data for the win

Data augmentation is a way to enable practitioners to significantly increase the diversity of data available for training models, without actually collecting new data. While using Kaggle we’ve seen contestants use datasets in new ways — including methods like back-translation, which involves algorithmically translating phrases from one language to another and then back to the original language. Sharing roundtrip translated sentences can introduce new phrasings and equivalent terms missing from the training data, improving coverage and model generalization.

We are committed to improving the outcomes of Perspective API, and if there are ways to make the tool more effective, we want to know. Seeing how other engineers and scientists approach these challenges often influences the way that we build models. And while strategies used to solve challenges on Kaggle can be impractical in production environments, they often bring new perspectives that help us come at challenges in new ways.

Mitigating bias

We have written about the problems of unintended bias in models, and we chose this topic as the basis for one of our earlier Kaggle competitions. This competition featured a newly introduced scoring framework that required participants’ models to perform well on many slices of the test data related to particular groups like gender or ethnicity. The tendency of machine learning models to unfairly label comments from and about marginalized communities is an open research challenge — and this is an area our team continues to research, building algorithms to mitigate biases.

Over three thousand teams participated in this contest, and the winning entry studied the bias metric and used a few techniques unique to their solution, including a special custom loss metric. The more significant and lasting impact of this competition can be seen by tracing the research trends. The winning result fostered hundreds of citations in publications, articles and several YouTube tutorials.

Promoting transparency

At Jigsaw we have the opportunity to work with a number of academic institutions, and one of the benefits we’ve seen in this sector is the openness to share information, learnings and hypotheses. This collegial spirit is one we aspire to always encourage in our work, and it has been repeatedly apparent in our Kaggle competitions. Kaggle’s community sits at the intersection of the academic data science community, industry and an eclectic group of hobbyists and entrepreneurs that almost defy categorization. The winners are frequently members of the community who share notebooks and advice in the forums, helping others move their work forward even if it provides no immediate gains to them personally.

We have often used our Kaggle competitions as a way to share our thinking and the challenges we’re aiming to solve. Jigsaw has worked to build a collection of annotated resources that can be publicly shared and used in commercial research. And even when we cannot share specific data, we endeavor to share a discussion of which methods work and which don’t, which can help others tackling similar problems.

All of our Kaggle contests have included disclosure requirements, and participants must share publicly the details of their winning submissions to receive their cash prizes. Many of our Kaggle competitions, like the Multilingual Toxic Comments Challenge, have — for technical reasons — even more disclosure requirements. For this competition, contestants had to share all of the data with other competitors before they could train their models. This was due to the limitations of Google TPU accelerator hardware, which did not yet support private data. But, as always, Kagglers took these technical challenges in stride. Jigsaw recognized two participants with special cash prizes for solving and sharing popular models and tools. Being able to open up access to the latest machine learning hardware, even before it’s commercially available, is part of the excitement that Kaggle brings to participants.

What’s Next

We view our Kaggle competitions as an opportunity to release data, engage with the community in a way that promotes transparency and provide performance baselines that can help inform discussions about algorithms and policies. Not every research question or product goal makes for a good competition. And even with a great idea, finding the right source of data and annotating it can pose serious challenges. We hope to continue providing competitions that engage the Kaggle community and bring new people into the field, and we’re actively considering topics for upcoming contests, including detecting toxic spans, incorporating annotator identity into model scores, or developing new models which better collaborate with human moderators.

Contributors: Jeffrey Sorensen, Ian Kivlichan, Nithum Thain, Tin Acosta, Lucy Vasserman

--

--

Jigsaw
Jigsaw
Editor for

Jigsaw is a unit within Google that explores threats to open societies, and builds technology that inspires scalable solutions.