Data Labeling using Weak Supervision: In Action

In this blog post, I will share my takeaways and results from using Weak Supervision to label Jigsaw’s Comments data as toxic or non-toxic comments. In my previous

Weak Supervision for Online Discussions

When working on the data I used for my previous blog post, I grew particularly interested in learning how the dataset was labeled for toxicity and identities. I believe that...

Feature-based Approach with BERT

BERT is a language representation model pre-trained on a very large amount of unlabeled text corpus over different pre-training tasks. It was proposed in the paper BERT: Pre-training of...

Semantic Entailment

A key part of our understanding of natural language is the ability to understand sentence semantics.