Data Labeling using Weak Supervision: In Action
In this blog post, I will share my takeaways and results from using Weak Supervision to label Jigsaw’s Comments data as toxic or non-toxic comments. In my previous
Weak Supervision for Online Discussions
When working on the data I used for my previous blog post, I grew particularly interested in learning how the dataset was labeled for toxicity and identities. I believe that...
Feature-based Approach with BERT
BERT is a language representation model pre-trained on a very large amount of unlabeled text corpus over different pre-training tasks. It was proposed in the paper BERT: Pre-training of...
Semantic Entailment
A key part of our understanding of natural language is the ability to understand sentence semantics.