July 26, 2020

Data Labeling using Weak Supervision: In Action

In this blog post, I will share my takeaways and results from using Weak Supervision to label Jigsaw’s Comments data as toxic or non-toxic comments. In my previous

July 05, 2020

Weak Supervision for Online Discussions

When working on the data I used for my previous blog post, I grew particularly interested in learning how the dataset was labeled for toxicity and identities. I believe that...

April 04, 2020

Feature-based Approach with BERT

BERT is a language representation model pre-trained on a very large amount of unlabeled text corpus over different pre-training tasks. It was proposed in the paper BERT: Pre-training of...

December 22, 2017

Semantic Entailment

A key part of our understanding of natural language is the ability to understand sentence semantics.