While we have made great progress in natural language understanding, transferring the success from benchmark datasets to real applications has not always been smooth. Notably, models sometimes make mistakes that are confusing and unexpected to humans. In this talk, I will discuss shortcuts in NLP tasks and present our recent works on guarding against spurious correlations in natural language understanding tasks (e.g. textual entailment and paraphrase identification) from the perspectives of both robust learning algorithms and better data coverage. Motivated by the observation that our data often contains a small amount of “unbiased” examples that do not exhibit spurious correlations, we present new learning algorithms that better exploit these minority examples. On the other hand, we may want to directly augment such “unbiased” examples. While recent works along this line are promising, we show several pitfalls in the data augmentation approach.