핑퐁팀 ML 세미나, 그 여섯 번째

핑퐁 ML 리서치 사이언티스트들의 시즌 6 세미나 자료

Sep 16, 2020

Contents

Open-Retrieval Conversational Question Answering (서상우)Beyond Accuracy: Behavioral Testing of NLP Models with CHECKLIST (박채훈)Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval (김준성)Weight Poisoning Attacks on Pre-trained Models (정다운)Sparse, Dense, and Attentional Representations for Text Retrieval (구상준)Adversarial Filters of Dataset Biases (장성보)SimCLR: A Simple Framework for Contrastive Learning of Visual Representations (이주홍)마치며

서상우 박채훈 김준성 정다운 구상준 장성보 이주홍 | 2020년 09월 16일 | #Machine_Learning

안녕하십니까? 이번 여름은 언제 왔는지도 모르게 조용히 지나갔습니다. 저희 사이언티스트들이 여름에 진행한 시즌 6 세미나 발표자료를 갈무리하여 올립니다.

본 세미나는 2020년 7월 말에서 9월까지 이전 시즌과 동일하게 주제 제한 없이 매주 진행하였습니다. 그럼에도 불구하고 이번 시즌 세미나는 크게 두 가지 질문 ‘문장의 representation이 어떤 식으로 출력에 영향을 주는가?’와 ‘모델 성능을 어떻게 측정할 수 있을까?’를 주제로 진행되었습니다. Transformer가 기존의 벤치마크에 대해서 좋은 성능을 낸다는 것은 알려져 있지만, 그것이 어떻게 실제 체감되는 성능으로 이어질 수 있는지는 미지수입니다. 이는 오픈-도메인 챗봇을 개발하는 모든 사람들에게 주어진 과제라 할 수 있습니다.

Open-Retrieval Conversational Question Answering (서상우)

Open-Retrieval Conversational Question Answering
- Written by Chen Qu et al. @ University of Massachusetts Amherst, Ant Financial & Alibaba Group
- Published @ SIGIR 2020

https://speakerdeck.com/scatterlab/open-retrieval-conversational-question-answering

Beyond Accuracy: Behavioral Testing of NLP Models with CHECKLIST (박채훈)

Unified Language Model Pre-training for Natural Language Understanding and Generation
- Written by Marco Tulio Ribeiro et al. @ Microsoft Research, University of Washington & University of California, Irvine
- Published @ ACL 2020

https://speakerdeck.com/scatterlab/beyond-accuracy-behavioral-testing-of-nlp-models-with-checklist

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval (김준성)

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
- Written by Lee Xiong et al. @ Microsoft Corp.
- Preprinted in arXiv 2020

https://speakerdeck.com/scatterlab/approximate-nearest-neighbor-negative-contrastive-learning-for-dense-text-retrieval

Weight Poisoning Attacks on Pre-trained Models (정다운)

Weight Poisoning Attacks on Pre-trained Models
- Written by Keita Kurita et al. @ Language Technologies Institute, Carnegie Mellon University
- Published @ ACL 2020

https://speakerdeck.com/scatterlab/weight-poisoning-attacks-on-pre-trained-models

Sparse, Dense, and Attentional Representations for Text Retrieval (구상준)

Sparse, Dense, and Attentional Representations for Text Retrieval
- Written by Yi Luan et al. @ Google Research
- Preprinted in arXiv 2020

https://speakerdeck.com/scatterlab/sparse-dense-and-attentional-representations-for-text-retrieval

Adversarial Filters of Dataset Biases (장성보)

Adversarial Filters of Dataset Biases
- Written by Ronan Le Bras et al. @ Allen Institute for Artificial Intelligence & University of Washington
- Published @ ICML 2020

https://speakerdeck.com/scatterlab/adversarial-filters-of-dataset-biases

SimCLR: A Simple Framework for Contrastive Learning of Visual Representations (이주홍)

SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
- Written by Ting Chen et al. @ Google Research, Brain Team
- Published @ ICML 2020

https://speakerdeck.com/scatterlab/simclr-a-simple-framework-for-contrastive-learning-of-visual-representations

마치며

2020년 여름에 진행되었던 머신러닝 세미나 자료를 공유해보았습니다. “어떤 언어 모델이 좋다” 라는 문장이 의미하는 바는 정확히 무엇일까요? 벤치마크에서 높은 성능을 거둔 언어 모델이 실제로는 좋은 모델이라고 할 수 없다면, 우리가 간과하고 있는 게 무엇일까요? 언어의 복잡한 특성을 보다 잘 반영한 챗봇을 만들기 위해 핑퐁팀은 계속해서 노력하고 있습니다.

PS: 아래 메일링 리스트에 등록하시면 핑퐁 블로그에 올라오는 재미있고 유익한 글들을 가장 빠르게 받아보실 수 있습니다 😉