Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.
SOTA for Question Answering on SQuAD2.0
LINGUISTIC ACCEPTABILITY NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS
To enable evaluation of progress on code search, we are releasing the CodeSearchNet Corpus and are presenting the CodeSearchNet Challenge, which consists of 99 natural language queries with about 4k expert relevance annotations of likely results from CodeSearchNet Corpus.
CMU-Perceptual-Computing-Lab/openpose_train
•We present the first single-network approach for 2D~whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints.
CMU-Perceptual-Computing-Lab/openpose_train
•OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.
SOTA for Text Summarization on GigaWord (using extra training data)
ABSTRACTIVE TEXT SUMMARIZATION LANGUAGE MODELLING QUESTION ANSWERING QUESTION GENERATION TEXT GENERATION
We release an implementation of our language with this paper.
MIT-HAN-LAB/temporal-shift-module •
The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.
VIDEO OBJECT DETECTION VIDEO RECOGNITION VIDEO UNDERSTANDING
CVPR 2018 • MIT-HAN-LAB/temporal-shift-module •
Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.
#3 best model for Instance Segmentation on COCO minival
INSTANCE SEGMENTATION KEYPOINT DETECTION OBJECT DETECTION VIDEO CLASSIFICATION
MIT-HAN-LAB/temporal-shift-module •
The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network.
#3 best model for Multimodal Activity Recognition on EV-Action
ACTION RECOGNITION IN VIDEOS MULTIMODAL ACTIVITY RECOGNITION
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
SOTA for Language Modelling on Text8 (using extra training data)
COMMON SENSE REASONING DOCUMENT SUMMARIZATION LANGUAGE MODELLING MACHINE TRANSLATION QUESTION ANSWERING READING COMPREHENSION TEXT GENERATION