WikiReading - Review of Literature
What is WikiReading
@ WIKIREADING- A Novel Large-scale Language Understanding Task over Wikipedia, ACL 2016
new data from structured knowledge statements
Wikidata
(item, property, value) -> (document, question, answer)
Statistics
- all instances size : 18.58M
- train/val/test(85/10/5): 16.03M, 1.89M, 0.95M
- documents: 4.7M unique, 5.31 instances, 4, 879 ; 489.2, 203 words
- properties: 867 unique, 20->75%, 180->99%; Categorical and Relational
Features
- long document
- short question
- structured knowledge
A Real Case
Baselines
State-of-the-art Methods
Coarse-to-fine Model
@ Coarse-to-Fine Question Answering for Long Documents, ACL 2017
@ Hierarchical Question Answering for Long Documents, Arxiv 2016
Intuition
long documents and low speed
Model
Sentence Selection Methods
BoW Model
Chunked BoW Model
- Convolutional Neural Network Model
Document Representation
Hard Attention
Soft Attention
Learning Methods
- Distant Learning
First sentence full matching the answer
First sentence of Document if no full match exists
Reinforcement Learning
Soft Attention Learning
Result
Discussion
70%, 10%, 20%
Sliding-Window Encoder Attentive Reader (SWEAR)
@ Accurate Supervised and Semi-Supervised Machine Reading for Long Documents, EMNLP 2017
Intuition
Long document, Chunk
Model
- Attention Method
Result
Feedback&Advice
- weibo:@伟康青年,@github
- mail:wavejkd@pku.edu.cn