Question answering (QA) automatically provides answers to questions posed in natural language. Answers may be contained in structured databases or unstructured text collections.
Input:
世界上最大的国家是什么?
Output:
俄国
The KBQA shared task at NLPCC 2017 asks systems to retrieve answers from a provided knowledge base (KB) of factual triples. The knowledge base consists of 8.7m entities and 47.9m triples.
The test set was formed by human annotators who selected triples. For each triple, the annotator wrote down a natural-language question whose answer is the object of the triple. Q/A pairs are provided, but the triple is not provided.
Test set | Size (Q/A pairs) | Genre |
---|---|---|
NLPCC-ICCPOL KBQA 2016 | 9870 | Open domain |
NLPCC KBQA 2017 | 7631 | Open domain |
Averaged F1.
14 teams participated.
System | Averaged F1 |
---|---|
Best anonymous score reported | 0.47 |
Train set | Size (Q/A pairs) | Genre |
---|---|---|
NLPCC KBQA 2016/2017 | 14,609 | Open domain |
The DBQA shared task at NLPCC 2017 asks systems to
The test set was formed by human annotators who were given documents. For each document, an annotator selected a sentence, then constructed a natural-language question whose answer is that sentence.
Test set | Size (document/sentence pairs) | Genre |
---|---|---|
NLPCC-ICCPOL DBQA 2016 | 5779 | Open domain |
NLPCC DBQA 2017 | 2500 | Open domain |
NLPCC DBQA 2016
System | MRR | F1 |
---|---|---|
ERNIE 2.0 | 95.8 | 85.8 |
Meng et. al. (2019) (Glyce + BERT) | - | 83.4 |
ERNIE(baidu) | 95.1 | 82.7 |
BERT | 94.6 | 80.8 |
NLPCC DBQA 2017
System | MRR | MAP | Accuracy @ 1 |
---|---|---|---|
Best anonymous score reported | 72.0 | 71.7 | 59.2 |
Train set | Size (document/sentence pairs) | Genre |
---|---|---|
NLPCC DBQA 2016/2017 | 8772 | Open domain |
CLUE is a Chinese Language Understanding Evaluation benchmark. Machine Reading Comprehension (MRC) is a task to teach machine to read and understand unstructured text and then answer questions about it. MRC corpus in CLUE consists of three datasets: CMRC 2018 (Cui et al.), ChID (Zheng et al.), and C3 (Sun et al.).
| System | CMRC 2018 | ChID | C3 | | — | — | — | — | | HUMAN (CLUE origin) | 92.40 | 87.10 | 96.00 | | RoBERTa-wwm-ext-large (CLUE origin) | 76.58 | 85.37 | 72.32 | | BERT-base (CLUE origin) | 69.72 | 82.04 | 64.50 |
Suggestions? Changes? Please send email to chinesenlp.xyz@gmail.com