Transliteration translates proper names and technical terms across languages that use different alphabets and sound systems.
Input:
约翰伍兹 (yue han wu zi)
Output:
John Woods
Named Entity Workshop (NEWS) is a long-running transliteration evaluation campaign. Chinese/English is one of the most popular NEWS language pairs. For NEWS 2018:
| Test set name | Source | Target | Test set size (phrase pairs) |
|---|---|---|---|
| NEWS 2018 Dataset_03 T-EnCh | English | Chinese | 1000 |
| NEWS 2018 Dataset_03 B-ChEn | Chinese | English | 1000 |
English-Chinese
| ACC | F-score | MRR | MAP | |
|---|---|---|---|---|
| He, Cohen (2020) | 0.299 | 0.6799 | ||
| EDI (University of Edinburgh) | 0.304 | 0.6791 | 0.4364 | 0.304 |
Chinese-English
| ACC | F-score | MRR | MAP | |
|---|---|---|---|---|
| UALB (University of Alberta) | 0.3 | 0.8 | 0.374 | 0.3 |
| EDI (University of Edinburgh) | 0.276 | 0.83 | 0.386 | 0.276 |
| Train set name | Source | Target | Train set size (phrase pairs) |
|---|---|---|---|
| NEWS 2018 Dataset_03 T-EnCh |
English | Chinese | 41318 |
| NEWS 2018 Dataset_03 B-ChEn |
Chinese | English | 32002 |
Suggestions? Changes? Please send email to chinesenlp.xyz@gmail.com