GLOSS-Reading corpus

The corpus will be available for Download soon
You can download sample files


The Global Language Online Support System (GLOSS) Reading corpus has been developed by Defense Language Institute (DLI) Foreign Language Center. The corpus is a resource for independent second language learners and includes texts annotated with the ILR scale. The corpus is annotated with five difficulty levels and we morphologically analyzed it using MADAMIRA and Alkhalil-Toolkit. After necessary cleaning and annotation steps, the GLOSS-Reading corpus has xx texts comprising xx sentences and roughly xx words. The corpus is encoded in raw text and XML, each text is annotated with its level of difficulty. Moreover, Each word is annotated with lemma and part-of-speech.


For further details, please check the following paper(s) :

Or contact us at: naoual.nassiri@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.