Qur'an Word by Word Sunnah Salah Audio
Sign In Search
Find a verse ...

The Quranic Arabic Corpus

Welcome to the Quranic Arabic Corpus, an annotated linguistic resource which shows the Arabic grammar, syntax and morphology for each word in the Holy Quran. The corpus provides three levels of analysis: morphological annotation, a syntactic treebank and a semantic ontology.

Developed at the Institute for Artificial Intelligence (AI) at the University of Leeds, the grammatical data was initially produced by a computer program that independently derived the rules of Classical Arabic grammar using machine learning. By learning from example annotation, the AI system provides detailed grammatical analysis for each word in the Quran. The resulting information has been made available online as a free educational resource.

Similar to Wikipedia, the corpus website is publically accessible and contains no advertising. Although generated automatically by an artificial intelligence system, to ensure accuracy, a team of volunteer editors continually update and improve the content by comparing to authentic, traditional sources of Quranic grammar.

The corpus is a unique online website for studying the Classical Arabic language of the Quran through the traditional analysis of iʿrāb (إعراب), focusing on morphology (ṣarf - صرف) and syntactic analysis (naḥu - نحو). The website is actively used worldwide to access deep linguistic analysis of the Quran, as a study aid for learning Classical Arabic. Reviving the historical approach to linguistics by traditional Arabic grammarians, the website contains new grammatical information provided in unprecedented detail that is not available anywhere else on the internet or in book form.

What is the Quran?

The Quran is a significant religious text followed by believers of the Islamic faith. Written in Quranic Arabic, the Quran contains 6,236 numbered verses (ayāt) and is divided into 114 chapters. An example verse (21:30) from the Quran:

Have those who disbelieved not considered that the heavens and the earth were a joined entity, and We separated them and made from water every living thing? Then will they not believe?

Access the corpus...

  • What's new? - Updated features in version 0.5 of the corpus.
  • Word by Word Quran - Maps out the syntax of the entire Quran, with analysis and translation.
  • Quranic Grammar - Traditional Arabic grammar (إعراب) illustrated using dependency graphs.

How you can get involved

This project contributes to original research of the Quran by applying natural language computing technology to analyze the Arabic text of each verse. The word by word grammar is very accurate, but ensuring complete accuracy is not possible without your help. If you come across a word and you feel that a better analysis could be provided, you can suggest a correction online by clicking on an Arabic word.

World map of users of the Quranic Arabic Corpus, provided by Google Analytics. Countries with the highest number of users are shaded in darker blue.

The map above shows worldwide interest in the Quranic Arabic Corpus. The website is used by over 5 million people a year from 165 different countries. Help us review the information on this website so that together we can build the most accurate linguistic resource for Quranic Arabic.


The Quranic Arabic Corpus is divided into several research projects that focus on different aspects of linguistics. The main projects are the treebank, which covers morphology (صرف) and syntax (نحو), and the ontology which maps concepts in the Quran.

The Quranic Arabic Treebank

The Quranic Treebank is an effort to map out the entire grammar of the Quran by linking Arabic words through dependencies. The linguistic structure of verses is represented using formal graphs (from mathematical graph theory). The Quranic Arabic Corpus provides a novel visualization of Quranic syntax using hybrid dependency graphs that combine aspects of both modern constituency and dependency grammar.

A hybrid dependency-constituency syntactic representation for verse (67:1) of the Quran.

The Ontology of Quranic Concepts

The Quranic Ontology uses knowledge representation to define the key concepts in the Quran and shows the relationships between these concepts using predicate logic. Named entities in verses, such as the names of historic people and places mentioned in the Quran, are linked to concepts in the ontology.

A visual representation of the ontology with 300 linked concepts and 350 relations.

See also...

Language Research Group
University of Leeds
  In research publications, please cite:

Statistical Parsing by Machine Learning from a Classical Arabic Treebank.
Kais Dukes (2013). PhD Thesis, University of Leeds.

Copyright © Dr. Kais Dukes (2009-2015).
E-mail: kais@kaisdukes.com