fisun.org

Russian Religious Corpora

A research-oriented collection of Russian corpora and resources hosted at fisun.org.

The corpora listed below are available via NoSketch Engine (available) and KonText (in development). Corpus query interface (NoSketch Engine): https://noske.fisun.org/

For corpus descriptions, compilation notes, licensing notes, and suggested citations, see the “Corpus info” pages linked below.

Corpora of Orthodox religious language

OrthRus available

A large, multi-genre Russian corpus of Orthodox religious language compiled from multiple portals and text types (data collected in December 2025 and January 2026). It includes magazine-style content, news and topical articles with an Orthodox focus, Q&A (“ask a priest”) materials, and an Orthodox digital library (books, textbooks, and academic works).

Tokens232,687,568
Documents (<doc>)223,666

A multi-source corpus compiled from patriarchia.ru, dialog.elitsy.ru, foma.ru, azbyka.ru, and pravmir.ru.

Vopros available

A corpus of religious Q&A texts (questions and published answers) collected from several Russian Orthodox web portals (data collected in December 2025). Answers are not always authored by priests; in many cases they are provided by other church-affiliated respondents authorized by the portals.

Tokens15,568,332
Documents (<doc>)67,771
Question segments (<question>)71,510
Answer segments (<answer>)94,277

Sources include Q&A sections of azbyka.ru and foma.ru, plus Q&A content from elitsy.ru and pravmir.ru.

Foma available

A Russian corpus compiled from the online portal of the Russian Orthodox magazine Foma (foma.ru), excluding Q&A content (data collected in December 2025).

Tokens29,890,878
Documents (<doc>)31,744

Note: the corpus contains non-Q&A content from foma.ru.

Otechnik_20 available

A Russian corpus compiled from the online library “Otechnik” of the Russian Orthodox portal Azbyka (azbyka.ru), containing only texts by authors born in the 20th century (data collected in December 2025).

Tokens87,801,535
Documents (<doc>)3,273

Source: azbyka.ru → Otechnik online library (20th-century authors only).

Corpora of Islamic religious language

IslamRus available

A large, multi-genre Russian corpus compiled from Islamic online resources representing different regions of Russia (data collected in December 2025 and January 2026). The corpus covers multiple registers and genres, including news, topical articles, fatwas, Q&A, and practical guidance texts.

Tokens36,021,460
Documents (<doc>)76,385

A multi-source corpus compiled from islam.ru, islamdag.ru, islam-today.ru, umma.ru, azan.ru, and muftiyatrd.ru.

Islam-today available

Corpus from Islam-today.ru (non-Q&A content; collected in January 2026). The corpus contains Islam-related discourse in Russia.

Tokens7,170,001
Documents (<doc>)9,498

Otvet_islam available

A corpus of Islamic religious Q&A texts (questions and published answers), including fatwas and expert replies, collected from several Russian web portals (data collected in January 2026). Answers are authored by Islamic scholars, muftis, and imams representing different regions of Russia.

Tokens2,363,889
Documents (<doc>)4,073
Question segments (<question>)4,023
Answer segments (<answer>)4,071

Sources include Q&A sections of muftiyatrd.ru, islam-today.ru, islam.ru, islamdag.ru, and azan.ru.

Reference corpora (not available)

Russian National Corpus (gold standard sample) not available

nkrja_gold_standard. This corpus is not available for querying on this server and is intended for personal use only. The dataset can be downloaded after filling in a license agreement at https://ruscorpora.ru/page/corpora-datasets/.

Tokens1,393,233
Documents (<doc>)556

Reference corpora (planned)

The following reference corpora are planned for integration into the same NoSketch interface:

Entries will be activated once the corpora are available for querying on this server.

Contact

Maintainer: roman.fisun@ur.de