otechnik_20 / Azbyka is a Russian corpus compiled from the online library (“Otechnik”) of the Russian Orthodox portal Azbyka (azbyka.ru). It contains only texts by authors born in the 20th century and was collected in December 2025.
Azbyka.ru (“Азбука веры”) is a large Russian Orthodox educational portal that describes itself as a volunteer-driven project funded by reader donations through a nonprofit foundation.
The project reports a monthly audience of more than 8 million unique readers. It also reports approximately 100 TB of downloaded information per month across the portal.
The portal states that more than 700 volunteers contribute to the project and that the team is distributed across different cities and countries rather than based in a single office. It also states that its budget consists exclusively of reader donations managed through the nonprofit foundation “Azbyka Very”.
The portal is structured as a multi-section resource, described as a “tree” of projects, and includes major thematic blocks such as libraries, reference and encyclopedic materials, media, and practical thematic sections.
The library section includes the patristic and church-writer library “Otechnik” (“Отечник”). “Otechnik” is described as a large library of texts by church authors, including materials in patristics, theology, biblical studies, church history, canon law, and related fields.
The portal states that it was founded in 2005.
<doc> metadata: doc.language, doc.author, doc.authyear, doc.status, doc.title, doc.pubyear, doc.source, doc.url, doc.text_id.<s> (attribute: s.id).The corpus is annotated within the Universal Dependencies (UD) framework using UDPipe for lemmatization and UPOS tagging, with morphological features in UD FEATS format.
For the tagset and annotation notes, see: corpora.fisun.org/corpus-pages/tagset.html.
Positional attributes: word, lemma, pos, morph, lc.
| Item | Count |
|---|---|
| Tokens | 87,801,535 |
| Words | 67,438,001 |
Sentences (<s>) | 5,230,177 |
Documents (<doc>) | 3,273 |
| Attribute | Count |
|---|---|
word | 1,711,982 |
lemma | 1,229,630 |
pos | 17 |
morph | 556 |
lc | 1,540,887 |
The corpus defines <doc> with 9 attributes and <s> with 1 attribute (s.id).
| Structure | Attribute | Distinct values |
|---|---|---|
<doc> | doc.author | 162 |
<doc> | doc.authyear | 134 |
<doc> | doc.language | 1 |
<doc> | doc.pubyear | 135 |
<doc> | doc.source | 2,823 |
<doc> | doc.status | 25 |
<doc> | doc.text_id | 3,043 |
<doc> | doc.title | 3,273 |
<doc> | doc.url | 3,043 |
<s> | s.id | 1 |
Fisun, Roman. 2025. otechnik_20 / Azbyka: Russian Orthodox Library Corpus. Compiled from azbyka.ru
(the “Otechnik” library; authors born in the 20th century; data collected in December 2025).
Available at: https://corpora.fisun.org/
(corpus name: otechnik_20). Accessed: <YYYY-MM-DD>.
Access to this corpus is restricted (password-protected) and provided on an “as is” basis for research and educational use only. This service does not grant any licence or other rights to the underlying texts.
The corpus, including any excerpts, downloads, or derived copies of the original texts, is not freely distributable. Reproduction, redistribution, republication, mirroring, or making the content publicly available is prohibited unless you have explicit permission from the respective rights holders and/or the source website(s).
Copyright and any other rights in the original texts remain with the respective source website(s) and/or their authors. Users are solely responsible for ensuring that any use complies with applicable law and the terms of the original sources.
The maintainer makes no warranties regarding completeness, accuracy, fitness for a particular purpose, or continued availability of the service.
name.surname [at] ur.de