otechnik_20/Azbyka

otechnik_20/Azbyka is a Russian corpus compiled from the online library (“Otechnik”) of the Russian Orthodox portal Azbyka (azbyka.ru), containing only texts by authors born in the 20th century; collected in December 2025.

Source

About the portal (azbyka.ru)

Azbyka.ru (“Азбука веры”) is a large Russian Orthodox educational portal that describes itself as a volunteer-driven project, funded by reader donations via a nonprofit foundation.

Popularity and reach

The project reports a monthly audience of more than 8 million unique readers. It also reports downloading about 100 TB of information per month across the portal.

Organization and editorial model

The portal states that more than 700 volunteers contribute to the project and that the team is distributed across different cities and countries (no single office). The portal states that its budget consists exclusively of reader donations, managed via the nonprofit “Azbyka Very” foundation.

Rubrics and content sections

The portal is structured as a multi-section resource (a “tree” of projects), including major thematic blocks such as libraries, reference/encyclopedic materials, media, and practical thematic sections.

Library section includes the patristic / church-writer library “Otechnik” (“Отечник”). “Otechnik” is described as a large library of texts by church authors (patristics, theology, biblical studies, church history, canon law, etc.).

Since when online

The portal states it was founded in 2005.

Significance

Document structure

Linguistic annotation

The corpus is annotated in the Universal Dependencies (UD) framework using UDPipe (lemmatization and UPOS tagging; morphological features in UD FEATS format).

Tagset and annotation notes: corpora.fisun.org/corpus-pages/tagset.html.

Corpus attributes

Positional attributes: word, lemma, pos, morph, lc.

Size (current index)

ItemCount
Tokens87,801,535
Words67,438,001
Sentences (<s>)5,230,177
Documents (<doc>)3,273

Lexicon sizes

AttributeCount
word1,711,982
lemma1,229,630
pos17
morph556
lc1,540,887

Text types (metadata inventory)

The corpus defines <doc> with 9 attributes and <s> with 1 attribute (s.id).

StructureAttributeDistinct values
<doc>doc.author162
<doc>doc.authyear134
<doc>doc.language1
<doc>doc.pubyear135
<doc>doc.source2,823
<doc>doc.status25
<doc>doc.text_id3,043
<doc>doc.title3,273
<doc>doc.url3,043
<s>s.id1

How to cite

Fisun, Roman. 2025. otechnik_20/Azbyka: Russian Orthodox library corpus. Compiled from azbyka.ru (“Otechnik” library; authors born in the 20th century; data collected in December 2025). Available at: https://corpora.fisun.org/ (corpus name: otechnik_20). Accessed: <YYYY-MM-DD>.

Software

Terms of use

Access to this corpus is restricted (password-protected) and provided on an “as is” basis for research and educational use only. This service does not grant any license or other rights to the underlying texts.

The corpus (including any excerpts, downloads, or derived copies of the original texts) is not freely distributable. Reproduction, redistribution, republication, mirroring, or making the content publicly available is prohibited unless you have explicit permission from the respective rights holders and/or the source website(s).

Copyright and any other rights in the original texts remain with the respective source website(s) and/or their authors. Users are solely responsible for ensuring that any use complies with applicable law and the terms of the original sources.

The maintainer makes no warranties regarding completeness, accuracy, fitness for a particular purpose, or continued availability of the service.

Contact

Maintainer: roman.fisun@ur.de