IslamRus
IslamRus (islamrus) is a multi-portal Russian corpus of Islamic web discourse.
The data were collected in December 2025 and January 2026; the current index was compiled in January 2026.
Public subcorpora are provided by genre (doc.genre) and by portal (doc.portal).
Sources (portals)
IslamRus combines three analytical layers of Russian Islamic discourse: institutional communication (muftiate register), editorial media writing (news and commentary), and educational/advisory writing (didactic texts and Q&A).
Portal labels correspond to doc.portal.
Portal coverage (current index)
Token coverage is intentionally uneven: the index is built around large media sources and then complemented with smaller educational and institutional components to maximize register diversity.
| Portal | Tokens | Share |
| islamdag.ru | 11,804,986 | 32.8% |
| islam.ru | 11,223,197 | 31.2% |
| islam-today.ru | 7,843,284 | 21.8% |
| umma.ru | 3,413,729 | 9.5% |
| azan.ru | 1,212,972 | 3.4% |
| muftiyatrd.ru | 523,292 | 1.5% |
Reach (where available)
Audience figures are heterogeneous across portals. Where explicit portal metrics or widely used third-party estimates are available, they are listed here as contextual information.
- islam-today.ru: Semrush estimate — 1.51M visits (February 2025 snapshot).
- umma.ru: portal-reported — about 400,000 unique users per month; over 1,000,000 views per month.
- islam.ru: Ahrefs estimate — about 29K monthly organic search traffic (March 2025 snapshot).
islamdag.ru
islamdag.ru is treated as a high-coverage media-and-advisory source with a strong regional anchor in Dagestan.
The portal explicitly frames its work as religious education and as protection from pseudo-religious movements linked to extremist ideology.
islam.ru
islam.ru is treated as a large-scale media portal in the current index.
It contributes substantial news and editorial discourse and functions as one of the backbone sources for mainstream Russian-language Islamic public writing in IslamRus.
islam-today.ru
islam-today.ru is treated as a federal-scale media source combining news reporting with explanatory and advisory genres.
In its positioning statement, the portal links its ideological basis to traditions historically characteristic for Muslims of Tatarstan and Russia.
umma.ru
umma.ru is treated as an author-centered educational source in IslamRus.
The site describes itself as an educational project, states a 1999 launch, and identifies Shamil Alyautdinov as the author of materials.
azan.ru
azan.ru is treated as a structured educational and reference source.
A programmatic portal text frames the project as “traditional Sunni” in creed and madhhab-based jurisprudence and presents it as an educational initiative.
muftiyatrd.ru
muftiyatrd.ru is treated as the institutional component of IslamRus.
It represents the public communication of the Muftiate of the Republic of Dagestan and is used to capture a formal, administrative register (announcements, statements, organizational texts).
Public subcorpora
Public subcorpora are provided by genre (doc.genre) and by portal (doc.portal).
Genre subcorpora support controlled comparisons between editorial discourse, news reporting, and Q&A interaction.
By genre
genre_core: editorial and educational discourse (articles, explanations, commentaries, structured lessons).
genre_news: news reporting and chronicle-like updates.
genre_qa: Q&A and fatwa-style texts (questions and published answers).
By portal
portal_islamdag: islamdag.ru
portal_islam_ru: islam.ru
portal_islam_today: islam-today.ru
portal_umma: umma.ru
portal_azan: azan.ru
portal_muftiyatrd: muftiyatrd.ru
Document structure
Units and segmentation
<doc> is the main document unit (one portal item: article, news entry, or Q&A page).
<s> marks sentence boundaries.
<question> and <answer> are present only in Q&A-type documents.
<doc> metadata fields
Metadata is extracted from source pages and normalized where possible. Field availability differs across portals and sections.
| Field | Meaning |
doc.text_id | Internal unique identifier (assigned during compilation). |
doc.url | Source URL of the document. |
doc.portal | Source portal label (domain-based). |
doc.genre | Genre label used for public subcorpora: core, news, qa. |
doc.title | Title/headline (for Q&A sources, the question text may be mapped to title during normalization). |
doc.author | Author string when present. |
doc.status | Author role/status descriptor when available (values may be noisy due to heterogeneous sources). |
doc.rubric | Rubric/section label as defined by the source site. |
doc.pubdate | Publication date when available. |
doc.pubyear | Publication year. |
doc.language | Language label when available. |
Linguistic annotation
The corpus is annotated in the Universal Dependencies (UD) framework using UDPipe (lemmatization, UPOS tagging, and UD FEATS).
Tagset and annotation notes:
https://corpora.fisun.org/corpus-pages/tagset.html
Size (current index)
Tokens36,021,460
Documents (<doc>)76,385
Sentences (<s>)1,971,787
Minimal corpus profile
IslamRus is dominated by editorial and educational discourse (genre_core), while news and Q&A form smaller but analytically distinct registers.
In the current index, core accounts for 72.3% of tokens, news for 21.2%, and Q&A for 6.6%.
Subcorpus sizes (tokens)
| Subcorpus | Tokens | Share |
genre_core | 26,035,268 | 72.3% |
genre_news | 7,622,303 | 21.2% |
genre_qa | 2,363,889 | 6.6% |
How to cite
Fisun, Roman. 2026. IslamRus: Multi-portal Russian Islamic web discourse corpus.
Compiled from islamdag.ru, islam.ru, islam-today.ru, umma.ru, azan.ru, muftiyatrd.ru (multi-genre; core sections, news, and Q&A; data collected in December 2025 and January 2026).
Available at: https://corpora.fisun.org/ (corpus name: islamrus). Accessed: <YYYY-MM-DD>.
Software
Terms of use
Access to this corpus is restricted (password-protected) and provided on an “as is” basis for research and educational use only. This service does not grant any license or other rights to the underlying texts.
The corpus (including any excerpts, downloads, or derived copies of the original texts) is not freely distributable. Reproduction, redistribution, republication, mirroring, or making the content publicly available is prohibited unless you have explicit permission from the respective rights holders and/or the source websites.
Copyright and any other rights in the original texts remain with the respective source websites and/or their authors. Users are solely responsible for ensuring that any use complies with applicable law and the terms of the original sources.
The maintainer makes no warranties regarding completeness, accuracy, fitness for a particular purpose, or continued availability of the service.
Contact
Maintainer: roman.fisun@ur.de