Islam-today is a Russian corpus compiled from the Islamic portal Islam-today.ru, excluding Q&A content; collected in January 2026.
In its self-description, Islam-today.ru states that the ideological basis of the website is rooted in Islamic traditions historically characteristic of Muslims of Tatarstan and Russia.
The portal describes itself as a federal information-and-analytical project opened in 2012, and presents “openness, speed, and objectivity” as its journalistic motto.
The site also features content and author sections linked to Tatarstan’s Islamic institutions; for example, it maintains a dedicated blog section for Kamil khazrat Samigullin (Mufti and Chairman of the Spiritual Administration of Muslims of the Republic of Tatarstan, DUM RT).
At the time of data collection (Jan 2026), the portal navigation included (among others) the following major section groups:
<doc> metadata:
doc.portal, doc.language, doc.author, doc.status,
doc.title, doc.pubdate, doc.pubyear, doc.source,
doc.url, doc.rubric, doc.text_id.
<s>.The corpus is annotated in the Universal Dependencies (UD) framework using UDPipe (lemmatization and UPOS tagging; morphological features in UD FEATS format).
The tagger/lemmatizer model is based on UD Russian SynTagRus.
Tagset and annotation notes: corpora.fisun.org/corpus-pages/tagset.html.
Positional attributes: id, word, lemma, pos, morph, head, deprel, and dynamic lc (plus lemma_lc, if enabled in the registry).
| Item | Count |
|---|---|
| Tokens | 7,170,001 |
| Words | 5,777,586 |
Sentences (<s>) | 390,995 |
Documents (<doc>) | 9,498 |
| Attribute | Count |
|---|---|
id | 249 |
word | 326,702 |
lemma | 174,873 |
pos | 17 |
morph | 551 |
head | 237 |
deprel | 38 |
lc | 297,013 |
lemma_lc | 163,016 |
The corpus defines <doc> with 11 attributes and <s>.
| Structure | Attribute | Distinct values |
|---|---|---|
<doc> | doc.author | 1,685 |
<doc> | doc.language | 1 |
<doc> | doc.portal | 1 |
<doc> | doc.pubdate | 3,332 |
<doc> | doc.pubyear | 16 |
<doc> | doc.rubric | 45 |
<doc> | doc.source | 1 |
<doc> | doc.status | 13 |
<doc> | doc.text_id | 9,498 |
<doc> | doc.title | 9,415 |
<doc> | doc.url | 9,498 |
Fisun, Roman. 2026. Islam-today: Russian Islamic portal corpus. Compiled from Islam-today.ru (data collected in January 2026; Q&A excluded).
Available at: https://corpora.fisun.org/ (corpus name: islam-today). Accessed: <YYYY-MM-DD>.
Access to this corpus is restricted (password-protected) and provided on an “as is” basis for research and educational use only. This service does not grant any license or other rights to the underlying texts.
The corpus (including any excerpts, downloads, or derived copies of the original texts) is not freely distributable. Reproduction, redistribution, republication, mirroring, or making the content publicly available is prohibited unless you have explicit permission from the respective rights holders and/or the source website.
Copyright and any other rights in the original texts remain with the respective source website and/or their authors. Users are solely responsible for ensuring that any use complies with applicable law and the terms of the original source.
The maintainer makes no warranties regarding completeness, accuracy, fitness for a particular purpose, or continued availability of the service.
Maintainer: roman.fisun@ur.de