Islam-today

Islam-today is a Russian corpus compiled from the Islamic portal islam-today.ru, excluding Q&A content. Data was collected in January 2026.

Source

About the portal

According to its self-description, islam-today.ru is rooted in Islamic traditions historically characteristic of Muslims in Tatarstan and Russia.

The portal describes itself as a federal information and analytical project launched in 2012, with the journalistic motto “openness, speed, and objectivity”.

The site features content and author sections linked to Tatarstan’s Islamic institutions, including a dedicated blog section for Kamil khazrat Samigullin (Mufti and Chairman of the Spiritual Administration of Muslims of the Republic of Tatarstan, DUM RT).

Sections

At the time of data collection (January 2026), the portal navigation included the following major section groups (among others):

Scope and exclusions

Document structure

Linguistic annotation

The corpus is annotated within the Universal Dependencies (UD) framework using UDPipe for lemmatization and UPOS tagging, with morphological features in UD FEATS format.

The tagger and lemmatizer model is based on UD Russian SynTagRus.

For the tagset and annotation notes, see: corpora.fisun.org/corpus-pages/tagset.html.

Corpus attributes

Positional attributes: id, word, lemma, pos, morph, head, deprel, and dynamic lc (plus lemma_lc, if enabled in the registry).

Size

ItemCount
Tokens7,170,001
Words5,777,586
Sentences (<s>)390,995
Documents (<doc>)9,498

Lexicon sizes

AttributeCount
id249
word326,702
lemma174,873
pos17
morph551
head237
deprel38
lc297,013
lemma_lc163,016

Metadata inventory

The corpus defines <doc> with 11 attributes.

StructureAttributeDistinct values
<doc>doc.author1,685
<doc>doc.language1
<doc>doc.portal1
<doc>doc.pubdate3,332
<doc>doc.pubyear16
<doc>doc.rubric45
<doc>doc.source1
<doc>doc.status13
<doc>doc.text_id9,498
<doc>doc.title9,415
<doc>doc.url9,498

How to cite

Fisun, Roman. 2026. Islam-today: Russian Islamic Portal Corpus. Compiled from islam-today.ru (data collected in January 2026; Q&A content excluded). Available at: https://corpora.fisun.org/ (corpus name: islam-today). Accessed: <YYYY-MM-DD>.

Software

Terms of use

Access to this corpus is restricted (password-protected) and provided on an “as is” basis for research and educational use only. This service does not grant any licence or other rights to the underlying texts.

The corpus, including any excerpts, downloads, or derived copies of the original texts, is not freely distributable. Reproduction, redistribution, republication, mirroring, or making the content publicly available is prohibited unless you have explicit permission from the respective rights holders and/or the source website.

Copyright and any other rights in the original texts remain with the respective source website and/or their authors. Users are solely responsible for ensuring that any use complies with applicable law and the terms of the original source.

The maintainer makes no warranties regarding completeness, accuracy, fitness for a particular purpose, or continued availability of the service.

Contact

name.surname [at] ur.de