The paper deals with an approach to finding the ultimate number of papers Russian authors deliver to scientific journals. The approach is based on an assessment of the entire audience interested in publications in scientific journals, an assumption that each author personally pursues a research and writes a paper for the scientific journal, as well as an account for the co-authorship indicator, i.e. the involvement of colleagues into the research performed by the author. The research employs data by Rosstat, Higher School of Economics and Scientific Electronic Library to reveal all the authors interested in publication. This indicator is then multiplied by a value of co-authorship. The number of papers the authors publish per year is recommended to be taken as a random value, to be used later as a basis for building a probabilistic distribution of the ultimate number of papers among all authors. The distribution is built by the Monte Carlo method, and the analysis employs the apparatus of the probability theory and linear algebra. The obtained data and data from the Scientific electronic library have been exposed to comparative analysis. The work revealed that for the annual amount of more than one article the distribution of the number of papers received from the Russian audience can be approximated by a normal distribution, with all its parameters depending on maximum annual number of papers, scope of audience and co-authorship indicator. This gives the ultimate number of papers for any section (group of disciplines or a particular discipline). The article considers the examples of obtaining the distribution for the ultimate number of papers. The results can be used for correcting the research policy of an organization or an institute of higher education in the directions of activity being researched, and can serve as a guide to the necessary number of papers in various fields of science.

Рассматривается подход к определению предельного количества статей, подаваемых российскими авторами в научные журналы. Основу подхода составляют оценка всей аудитории, заинтересованной в публикации в научных журналах, допущение, что каждый автор только единолично проводит исследование и готовит публикацию в научный журнал, а также учёт показателя соавторства научных статей как включение коллег в исследование автора. Для проведения исследования используются данные Росстата, Высшей школы экономики и Научной электронной библиотеки для выявления всех заинтересованных в публикации авторов. Этот показатель мультипликативно увеличивается на значение коэффициента соавторства. Количество статей, публикуемых авторами за год, предлагается взять случайным образом и на этом основании построить вероятностное распределение предельного количества статей по всем авторам. Для построения распределения используется метод Монте-Карло, а для анализа аппарат теории вероятностей и линейной алгебры. Проведен сравнительный анализ полученных результатов с данными Научной электронной библиотеки. В ходе работы выявлено, что при годовом количестве статей более одной распределение предельного количества статей от российской аудитории может быть аппроксимировано нормальным распределением, все параметры которого определяются максимальным годовым количеством статей, объемом аудитории и показателем соавторства. Из данного факта может быть получено предельное количество статей по любому разрезу (группы специальностей, конкретной специальности). Рассмотрены примеры нахождения распределения предельного количества статей. Результаты исследования могут быть использованы для корректировки научной политики организации или вуза по исследуемым направлениям деятельности и служить ориентиром для необходимого количества публикаций по отраслям науки.

The launch of any, and not just a scientific, journal, requires the agreement of the following two components:

The second component has been much studied, whereas assessment of currently available readers’ and authors’ audience might have a catastrophic implication for the journal, as it may reveal such a small size of the audience that the journal will fail in the competition with publications similar or close in their topics. This research deals with the authors’ audience only, as the readers’ one, though much wider, may have a potential for interest in publication of scientific papers, but does not need it.

The issue of the ultimate reachable number of papers flowing into scientific journals will allow the comparison between their current and potential level. Understanding of potentially reachable ultimate number of papers allows to use the methods for controlling the flow of papers arriving into the journal [

Publication activity apparently relies on two factors, the scope of the authors’ audience and the number of papers required within a period of time. The first factor may be assessed definitely, whereas the second one is related to probabilistic distribution and is to be exposed to probabilistic methods [

It should be noted that publication activity issues have been dealt with before [6–9], but the issue of the ultimate activity, namely, assessment from the top, as potentially reachable, is still not found in the literature.

This research is aimed at defining of the ultimate publication activity of the authors measured in the number of papers being submitted to the scientific journals.

The research carried out for a scientific and technical journal and described in paper [

In addition, paper [

where Кfield — relation between the number of researchers in the particular field of science and the number of researchers in all sciences;

Nnumber of disciplines — number of disciplines in which the journal can publish papers (can be taken from the effective HAC list);

Ntotal number of disciplines — total number of disciplines in the field;

Nnumber of research fellows, Nnumber of university professors, Ntotal nimber of post-graduate students — total number of research fellows, university professors, and post-graduate students according to the Rosstat and HSE data.

Кfield coefficient showing the relation of researchers in various fields, can be obtained from Rosstat data for all the researchers in Russia (see Table 1).

It should be noted that Kfield coefficient for post-graduate students can also be calculated according to Rosstat data, and its value differs from the coefficient for researchers (see Table 2).

Rosstat has no data related to university professors, which is why we will use those which Higher School of Economics [

Use of data from Tables 1–3 can now be made to assess scientific journals’ audience by scientific fields. To do this, we will employ the HAC data pertaining to disciplines. At present, the order of the Ministry of Education and Science of Russia of the 21st February, 2021 No. 118 numbers them to 351 [

By assuming the homogeneous distribution by disciplines, let us assess the audiences of a scientific journal in one discipline. The data are given in Table 4.

It should be noted that the border between social and humanitarian sciences is more conventional than between engineering and natural ones, which is why assessments of audience potential for these fields of science may diverge widely.

The data from Table 4 allow for:

Data from Tables 1–4 may be used to assess the ultimate number of scientific papers from the Russian audience. Suppose that post-graduate students, researchers and university professors are to publish one paper per year. Suppose also, as a limiting case, that none of post-graduate students, researchers or university professors have co-authors. Then, the ultimate flow of papers scientific journals receive from the Russian authors is the figures presented in Table 5.

Table 5 shows that, potentially, papers by Russian scientific authors can amount to 677,000 per year. With regard for potential non-target audience, it may be increased by 30 % (up to 1 million papers per year). However, data in Table 5 are given for only one paper per year for each type of audience. According to practice existing now in Russia, the number of papers is either established in local administrative documents of an organization, or is not regulated at all.

It must be noted that science cannot be developed by one person, which explains why most of the papers are published by co-authors. First 500 journals from 2020 Science Index rating have been analysed to assess the coefficient of co-authorship. In total, the rating includes 4249 journals, so the error in finding the average number of authors for the papers written in 2020 made 4 % at 0.05 significance level [

Table 1. Distribution of researchers in various scientific fields, according to Rosstat data (for the year 2020)

Field of science | Quantity | Kfield, % |

Humanitarian | 12 326 | 4 |

Natural | 80 966 | 23 |

Medical | 14 584 | 4 |

Social | 20 076 | 6 |

Agricultural | 14 584 | 4 |

Engineering | 208 994 | 60 |

Total | 351 530 | 100 |

Table 2. Distribution of post-graduate students in various scientific fields, according to Rosstat data (for the year 2020)

Field of science | Quantity | Kfield, % |

Humanitarian | 3 510 | 4 |

Natural | 14 918 | 17 |

Medical | 7 898 | 9 |

Social | 36 855 | 42 |

Agricultural | 3 510 | 4 |

Engineering | 21 938 | 25 |

Total | 88 629 | 100 |

Table 3. Distribution of university professors in various scientific fields, according to HSE data and calculations of the author (for 2019/20 academic year)

Field of science | Quantity | Kfield, % |

Humanitarian | 10 223 | 4 |

Natural | 25 287 | 11 |

Medical | 16 944 | 7 |

Social | 108 560 | 46 |

Agricultural | 9 524 | 4 |

Engineering | 66 741 | 28 |

Total | 237 279 | 100 |

Table 4. Authors potentially interested in the journal and representing one HAC discipline

Field of science | Numberof disciplines | Scope of audience for one discipline | Total | ||

Post-graduate students | Researchers | University professors | |||

Humanitarian | 47 | 75 | 262 | 218 | 554 |

Natural | 96 | 155 | 843 | 263 | 1 262 |

Medical | 52 | 152 | 280 | 326 | 758 |

Social | 27 | 1 365 | 744 | 4 021 | 6 129 |

Agricultural | 17 | 206 | 858 | 560 | 1625 |

Engineering | 112 | 196 | 1 866 | 596 | 2 658 |

Table 5. Potential scope of papers in terms of audience per year

Field of science | Potential scope of papers from the audience | Total | ||

Post-graduate students | Researchers | University professors | ||

Humanitarian | 3 510 | 12 326 | 10 223 | 26 059 |

Natural | 14 918 | 80 966 | 25 287 | 121 171 |

Medical | 7 898 | 14 584 | 16 944 | 3 9426 |

Social | 36 855 | 20 076 | 108 560 | 165 491 |

Agricultural | 3 510 | 14 584 | 9 524 | 27 618 |

Engineering | 21 938 | 208 994 | 66 741 | 297 673 |

Total | 677 438 |

Table 6. Field of science, branch of science and average number of authors per paper

Field of science | Branch of science | Average number of authors per paper |

Humanitarian | History | 1,5 |

Literature | 1,35 | |

Science of science | 1,9 | |

Political science | 1,49 | |

Psychology | 2,28 | |

Philosophy | 1,63 | |

Linguistics | 1,37 | |

Average | 1,65 | |

Natural | Astronomy | 3,2 |

Biology | 4 | |

Geography | 2,98 | |

Geology | 3,35 | |

Geophysics | 3,17 | |

Mathematics | 2,11 | |

Mechanics | 2,4 | |

Physics | 4,02 | |

Chemistry | 4,45 | |

Ecology | 3,5 | |

Average | 3,32 | |

Medical | Medicine | 4,40 |

Field of science | Branch of science | Average number of authors per paper |

Social | Domestic trade | 2 |

State and law. Legal science | 1,36 | |

Demography | 1,7 | |

Comprehensive studies of countries and regions | 2,05 | |

Education | 2,7 | |

Pedagogics | 1,92 | |

Sociology | 1,58 | |

Economics | 1,82 | |

Average | 1,89 | |

Agricultural | Agriculture | 2,91 |

Engineering | Automatics. Computer facilities | 2,55 |

Informatics | 2,35 | |

Engineering | 3,16 | |

Metallurgical sector | 3,7 | |

Food industry | 2,68 | |

Communication | 2,4 | |

Construction. Architecture | 2,78 | |

Electronics. Radio engineering | 4,2 | |

Electrical engineering | 2,8 | |

Power engineering | 3,7 | |

Average | 3,03 | |

- | Multidisciplinary | 2,07 |

Average for all branches* | 2,82 |

Table 6 shows that the average number of authors per one paper varies for different fields of knowledge. For instance, it makes 4.4 for medicine, close to 2 for social sciences (economics, law, political science, sociology, etc.), and slightly more than 3 for social sciences (chemistry, physics, biology).

After obtaining the audience scope, authors’ publication activity can be assessed as follows. Let us assume that each kind of audience has quotas for annual number of papers. For example, up to two papers are required per year. Suppose that a post-graduate student is to write one paper in humanitarian sciences, while a university professor, two. In engineering sciences both a post-graduate student and university professor are to write two papers each. An ultimate value will obviously be a distribution depending on how many papers should annually be written by a certain audience. The distribution can be found by the known Monte Carlo method [

where KiСО — average co-authorship coefficient from Table 6 for six fields of science,

Ninumber of research fellows, Ninumber of university professors, Nitotal number of post-graduate students — scope of audience for six fields of science;

nij — a whole random number (in modelling, takes any value between 1 and nmax) that characterizes a number of papers different kinds of audience are to present to scientific journals per year.

Graphs of functions (2) for nmax between 2 and 6 papers are shown in Fig.1. The graphs were plotted in Excel. Number of tests by the Monte Carlo method made 105 for each distribution. In this case, the error in finding of distribution parameters makes 0.6 % [

Fig. 1. Densities of probable number of papers, in millions per year

Fig. 2. Mathematical expectation and root mean square error vs annual number of papers

Several conclusions can be drawn from the analysis of Figure 1:

Fig. 2 shows the graph of mathematical expectation and root mean square error vs annual number of papers being received in scientific journals.

Fig. 2 demonstrates that a trend line built in Excel is a perfect approximation for the dependences of average value and root mean square error growth, which is why further results can be obtained without resort to the Monte Carlo method.

Elementary algebraic transformations of linear equations from Fig. 2 in the process of solving of two systems in two unknowns and rounding the results to the whole numbers, produce

where N — scope of the audience;

kСА — average number of paper’s co-authors from Table 6 for all fields of science (kСА = 2.82).

Next, with regard for the three sigma rule, the ultimate value for the publication activity can be represented as [

Table 7 gives ultimate values of the publication activity for various types of audience and respective various kСА, coefficients obtained from formula (1). The ultimate number of papers in Table 7 means that, per year, each post-graduate student, professor or research fellow writes precisely the number of papers stated in the heading of the table, and the total number is then multiplied by the co-authorship coefficient, because each author not only writes a paper by themself, but is involved in writing of other authors’ papers.

Using the data from Table 7 and a trend to reduction of scientific authors’ journals described in paper [

Table 7. Ultimatenumber of papers, million

Field of science | Audience, persons | kСА | Number of papers per year | |||||

1 | 2 | 3 | 4 | 5 | 6 | |||

Humanitarian | 26 059 | 1,65 | 0,1 | 0,1 | 0,1 | 0,2 | 0,2 | 0,2 |

Natural | 121 171 | 3,32 | 0,6 | 0,9 | 1,3 | 1,6 | 2,0 | 2,3 |

Medical | 39 426 | 4,40 | 0,3 | 0,4 | 0,6 | 0,7 | 0,9 | 1,0 |

Social | 165 491 | 1,89 | 0,5 | 0,7 | 1,0 | 1,3 | 1,5 | 1,8 |

Agricultural | 27 618 | 2,91 | 0,1 | 0,2 | 0,3 | 0,3 | 0,4 | 0,5 |

Engineering | 297 673 | 3,03 | 1,3 | 2,1 | 2,9 | 3,7 | 4,4 | 5,2 |

Total: | 2,9 | 4,5 | 6,1 | 7,8 | 9,4 | 11,1 |

Elibrary.ru data from Fig. 3 show that actual maximum number of papers exceeds 4.5 mln, yet statistics does not indicate whether the download included archive issues of scientific journals and papers downloaded by the CIS countries. The analysis of graphs in Fig. 3 proves that the data of Russian authors’ efficiency are not that far from the ultimate ones. For example, with the mandatory requirement of 6 papers per year for all the audience the ultimate number of papers makes 11 mln a year, 2.3 times more than the maximum value according to elibrary.ru data for 2019.

Reverse conclusions can be drawn from the graph in Fig. 3 as well. With data on the quantity of the audience and the number of authors, we can calculate the number of papers published per one scientist. Average values from the graph in Fig. 3, as per the elibrary.ru data, give approximately 2.5 mln. These 2.5 mln papers divided by multiplication of the Russian scientists audience (0.6 mln) with the number of co-authors in the paper (2.82) give 1.4 papers per year per each Russian scientist.

Using the above reasoning, the distribution of the ultimate number of papers for any topic or discipline can be obtained. For example, Fig. 4 shows a probability density for nmax =2 (two) groups of disciplines in natural sciences.

Fig. 4 proves that the maximum number of papers in Mathematics discipline groups gives the maximum spread between 16 and 94 thousands of papers per year, and for Mechanics discipline groups, between 10 and 70 thousands of papers per year.

The following issue is quite legitimate: if some upper limit exists, is it possible to use data of modelling in assessing of a lower limit or the average value? For instance, graphs in Fig. 4 may lead to a conclusion that the average annual number of papers on mechanics and mathematics makes 42.1 and 52.2 thousands, respectively, and an almost impossible minimum number makes 10 thousands for mechanics and 16 thousands for mathematics. Unfortunately, such conclusion cannot be made, as the approach described in this paper accounts for the entire audience, yet it cannot be stated that the whole audience pursues scientific investigations, though it consists of researchers. This is the reason why the apparatus offered may be used only for assessment from the top.

Fig. 3. Ultimate number of papers vs elibrary.ru data by years

Fig. 4. Probability density for the number of papers in groups of disciplines: mathematics (6 disciplines, kСА = 2.11), mechanics (4 disciplines, kСА = 2.4), thd

As it is clear from the research, in general, even with regard for reducing of the number of participants, the Russian science demonstrates quite good publication dynamics, as the ultimate value for the number of scientific papers is not an order greater than the data on the papers being uploaded to elibrary.ru platform. The actual number of papers and the assessed ultimate number are commensurable, even with regard for somewhat idealized approach to the assessment of publication activity.

As has been noted above, the results of the study can be used in assessing of the publication activity of researchers in any field of science. In addition, the findings may be useful for editors of new scientific journals for potential assessment of the number of papers to be published in the planned field and worthiness of investments into attracting the audience for cooperation with the journal.

The authors declare that there are no conflicts of interest present.