Autonomous Unbiased Study Group Formation Algorithm for Rapid Knowledge Propagation

Knowledge propagation is a necessity, both in academics and in the industry. The focus of this work is on how to achieve rapid knowledge propagation using collaborative study groups. The practice of knowledge sharing in study groups finds relevance in conferences, workshops, and class rooms. Unfortunately, there appears to be only few researches on empirical best practices and techniques on study groups formation, especially for achieving rapid knowledge propagation. This work bridges this gap by presenting a workflow driven computational algorithm for autonomous and unbiased formation of study groups. The system workflow consists of a chronology of stages, each made of distinct steps. Two of the most important steps, subsumed within the algorithmic stage, are the algorithms that resolve the decisional problem of number of study groups to be formed, as well as the most effective permutation of the study group participants to form collaborative pairs. This work contributes a number of new algorithmic concepts, such as autonomous and unbiased matching, exhaustive multiplication technique, twisted round-robin transversal, equilibrium summation, among others. The concept of autonomous and unbiased matching is centered on the constitution of study groups and pairs purely based on the participants’ performances in an examination, rather than through any external process. As part of practical demonstration of this work, study group formation as well as unbiased pairing were fully demonstrated for a collaborative learning size of forty (40) participants, and partially for study groups of 50, 60 and 80 participants. The quantitative proof of this work was done through the technique called equilibrium summation, as well as the calculation of inter-study group Pearson Correlation Coefficients, which resulted in values higher than 0.9 in all cases. Real life experimentation This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Computer Systems Science & Engineering DOI:10.32604/csse.2022.021964 Article ech T Press Science

Abstract: Knowledge propagation is a necessity, both in academics and in the industry. The focus of this work is on how to achieve rapid knowledge propagation using collaborative study groups. The practice of knowledge sharing in study groups finds relevance in conferences, workshops, and class rooms. Unfortunately, there appears to be only few researches on empirical best practices and techniques on study groups formation, especially for achieving rapid knowledge propagation. This work bridges this gap by presenting a workflow driven computational algorithm for autonomous and unbiased formation of study groups. The system workflow consists of a chronology of stages, each made of distinct steps. Two of the most important steps, subsumed within the algorithmic stage, are the algorithms that resolve the decisional problem of number of study groups to be formed, as well as the most effective permutation of the study group participants to form collaborative pairs. This work contributes a number of new algorithmic concepts, such as autonomous and unbiased matching, exhaustive multiplication technique, twisted round-robin transversal, equilibrium summation, among others. The concept of autonomous and unbiased matching is centered on the constitution of study groups and pairs purely based on the participants' performances in an examination, rather than through any external process. As part of practical demonstration of this work, study group formation as well as unbiased pairing were fully demonstrated for a collaborative learning size of forty (40) participants, and partially for study groups of 50, 60 and 80 participants. The quantitative proof of this work was done through the technique called equilibrium summation, as well as the calculation of inter-study group Pearson Correlation Coefficients, which resulted in values higher than 0.9 in all cases. Real life experimentation

Introduction
The three key concepts that make up the title as well as content of this work are rapid knowledge propagation [1], autonomous and unbiased matching [2,3], and study group formation [4,5]. Thus, this research presents innovative algorithms, and carefully evaluated techniques for creating and managing collaborative study groups that enhances fast knowledge propagation. One major deliverable of this work is that it is expected to fast track knowledge propagation in a learning community, majorly because it enforces the mixing of the participants, known as pairing, based on their estimated knowledge gaps in an autonomous and unbiased manner. Collaborative learning [6] in a contact network setting involves sharing knowledge by people who interact in groups. While modern researches and establishments acknowledge the strength of collaborative learning, there are some obvious challenges to resolve. One is the question of how many collaborative study groups should be formed for an arbitrary integer population P of learners. Another issue is how best to pair the participants in the collaborative study groups, in order to make the best impacts in achieving very rapid knowledge propagation. There is also the question of which computational algorithms [7] will be used to achieve the required goal. Moreover, there is also a necessity to ensure that the methodology is evaluated [8] scientifically. These and many other related issues constitute the focus of this research. One of the interesting attributes of this research is that the selection of the collaborative study group participants is based on autonomous and unbiased technique, purely dependent on the person's performance, rather than trial and errors or external factors. A practical experimental run of this research was in the teaching of an undergraduate course titled Object Oriented Programming [9] with course code INSY 404 in Babcock University between February and May, 2021. This work is organized into five sections. First is an introduction, and then an exploration of related works. This is followed by detailed presentation of the system workflow. The eleven stages and four steps of the workflow, as well as four puzzles or challenges tackled in this research were discussed. This is followed by the proofs and evaluation of work, and then the presentation of findings and conclusion. One of the peculiarities of this work is that it is focused on even number of participants, mainly because of the need for pairing. Future research will tackle cases of odd number of participants.

Related Works
A research by [10] underlined the need for active research on study group formation, where the learners have varied abilities, such that some academically weak ones can learn from more intelligent colleagues, a concept termed as heterogenous mixing. Another research by [11] uses a technique called automated group decomposition, based on k-means clustering to build study groups. Extensive discussion on how cooperative learning impacted positively on accounting students was presented in [12], however, the work was silent on evolution of techniques for group formation. Furthermore, the managerial expertise for making the best out of study group comes handy in the research [13]. An extensive analytical research on the performance measures, and justification of use of study groups is a research by [14]. It is necessary to state that none of the literatures highlighted above, apart from [11] attempted to build any unique computational strategy for study group formation, Obviously, none of these works have used autonomous and unbiased selection for study group formation and pairing, thus the reason for this current research.

System Workflow
As already stated, the major aim of this work is to present a new computational algorithm for rapid domain knowledge propagation through precision-based study groups formation. This implies that the resulting study groups enforce autonomous and unbiased mixing or permutation of the participants, such that for any two learners Lx and Ly in a collaborative pair (Lx, Ly), there is synergy, such that the perceived knowledge gap of one partner is filled by the colleague. The general workflow [15] of this research is shown in Fig. 1. As shown in the diagram, there are a total of eleven chronological steps, all of which were labeled numerically from inception to conclusion.
Step 11 is a looping point [16], where the workflow control could be switched back to Step 3, as long as further iteration is necessary. Before going into detailed explanation of the workflow, it is necessary to mention two important points. One is that a number of the chronological steps involved are procedural [17] in nature. Secondly, there are two stages that constitute the major algorithmic implementations in this research. These are steps 6 and 7, at which point the creation of study groups, as well as the autonomous and unbiased study group pair formations are achieved. All these will be explained further in appropriate sections of this work.

Preparatory Stage
The algorithmic Steps 1 and 2 constitute the preparatory stage, since they involve putting strategic procedures in place before the actual learning is kick-started. The essence of Step 1 which is tagged [18] as Create LearnPOP in the system workflow is to create a learning population. This is a procedural step which involves gathering details of the students that make up the learning population, just like in any normal or conventional classroom. The minimum dataset [19] could be as simple as the full names of the students presented in a spreadsheet, which will be used to track the expected physical attendance of the students to class activities. Obviously, because of the fact that the collaborative learning at the lowest level is in pairs, the learning population should be an even integer [20] and not an odd number. The Step 2 of the workflow is the registration of the learning participants, titled as LearnREG. This involves implementing a simple registration of the study group participants. The standard registration number adopted in the experiment is SGXXX where SG stands for study group, and XXX represents an integer, though other alternative nomenclatures could also be adopted. The next column after SGXXX is the space reserved for storing the examination scores. It is also possible to increase the number of columns, by capturing such information as phone number, matriculation number (in the case of university students), among others, though for the sake of this work, such an extension was not implemented. Tab. 1 shows a sample LearnREG dataset for a 40 learners experimental study group, where the study group members SG21 to SG38 were purposely hidden so as to conserve table space in this report.

First Learning Stage
The algorithmic Steps 3, 4 and 5 consists of the first learning stage of the system workflow. During Step 3 tagged the CRSession, which stands for class room session, all members of the learning community are taught by an expert [21], for instance, a class teacher, a facilitator, or university lecturer as the case may be. As shown in the workflow diagram, this stage is module driven. This implies that it follows a very organized module-driven course outline [22]. In testing this work, a university approved course outline for Object-Oriented Programming was used. During the normal class room teaching, the lecturer could cover one to two modules before launching into the next step. However, for a more thorough and evaluation-based learning, it is advisable to restrict coverage to only one module before launching into the first examination. The first examination, which is Step 4 tagged Exams-1 follows. The main essence of the examination is to have an empirical measure of the academic capabilities of the participants emanating from the just concluded teaching.
Step 5 tagged GenResult-1 involves the generation of the examination scores of the learners, an outcome which marks the conclusion of first learning stage of the workflow.

The Algorithmic Stage
The workflow Steps 6 and 7 tagged NumSG Generation and LearnPair Matching mark the core deliverables of this research. Incidentally, the earlier stages of the workflow lead to an examination result, which is an output from the first learning stage. Importantly, this initial output becomes the input to the core algorithmic stage as will be carefully outlined here. It is necessary to again recount the major problem statement of this research in more specific terms. Given the earlier learning community of size P = 40 participants shown in Tab. 1, there are a number of algorithmic challenges that need to be tackled, some of which are stated in Tab. 2.
To tackle the enumerated challenges as part of workflow Steps 6 and 7, a sample architecture [24] generated for a study group of 40 learners is shown in Fig. 2. The goal is to develop an algorithm for generating the number of study groups, as well as creating the pairs that make up a study group, and proving that the resulting outcome has scientific merit. How many study groups SG1, S2, …. SGX where X is an integer, are most appropriate to be created for a learning population of cardinality P, where P is an even integer?
This issue was tackled in Section 3.3.

Challenge 2
How many learning pairs should be created for each study group? What algorithmic steps will be used to achieve this?
This issue was also tackled in Section 3.3.

Challenge 3
How does the computational algorithm ensure that the paring of learners in a study group is autonomous and unbiased, rather than being influenced by external views?
This has been explained in several sections, and especially Section 3.3.

Challenge 4
What computational proof [23] or evaluation will be used to justify the overall algorithmic process?
Tackled in proof and evaluation section.
The first challenge is to decide on number of study groups to create, and to create same. This is part of the Step 6, tagged NumSG Generation in the general workflow.
The decision on number of study groups is achieved through a technique termed exhaustive multiplication [25].

Exhaustive Multiplication Implementation
Given the population size P of learners to be organized into a study group, then first and foremost, P must be an even integer, since only an even number is divisible by 2 (pairing) without a remainder. The next step is to create an exhaustive multiplication table of P, consisting of four columns as shown in Fig. 3. For instance, the integer 40 = 2 × 2 × 10 = 2 × 10 × 2 = 2 × 5 × 4 = 2 × 4 × 5 and so on, till all possibilities are covered.
In a similar way, the exhaustive multiplication tables for P = 50, P = 60 and P = 80 respectively are shown in Fig. 4.  A further and more detailed explanation of the use of exhaustive multiplication table will be based on study group population P = 40. As shown in Fig. 3, the fourth column of the table is the comment field used to indicate rejection or acceptance criteria [26]. The double asterisk (**) shows that particular option is rejected, while 'OK' signifies it as one of the acceptable options. An important criteria for rejection is where the value of NumSG is in the extreme, that is being either too low, or too high compared to others, as such options may lead to creating either too many or too little number of study groups. Thus, the rejected cases are for the set of values NumSG = {2, 10, 1, 40}. Thus, the selected values for possible number of study groups to be created are for set of values NumSG = {4, 5}. Suppose the value of 4 is selected in line with study group architecture shown earlier in Fig. 2, then the next task in the NumSG Generation step as shown in the workflow is to create the contents of each of the four study groups.

Study Group Structural Creation
The algorithmic steps for achieving this is as follows. To create S study groups out of P learners, then the number of learners L per study group is given by Eq. (1).
Therefore, for P = 40 and S = 4, the number of learners in each study group is L = 40/4 which is 10. First, create 4 arrays, SG1, SG2, SG3 and SG4 as shown in Fig. 5. The members of the learning community are then arranged in ranking positions [27], from 1 to 40 in ascending order of their performances (scores) in the first exams result earlier generated in workflow Step 5. Where two or more persons have same score, example  table for P = 50, 60 and 80 50, 50, 50, for three persons, such persons should simply be arranged in consecutive order such as 10 th , 11 th , 13 th , and so on without prejudice.
The assigned rankings are then used to generate the study groups by filling the four arrays in a twisted round robin pattern shown in Fig. 5. As shown in the figure, the twisted round robin [28] makes use of four major movements tagged as HR4, VU1, HL4 and VU1 coloured as RED, BLUE, GREEN and YELLOW respectively. The movement patterns are defined using the following rules:

Study Group Contents Creation
It is obvious that based on Eq. (1), if there are L = (P/S) number of learners in a study group, then after pairing [29], there will be a total of L/2 which is the same as P/2S number of pairs in each study group. Therefore, for P = 40, S = 4, the number of pairs is 10/2 which is 5. The details of the LearnPair Matching algorithm are as follows. To create pairs for a study group, first pick the array that represents the study group, for example, study group SG1. Next, locate the max-end and min-end of the array. In this research, the max-end of an array is defined as the array content having the highest index, while the min-end is the one with least index [30]. First match the max-end with the min-end. This operation is repeated again and again for the remaining part of the array until the entire array is used up. A practical demonstration of the evolution of SG1 pairs is shown in Fig. 6.
In a similar way, the pairs are created for all the study groups SG1, SG2, SG3 and SG4 using the LearnPair Matching algorithm, and the final output is shown in Fig. 7. Thus, the Challenge 2 has been tackled. Again, the answer to Challenge 3 is obviously available. The process of pairing is purely based on the performance of each learner in an examination. It is the score per learner that determine the ranking of the student, and not necessarily based on any other external reason. Thus, the pairing is autonomous and unbiased. Further discussions will be made on how the paring arrangement leads to rapid knowledge propagation, which is the major goal of this work.

The Collab Learning Stage
After the formation of study groups and learning pairs, then comes the Collab Learning Stage of the system workflow, where "Collab" is short for "Collaborative". This consist of Steps 8, Step 9 and Step10 respectively. This stage begins with a collaborative revision. This stage brings to reality the very essence of creating study groups, and learning pairs. So, the next question is what is the significance of collaborative learning pairs as projected in this research? The answer is that the autonomous and unbiased selection of the study groups and learning pairs ensures that there is synergy between the learners. Thus, a learner is paired with another learners, based purely on their learning quotients. This is why the learner with rank high 40 is paired with the person of low rank 1, learner with rank 39 paired with another with rank 2, and so on. This is based on the assumption that the learner of high academic quotient 40 will teach the person of low academic quotient of rank 1. Through such collaborative learning, all the learners will gain knowledge in a faster way. In other words, the collaborative learning pairing ensures that a learner is complemented by a partner. This is the concept of rapid knowledge propagation, which is the goal of this research. Thus Step 8 involves collaborative revision, Step 9 involves a second examination while Step 10 involves the generation of the examination result. Note that the collaborative revision is also module-based, so that students will be guided on what to discuss or revise in their collaborative pair learning sessions. It is important to mention that the second examination in Step 9 is a form of evaluator, Figure 6: Demonstration of study pairs creation for SG1 Figure 7: Pairs for all the study groups to be sure there is significant positive impact since after the first examination. The outcomes of the two examinations were used to create performance evaluation graph and bar chart at the conclusion part of this work.

The Loop Stage
The final stage in the general workflow is the loop stage. This is a point where the moderator, who is the overall lecturer, may decide to either continue with further sessions, or may terminate the workflow.

Proofs and Evaluations
The evaluation of this work is based on two perspectives. One is through what is termed as equilibrium summation, and another is through the use of statistical correlation [31].

Equilibrium Sum Checks
The major goal of this work is to ensure that learners are arranged in such a way that every learner with academic performance rating X is grouped with another learner with academic performance Y, such that there are visible equilibria in the summation for the entire study groups. This ensures that collaborative pairing enhances rapid knowledge propagation. The two important rules on summation in study groups are outlined as follows.

Rule Number One
The sum of the positional constituents of every study group give an equal value. This is what is termed the equilibrium sum. For instance, for study group population P = 40, consisting of four distinct study groups, the value is 205 for all study groups. The flowchart for computing this value is shown in Fig. 8.
Thus, the outcome from the implementation of the flowchart is as follows: It is important to mention that a new flowchart symbol in form of solid cuboid was introduced in this work in order to effectively represent program loop. This is shown in Figs. 8 and 9 respectively.

Rule Number Two
Apart from equilibrium at the study group level, there is also a unique sum of each of the pairs. The rule states that the sums of the positional contents of each pair in every study group in the entire population should be equal. For instance, for study group population P = 40, consisting of 4 study groups as shown in Figs. 6 and 7, the values of all of the possible pair sums is 41. The flowchart for computing this value is shown in Fig. 9.
Thus, the outcome from the implementation of the flowchart is as follows:

Correlation Coefficient
Correlation is a statistical measure of linear relationship between variables. A presentation technique known as scatter plot [32] can also be programmed to visualize such linear relationships. The value of correlation coefficient [33]  (2) One of the ways to prove that the constitution of the study groups is near perfection is to compare the correlation coefficients of each study group with the rest of others, and to be sure that all resulting the correlation coefficients are close to +1. The Pearson Correlation Coefficient P C is given by Eq. (3): where PC (X,Y) = Pearson Correlation Coefficient between variables X and Y,  Furthermore, the Spearman's Rank Correlation Coefficient S C is given by Eq. (4): where D = difference in ranks of the two variables representing the two study groups being analyzed and n = number of participants in each study group.
Based on the results of correlation coefficients, it implies that there is a very strong correlation coefficient between all the arrangements of the individual study groups SG1, SG2, SG3 and SG4. Similar correlation tests have been done after generating study groups for learners of populations size 50 with 5 study groups, 60 leaners with 3 study groups, 80 learners with 4 study groups, using this algorithm, and the resulting correlation coefficients were all very close to +1.

Findings and Conclusion
This research has presented a very unambiguous algorithm for achieving a rapid domain knowledge propagation using autonomous and unbiased matching based study groups. The result of the experiment was visualized [35] using a comparative performance graph [36] shown in Fig. 10, and a comparative bar chart shown in Fig. 11. As shown in the figures, the X-axis represents the standard registration formats SGX where X is in the range 1 to 40. Two examinations Exam1 and 2 were taken in line with the system workflow. The results were used to plot the performance graphs and barcharts [37] respectively.
There was a very significant [38] positive displacement for all the participants, especially those who had low grades at the first examination. The only exception was a participant with registration number SG2, who had a score deviation, where score in Exam1 was 55, while score in Exam2 was 20. Thus, the failure rate [39] of this experiment is estimated as 1/40 which is about 2.5% only.   In conclusion, this work has presented an innovative algorithm on how to create study groups, and pairs so as to achieve rapid knowledge propagation. A number of new concepts and computational techniques have evolved from this work, as contributions to knowledge. The work has been presented in a very unambiguous manner, with explicit and annotated workflows, flowcharts, among others. Mathematical proofs as well statistical correlations were also exploited for further evaluation of the work, with very impressive outcomes. The outcome of the final experimental run in this research shows about 97.5% success and 2.5% failure. Consequently, future research will focus on performing further investigative study on other factors that affect performance in collaborative study groups, especially in respect of the 2.5% of the participant such as SG2, who failed to perform as brilliantly as others. Future research will also involve running the experiment in a large scale [40], for participants P > 40, and possibly, covering an entire course outline, or taking up other courses apart from Object-Oriented Programming used as a case study in this work. Spearman's rank correlation is also recommended for future studies in that regard. The algorithms, techniques, concepts and overall content of this research is expected to be very useful to stakeholders in the world of knowledge propagation. Three recent multi-disciplinary works on knowledge propagation are [41][42][43]. Finally, the dataset for Exams 1 and 2 is shown in Fig. 12.
Funding Statement: The authors received no specific funding for this study.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.