Modeling Anaerobic Decomposition: JMP Application with Biomass Data

Abubakar, A. M.; Elboughdiri, N.; Chibani, A.; Nneka, E. C.; Yunus, M. U.; Ghernaout, D.; Abubakar, A. M.; Elboughdiri, N.; Chibani, A.; Nneka, E. C.; Yunus, M. U.; Ghernaout, D.

doi:10.4152/pea.2025430604

Serviços Personalizados

Journal

Artigo

Indicadores

Citado por SciELO
Acessos

Links relacionados

Similares em SciELO

Mais
Mais

Permalink

Portugaliae Electrochimica Acta

versão impressa ISSN 0872-1904versão On-line ISSN 1647-1571

Port. Electrochim. Acta vol.43 no.6 Coimbra dez. 2025 Epub 30-Nov-2025

https://doi.org/10.4152/pea.2025430604

Research Article

Modeling Anaerobic Decomposition: JMP Application with Biomass Data

A. M. Abubakar¹²

N. Elboughdiri³⁴

A. Chibani⁵

E. C. Nneka⁶

M. U. Yunus³⁷

D. Ghernaout³⁷

^¹Department of Chemical Engineering, Modibbo Adama University, Girei LGA, Adamawa State, Nigeria

^² Department of Chemical Engineering, University of Maiduguri, Borno State, Nigeria

^³Chemical Engineering Department, University of Ha’il, Ha’il, Saudi Arabia

^⁴Chemical Engineering Process Department, University of Gabes, Gabes, Tunisia

^⁵Research Center in Industrial Technologies CRTI, Cheraga, Algiers, Algeria

^⁶ Department of Chemical Engineering, Faculty of Engineering, Chukwuemeka Odumegwu Ojukwu University, Igbariam, Anambra State, Nigeria

^⁷Chemical Engineering Department, University of Blida, Blida, Algeria

Abstract

Modern predictive modeling techniques, such as regression, NN and decision trees can be used to build better and more useful models. JMP 17.2.0 was used in this study to develop a fitting model for microbial growth observed data from chicken manure and banana peels labelled as Sample A, and a single chicken manure substrate, identified as Sample B. Statistical metrics, including COD (R²), RASE, MAD, negative log-likelihood and SSE were used to determine best predictions for Ct (X) from biomass of 22 and 24 samples (A and B) on SC (S) and SGR of microorganisms (μ). Along with estimated Monod parameters, TanH function SAS codes for 3 declared hidden layers, also demonstrated by surface plots, portrayed Sample B predicted model as the best one, even though the 2 samples datasets R² values for training (A: 0.9887916 and B: 1.0000) and validation (A: 0.9787637 and B: 0.9999999) pointed to a good fit. According to findings, optimal conditions for datasets were: A- biomass = 899868717 mg/L and SC = 4.62 x 10⁹ mg/L, correspondent to high µ (0.010201 h^-1); and B- biomass = 15351147 mg/L and SC = 9.2322 x 10⁹ mg/L, consistent with µ of 0.007316 h^-1. RMSE, which is the standard method of choice for evaluating the accuracy of predictive models, including those based on NN, should be activated in future studies. This research is both timely and relevant in the pursuit of sustainable waste management and renewable energy generation.

Keywords: ANN; JMP; Monod; SAS code; SGR; SC; TanH function

Introduction

Decomposition of organic wastes in an oxygen-free environment requires anaerobic microorganisms, which feed on the nutrients it contains, thereby converting it to biofuel. The development of several microbial growth kinetics model has helped explaining this process. Inherent to most models, such as Andrew, Contois, Monod, Moser and Verhulst, are independent factors like biomass and SC, in the bioreactor under consideration, where the microorganism’s SGR is the output variable. Since exponential microbial growth plays a key role in organic waste conversion, in most environmental engineers’ agenda it is crucial to optimize their development, especially during biodegradable waste anaerobic decomposition. Kinetic models have been employed in recent years, of which parameters were used to predict biogas optimal production and the growth rate of isolated microbes that help digesting feedstock. Others scholars have employed RSM to generate DOE, proposing different combinations of input variables for single or sets of responses or outputs. Currently, NN predictive modeling has evolved as a new optimization tool in bioprocessing, for biogas yield prediction, with random combination of factors, from palm oil mill effluents, cassava wastewater, cow dung and food waste, etc., using MATLAB ²^,³^,⁹^,¹².

Thorough research shows that prior studies using typical microbial growth kinetic parameters for biomass and SC are non-existent, especially a practical application of JMP software ⁷ linked to the anaerobic process. Applying NN predictive modeling technique to banana peels and chicken manure anaerobic decomposition can provide valuable insights onto processing behavior and enhance its management, leading to more efficient and reliable biogas and biofertilizer production. This study aimed to: use two different digesters separately charged with Sample A (banana peels + chicken manure) and Sample B (chicken manure) for microbial growth study; mathematically and experimentally determine and generate a dataset for the respective depleting Ct of SC and biomass in the exponential growth phase in their respective chambers, to calculate their microbes SGR; use datasets distinctly recorded in JMP software, specifying an orthogonal CCD, before programming it to generate a neural predictive model; analyze SGR statistical outputs of Sample A and B, for training and validation fitted plots; employ OriginPro 2018 software for a user-defined regression analysis, specifying the Monod equation, to estimate kinetic parameters for each dataset; and compare fits, surface plots, kinetic parameters and the predictive model for the two datasets, to determine the optimal combination of input variables. Similar to this work, ⁸ have carried out a combined demonstration of kinetics (using modified Gompertz model) and ANN, for biogas production via anaerobic digestion observations. ⁵ have also examined the behavior of biogas yield curve from lignocellulosic material using ANN. The present study suggests a different and more enhanced approach of merging the idea behind regression, CCD for RSM and neural modeling, to achieve the stated goals.

Methodology and materials

Materials sourcing

Banana peels and chicken manure were collected from Kasuwan Shanu Market and UNIMAID Faculty of Agriculture Poultry Farm, respectively, both in Maiduguri, Borno State, Nigeria. Drinkable water was utilized for anaerobic digestion and cell density determination. Before digestion, organic biomass was processed into Sample A and Sample B. Sample A is a mixture of chicken manure (4 kg), banana peels (0.5 kg) and water (7.5 kg) in a digester. Sample B is a mixture of chicken manure (7.5 kg) and water in equal proportions by weight, decomposed in a different digester. The two digesters were operated in batch mode, as described by ¹⁷.

Biomass and SC datasets

Viable cells in samples A and B digesters were determined using hypothetical units- colony-forming units (CFU/mL) converted to SC in mg/L units. In microbiology and cell growth experiments, biomass means the viable cells counted in colonies, which stand as Ct from biomass. This determination was carried out by manual NA preparation. From the experiment onset, it was assumed that chicken manure contains microorganisms which are paramount to the anaerobic process (¹⁰). Using a biomass-to-SC ratio of Y = 400, SC was determined using Eq. 1 (1) for samples A and B, respectively.

(1)

Depletion of nutrients or feedstock by biomass during anaerobic fermentation, or the feedstock amount left at a particular time, are typified as S. In Eq. 1, X₀ and S₀ represent initial biomass and SC, which are 899868717.4 and 4620000000 mg/L, (Sample A), and 15351147.09 and 9232210402.48 mg/L, respectively (Sample B). By convention, µ and S in Monod plots are often extracted from the growth phase of the microorganism acting on the samples. Hence, datasets for S and 𝜇 are inverted, initiating at S = 0 and 𝜇 = 0, before plotting Monod curve (Figs. 1 and 2 (b)).

Figure 1: SGR against X and S (sample A).

Figure 2: SGR against X and S (sample B).

RSM

JMP® Trial 17.2.0 (701896), Serial Number: T-VHFY3P0J09 was installed on Microsoft Windows 10 Pro (10.0.10240.0). The software was developed by JMP Statistical Discovery LLC, by Neil Hodgson (neilh@scintilla.org). Under DOE menu, and Classical drop-down menu, Response Surface Design was selected. For RSM analysis of Sample A and B, the response (µ), called SGR (h^-1) was to be maximized. The specification of upper and lower limits of X and S factors is shown in Figs. 3 and 4.

Figure 3: Sample A analysis -factor boundary values and response goal specification.

Figure 4: Sample A analysis-factor boundary values and response goal specification.

Then, CCD-Orthogonal Design Type, over 16 runs and 8 center points, was chosen, and a randomized output option run order was selected. A table of the runs that was generated by JMP was then assessed for implementation. If the estimates of X and S given in the table are not realistic, those empirically obtained (Figs. 1 and 2) are entered after JMP proposed outcomes are deleted.

Incorporating a NN design

Using sets of empirical X, S and µ, which were used to visualize microbial growth via Figs. 1 and 2, NN was generated for µ outcomes. To do that, NN was selected under ‘Predictive Modeling’ drop-down arrow, under “Analyze” menu in JMP application. NN is a modern predictive modeling technique that predicts the response variable, using a flexible function of input variables (e.g., X and S). The respective factors and the response were then chosen, and a Random Seed of 0 was defined, to generate a reproducible sequence of random numbers, starting from 0. In Model Launch window, a Holdback Validation Method used to assess the performance of a trained model was selected. A default Holdback Proportion of 0.333, referring to a fraction of the dataset reserved for validation/testing purposes, or a determinant of how the dataset is split into training, validation and test sets, were allowed, as conducted by ⁹. Afterwards, 3 was entered as the number of hidden nodes, and the model was launched using ‘Go’ button.

Expected results after the model run were: samples A and B dataset statistical parameter estimates; NN; prediction profiler model interpretation and sensitivity analysis representations; 3D surface profiles ¹⁶; Actual by Prediction plots; a table containing formulas for the predicted response and hidden layers’ nodes; and SAS code that can be used to score a new dataset. Using prediction profiler, optimal settings for predictor variables that led to desired predicted outcomes were found, in order to optimize the process. Based on 40 experimental runs, JMP simulated several DOE for microbial growth process involving X, S and µ average estimates.

Determination of microbial growth optimization parameters

Parameters in basic microbial growth models were determined for observed datasets and NN model predicted values, by performing regression based on Monod equation. Predicted parameters were then assessed based on µ optimal prediction.

Results and discussion

Training/validation and statistical measure based on predicted µ

A holdback validation method is a crucial technique for assessing and fine-tuning NN models, which helps to prevent overfitting, and ensure they generalize well to new and unseen data. A holdback of 0.333, as specified, implies that 66.7% of the data were allocated for training, while 33.3% was set aside for validation or testing. Commonly, the specific choice of the holdback proportion depends on the nature of the problem, the dataset size, and the goals of the machine learning experiment. In the regression analysis context, R² is a metrics often used for training and validation datasets ³^,¹³. Higher R² of 0.9888 and 1.0000 for training, and lower R² of 0.9788 and 0.9999 for validation, obtained for Samples A and B statistical metric (Table 1), respectively, indicated overfitting.

Table 1: Sample A and B statistical model fitting predictions.

Measure	Sample A		Sample B
Measure	Training value	Validation value	Training value	Validation value
R²	0.9887916	0.9787637	1	0.9999999
RASE	0.0002279	0.0004126	1.8517 x 10^-7	4.8578 x 10^-7
Mean abs dev	0.0001933	0.0002962	1.5785 x 10^-7	4.2239 x 10^-7
-Log likelihood	-97.5472	-50.99224	-225.3291	-104.9486
SSE	7.2715 x 10⁷	1.3621 x 10^-6	5.486 x 10^-7	1.888 x 10^-12
Sum freq	14	8	16	8

Overfitting occurs when the model is able to perfectly fit training data, but fails to generalize them to unseen data. In such cases, the model may capture noise and idiosyncrasies in training data, which results in high R² value for training. When applied to validation dataset, the model performs poorly, resulting in lower R² value. Viz-a-viz, higher R² for validation and lower R² for training are often a sign of underfitting. However, if R² values for both training and validation are reasonably high and close to each other, as shown in Table 1, this indicates that the model is able to capture underlying patterns in the data without overfitting or underfitting ⁶. SSE can be explained in terms of training and validation. Training SSE close to zero implies that the model is perfectly fitting training data, while lower validation SSE indicates better generalization. Clearly, results in Table 1 establish an ideal scenario, where validation SSE is reasonably close to the training one, for the respective samples, it shows that the model learnt underlying patterns in the data without overfitting. Similarly, lower training RASE of 0.0002279 (Sample A) and 1.8517 x 10^-7 (Sample B), shows that the model predicted training data with smaller errors.

MAD or MAE behavior, which quantifies how far predictions are from true values on average, is consistent whether one is looking at training or validation dataset. If R² is high for training dataset, MAD will typically be lower on it (e.g., 0.0001933 for Sample A). Alternatively, low R² for validation dataset will result in higher MAD on validation dataset (viz. 0.0002962). As for equal R² (dataset B), MAD’s estimates can be used to select models. In order to minimize prediction errors in practical applications where accuracy is crucial, lower MAD (i.e., 1.5785 x 10^-7 for training) is suitable. Sum of frequencies show how many data points are present in each dataset, which is crucial in assessing reliability and generalizability of NN predictive model. Larger datasets (as in Sample B) are often beneficial for training more robust and accurate models. The difference itself in “Sum Freq” between training and validation datasets does not inherently indicate an issue. “Loglikelihood” is a term used to quantify how well a statistical model’s predictions match observed data, in which lower values (i.e., -97.5472 and -225.3291 for Samples A and B training, respectively) indicate a better fit.

Hidden layers’ structure

In a NN, hidden nodes, as earlier specified in the methodology, are computational units that are part of hidden layers, which stand between input (X and S blue box) and output layers (𝜇 square box), as shown in Fig. 5.

Figure 5: NN configuration for predicting 𝜇 from sample A and B datasets.

Hidden nodes are responsible for processing input data and learning complex patterns and representations from them ². Hidden layers with 3 hidden nodes mean that this layer contains 3 computational units. The units in the green field circle containing ‘S’ symbol, as shown in Fig. 5, are called TanH (¹⁴).

Due to selected alike nodes, Fig. 5 is similar for samples A and B data. Changing the number of hidden nodes from 3 to 5 and 10, for Sample A dataset, will affect the extent of fit and R² values ⁴^,¹¹. At hidden nodes of 3, 4 and 5, R² values were: for training, 0.9887916, 0.9678015 and 0.9926753; and for validation, 0.9787637, 0.9786579 and 0.9786468. Initially, increasing hidden nodes number might lead to an improvement in R² for training dataset, since a more complex model is able to learn intricate patterns on data. Further increase in nodes (to 10) will lead to overfitting, as the model starts to fit noise or random variations in training data, which results in artificially high R². Likewise, increasing hidden nodes may, at first, improve R² for validation dataset, because the model can capture more nuances in data. Nevertheless, if the model starts to overfit training data, R² for validation dataset is likely to decrease. Generally, increasing the hidden nodes number can affect the dependent variable (µ) prediction. NN with more hidden nodes is computationally more complex, and it may require additional computational power and memory. What is called “black box” may occur ², a situation in which it has signally became challenging to interpret how specific features or variables impact µ, due to more hidden nodes.

Prediction profile under maximum desirability

In JMP software, prediction profiler is a tool used for exploring and visualizing X and S effects on predictions made by a statistical model. It is particularly useful for understanding how changes in input variables (predictors) impact predicted outcome (µ) from the model. As portrayed in Figs. 6 and 7, desirability refers to a measure used to assess the overall quality of a set of predicted outcomes for a given set of input conditions or factors.

Figure 6: Predicted SGR from sample A under X and S values’ maximal desirability.

In Fig. 6, a high desirability score of 0.994319 indicates that a combination of X = 899868717 mg/L and S = 4.62 x 10⁹ mg/L led to optimal or desirable values of µ = 0.010201 h^-1, which met the desired goal. This is because, when working with prediction profilers and desirability, the aim is to find factor settings or input conditions that maximize overall desirability score.

Hence, in the same context, X = 15351147 mg/L and S = 9.2322 x 10⁹ mg/L, which, at high desirability of 0.981835, is the desired combination for µ_max of 0.007316 h^-1 (Fig. 7).

Figure 7: Predicted SGR from sample b under maximal desirability of X and S values.

Simulation experiment

‘Simulation’ feature, under prediction profiler in JMP software, was used to perform Monte Carlo simulations, for predicting X and S values, and statistical properties: mean and SD of µ for 40 experimental runs (Tables 2 and 3).

Table 2: JMP predicted experimental runs for sample A dataset.

Run	X (mg/L)	S (mg/L)	µ mean (h)	µ SD
1	830648046.83	3968461538.50	0.008959737	1.21E-17
2	819111268.40	2310000000.00	0.006560012	1.04E-17
3	888331938.97	3613076923.10	0.008731199	1.21E-17
4	784500933.12	4027692307.70	0.00881895	1.21E-17
5	461471137.13	3316923076.90	0.00617417	1.65E-17
6	634522813.55	3020769230.80	0.006670966	2.60E-18
7	530691807.70	3790769230.80	0.007191932	1.04E-17
8	484544693.98	3139230769.20	0.00605546	8.67E-18
9	669133148.84	2783846153.80	0.006513329	1.73E-18
10	449934358.70	2606153846.20	0.005140131	0
11	611449256.69	4383076923.10	0.008428288	1.21E-17
12	715280262.55	4501538461.50	0.009129081	2.43E-17
13	599912478.27	4560769230.80	0.00860883	2.26E-17
14	646059591.98	3198461538.50	0.006976319	1.73E-17
15	703743484.12	3435384615.40	0.007595131	1.91E-17
16	576838921.41	3080000000.00	0.006458603	1.73E-18
17	807574489.97	4146153846.20	0.009096647	1.21E-17
18	519155029.27	3909230769.20	0.007291284	1.73E-17
19	876795160.54	2487692307.70	0.007076688	5.20E-18
20	796037711.55	4620000000.00	0.009698155	1.56E-17
21	772964154.69	2428461538.50	0.006513781	1.65E-17
22	692206705.69	2369230769.20	0.006043987	1.56E-17
23	565302142.98	2902307692.30	0.006153266	8.67E-19
24	507618250.84	3257692307.70	0.006340537	1.04E-17
25	588375699.84	4323846153.80	0.008225424	0
26	865258382.12	4205384615.40	0.009456864	0
27	542228586.13	2961538461.50	0.006115486	8.67E-18
28	853721603.69	2665384615.40	0.007225682	6.07E-18
29	473007915.56	3731538461.50	0.006800266	3.47E-18
30	899868717.40	3553846153.80	0.008700551	1.21E-17
31	657596370.41	4086923076.90	0.008263504	1.21E-17
32	680669927.26	4264615384.60	0.008626408	1.21E-17
33	622986035.12	3672307692.30	0.007514357	1.73E-17
34	761427376.26	4442307692.30	0.009280284	1.21E-17
35	553765364.55	2843076923.10	0.006011707	1.56E-17
36	749890597.83	3376153846.20	0.007739604	1.13E-17
37	496081472.41	3850000000.00	0.00708624	0
38	842184825.26	2546923076.90	0.007003912	1.73E-17
39	738353819.41	2724615384.60	0.006768006	1.04E-17
40	726817040.98	3494615384.60	0.007792039	1.91E-17

Higher number of runs provides more precise estimates of µ’s mean and SD. The relationship between µ and limiting S given in the tables, over 40 runs, can be described using Monod equation (Eq. 2), for a typical controlled microbial environment to which Samples A and B were subjected to.

(2)

where µ_max is maximum SGR, i.e., rate at which microorganisms can grow when S is not limiting, and K_s is half-saturation constant or S, at which µ is half of µ_max. Since there are no estimates of µ_max and K_s, to aid selecting an optimal design, a run with higher µ in Tables 2 and 3 may hint on X and S best combinations. Thus, Sample A dataset’s X = 899868717.4 mg/L and S = 4.62 x 10⁹ mg/L (Table 2), at µ_max = 0.009698155 h^-1 below maximum desirability, as shown in Fig. 6. On the other hand, Sample B dataset’s X = 15351147.09 mg/L and S = 9232210402.5 mg/L (Table 3), at µ_max of 0.00728 h^-1 below maximum desirability, as shown in Fig. 7, since estimates were not practically experimented and compared with the ones given by JMP.

Table 3: JMP predicted experimental runs for sample B dataset.

Run	X (mg/L)	S (mg/L)	µ mean (h)	µ SD
1	14367099.20	7811870340.60	0.007152	5.67255E-16
2	13776670.47	5562998575.90	0.006648	1.76074E-16
3	12005384.26	8166955356.00	0.00713	6.65267E-16
4	8856431.01	8640402043.30	0.007108	5.73326E-16
5	8659621.44	7456785325.10	0.006924	1.04951E-16
6	13383051.31	6864976965.90	0.006965	2.6975E-16
7	11808574.69	5918083591.30	0.006676	4.25875E-16
8	9053240.59	5089551888.50	0.006274	5.96745E-16
9	12989432.15	6509891950.50	0.006875	6.24501E-16
10	8266002.28	8877125387.00	0.007122	9.28077E-17
11	7872383.12	8522040371.50	0.007069	1.08507E-15
12	10627717.22	5326275232.20	0.006433	1.40599E-15
13	10824526.79	8995487058.80	0.007193	1.1163E-15
14	14170289.62	5799721919.50	0.006737	1.0365E-15
15	15154337.51	7930232012.40	0.007191	1.97759E-16
16	10234098.06	7575146996.90	0.006993	1.81105E-15
17	8462811.86	7220061981.40	0.006873	2.09902E-16
18	9250050.17	6154806935.00	0.006645	7.61544E-16
19	11611765.11	5681360247.70	0.006596	7.31186E-16
20	15351147.09	8758763715.20	0.00728	1.5786E-16
21	13973480.04	4852828544.90	0.006397	1.7486E-15
22	12595813.00	6273168606.80	0.006803	6.03684E-16
23	13579860.89	5444636904.00	0.006601	8.92515E-16
24	14957527.93	8285317027.90	0.007225	9.6971E-16
25	13186241.73	6983338637.80	0.006982	1.81279E-15
26	11414955.53	7101700309.60	0.006945	1.00614E-16
27	12792622.58	8048593684.20	0.007137	2.09902E-16
28	14563908.78	6628253622.30	0.006959	6.21031E-16
29	12202193.84	7693508668.70	0.007071	1.39559E-15
30	7675573.55	8403678699.70	0.007049	1.47452E-17
31	9643669.33	4616105201.20	0.006094	1.44503E-15
32	9446859.75	6746615294.10	0.006803	1.29931E-15
33	14760718.36	6391530278.60	0.006914	4.42355E-17
34	11021336.37	6036445263.20	0.006679	1.31319E-15
35	12399003.42	9232210402.50	0.007249	1.13798E-15
36	9840478.90	4971190216.70	0.00626	2.8276E-16
37	10430907.64	4734466873.10	0.006184	1.5786E-15
38	11218145.95	9113848730.70	0.007213	5.67255E-16
39	10037288.48	7338423653.30	0.006945	1.3262E-15
40	8069192.70	5207913560.40	0.006279	1.12497E-15

Surface profile

Just as previously demonstrated by ¹¹^,¹⁵, predictors’ (S and X) effect on the response can be simulated with NN model, and analyzed using 3D surface plots. Fig. 8 shows that µ increased gradually with X, before attaining µ_max of 0.015 h^-1, and then declined, after nutrients became low. Likewise, µ increased as S decreased and X rose, as shown in Fig. 9.

Figure 8: Surface plot of X, S and 𝜇 showing patterns of changes in dataset A.

Figure 9: Surface plot of X, S and 𝜇 showing patterns of changes in dataset B.

Fitted plots and model equation command code

In the specified fit, plot for training set with actual and predicted values on Y- and X-axes, respectively, and validation set are shown in Figs. 10 and 11. They are generated by JMP based on their number of data points or Sum Freq. or counts given earlier in Table 1.

Figure 10: (a)- Actual vs. predicted correlation based on dataset A.

Figure 11: Actual vs. predicted correlation based on dataset B.

Under training linear fitting plots in both figures, random data splitting process instigates more data points, to end up in the training dataset by chance. As previously described, MAD and R² are complementary metrics. High and low R² often correspond to lower and higher MAD, indicating better and worse model fits, respectively (Fig. 10a and b). Fig. 11, for chicken manure model fitting of µ, shows that R² values for training and validation datasets are equally high (Table 1). This suggests that the model performed well in terms of explaining the variance in the target variable, and generalized effectively to unseen data ⁹. This is actually a positive sign. However, if precise and accurate predictions are the analysis’ priority, the training model with less errors (SSE = 5.486 x 10^-13) should be chosen based on its lower MAD. A model with higher MAD (validation) should be selected if one wishes more sensitivity to potential outliers or extreme cases. Another reason for better fit in both training graphs is that their negative log-likelihoods are lower, while higher values in validation plots suggest poorer fit ⁷.

Figs. 12 and 13 depict Ntan(H) model equations for 3 hidden layers. Notably, ¹³ utilizes 2 hidden layers and both neural and fuzzy logic to optimize biogas yield from datasets of two reactor setups TanH function’s advantage is that it is nonlinear, allowing to map more intricate functions and stack layers, and to obtain high precision ¹⁴.

Figure 12: Model equation of 𝜇 for sample A dataset.

Figure 13: Model equation of 𝜇 for sample B dataset.

Experimental versus predicted 𝛍 datasets

In samples A and B (Figs. 14 and 15), increased µ means microbial growth faster rate, which rapidly depletes S, as it further consumes the substrate. In that case, µ in Monod equation represents microbial growth in relation to limiting S. Predicted values (Figs. 14 and 15), using model equations in Figs. 12 and 13, showed almost 100% fit demonstrated by the earlier shown plots. When regression was performed using OriginPro 2018 software, µ_max and K_s values obtained for experimental and predicted samples were almost equal (Table 4). Although model equation for predicted µ of A dataset indicated that it was reliable, resulting in values of µ_max = 0.00997 h^-1 and K_s = 9.12975 x 10⁷ mg/L (0.01 h^-1 and 9.498595 x 10⁷ mg/L), the same parameters estimated for experimental results gave the best combination, coupled with their corresponding X and S based on maximal desirability. Model equation for Sample B accurately mimicked Eq. 2, since it gave 100% equal experimental and predicted µ_max and K_s (0.00762 h^-1 and 3.838 x 10⁷ mg/L, respectively) values and suitable chicken manure data (Fig. 13). These parameters, along with X and S maximal desirability (Fig. 7), indicate that Sample B study was the optimal combination.

Figure 14: Sample A’s experimental dataset and predicted SGR.

Constant "H" in Figs. 14 and 15, as herein mentioned, likely represents a parameter in model equations used for predicting µ of microorganisms based on X and S input variables. Positive "H" constant may suggest that certain conditions or factors it represents have a stimulatory effect on microbial growth, and indicate a direct relationship with µ, meaning that they increased together. Conversely, negative "H" constant may imply inhibitory factors or conditions that negatively impact microbial growth, which indicates an inverse relationship with µ. In this case, as "H" value decreased, µ increased.

Parameters, such as µ_max and K_s allow scientists to predict how fast microbial growth is under specific nutrient Ct. It is quite important in this study, since µ specifically impacts biogas generation, when Sample A or B are used, with their associated production rates.

Optimization of S in their respective bioreactors to maximize microbial growth is aided with the knowledge of these parameters. In the Monod equation, an increase in µ_max would lead to a higher 𝜇 that can be achieved under optimal conditions. An increase in µ_max implied higher microbial growth potential, which could lead to faster substrate consumption and, consequently, a decrease in S, over time ¹.

Figure 15: Sample B’s experimental dataset and predicted SGR.

Table 4: Statistics and estimated Monod parameters.

	Sample A		Sample B
	Experimental	Predicted	Experimental	Predicted
Number of points	22	22	24	24
Degrees of freedom	20	20	22	22
Reduced chi-square	4.98155E-12	1.01704E-07	1.06662E-19	4.54345E-14
Residual sum of squares	9.46310E-11	2.03408E-06	2.34655E-18	9.99560E-13
R² (COD)	1	0.98769	1	1
Adj. R²	1	0.98707	1	1
Fit status	Succeeded (100)	Succeeded (100)	Succeeded (100)	Succeeded (101)
µ_max (h^-1)	0.01	0.00997	0.00762	0.00762
K_s (mg/L)	9.49859E + 07	9.12975E + 07	3.838E + 08	3.83747E + 08

Conclusion

R² values near or equal to 1, obtained on training and validation sets, show that TanH neural models for Samples A and B datasets correctly predicted µ. Under 99.43% desirability score, µ_max of 0.01 h^-1 and K_s of 9.498595 x 10⁷ mg/L, the combination of X and S (899868717 mg/L and 4.62 x 10⁹ mg/L, respectively), for optimal µ of 0.010201 h^-1, was valid for Sample A’s experimental observations. Estimated Monod parameters for µ_max and K_s (0.00762 h^-1 and 3.838 x 10⁷ mg/L, respectively), based on 98.18% desirability for X and S (15351147 and 9 x 10⁹ mg/L), made optimal microbial growth condition that resulted in maximum µ of 0.007316 h^-1, which was valid for Sample B empirical and predicted outputs. Overall, Sample B’s anaerobic chamber produced higher biogas yields that those from Sample A’s bioreactor. This was due to exact R² and kinetic parameter values obtained for experimental and forecasted 𝜇. In future researches, neural predictive modeling of two datasets, employing linear and Gaussian representative models, can be run using JMP, of which outcome may be compared with estimates in this study and with the created TanH equation. In addition, hidden nodes variation may improve R² estimation and predicted values, especially for Sample B estimates, which still falls below observed variables and obtained parameters. Applying advanced statistical tools like JMP and NN allows for data-driven decision making, which can result in optimized process parameters and improved waste-to-resource conversion.

Declaration of interests

Authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work herein reported.

Authors’ contributions

A. M. Abubakar: study conceptualization; integrated new ideas with existing literature; analyzed results using JMP and OriginPro; studied and discussed results. N. Elboughdiri: carried out literature survey; reviewed and edited the manuscript; provided project administration and support. A. Chibani: chose the research problem; gave outlines for the manuscript preparation. E. C. Nneka and D. Ghernaout: reviewed and edited the manuscript; guided the paper writing. M. U. Yunus: completed experimental part; helped in raw material sourcing.

Abbreviations

ANN: artificial neural network

CCD: central composite design

COD: coefficient of determination (R²)

DoE: design of experiment

MAD: mean absolute deviation

MAE: mean absolute error

NA: nutrient agar

NN: neural network

RASE: root average squared error

RMSE: root mean square error

RSM: response surface methodology

SAS: statistical analysis system

SC: substrate concentration

SD: standard deviation

SGR: specific growth rate

SSE: sum of squared errors

TanH: hyperbolic tangent

References

1. Abubakar AM, Soltanifar Z, Kida MM. Sensitivity analysis of kinetic growth model data: Monod equation. Schol: J Nat Med Educ. 2022;1:1-13. [ Links ]

2. Chen W-Y, Chan YJ, Lim JW, et al. Artificial neural network (ANN) modelling for biogas production in pre-commercialized integrated anaerobic-aerobic bioreactors (IAAB). Water. 2022;14(1410):1-36. DOI: https://doi.org/10.3390/w14091410 [ Links ]

3. Cruiz IA, Nascimento VRS, Felisardo RJA, et al. Evaluation of artificial neural network models for predictive monitoring of biogas production from cassava wastewater: Comparison of training algorithms. Biom Bioener. 2023;175(106869):1-24. DOI: https://doi.org/10.1016/j.biombioe.2023.106869 [ Links ]

4. Fakharudin AS, Sulaiman N, Salihon J, et al. Implementing artificial neural networks and genetic algorithms to solve modeling and optimisation of biogas production. Proceed 4th Inter Conf Comp Inform. 2013;088:121-126. [ Links ]

5. Ghatak MD, Ghatak A. Artificial neural network model to predict behavior of biogas production curve from mixed lignocellulosic co-substrates. Fuel. 2018;232:78-189. DOI: https://doi.org/10.1016/j.fuel.2018.05.051 [ Links ]

6. Jaroenpoj S, Yu QJ, Ness J. Development of artificial neural network models for biogas production from co-digestion of leachate and pineapple peel. Glob Environ Eng. 2014;1(2):42-47. DOI: https://doi.org/10.15377/2410-3624.2014.01.02.2 [ Links ]

7. Klimberg R, Mccullough BD. Logistic regression. In R. Klimberg & B. D. McCullough (Eds.). Fundamentals of Predictive Analytics with JMP. 2nd ed. SAS Institute Inc; 2012. 104-133. [ Links ]

8. Mougari NE, Largeau JF, Himrane N, et al. Application of artificial neural network and kinetic modeling for the prediction of biogas and methane production in anaerobic digestion of several organic wastes. Int J Green Ener. 2021;18(15):1584-1596. DOI: https://doi.org/10.1080/15435075.2021.1914630 [ Links ]

9. Obileke K, Tangwe S, Makaka G, et al. Comparison of prediction of biogas yield in a batch mode underground fixed dome digester with cow dung. Biom Conv Biorefin. 2023;14(20):26427-42. DOI: https://doi.org/10.1007/s13399-023-04593-z [ Links ]

10. Olufunmi AO. Microbiological potentials of co-digestion of chicken droppings and banana peels as substrates for biogas production. J Chem Pharm Res. 2014;6(4):1088-92. [ Links ]

11. Palaniswamy D, Ramesh G, Sivasankaran S, et al. Optimising biogas from food waste using a neural network model. Proceed Instit Civ Eng Munic Eng. 2017;170(ME4):221-29. DOI: https://doi.org/10.1680/jmuen.16.00008 [ Links ]

12. Qdais HA, Hani KB, Shatnawi N. Modeling and optimization of biogas production from a waste digester using artificial neural network and genetic algorithm. Res Cons Recyc. 2010;54(6):359-63. DOI: https://doi.org/10.1016/j.resconrec.2009.08.012 [ Links ]

13. Rego ASC, Leite SAF, Leite BS, et al. Artificial neural network modelling for biogas production in biodigesters. Chem Eng Transac. 2019;74:25-30. DOI: https://doi.org/10.3303/CET1974005 [ Links ]

14. Rodriguez-Granrose D, Jones A, Loftus H, et al. Design of experiment (DOE) applied to artificial neural network architecture enables rapid bioprocess improvement. Bioproc Biosyst Eng. 2021;44(1):1-7. DOI: https://doi.org/10.1007/s00449-021-02529-3 [ Links ]

15. Sakiewicz P, Piotrowski K, Ober J, et al. Innovative artificial neural network approach for integrated biogas - wastewater treatment system modelling: Effect of plant operating parameters on process intensification. Renew Sustain Energ Rev. 2020;124(109784):1-14. DOI: https://doi.org/10.1016/j.rser.2020.109784 [ Links ]

16. Sudirman A, Rodríguez-Nieto CA, Dhidden LZB, et al. Ways of thinking 3D geometry: Exploratory case study in junior high school students. Polyhed Int J Math Educ. 2023;1(1):15-34. https://nakiscience.com/index.php/pijme [ Links ]

17. Um-e-Habiba, Khan MS, Raza W, et al. A study on the reaction kinetics of anaerobic microbes using batch anaerobic sludge technique for beverage industrial wastewater. Separations. 2021;8(43):1-16. DOI: https://doi.org/10.3390/ separations8040043 [ Links ]

Received: November 23, 2023; Accepted: April 18, 2024

Corresponding author: abdulhalim@mau.edu.ng

This is an open-access article distributed under the terms of the Creative Commons Attribution License