| Authors | معین الدین شیخ الطایفه,زهرا اسماعیلی طاهری,فرشته دهقانی |
| Journal | SN Computer Science |
| Page number | 1 |
| Volume number | 6 |
| IF | ثبت نشده |
| Paper Type | Full Paper |
| Published At | 2025-02-18 |
| Journal Grade | Scientific - research |
| Journal Type | Electronic |
| Journal Country | Iran, Islamic Republic Of |
| Journal Index | ISC ,SCOPUS |
Abstract
First-order optimization methods that leverage gradient information are fundamental for solving problems across diverse
domains due to their scalability and computational efficiency. Despite their effectiveness, traditional methods like Gradient
Descent (GD) often face challenges related to noise, scalability, and convergence to local optima. This paper introduces
Spawning Gradient Descent (SpGD), a novel algorithm that enhances gradient-based optimization by selecting appropriate
starting points, dynamically adjusting the learning rate through the proposed Augmented Gradient Descent (AGD) algo-
rithm, and optimizing movement patterns. The AGD mechanism enables dynamic learning rate adjustment by comparing
the gradient signs at the newly generated point with those at the current point for each dimension, allowing the optimization
process to adapt to the current search state. These innovations mitigate zigzagging, improve initial positioning, and eliminate
the need for manual learning rate tuning. By incorporating controlled randomization, SpGD addresses key limitations of
traditional methods. Experimental results demonstrate that SpGD achieves enhanced accuracy, resolves step-size reduction
constraints, and generates randomized points with superior efficiency. Even with modest computational resources, SpGD
facilitates improved exploration of the solution space and increases the precision of locating global minima, despite com-
putational challenges. The Spawning Gradient Descent (SpGD) algorithm demonstrates superior performance across both
convex and non-convex benchmarks, significantly outperforming optimizers like GD, Adam, RAdam, and AdaBelief. For
instance, on the Quadratic function, SpGD achieves a near-zero performance error of 1.7e−11, far surpassing AdaBelief
(0.061). On non-convex functions such as Ackley, Schaffer, and Rastrigin, SpGD consistently achieves better proximity to
the global optimum. For example, on the Ackley function, SpGD achieves a performance error of 0.005, significantly better
than Momentum (0.429). While SpGD’s execution time is marginally higher, the increase is modest and justified by its sig-
nificantly better precision. The SpGD algorithm also demonstrates superior performance in deep learning models, achieving
faster convergence and higher accuracy on the CIFAR-10 and Fashion-MNIST datasets using ResNet-20 and DenseNet-19.
SpGD achieves 80% and 85% accuracy on CIFAR-10 in just 28 and 20 epochs, respectively, significantly outperforming
SRSGD, which requires up to 70 epochs to achieve comparable results. These findings underscore SpGD’s potential for effi-
cient training of large-scale neural networks with reduced computational time. The implementation details and source code
for all experiments in this study are available on GitHub at https://github.com/z-esmaily/Spawn_Gradient_Descent/tree/main