Spawning Gradient Descent (SpGD): A Novel Optimization Framework for Machine Learning and Deep Learning

نویسندگانمعین الدین شیخ الطایفه,زهرا اسماعیلی طاهری,فرشته دهقانی
نشریهSN Computer Science
شماره صفحات1
شماره مجلد6
ضریب تاثیر (IF)ثبت نشده
نوع مقالهFull Paper
تاریخ انتشار2025-02-18
رتبه نشریهعلمی - پژوهشی
نوع نشریهالکترونیکی
کشور محل چاپایران
نمایه نشریهISC ,SCOPUS

چکیده مقاله

First-order optimization methods that leverage gradient information are fundamental for solving problems across diverse domains due to their scalability and computational efficiency. Despite their effectiveness, traditional methods like Gradient Descent (GD) often face challenges related to noise, scalability, and convergence to local optima. This paper introduces Spawning Gradient Descent (SpGD), a novel algorithm that enhances gradient-based optimization by selecting appropriate starting points, dynamically adjusting the learning rate through the proposed Augmented Gradient Descent (AGD) algo- rithm, and optimizing movement patterns. The AGD mechanism enables dynamic learning rate adjustment by comparing the gradient signs at the newly generated point with those at the current point for each dimension, allowing the optimization process to adapt to the current search state. These innovations mitigate zigzagging, improve initial positioning, and eliminate the need for manual learning rate tuning. By incorporating controlled randomization, SpGD addresses key limitations of traditional methods. Experimental results demonstrate that SpGD achieves enhanced accuracy, resolves step-size reduction constraints, and generates randomized points with superior efficiency. Even with modest computational resources, SpGD facilitates improved exploration of the solution space and increases the precision of locating global minima, despite com- putational challenges. The Spawning Gradient Descent (SpGD) algorithm demonstrates superior performance across both convex and non-convex benchmarks, significantly outperforming optimizers like GD, Adam, RAdam, and AdaBelief. For instance, on the Quadratic function, SpGD achieves a near-zero performance error of 1.7e−11, far surpassing AdaBelief (0.061). On non-convex functions such as Ackley, Schaffer, and Rastrigin, SpGD consistently achieves better proximity to the global optimum. For example, on the Ackley function, SpGD achieves a performance error of 0.005, significantly better than Momentum (0.429). While SpGD’s execution time is marginally higher, the increase is modest and justified by its sig- nificantly better precision. The SpGD algorithm also demonstrates superior performance in deep learning models, achieving faster convergence and higher accuracy on the CIFAR-10 and Fashion-MNIST datasets using ResNet-20 and DenseNet-19. SpGD achieves 80% and 85% accuracy on CIFAR-10 in just 28 and 20 epochs, respectively, significantly outperforming SRSGD, which requires up to 70 epochs to achieve comparable results. These findings underscore SpGD’s potential for effi- cient training of large-scale neural networks with reduced computational time. The implementation details and source code for all experiments in this study are available on GitHub at https://github.com/z-esmaily/Spawn_Gradient_Descent/tree/main

tags: Enhancing gradient descent · Adaptive learning rates · Deep learning · Controlled randomization