Recent Advances in Mobile SoC: Design for Performance and Power Efficiency

Journal of Artificial Intelligence and Big Data | Vol 3, Issue 1

Table 1. Comparative Analysis and Research Gaps inMobile Soc Performance and Power Efficiency Studies

Reference	Study Focus	Key Findings	Challenges Identified	Limitations	Future Work
BenSaleh, Qasim & Obeid (2019)	Design of advanced SoC architecture integrating ARM host processor, high-speed IPs & peripherals	Provided a modular SoC design with rich connectivity (I2C, SPI, UART, watchdog, TPM, etc.) suitable for mobile devices	Complexity of integrating multiple heterogeneous IP cores while maintaining power efficiency	Lacks performance/power optimization evaluation; no experimental results on energy consumption	Develop power-aware integration frameworks, evaluate on real-world mobile workloads, explore security–performance trade-offs
Zhu, Mattina & Whatmough (2018)	System-level co-design for ML-intensive mobile pipelines (camera → ISP → memory → NN accelerator)	Demonstrated that co-optimizing entire mobile vision pipeline yields better efficiency than focusing only on ML accelerators	Difficulty in coordinating multiple heterogeneous SoC components; data transfer overhead	Study limited to continuous computer vision pipelines only	Extend co-design to multi-stage AR/VR/ADAS applications, integrate scheduling and thermal management, generalize framework to next-gen ML accelerators
Kuo et al. (2018)	Dynamic compact thermal modeling (DCTM) for heterogeneous CPU–GPU SoCs	Proposed a simplified yet accurate thermal model capturing CPU–GPU coupling; achieved <1.58°C error in validation	Thermal coupling complicates power management; real-time thermal-aware decision making	Does not integrate with run-time DVFS or task schedulers; modeled specific smartphone environments	Develop real-time thermal-aware schedulers, integrate DCTM with adaptive DVFS, expand model to NPU/AI accelerators
Kumaki, Koide & Fujino (2017)	Secure data processing using a SIMD matrix processor (MX-1) with interleaved-bitslice cipher implementation	Achieved up to 93% fewer clock cycles for AES; energy efficiency 4.8× higher than ARM Cortex-A8	Balancing high-speed cryptography with limited mobile energy budgets	Focused on block cipher workloads only; limited discussion of integration into full SoC	Expand to post-quantum cryptography, integrate MX-1-like accelerators into heterogeneous SoCs, evaluate thermal impact
Lee, Jang & Kim (2016)	CPU–GPU parallelization to accelerate Viola–Jones face detection on mobile devices	Achieved 3.3–6.29× speedups using optimized parallelism (task parallelism, sliding windows, thread allocation)	Irregular memory access and unbalanced workloads limit GPU utilization	Study focuses on a single computer-vision algorithm; outdated relative to modern AI workloads	Extend to deep learning-based CV algorithms, apply dynamic scheduling for heterogeneity, analyze energy vs. performance trade-offs