Back to Article
Recent Advances in Mobile SoC: Design for Performance and Power Efficiency
Journal of Artificial Intelligence and Big Data
| Vol 3, Issue 1
Table 1. Comparative Analysis and Research Gaps inMobile Soc Performance and Power Efficiency Studies
| Reference | Study Focus | Key Findings | Challenges Identified | Limitations | Future Work |
| BenSaleh, Qasim & Obeid (2019) | Design of advanced SoC architecture integrating ARM host processor, high-speed IPs & peripherals | Provided a modular SoC design with rich connectivity (I2C, SPI, UART, watchdog, TPM, etc.) suitable for mobile devices | Complexity of integrating multiple heterogeneous IP cores while maintaining power efficiency | Lacks performance/power optimization evaluation; no experimental results on energy consumption | Develop power-aware integration frameworks, evaluate on real-world mobile workloads, explore security–performance trade-offs |
| Zhu, Mattina & Whatmough (2018) | System-level co-design for ML-intensive mobile pipelines (camera → ISP → memory → NN accelerator) | Demonstrated that co-optimizing entire mobile vision pipeline yields better efficiency than focusing only on ML accelerators | Difficulty in coordinating multiple heterogeneous SoC components; data transfer overhead | Study limited to continuous computer vision pipelines only | Extend co-design to multi-stage AR/VR/ADAS applications, integrate scheduling and thermal management, generalize framework to next-gen ML accelerators |
| Kuo et al. (2018) | Dynamic compact thermal modeling (DCTM) for heterogeneous CPU–GPU SoCs | Proposed a simplified yet accurate thermal model capturing CPU–GPU coupling; achieved <1.58°C error in validation | Thermal coupling complicates power management; real-time thermal-aware decision making | Does not integrate with run-time DVFS or task schedulers; modeled specific smartphone environments | Develop real-time thermal-aware schedulers, integrate DCTM with adaptive DVFS, expand model to NPU/AI accelerators |
| Kumaki, Koide & Fujino (2017) | Secure data processing using a SIMD matrix processor (MX-1) with interleaved-bitslice cipher implementation | Achieved up to 93% fewer clock cycles for AES; energy efficiency 4.8× higher than ARM Cortex-A8 | Balancing high-speed cryptography with limited mobile energy budgets | Focused on block cipher workloads only; limited discussion of integration into full SoC | Expand to post-quantum cryptography, integrate MX-1-like accelerators into heterogeneous SoCs, evaluate thermal impact |
| Lee, Jang & Kim (2016) | CPU–GPU parallelization to accelerate Viola–Jones face detection on mobile devices | Achieved 3.3–6.29× speedups using optimized parallelism (task parallelism, sliding windows, thread allocation) | Irregular memory access and unbalanced workloads limit GPU utilization | Study focuses on a single computer-vision algorithm; outdated relative to modern AI workloads | Extend to deep learning-based CV algorithms, apply dynamic scheduling for heterogeneity, analyze energy vs. performance trade-offs |