Abstract
This paper investigates a load scheduling problem essential for optimizing GPU performance in parallel processing environments. We formulate the problem as (Formula presented.) which integrates elements of flexible flow shop and parallel machine scheduling. In this model, a single machine at the first stage is followed by m parallel machines at the second stage; each job (Formula presented.) is processed first on the single machine and then simultaneously on a set of (Formula presented.) machines, with the goal of minimizing the makespan. The variant (Formula presented.) requires that the (Formula presented.) machines at the second stage be consecutive (i.e., contiguous in the given order). Focusing on the special case (Formula presented.) which equals to (Formula presented.) coincidentally, we propose a Polynomial Time Approximation Scheme (PTAS) based on a novel three-parameter scheduling system. This PTAS is expandable to more general cases involving a constant number of machines at the second stage. Our approach improves the best-known approximation ratio and achieves optimal approximability under the strong NP-hardness of the problem.
Original language | English |
---|---|
Journal | Journal of the Operational Research Society |
DOIs | |
State | Published - May 17 2025 |
Scopus Subject Areas
- Modeling and Simulation
- Strategy and Management
- Statistics, Probability and Uncertainty
- Management Science and Operations Research
Keywords
- GPU load scheduling
- Two-stage flexible flow shop scheduling
- integer linear programming
- polynomial time approximation scheme