Run a GPU burn test
Stop the Runpod agent
Stop the Runpod agent to prevent the machine from accepting jobs during testing:
Run the burn test
Run a GPU burn test for at least 48 hours (172800 seconds) to verify stability under load:Monitor GPU temperatures and watch for any errors during the test.
Verify CPU, memory, and storage
Use stress-ng to test other system components:
Self-rent your machine
Go to your machine dashboard and self-rent the machine. Test it with popular templates to verify everything works correctly with real workloads.
What to watch for
- GPU temperature: Sustained temperatures above 85°C may indicate cooling issues.
- Memory errors: Any GPU memory errors during the burn test require investigation.
- System stability: Crashes, freezes, or unexpected reboots indicate hardware problems.
- Performance degradation: Significant performance drops over time may indicate thermal throttling.