complete codex docs

This commit is contained in:
Frank Harris 2026-06-05 11:37:09 -05:00
parent b5dcf01a8c
commit 3cefad183d
62 changed files with 2730 additions and 50 deletions

View file

@ -0,0 +1,29 @@
# Decision 0001: Keep `screen` As The Shared Backend
## Status
Accepted
## Decision
GSP should keep `screen` as the shared process/session backend for both Linux and Windows/Cygwin agents for the current platform generation.
## Reasoning
- The existing agents already implement server lifecycle around `screen`.
- Linux and Windows/Cygwin behavior can stay aligned if both sides share the same session model.
- The Panel already expects session-based lifecycle checks.
- Moving to a different backend too early would create a large amount of compatibility work for start/stop/restart, logging, and scheduler flows.
## Alternatives Considered
- `tmux`
- direct process supervision without session wrappers
- custom daemon per server
## Why Those Were Not Chosen
- They would require broader code changes across both agents and the Panel.
- The existing runtime, log, and scheduler flows already assume `screen`.
- Cross-platform parity is easier to maintain with the current backend.

View file

@ -0,0 +1,32 @@
# Decision 0002: Agent-Truthed Status Detection
## Status
Accepted
## Decision
Server status should be derived from agent truth:
1. managed process/session existence
2. required game port listening
3. optional query and metadata lookup
## Reasoning
- Marker files become stale after crashes, failed starts, and power loss.
- Query systems like LGSL and GameQ are useful but unreliable as the sole online signal.
- The agent can check the actual runtime state locally.
## Alternatives Considered
- query-only status
- marker-file status
- process-only status
## Why Those Were Not Chosen
- Query-only status can lie for supported games.
- Marker files are not authoritative.
- Process-only status misses the important readiness condition: the port must actually listen.

View file

@ -0,0 +1,29 @@
# Decision 0003: First-Class Companion Programs
## Status
Accepted
## Decision
Companion applications such as BEC, B3, Discord bridges, log watchers, and stats collectors should be modeled as managed companion programs rather than ad hoc customer-editable startup scripts.
## Reasoning
- Customers should not control privileged helper commands through editable startup files.
- Companion processes need to be started, stopped, and restarted alongside the game server.
- PID and log handling should be centralized.
- The same design should work across Linux and Windows/Cygwin.
## Alternatives Considered
- keep `_alsoRun.bat` style helper files
- use only `pre_start` scripts
- rely on manual customer scripts
## Why Those Were Not Chosen
- They are not centrally managed.
- They are hard to secure.
- They are difficult to cleanly stop on server shutdown.

View file

@ -0,0 +1,26 @@
# Decision 0004: Server Content Manager Is The Workshop Layer
## Status
Accepted
## Decision
`Panel/modules/addonsmanager` should remain the primary future home for Workshop items, mods, add-ons, and server content. `steam_workshop` should remain a deprecated compatibility layer only.
## Reasoning
- `addonsmanager` already has the richer schema and more complete product direction.
- It supports content types beyond Steam Workshop.
- It is a better fit for load order, enable/disable, install history, and metadata.
## Alternatives Considered
- keep `steam_workshop` as the main module
- split mods, add-ons, and Workshop into separate modules
## Why Those Were Not Chosen
- `steam_workshop` is explicitly deprecated in the codebase.
- Separate modules would fragment user workflows and duplicate install logic.

View file

@ -0,0 +1,26 @@
# Decision 0005: Keep Managed Control Paths Outside Customer-Easy Edit Paths
## Status
Accepted
## Decision
Managed control files, manifests, scheduler state, and helper scripts should live in protected control locations rather than in customer-editable startup files where possible.
## Reasoning
- Customer-editable startup areas are too easy to tamper with.
- Managed state should not depend on files that customers can modify through FTP or file manager.
- Secure and auditable behavior is easier when control files are outside the customer content path.
## Alternatives Considered
- keep helper scripts in the game home
- keep runtime manifests next to the game executable
## Why Those Were Not Chosen
- Those options make it too easy for customers to alter managed execution.
- They complicate cleanup and lifecycle tracking.

View file

@ -0,0 +1,28 @@
# Decision 0006: Installers Must Be Game-Capability Driven
## Status
Accepted
## Decision
Installer behavior should be driven by game XML capabilities and module metadata instead of ad hoc shell scripts or one-off module pages.
## Reasoning
- Different games need different install strategies.
- Some games are content-copy based.
- Some are SteamCMD based.
- Some need key copying, startup parameter edits, or profile transforms.
- The Panel needs a structured model to support all of them cleanly.
## Alternatives Considered
- per-game custom shell scripts only
- raw customer-provided installer commands
## Why Those Were Not Chosen
- They are harder to secure and harder to document.
- They do not scale cleanly across games or platforms.

View file

@ -0,0 +1,866 @@
# Companion Programs Design Investigation
This is an investigation-only design report for adding first-class companion/sidecar application support to GSP game servers. No implementation is included here.
Repository layout reviewed:
- `/Agent-Windows`
- `/Agent_Linux`
- `/Panel`
- `/Website`
Note: the repository currently uses `Agent_Linux` on disk, not `Agent-Linux`.
## 1. Current Flow Found
### Main Agent Files
Relevant files:
- `Agent_Linux/ogp_agent.pl`
- `Agent-Windows/ogp_agent.pl`
- `Panel/includes/lib_remote.php`
- `Panel/modules/gamemanager/mini_start.php`
- `Panel/modules/gamemanager/home_handling_functions.php`
- `Panel/modules/gamemanager/start_server.php`
- `Panel/modules/gamemanager/stop_server.php`
- `Panel/modules/gamemanager/restart_server.php`
- `Panel/modules/config_games/schema_server_config.xml`
- `Panel/modules/config_games/server_config_parser.php`
- `Panel/modules/config_games/config_servers.php`
- `Panel/modules/config_games/cli-params.php`
### Agent Startup Directory And Autostart
Both agents define an agent run directory and a startup flag directory:
- Linux: `AGENT_RUN_DIR`, `GAME_STARTUP_DIR`, `SCREEN_LOGS_DIR`, `SCREENRC_FILE`
- Windows/Cygwin: same general constants, with Windows-specific SteamCMD executable paths
Both agents create/read the `startups` directory on agent startup. If startup files exist and `--no-startups` is not set, the agent reads each startup file and calls `universal_start_without_decrypt(...)` with the saved startup arguments.
Startup files are CSV-like records containing:
```text
home_id, home_path, server_exe, run_dir, startup_cmd, server_port, server_ip, cpu, nice, preStart, envVars, game_key, console_log
```
These files are useful for agent autostart but should not be treated as runtime status truth.
### Screen Usage
Both agents use `screen` as the shared runtime backend.
Linux:
- `create_screen_id(SCREEN_TYPE_HOME, $home_id)` creates names like `OGP_HOME_000000123`.
- `create_screen_cmd(...)` returns a `screen -d -m -t ... -S ...` command.
- `create_screen_cmd_loop(...)` creates a generated shell script named like `OGP_HOME_000000123_startup_scr.sh`, then runs that script inside `screen`.
- The generated shell script loops/restarts the server when autorestart is enabled.
- The script checks for `SERVER_STOPPED` before respawning.
Windows/Cygwin:
- Uses the same `create_screen_id(...)` naming pattern.
- `create_screen_cmd(...)` launches a command in `screen`.
- `create_screen_cmd_loop(...)` writes `_serverStart.bat`, then runs it through `cmd /Q /C` inside `screen`.
- The generated batch file loops/restarts the server when autorestart is enabled.
- The batch file checks `SERVER_STOPPED` before respawning.
### Start Flow
Panel start path:
- `Panel/modules/gamemanager/mini_start.php` builds `$start_cmd`.
- It collects XML `pre_start`, `environment_variables`, `lock_files`, executable name, executable location, port, IP, CPU affinity, nice value, game key, and console log.
- It calls `$remote->universal_start(...)`.
Linux agent start:
- `universal_start_without_decrypt(...)` validates the home path and executable.
- It may create a per-game Linux user when that preference is enabled.
- It writes/updates the startup flag file in `GAME_STARTUP_DIR`.
- It converts `pre_start` and environment variables between multiline and startup-safe formats.
- It builds the command according to executable type:
- `.exe` / `.bat`: Wine command
- `.jar`: startup command
- other: `./server_exe startup_cmd`
- It creates a generated shell startup script through `create_screen_cmd_loop(...)`.
- It backs up `screenlogs/screenlog.<screen_id>`.
- It runs pre-start commands through `run_before_start_commands(...)`.
- It launches the generated command through `sudo_exec_without_decrypt($cli_bin, $owner)`.
- It renices server processes after launch.
Windows/Cygwin agent start:
- `universal_start_without_decrypt(...)` validates the home path and executable.
- It converts paths with `cygpath -wa`.
- It converts `pre_start` and environment variables.
- It builds Windows commands using `cmd /Q /C start ... /WAIT`.
- With autorestart enabled, it uses `create_screen_cmd_loop(...)` and `_serverStart.bat`.
- Without autorestart, it uses `create_screen_cmd(...)`.
- It backs up `screenlogs/screenlog.<screen_id>`.
- It runs pre-start commands through `run_before_start_commands(...)`.
- It launches the screen command with `system($cli_bin)`.
- It writes a startup file named `_serverStart.bat` under `GAME_STARTUP_DIR`, though the name is generic rather than per IP/port in the current Windows path.
### Stop Flow
Panel stop path:
- `Panel/modules/gamemanager/stop_server.php` calls `$remote->remote_stop_server(...)`.
- Stop parameters include home ID, IP, port, control protocol, control password, control type, and home path.
Linux agent stop:
- `stop_server_without_decrypt(...)` removes the startup flag for `$server_ip-$server_port`.
- If autorestart is enabled, it creates `SERVER_STOPPED` in the game home.
- It attempts a graceful stop when configured:
- `rcon`
- `rcon2`
- `armabe`
- `minecraft`
- If graceful stop does not complete, it collects PIDs from the screen session using `get_home_pids(...)`.
- It sends `kill 15`, then `kill 9` if needed.
- It runs `screen -wipe`.
Windows/Cygwin agent stop:
- `stop_server_without_decrypt(...)` removes the startup flag.
- If autorestart is enabled, it creates `SERVER_STOPPED` in the game home.
- It finds the screen PID from `screen -list`.
- It maps that to a Windows PID using `ps -W`.
- It calls `cmd /C taskkill /f /fi 'PID eq ...' /T`.
- It runs `screen -wipe`.
### Restart Flow
Both agents expose `restart_server` and `restart_server_without_decrypt(...)`.
The current intended restart flow is:
1. Stop server.
2. Wait 60 seconds.
3. Start server.
That flow is now present in the current agent files. Any companion system should hook into this sequence explicitly:
1. Stop companions.
2. Stop game server.
3. Wait 60 seconds.
4. Start game server.
5. Start companions after their configured delays.
### PID Tracking
Current PID tracking exists but is inconsistent:
- The agents write their own `ogp_agent.pid`.
- FastDownload has `fd.pid`.
- Scheduler has `scheduler.pid`.
- Linux game server child PIDs are discovered from the screen PID with `pgrep -P`.
- Windows/Cygwin maps a screen PID to a Windows PID through `ps -W`.
- Some game XML uses the `PID_FILE` CLI param, for example Source-engine servers.
- Windows `_alsoRun.bat` behavior expects `_alsoRun.pid` to contain helper PIDs.
There is no general companion PID registry today.
### Marker Files
Relevant marker files:
- `GAME_STARTUP_DIR/<ip>-<port>` startup flag on Linux.
- Windows writes a startup file under `GAME_STARTUP_DIR`, currently `_serverStart.bat`.
- `SERVER_STOPPED` in the game home when autorestart is enabled.
- `_alsoRun.pid` for the current Windows helper behavior.
`SERVER_STOPPED` is used by generated autorestart scripts to decide whether to respawn. It should not be used as the source of truth for current runtime status.
### Logs
Game screen logs are written under:
```text
screenlogs/screenlog.<screen_id>
```
The agents call `backup_home_log(...)` before starting a server. If XML defines `console_log`, Panel/agent log functions may read the game-specific console log instead.
There is no dedicated companion stdout/stderr log path today.
## 2. Current Helper Script Behavior
### `_alsoRun.bat`
The clearest current companion behavior is Windows-only.
In `Agent-Windows/ogp_agent.pl`, `create_screen_cmd_loop(...)` generates `_serverStart.bat`. That generated batch file contains:
```bat
if exist "_alsoRun.bat" call "_alsoRun.bat"
start ... /wait <game command>
for /f %%p in (_alsoRun.pid) do taskkill /PID %%p /F
```
This means:
- `_alsoRun.bat` is executed before the game server process.
- `_alsoRun.pid` is read after the game server exits.
- Each PID listed in `_alsoRun.pid` is force-killed.
- This only exists in the Windows/Cygwin agent path.
- It only works when the generated `_serverStart.bat` loop path is used.
- It relies on files in or near the game server working directory.
- It depends on the helper script correctly writing `_alsoRun.pid`.
Some DayZ/Arma XML files generate `_alsoRun.bat` in `pre_start`. Examples found include:
- `Panel/modules/config_games/server_configs/dayz_arma2co_win32.xml`
- `Panel/modules/config_games/server_configs/dayz_epoch_mod_win32.xml`
Those XML files build `_alsoRun.bat` to start BEC and use WMIC to find/write the BEC PID into `_alsoRun.pid`.
Problems:
- The helper file lives in a customer-accessible game area.
- The customer may be able to edit/delete `_alsoRun.bat` or `_alsoRun.pid`.
- It is Windows-specific.
- PID capture depends on executable path matching and WMIC.
- Cleanup happens after the game process exits, not as a first-class stop action.
- It is not centrally visible to the Panel as companion state.
### `pre_start`
The XML schema supports `pre_start`.
Panel reads `pre_start` in:
- `Panel/modules/gamemanager/mini_start.php`
- `Panel/modules/gamemanager/restart_server.php`
- `Panel/modules/gamemanager/home_handling_functions.php`
The value is passed to the agent as `$preStart`.
Linux behavior:
- `run_before_start_commands(...)` writes `${home_path}/prestart_ogp.sh`.
- It writes each XML line into the shell file.
- It runs `bash prestart_ogp.sh` as the game owner.
- The script removes itself at the end.
Windows/Cygwin behavior:
- `run_before_start_commands(...)` writes `_prestart.bat` in the Windows-converted home path.
- It writes each XML line into the batch file.
- It runs `cmd /Q /C start /wait "<_prestart.bat>"`.
- It deletes `_prestart.bat`.
Concerns:
- `pre_start` is trusted XML/admin-defined script content, but it executes in/against the customer game home.
- It is not structured.
- It is not tracked.
- It is only "before start"; it is not a companion lifecycle model.
- It cannot reliably stop helper processes unless custom script authors implement that themselves.
### `post_start`
The schema and config editor order include `post_start`.
Found references:
- `Panel/modules/config_games/schema_server_config.xml`
- `Panel/modules/config_games/config_servers.php`
- `Panel/modules/config_games/xml_tag_descriptions.php`
- One example XML: `Panel/modules/config_games/server_configs/nexuiz_win.xml`
I did not find a start path that reads `server_xml->post_start` or sends it to the agents. As currently written, `post_start` appears schema-visible but not operational.
### `post_install`
Both agents include workshop/mod post-install script generation logic. This is install/update-related, not game-server runtime lifecycle management.
`post_install` should not be used for companion processes because it is not tied to server start/stop/restart.
### Custom Fields And Custom Scripts
Custom fields are supported in XML and stored per server. They are used by config replacement logic:
- `Panel/modules/gamemanager/cfg_text_replace.php`
- `Panel/modules/config_games/schema_server_config.xml`
Custom fields can influence config file content. They are not currently a safe or structured way to define executable companion commands.
Customers can also use extra startup parameters where permissions allow. That should not become the companion command source because it would permit arbitrary command injection into managed helper launches.
## 3. Recommended Method 1
### First-Class Companion Programs System
The best design is a first-class, structured companion program system.
Game XML should define the allowed companion programs for that game. The Panel should store per-server choices such as enabled/disabled and optional safe settings. The agent should own runtime launch, PID tracking, log paths, stop, and restart cleanup.
Example XML shape:
```xml
<companion_programs>
<program key="bec" name="BattlEye Extended Controls" os="windows">
<enabled_default>0</enabled_default>
<delay_seconds_default>30</delay_seconds_default>
<working_dir>{GAME_PATH}\BEC</working_dir>
<command>Bec.exe</command>
<args>-f Config.cfg</args>
<process_name>Bec.exe</process_name>
<stop_on_server_stop>1</stop_on_server_stop>
<restart_on_server_restart>1</restart_on_server_restart>
<stdout_log>companions/bec.out.log</stdout_log>
<stderr_log>companions/bec.err.log</stderr_log>
</program>
</companion_programs>
```
Recommended XML fields:
- `key`: stable internal ID.
- `name`: display name.
- `os`: `linux`, `windows`, or `any`.
- `enabled_default`: default panel setting.
- `delay_seconds_default`: default delay after game start.
- `working_dir`: trusted template path.
- `command`: trusted executable/script name or path template.
- `args`: trusted default arguments.
- `process_name`: optional verification aid only, not the kill target by itself.
- `stop_command`: optional graceful stop command.
- `stop_timeout_seconds`: default graceful wait.
- `kill_after_timeout`: boolean.
- `stop_on_server_stop`: boolean.
- `restart_on_server_restart`: boolean.
- `stdout_log` / `stderr_log`: relative managed log paths.
- `required_files`: optional list for Panel validation.
- `requires_game_online`: whether to delay until the game port is online.
### Where XML Defines Allowed Companion Programs
Add `companion_programs` to `Panel/modules/config_games/schema_server_config.xml` as an optional top-level element.
Only admins/editors of game XML should be able to define:
- executable path
- arguments
- working directory
- stop command
- process name
- default delay
- log paths
Customers should not be able to enter arbitrary commands.
### Where Panel Stores Per-Server Settings
Panel should store per-home companion settings in the database, not in customer-editable files.
Possible table:
```sql
server_companion_settings (
id INT AUTO_INCREMENT PRIMARY KEY,
home_id INT NOT NULL,
companion_key VARCHAR(64) NOT NULL,
enabled TINYINT(1) NOT NULL DEFAULT 0,
delay_seconds INT NULL,
settings_json LONGTEXT NULL,
UNIQUE KEY uniq_home_companion (home_id, companion_key)
)
```
`settings_json` should only contain safe, schema-defined options. Do not allow free-form command or shell text. For example:
- config profile selection
- config file name from an allowed list
- delay override within min/max bounds
- environment preset key
### How The Agent Receives Settings
Extend the encrypted XML-RPC start/restart call to include a companion payload, or add a dedicated agent RPC after start:
Recommended:
- `universal_start(..., companion_payload)` for direct lifecycle coupling.
- `server_companion_start(home_id, payload)` for explicit start action.
- `server_companion_stop(home_id)` for explicit stop action.
- `server_companion_status(home_id)` for status display.
The payload should be generated by the Panel from trusted XML plus stored per-server enabled settings.
Payload should contain fully resolved, validated data:
```json
{
"home_id": 123,
"server_ip": "1.2.3.4",
"server_port": 2302,
"companions": [
{
"key": "bec",
"enabled": true,
"delay_seconds": 30,
"working_dir": "/home/ogp_agent/OGP_User_Files/123/BEC",
"command": "Bec.exe",
"args": ["-f", "Config.cfg"],
"stdout_log": "bec.out.log",
"stderr_log": "bec.err.log"
}
]
}
```
For XML-RPC compatibility, this may be passed as JSON text.
### How The Agent Starts Companions
The agent should start companions after the game screen/session has launched.
Recommended approach:
- Use separate `screen` sessions per companion.
- Session naming:
- Game: `OGP_HOME_000000123`
- Companion: `OGP_COMPANION_000000123_bec`
- Write companion runtime state under an agent-controlled path outside the game FTP/file-manager root.
- Start delayed companions through a small agent-owned delay wrapper:
- Linux: `screen -d -m -S OGP_COMPANION_... bash -lc 'sleep 30; cd ...; exec ...'`
- Windows/Cygwin: `screen -d -m -S OGP_COMPANION_... cmd /Q /C "timeout /t 30 ... && cd /d ... && ..."`
This avoids blocking game startup. The Panel can show the game as `STARTING`/`ONLINE` independently while companion statuses move through `PENDING`, `STARTING`, `RUNNING`, or `FAILED`.
### How The Agent Records PIDs
Create an agent-owned runtime directory, for example:
```text
<AGENT_RUN_DIR>/companions/<home_id>/
```
Files:
```text
companions/<home_id>/state.json
companions/<home_id>/pids/<companion_key>.pid
companions/<home_id>/logs/<companion_key>.out.log
companions/<home_id>/logs/<companion_key>.err.log
```
State should include:
- home ID
- companion key
- screen session name
- agent PID/screen PID
- child PID if known
- command hash or executable path
- start time
- last status
- last error
For Linux:
- Prefer launching through `screen`.
- Use the screen PID plus child process lookup.
- If possible, make the wrapper write `echo $$` and the exec child PID to a PID file.
- Use process group/session cleanup where possible.
For Windows/Cygwin:
- Keep `screen` for parity.
- Prefer a wrapper that starts the companion and writes the Windows PID to a PID file.
- Use Cygwin `ps -W` or PowerShell/WMIC alternatives where available.
- Avoid killing by executable name alone.
### How The Agent Stops Companions
Stop by the recorded handle, not by executable name.
Recommended stop order:
1. Load `companions/<home_id>/state.json`.
2. For each running companion:
- If `stop_command` exists, run it in the configured working directory.
- Wait up to `stop_timeout_seconds`.
- If still alive and `kill_after_timeout=1`, kill only recorded PID/process tree/session.
- Close the companion screen session.
- Update state.
3. Clean stale PID files only after verifying the process is gone.
Avoid:
- `pkill Bec.exe`
- `taskkill /IM Bec.exe`
- broad executable-name matching
Those can kill unrelated customer processes.
### Restart Behavior
Recommended restart sequence:
1. Stop companions for the home.
2. Stop game server.
3. Wait 60 seconds.
4. Start game server.
5. Start enabled companions according to delay settings.
Do not let companions survive a restart unless explicitly marked as persistent and safe.
### Companion Logging
Companion stdout/stderr should be logged separately from game screen logs.
Recommended default:
```text
<AGENT_RUN_DIR>/companions/<home_id>/logs/<key>.out.log
<AGENT_RUN_DIR>/companions/<home_id>/logs/<key>.err.log
```
Panel can add a companion log viewer later, separate from the game server log viewer.
For customer visibility, expose logs through the agent RPC after permission checks rather than placing them inside the customer FTP root by default.
### Security
This method is secure because:
- Commands come from trusted game XML/admin configuration.
- Customer UI stores only enable/disable and bounded safe settings.
- Runtime files live outside the customer file manager/FTP root.
- The agent launches only known companion keys for the current game config.
- Stop/kill uses recorded PIDs/screen sessions, not executable names.
- Logs are controlled by the agent.
### Portability
This method is portable because:
- XML can define Linux and Windows variants with the same logical companion key.
- Agent receives normalized companion payload.
- Both agents use screen sessions.
- PID capture is platform-specific behind the same agent RPC model.
## 4. Recommended Method 2
### Agent-Managed Generated Startup/Cleanup Scripts
Alternate design: keep using scripts, but generate and store them in an agent-controlled directory outside customer-editable game files.
Example layout:
```text
<AGENT_RUN_DIR>/companions/<home_id>/
companions.json
start_companions.sh
stop_companions.sh
start_companions.bat
stop_companions.bat
pids/
logs/
```
The Panel still defines companions from trusted XML and stores per-server enabled settings. The agent generates scripts from those trusted definitions.
Start flow:
1. Game server launches.
2. Agent starts an agent-owned companion startup script in its own screen session.
3. Script handles delays and PID capture.
Stop flow:
1. Agent runs an agent-owned cleanup script.
2. Script kills only PIDs listed in the agent-owned `pids` directory.
3. Agent verifies cleanup.
### Pros
- Closer to the existing `_serverStart.bat` and generated shell script model.
- Easier to debug for admins because generated scripts are readable.
- Can work with the existing screen backend.
- Avoids customer-editable `_alsoRun.bat`.
- Can be implemented incrementally.
### Cons
- Still script-heavy.
- Quoting rules remain difficult across Linux, Cygwin, and Windows.
- Risk of script injection if generation is not strict.
- PID capture still needs platform-specific care.
- State management can drift if scripts are edited manually by admins.
- Less clean than direct agent-managed process APIs.
### Comparison To Method 1
Method 1 is better long term because the agent owns lifecycle state directly. Method 2 is a reasonable migration bridge if implementation speed matters, but it should still use trusted XML/admin definitions and agent-owned control paths.
## 5. Security Considerations
### Do Not Trust Customer-Editable Startup Files
Avoid using customer-editable files such as:
- `_alsoRun.bat`
- `_alsoRun.pid`
- arbitrary `.bat` or `.sh` helper files in the game home
Those files can be edited, deleted, replaced, or used to run unintended commands.
### Managed Files Should Live Outside The FTP/File Manager Root
Recommended location:
```text
<AGENT_RUN_DIR>/companions/<home_id>/
```
This directory should be owned by the agent or game server runner user and should not be directly writable by customers.
### Commands Must Come From Trusted Configuration
Companion commands should come from:
- shipped game XML
- admin-managed XML
- a future admin-only companion catalog
They should not come from:
- customer text fields
- startup extra parameters
- uploaded files
- customer-editable config files
### Validate Commands
Validation should include:
- companion key exists in the current game XML
- OS matches the agent OS
- command path resolves under an allowed directory unless explicitly admin-approved
- working directory resolves under game home or an admin-approved path
- no unexpanded `{...}` tokens remain
- no shell metacharacters in fields that are supposed to be argv tokens
- delay is numeric and capped
- log paths are relative and cannot escape managed log directories
### Avoid Arbitrary Command Execution
Use argv-style execution where possible.
When shell/batch is unavoidable:
- generate commands from trusted fields only
- quote every path/argument consistently
- avoid concatenating customer-provided strings into shell code
- keep generated files outside customer write paths
### Avoid Killing Unrelated Processes
Do not kill by executable name alone.
Preferred kill targets:
- recorded companion PID
- recorded process group
- recorded screen session
- recorded Windows process tree for that PID
Process name should be used only as a verification hint, not as the primary kill selector.
## 6. Cross-platform Considerations
### Linux Agent
Linux can use:
- `screen`
- `bash`
- PID files
- `/proc`
- `pgrep -P`
- `kill 15`, then `kill 9`
Recommended Linux companion launch:
```text
screen -d -m -S OGP_COMPANION_<home_id>_<key> bash -lc '<agent-owned wrapper>'
```
Use an agent-owned wrapper or direct fork/exec to record PID and redirect logs.
### Windows/Cygwin Agent
Windows/Cygwin can use:
- `screen`
- `cmd /Q /C`
- `start`
- `taskkill /PID ... /T`
- `ps -W`
- possibly WMIC or PowerShell depending on environment
Current `_alsoRun.bat` depends on WMIC in some XML. WMIC is not reliable on all modern Windows installations, so future code should not require it.
Recommended Windows companion launch:
```text
screen -d -m -S OGP_COMPANION_<home_id>_<key> cmd /Q /C "<agent-owned wrapper.bat>"
```
The wrapper should record the Windows PID or enough process/session information for cleanup.
### Batch vs Shell Differences
Risks:
- quoting spaces in paths
- escaping backslashes
- environment variable syntax differences
- `start` window title behavior on Windows
- delayed expansion in batch files
- Cygwin path vs Windows path conversion
- signal semantics: Linux signals vs Windows taskkill
The Panel should not build platform command strings. It should send trusted structured definitions to the agent and let the agent perform OS-specific quoting/execution.
### Screen Behavior
Screen remains the default shared backend for now.
Use separate screen sessions for companions rather than attaching them to the game server screen. This gives:
- independent status
- independent logs
- safer cleanup
- clearer restart behavior
### Process Cleanup Risks
Companion processes may:
- fork children
- daemonize
- spawn background processes
- exit while leaving children
- fail before writing PID
- be started manually by the customer/admin
The agent should track process trees where possible and verify stop results.
## 7. Suggested Implementation Phases
### Phase 1: Inventory Current Flow And Add Report Only
This report is Phase 1.
No source code changes should be made for companion behavior in this phase.
### Phase 2: XML Schema Design
Add `companion_programs` to:
- `Panel/modules/config_games/schema_server_config.xml`
- `Panel/modules/config_games/config_servers.php`
- `Panel/modules/config_games/xml_tag_descriptions.php`
Create example XML for BEC, B3, and a generic log watcher.
### Phase 3: Panel Storage/UI Design
Add storage for per-server companion settings.
Initial UI:
- show companion list from XML
- enable/disable checkbox
- delay override
- safe predefined options only
Do not expose raw command editing to customers.
### Phase 4: Agent Start/Stop Integration
Add encrypted RPCs:
- `server_companion_start`
- `server_companion_stop`
- `server_companion_status`
Extend start/restart flow to call companion start/stop at the right time.
### Phase 5: PID Tracking And Cleanup
Implement:
- agent-owned state directory
- PID files
- screen session names
- process tree cleanup
- stale PID detection
- companion status reporting
### Phase 6: DayZ/BEC Test
Migrate one DayZ/Arma BEC case away from `_alsoRun.bat`.
Test:
- start game
- delayed BEC start
- BEC status
- stop game
- BEC cleanup
- restart flow
- crash/recovery behavior
- agent restart behavior
### Phase 7: Generalized Docs
Document:
- XML authoring
- admin setup
- supported placeholders
- security rules
- troubleshooting
- log locations
- migration path from `_alsoRun.bat`
## 8. Open Questions
1. Should companions be started after the game process exists, after the game port is listening, or after query/RCON succeeds when available?
2. Should companions be allowed to keep running if the game server crashes and autorestarts, or should they always restart with each game process cycle?
3. Should some companions be marked "persistent" across game restarts?
4. Which user should run companions on Linux when `LINUX_USER_PER_GAME_SERVER` is enabled?
5. Should companion logs be visible to customers by default, or admin-only?
6. Should companion config files be editable through the existing config file module?
7. How should Windows PID capture work on systems without WMIC?
8. Is PowerShell guaranteed available in the supported Windows/Cygwin agent environment?
9. Should the Panel support companion install/update packages, or only runtime start/stop of already-installed files?
10. Should companion definitions live only in game XML, or should there be a reusable global companion catalog?
11. How should firewall rules handle companion ports if a companion needs one?
12. What should happen if a companion fails but the game server is online?
13. Should companion failure affect billing/status/customer-facing uptime?
14. How should agent autostart restore companion state after power loss or agent restart?
15. Should the agent stop companions before or after graceful game stop for tools that send shutdown messages to the game?
16. Should companion environment variables be separate from game environment variables?
17. How should command placeholders be standardized across Linux and Windows?
18. What is the maximum acceptable startup delay for companion programs?
19. Should admins be able to override companion commands per server, or only per game XML?
20. How should existing `_alsoRun.bat` game configs be migrated without breaking current customers?
## Summary Recommendation
Build Method 1: a first-class companion programs system driven by trusted XML/admin configuration, stored per server in the Panel, and executed/owned by the agent.
Do not continue expanding `_alsoRun.bat`. It is a useful proof of need, but it is Windows-specific, customer-editable, hard to audit, and unreliable for stop/restart cleanup.
Use `screen` as the shared backend for now, with one screen session per companion and agent-owned PID/log/state files outside the customer file root.

21
docs/decisions/README.md Normal file
View file

@ -0,0 +1,21 @@
# Decisions
This folder holds permanent architecture decisions and a small set of preserved investigation reports that informed those decisions.
## Decision Records
- `0001-screen-vs-tmux.md`
- `0002-status-detection.md`
- `0003-companion-programs.md`
- `0004-workshop-system.md`
- `0005-control-path-layout.md`
- `0006-installers.md`
## Preserved Reports
- `COMPANION_PROGRAMS_DESIGN.md`
- `SCHEDULER_ACTIONS_DESIGN.md`
- `STEAM_WORKSHOP_DESIGN.md`
Use the numbered decision files for long-term design rules. Use the report files for the investigative context that led to those decisions.

View file

@ -0,0 +1,810 @@
# GSP Scheduler Actions Design
## Scope
This is an investigation and design report only. It does not implement code.
The goal is to redesign GSP's Scheduler / CRON feature into a safer, more useful automation system for game hosting customers and administrators.
Repository layout reviewed:
- `Agent-Windows`
- `Agent_Linux` (the Linux agent directory currently uses an underscore in this repository)
- `Panel`
- `Website`
## 1. Current Scheduler Module Findings
### Files inspected
Panel Scheduler module:
- `Panel/modules/cron/module.php`
- `Panel/modules/cron/navigation.xml`
- `Panel/modules/cron/cron.php`
- `Panel/modules/cron/user_cron.php`
- `Panel/modules/cron/shared_cron_functions.php`
- `Panel/modules/cron/events.php`
- `Panel/modules/cron/thetime.php`
Panel remote/API integration:
- `Panel/includes/lib_remote.php`
- `Panel/includes/api_functions.php`
- `Panel/modules/gamemanager/start_server.php`
- `Panel/modules/gamemanager/stop_server.php`
- `Panel/modules/gamemanager/restart_server.php`
- `Panel/modules/gamemanager/update_actions.php`
- `Panel/modules/gamemanager/rcon.php`
- `Panel/modules/addonsmanager/server_content_actions.php`
Agent scheduler implementation:
- `Agent_Linux/ogp_agent.pl`
- `Agent-Windows/ogp_agent.pl`
### Current database tables used
The current Scheduler module does not appear to own database tables. Module metadata has:
- `Panel/modules/cron/module.php`
- `$db_version = 0`
Scheduled jobs are stored on each agent in a flat file:
- Linux: `AGENT_RUN_DIR/Schedule/scheduler.tasks`
- Linux: `AGENT_RUN_DIR/Schedule/scheduler.pid`
- Linux: `AGENT_RUN_DIR/Schedule/scheduler.log`
- Windows/Cygwin: `AGENT_RUN_DIR/scheduler.tasks`
- Windows/Cygwin: `AGENT_RUN_DIR/scheduler.pid`
- Windows/Cygwin: `AGENT_RUN_DIR/scheduler.log`
This means the agent is currently the storage location for task definitions, and the Panel reconstructs task lists by asking each agent for its task file.
### Current actions
Current customer-visible scheduled actions from `get_action_selector()`:
- `restart`
- `stop`
- `start`
- `steam_auto_update` when the game XML installer is `steamcmd`
Additional server content actions are appended when `addonsmanager` is installed:
- `server_content_check_updates`
- `server_content_check_workshop_updates`
- `server_content_install_updates_if_stopped`
- `server_content_install_updates_next_restart`
- `server_content_install_updates_now`
- `server_content_install_updates_and_restart`
- `server_content_notify_updates_only`
- `server_content_update_all`
- `server_content_validate_files`
- `server_content_backup_before_update`
Admin-only raw command path:
- `Panel/modules/cron/cron.php` exposes a second form where an admin selects a remote server and enters a raw shell command.
### How tasks are created
Customer task creation path:
1. User opens `Panel/modules/cron/user_cron.php`.
2. User selects a game server and action.
3. Panel validates the five CRON fields using `checkCronInput()`.
4. Panel calls `build_cron_scheduler_command()`.
5. The command is built as a `wget` callback to `ogp_api.php`.
6. Panel sends the whole CRON line to the agent through `scheduler_add_task()`.
7. Agent appends the task line to `scheduler.tasks`.
8. Agent restarts its scheduler process.
Admin task creation path:
1. Admin opens `Panel/modules/cron/cron.php`.
2. Admin can use the same server/action selector.
3. Admin can also enter a raw command for a remote server.
4. Panel writes the raw command into the agent task file.
Current scheduled API callback examples:
```text
wget -qO- "<panel>/ogp_api.php?gamemanager/stop&token=<token>&ip=<ip>&port=<port>&mod_key=<mod_key>" --no-check-certificate > /dev/null 2>&1
```
```text
wget -qO- "<panel>/ogp_api.php?server_content/run_scheduled_action&token=<token>&home_id=<home_id>&action=<action>&options=<json>" --no-check-certificate > /dev/null 2>&1
```
### How tasks execute
Both agents use Perl `Schedule::Cron`.
Agent startup:
- Stops prior scheduler process using `scheduler_stop()`.
- Creates a `Schedule::Cron` object.
- Adds a read/reload task that runs every second:
- `* * * * * *`
- Starts scheduler detached with `scheduler.pid`.
Agent task reload:
- `scheduler_read_tasks()` opens `scheduler.tasks`.
- It clears the in-memory timetable.
- It splits each line into five CRON fields plus command args.
- If args start with `%ACTION`, it uses `scheduler_server_action()`.
- Otherwise it adds a generic shell command task.
Current Panel-generated jobs are generic shell commands, not `%ACTION` jobs. They execute through:
- `scheduler_dispatcher()`
- backtick execution of the scheduled command
- append to `scheduler.log`
The older `%ACTION=start|stop|restart` direct-agent scheduler path still exists but does not appear to be the primary current Panel path.
### How task results are logged
Agent logging:
- `scheduler_log_events()` appends plain text to `scheduler.log`.
- Generic commands log:
- the command text
- any response text
Panel viewing:
- `Panel/modules/cron/events.php` reads `scheduler.log` from the selected remote server.
- It refreshes the log area periodically.
Limitations:
- No structured per-task run records.
- No status model such as pending/running/success/failed/skipped.
- No reliable per-run output attached to a task ID.
- No last run / next run / duration / exit code storage in the Panel DB.
- `wget` callbacks redirect output to `/dev/null`, so useful API responses are discarded.
### Current limitations and bugs
1. No Scheduler-owned database tables.
2. Tasks are stored per agent, so offline agents make task state invisible or stale.
3. Tasks contain API tokens in plain text inside agent task files.
4. Generic command scheduler can run arbitrary shell commands.
5. Admin raw command scheduling is powerful and should remain admin-only or be removed from the normal Scheduler UI.
6. Current customer tasks call the Panel through `wget`, so task execution depends on the agent reaching the Panel HTTP URL.
7. `--no-check-certificate` weakens TLS verification.
8. Task output is discarded for Panel API callbacks.
9. No retry policy.
10. No overlap prevention.
11. No conflict prevention, such as update and restart at the same time.
12. No job lock per game server.
13. No missed-run handling after agent downtime.
14. No clear timezone UX.
15. Admin and customer scheduling models are mixed in the same module.
16. Server content scheduled actions include duplicates and placeholders.
17. Some action names are not customer-friendly.
18. There is no typed argument system for warnings, backup paths, retention, RCON command allowlists, or wipe options.
19. There is no first-class notification support.
20. Linux and Windows store scheduler files in different relative locations.
## 2. Current Action Review
| Action | Keep/Remove/Admin Only | Why | Security Risk | Agent Support | Notes |
|---|---|---|---|---|---|
| `restart` | Keep | Core hosting feature. | Low if implemented through safe action. | Existing Panel API and agent restart support. | Should support warnings, save-world, lock, timeout, and result logging. |
| `stop` | Keep | Useful for scheduled shutdown windows and cost/resource control. | Low. | Existing Panel API and agent stop support. | Should verify stopped state through agent status. |
| `start` | Keep | Useful after maintenance windows. | Low. | Existing Panel API and agent start support. | Should show STARTING/ONLINE result, not only command fired. |
| `steam_auto_update` | Keep, rename | Useful but name is technical. | Medium due Steam credentials/update side effects. | Existing `steam_cmd` update path. | Rename to `update_server_files`; require game XML installer support. |
| `server_content_check_updates` | Keep internally, remove from customer dropdown for now | Useful as backend action but unclear to customers. | Low. | Partial Panel support. | Replace with clearer `check_content_updates`. |
| `server_content_check_workshop_updates` | Keep internally, remove from customer dropdown for now | Useful once Workshop system is mature. | Low/medium. | Partial support. | Expose later as `check_workshop_updates`. |
| `server_content_install_updates_if_stopped` | Keep internally | Safe behavior for automatic content updates. | Low. | Panel support. | Customer label should be `Update content when stopped`. |
| `server_content_install_updates_next_restart` | Keep internally | Useful queued-update pattern. | Low. | Panel support. | Needs real next-restart integration. |
| `server_content_install_updates_now` | Keep advanced customer/admin | Updates while server may be running can break files. | Medium. | Partial support. | Gate by game support and require warning. |
| `server_content_install_updates_and_restart` | Keep advanced customer/admin | Very useful but needs locking and warnings. | Medium. | Partial support. | Should become `update_mods_and_restart`. |
| `server_content_update_workshop` | Remove from dropdown; keep as internal alias | Duplicate with Workshop update action. | Medium. | Partial support. | Hide until Workshop redesign is implemented. |
| `server_content_update_all` | Remove/merge | Duplicate with install/update all. | Medium. | Partial support. | Replace with one clear `update_all_content`. |
| `server_content_notify_updates_only` | Remove for now | Name suggests notification but no notification system exists. | Low. | Partial check-only path. | Reintroduce after notifications exist. |
| `server_content_validate_files` | Keep admin/advanced | Useful repair/validate action. | Medium. | Partial support via generic script action. | Rename to `validate_content_files`; game support required. |
| `server_content_backup_before_update` | Remove or redesign | Currently sets an option but there is no clear backup implementation in that path. | Medium due false confidence. | Incomplete. | Replace with first-class backup action and update workflow option. |
| Raw remote shell command | Admin only or remove from normal UI | Powerful but dangerous. | High. | Existing generic scheduler execution. | Should not be available to customers. Should be audited if kept. |
| Legacy `%ACTION=start` | Remove/deprecate | Current Panel uses API callbacks. | Low. | Agent support exists. | Keep only during migration if old task files contain it. |
| Legacy `%ACTION=stop` | Remove/deprecate | Same as above. | Low. | Agent support exists. | Migrate to action registry. |
| Legacy `%ACTION=restart` | Remove/deprecate | Same as above. | Low. | Agent support exists. | Migrate to action registry. |
## 3. Competitor Feature Research
Sources reviewed:
- Nitrado automated tasks guide: https://server.nitrado.net/guides/automated-tasks-en
- Nitrado backup guide/FAQ: https://server.nitrado.net/en-US/guides/how-to-manage-and-restore-nitrado-server-backups/
- BisectHosting Starbase schedules: https://help.bisecthosting.com/hc/en-us/articles/40101097661083-How-to-Schedule-Tasks-on-the-Starbase-panel
- ZAP-Hosting scheduled tasks: https://zap-hosting.com/guides/docs/gameserver-scheduled-tasks
- Pterodactyl client schedules API docs: https://pterodactyl-panel.mintlify.app/api/client/schedules
- Shockbyte backups guide: https://shockbyte.com/help/knowledgebase/articles/how-to-backup-your-server-files
- PingPerfect scheduled server messages guide: https://pingperfect.com/knowledgebase/706/7-Days-to-Die--Scheduled-Server-Messages.html
- Host Havoc scheduled backups guide: https://hosthavoc.com/billing/knowledgebase/479/Creating-Scheduled-Backups.html
- GTXGaming Rust restart warning task guide: https://help.gtxgaming.co.uk/en/article/how-to-setup-restart-tasks-with-messages-for-your-rust-server-15zwnyr/
Common commercial features:
- Scheduled start/stop/restart.
- Scheduled backups.
- Scheduled console/RCON commands.
- Restart warning messages.
- Task offsets within a schedule.
- Backup retention limits.
- Restart-only-if-online option.
- Manual and automatic backup creation.
- Custom task scheduling.
- Game-specific tasks such as Rust wipes or server message tools.
Notable competitor patterns:
- Nitrado exposes simple automated power tasks: restart, start, stop. It also has automatic backups, but docs note timezone issues and game-specific backup behavior.
- BisectHosting Starbase schedules support separate schedules and tasks, including power actions and command tasks with time offsets.
- Pterodactyl's design is strong: schedules have multiple ordered tasks, time offsets, power actions, command actions, backup actions, `only_when_online`, and continue-on-failure behavior.
- ZAP-Hosting exposes start/stop/restart, restart-if-online, create backup, and execute command, with rate limits.
- Shockbyte emphasizes scheduled backup intervals and backup slot/auto-replace retention.
- PingPerfect supports scheduled messages and Console/RCON commands for games like 7 Days to Die.
- GTXGaming documents restart warnings/countdowns for Rust.
What GSP can do better:
- Use typed, safe game-aware actions instead of raw commands.
- Provide prebuilt restart workflows with save-world and warning steps.
- Tie Workshop/mod updates into the Scheduler.
- Add per-task locks and conflict prevention.
- Add structured logs and visible success/failure.
- Add notifications through Discord/email/panel.
- Add game XML capability detection so users only see actions that work.
- Add maintenance windows and "run when empty" automation.
- Add resource-based triggers using existing status/resource collection work.
## 4. Recommended GSP Scheduler Actions
### Customer-safe actions
- Restart server.
- Stop server.
- Start server.
- Backup server.
- Backup selected folders.
- Update Workshop mods.
- Update server content.
- Send warning message.
- Run allowed RCON command.
- Rotate logs.
- Delete old logs using admin-defined retention limits.
- Save world, if the game supports it.
- Check server status.
- Auto-restart if crashed.
### Advanced customer actions
- Scheduled wipe/reset for supported games.
- Validate/repair server files.
- Update SteamCMD game files.
- Clone backup to another server.
- Restore backup.
- Update mods and restart.
- Restart when player count is zero for X minutes.
- Restart if memory too high for X minutes.
- Restart if CPU stuck/high for X minutes.
- Scheduled config file replacement from approved templates.
- Scheduled database backup where applicable.
### Admin-only actions
- Arbitrary shell command.
- Raw script execution.
- Permission repair.
- Force kill process/session.
- Agent/node maintenance.
- Cleanup storage outside a server home.
- Clear global Workshop cache.
- Repair file ownership.
- Restart agent.
- Reboot node.
- Run panel update/maintenance.
## 5. Proposed Action System
Replace free-form action lists with a typed action registry.
Each action definition should include:
- `action_key`
- `display_name`
- `description`
- `category`
- `allowed_roles`
- `required_permissions`
- `supported_os`
- `required_agent_capability`
- `requires_game_running`
- `requires_game_stopped`
- `requires_rcon`
- `requires_workshop_support`
- `requires_steamcmd`
- `arguments_schema`
- `validation_rules`
- `timeout_seconds`
- `retry_policy`
- `overlap_policy`
- `conflict_group`
- `log_policy`
- `notification_events`
Example:
```yaml
scheduled_actions:
restart_server:
display_name: Restart Server
role: customer
agent_action: stop_wait_start
required_permissions: [server.power.restart]
args:
warning_minutes:
type: integer
min: 0
max: 60
default: 5
warning_message:
type: string
max_length: 160
default: "Server restart in {minutes} minutes."
save_world:
type: boolean
default: true
timeout_seconds: 600
conflict_group: server_power
overlap_policy: skip_if_running
```
### Recommended schedule model
Move from "one CRON line equals one command" to:
- Schedule:
- name
- cron expression or interval
- timezone
- enabled
- only_when_online
- missed_run_policy
- Tasks:
- ordered tasks within a schedule
- action key
- arguments
- time offset
- continue on failure
This matches the strongest commercial pattern and allows:
- 10 minutes before restart: send warning.
- 5 minutes before restart: save world.
- At restart time: restart server.
- 5 minutes after restart: send Discord notification.
### Suggested DB tables
`gsp_schedules`
- `id`
- `home_id`
- `remote_server_id`
- `name`
- `cron_minute`
- `cron_hour`
- `cron_day_of_month`
- `cron_month`
- `cron_day_of_week`
- `timezone`
- `enabled`
- `only_when_online`
- `missed_run_policy`
- `max_runtime_seconds`
- `created_by`
- `created_at`
- `updated_at`
`gsp_schedule_tasks`
- `id`
- `schedule_id`
- `sort_order`
- `time_offset_seconds`
- `action_key`
- `arguments_json`
- `continue_on_failure`
- `enabled`
- `created_at`
- `updated_at`
`gsp_schedule_runs`
- `id`
- `schedule_id`
- `home_id`
- `status`
- `scheduled_for`
- `started_at`
- `finished_at`
- `duration_seconds`
- `trigger`
- `last_error`
`gsp_schedule_task_runs`
- `id`
- `schedule_run_id`
- `schedule_task_id`
- `action_key`
- `status`
- `started_at`
- `finished_at`
- `exit_code`
- `message`
- `error`
- `log_path`
- `output_excerpt`
## 6. XML Integration
Game XML should declare game-specific Scheduler support.
Example:
```xml
<scheduler_support>
<action key="restart" enabled="1" />
<action key="rcon_warning" enabled="1" />
<action key="world_save" enabled="1">
<command>save</command>
</action>
<action key="workshop_update" enabled="1" />
<action key="wipe" enabled="1">
<strategy>rust_wipe</strategy>
</action>
</scheduler_support>
```
Global actions:
- Start server.
- Stop server.
- Restart server.
- Backup server files.
- Rotate logs.
- Delete old backups/logs.
- Check status.
Game-specific actions:
- Send RCON warning.
- Save world.
- Run console command.
- Workshop update.
- Mod update.
- Wipe/reset.
- Database backup.
- Validate files.
Actions requiring RCON:
- Warning message.
- Save world.
- Player-count-aware empty restart if query is not enough.
- Allowed RCON command.
- Game-specific graceful shutdown.
Actions requiring SteamCMD:
- Update SteamCMD game files.
- Validate/repair Steam game files.
Actions requiring Workshop support:
- Update Workshop mods.
- Repair Workshop mods.
- Update mods and restart.
Actions requiring backup support:
- Backup server.
- Backup selected folders.
- Restore backup.
- Clone backup.
## 7. Agent Integration
### Preferred direction
The agent should execute typed scheduled actions, not raw customer shell text.
New agent methods could be:
- `scheduler_action_start(home_id, action_manifest_json)`
- `scheduler_action_status(home_id, action_run_id)`
- `scheduler_action_log(home_id, action_run_id, offset)`
- `scheduler_action_cancel(home_id, action_run_id)`
The Panel should store schedules and send due actions to agents, or the agent should receive a structured schedule manifest from Panel. The cleanest long-term design is Panel-owned schedules plus an agent-side runner for actions.
### Start/stop/restart
Agent should:
- Use existing start/stop/restart functions.
- Use the new agent status model as source of truth.
- Wait for state transitions.
- Return structured result.
Restart should:
1. Optional RCON warning.
2. Optional save-world.
3. Stop.
4. Wait configured seconds.
5. Start.
6. Poll until STARTING/ONLINE/UNRESPONSIVE.
### Backup
Agent should:
- Create compressed archives through a typed backup action.
- Support include/exclude folders from safe config.
- Store backup manifests.
- Enforce retention.
- Avoid backing up transient cache/log folders unless configured.
### Update
Agent should:
- Run SteamCMD update or server content update through typed job actions.
- Avoid overlapping update with running backup/restart.
- Mark restart required when applicable.
### RCON/console command
Agent should:
- Use existing `send_rcon_command` support.
- Validate commands against action rules.
- Log command and response.
- Redact credentials.
Customer-safe RCON should use templates:
- `say {message}`
- `save`
- `save-all`
- game-specific warning command
Raw RCON text should be advanced/admin controlled.
### Mod update
Agent should:
- Run Workshop/server-content job runner from the Workshop design.
- Return job status and logs.
- Mark restart required.
### Log cleanup
Agent should:
- Delete only configured log paths.
- Enforce age/size limits.
- Log every removed path count/bytes.
### Status/resource actions
Agent should:
- Check process/session/port status.
- Optionally check memory/CPU samples.
- Execute conditional restart only after threshold duration.
### Timeouts and failure reporting
Every action should have:
- timeout
- retry count
- retry delay
- result status
- error message
- log excerpt
- correlation ID
## 8. Task Logs and User Feedback
Recommended run statuses:
- `pending`
- `running`
- `success`
- `failed`
- `skipped`
- `canceled`
- `timed_out`
The UI should show:
- schedule name
- enabled/disabled
- next run time
- last run time
- last status
- last duration
- current running task
- output log
- error message
- retry count
Run details should show:
- each task in the schedule
- action arguments summary
- start time
- finish time
- result
- output/log
Do not rely only on `scheduler.log`.
## 9. Notifications
Supported notification channels:
- Panel notification.
- Email.
- Discord webhook.
- Generic webhook later.
Notification events:
- Before restart.
- After restart.
- Backup succeeded.
- Backup failed.
- Update available.
- Update installed.
- Task skipped because server was offline/running.
- Task failed.
- Disk retention cleanup ran.
Security:
- Webhook URLs must be stored securely.
- Do not expose tokens in task logs.
- Customers should not be able to send arbitrary webhooks from shared infrastructure unless allowed by policy.
Pre-restart warning types:
- RCON in-game message.
- Console command.
- Discord/webhook message.
- Panel notification.
## 10. Implementation Phases
### Phase 1: Inventory/report only
- Complete this report.
- Do not modify code.
### Phase 2: Remove or hide useless actions
- Hide duplicate server-content actions from customer dropdown.
- Keep internal aliases for backward compatibility.
- Hide `server_content_backup_before_update` until real backup exists.
- Keep raw remote command admin-only.
### Phase 3: Safe action registry
- Add PHP action registry.
- Define roles, permissions, arguments, validation, and display names.
- Replace hardcoded dropdown arrays.
### Phase 4: Task logging
- Add schedule/task/run tables.
- Store run status and results.
- Keep agent `scheduler.log` as low-level debug only.
### Phase 5: Restart/backup/update actions
- Implement typed restart with warning/save-world hooks.
- Implement first-class server backup action.
- Implement update server files action.
### Phase 6: RCON warnings
- Add game XML `scheduler_support`.
- Add allowed warning/save commands.
- Add command templates and validation.
### Phase 7: Workshop update integration
- Integrate with the redesigned Workshop/server-content job system.
- Add update mods and update mods then restart workflows.
### Phase 8: Notifications
- Add panel notifications.
- Add Discord webhook.
- Add email.
### Phase 9: Commercial polish
- Multi-task schedules with offsets.
- Clone schedule to another server.
- Maintenance window mode.
- Conditional empty-server restart.
- Resource threshold triggers.
- Missed-run handling.
- Conflict and overlap visualization.
## 11. Final Recommendation
### Remove or hide
- Hide raw server-content internal actions from customer dropdown.
- Remove customer-facing `server_content_notify_updates_only` until notifications exist.
- Remove customer-facing `server_content_backup_before_update` until backup is real.
- Merge duplicate update actions into clear labels.
- Deprecate legacy `%ACTION=` task format after migration.
### Keep
- Start server.
- Stop server.
- Restart server.
- SteamCMD update, renamed to `Update server files`.
- Server content / Workshop update, once the Workshop system is mature.
- Admin raw command only behind explicit admin permissions.
### Build first
1. Typed action registry.
2. DB-backed schedules and run logs.
3. Restart server with warning and optional save-world.
4. Backup server with retention.
5. Update server files.
6. Update Workshop mods.
7. Notifications.
### Admin-only
- Shell command.
- Raw script execution.
- Force kill.
- Permission repair.
- Node cleanup.
- Agent restart/reboot.
### Delay until later
- Resource-triggered restarts.
- Wipe/reset workflows.
- Restore backup scheduling.
- Clone schedules.
- Generic webhooks.
- Advanced conditional schedules.
## Summary
The current GSP Scheduler is functional but primitive. It stores CRON lines on agents, executes shell commands, and often calls back into the Panel through `wget`. That makes it flexible, but it does not provide the safety, visibility, or polish expected from a modern commercial game hosting panel.
The recommended path is a typed, DB-backed schedule system with safe action definitions, game XML capability flags, agent-side action execution, structured run logs, notifications, and first-class workflows for restart, backup, update, Workshop mods, and RCON warnings.

File diff suppressed because it is too large Load diff