# POP-Controlled Reverse Access Gateway Contract ## Implementation Status Last updated: 2026-05-30 Asia/Shanghai. Current phase: MVP bootstrap in progress. Direction update under evaluation: * Legacy agent naming has been migrated to `Client Agent` for the first MVP path. * First capability being implemented: * Client connects to Client Agent. * Client Agent connects outbound to POP Server. * Client Agent wraps all traffic into LayLink frames. * POP Server authenticates/authorizes the request. * POP Server connects directly to the public target. * POP Server relays target data back to Client Agent through LayLink frames. * KCP acceleration requirement: * The Agent-to-POP transport should support a KCP-over-UDP mode. * TCP framed mode should remain as a fallback/control-friendly transport. * KCP should be introduced behind a transport abstraction instead of leaking into session, policy, or routing code. * Transport protocol configuration: * POP Server uses `POP_ALLOWED_AGENT_TRANSPORTS` to allow one or more Agent-to-POP transports. * Agent uses `AGENT_TRANSPORT_PROTOCOL` to choose one concrete transport. * Allowed names are `tcp`, `udp`, and `kcp`. * Current runnable implementations are `tcp` and experimental `kcp`; `udp` is reserved. * Feasibility: * Workerman supports long-running async TCP servers and custom protocols; it is suitable for the framed fallback/control channel. * KCP itself is a UDP-based reliable ARQ protocol, so adding KCP means adding a UDP transport layer and session demultiplexing below the existing LayLink frame protocol. * Native PHP KCP support is possible through an extension or FFI binding to `ikcp.c`; pure-PHP KCP is not recommended for production performance. * Recommended implementation order: 1. Complete Client Agent naming migration in code, docs, config, and entrypoints. 2. Implement TCP-framed Client Agent -> POP -> public target path. 3. Define `TransportInterface` so frame protocol can run over TCP now and KCP later. 4. Add KCP-over-UDP transport behind the transport abstraction after the TCP framed path is stable. * Main risk: * KCP is not a socket by itself. It needs UDP I/O, timers, packet flush/update scheduling, MTU handling, retransmission tuning, and connection/session management. * PHP-only KCP may work as a prototype but is likely CPU-heavy under concurrency. * The cleanest production path is a PHP extension/FFI binding or a sidecar KCP transport process. Completed in this checkpoint: * Added Composer PSR-4 autoload for `LayLink\\`. * Added `.env.example`, `config/nodes.php`, `config/policies.php`, and `config/routes.php`. * Added Workerman CLI entrypoints: * `bin/pop-server.php` * `bin/client-agent.php` * Added length-prefixed JSON frame protocol: * `Frame` * `FrameType` * `FrameCodec` * `FrameParser` * Added POP-side MVP services: * agent listener with node token auth, heartbeat handling, node registry, and framed session relay * Added Agent-side MVP client: * outbound POP connection * AUTH frame * heartbeat * local allowlist enforcement * target TCP connection and DATA/CLOSE relay * Added local JSONL audit logger at `runtime/audit.log`. * Configured Workerman logs and pid files under `runtime/`. * Added `readme.md` with node type descriptions, per-role `.env` requirements, config examples, and deployment checklist. * Ran `composer dump-autoload`. * Verified all non-vendor PHP files with `php -l`. * Verified PSR-4 autoload by instantiating `LayLink\\Protocol\\Frame`. * Verified POP Workerman startup with localhost high port outside the sandbox: * `POP_AGENT_LISTEN=127.0.0.1:19001` * worker reached `[OK]` and stopped cleanly via `timeout`. * Added Agent-to-POP transport configuration: * `POP_ALLOWED_AGENT_TRANSPORTS` * `AGENT_TRANSPORT_PROTOCOL` * POP rejects disallowed Agent transport during node authentication with `transport_not_allowed`. * Renamed the agent entrypoint and defaults: * Client Agent entrypoint is `bin/client-agent.php` * default `NODE_ID=client-01` * default `NODE_TYPE=client` * runtime pid file `runtime/client-agent.pid` * worker name `laylink-client-agent` * Verified `bin/client-agent.php` starts under Workerman and reaches `[OK]` in a short smoke test. * Reworked new MVP data path: * local client connects to one enabled Client Agent ingress listener * Client Agent sends `OPEN` to POP Server * POP Server authenticates client request and checks policy * POP Server opens the public target directly * `DATA` and `CLOSE` frames relay the stream between Client Agent and POP Server * Added Client Agent local ingress protocols: * `socks5`: SOCKS5 `CONNECT`; default enabled on `127.0.0.1:1080` * `http-proxy`: HTTP `CONNECT` and ordinary HTTP absolute URL proxy requests; default disabled on `127.0.0.1:8080` * `raw-json`: newline JSON debug ingress; default disabled on `127.0.0.1:9000` * Added per-ingress env switches, listen IPs, and listen ports: * `CLIENT_AGENT_SOCKS5_ENABLED` * `CLIENT_AGENT_SOCKS5_LISTEN_IP` * `CLIENT_AGENT_SOCKS5_LISTEN_PORT` * `CLIENT_AGENT_SOCKS5_AUTH_MODE` * `CLIENT_AGENT_SOCKS5_USERNAME` * `CLIENT_AGENT_SOCKS5_PASSWORD` * `CLIENT_AGENT_HTTP_PROXY_ENABLED` * `CLIENT_AGENT_HTTP_PROXY_LISTEN_IP` * `CLIENT_AGENT_HTTP_PROXY_LISTEN_PORT` * `CLIENT_AGENT_RAW_JSON_ENABLED` * `CLIENT_AGENT_RAW_JSON_LISTEN_IP` * `CLIENT_AGENT_RAW_JSON_LISTEN_PORT` * Added Client Agent default identity for generated proxy requests: * `CLIENT_AGENT_AUTH_TOKEN` * `CLIENT_AGENT_USER_ID` * Completed SOCKS5 TCP proxy protocol handling for the current MVP: * method negotiation * no-auth method * RFC1929 username/password method * IPv4/domain/IPv6 target address parsing * `CONNECT` * standard SOCKS5 failure replies * `BIND` returns command-not-supported * `UDP ASSOCIATE` returns a local UDP relay endpoint and uses LayLink `UDP_DATA` frames * Added LayLink UDP datagram relay path: * Client Agent parses SOCKS5 UDP request packets * Client Agent sends `UDP_DATA` frames to POP Server * POP Server validates client auth and `protocol=udp` policy * POP Server sends datagrams to public UDP targets * POP Server returns UDP responses as `UDP_DATA` * Added UDP egress sample policy `public-udp-egress` for ports `53`, `123`, and `443`. * Added `FrameType::descriptions()` as the code-level frame type catalog. * Verified Client Agent can start SOCKS5, HTTP proxy, and raw-json listeners together on localhost high ports. * Verified Client Agent can start SOCKS5 TCP plus SOCKS5 UDP relay listeners together on localhost high ports. * Added `scripts/verify-socks5.sh` to verify real SOCKS5 HTTPS requests: * `https://bing.com/` for connectivity and HTTPS support * `https://ip.sb/` for egress IP * Reorganized `.env.example` into readable sections: * `[config]` * `[kcp]` * `[client-agent]` * `[pop-server]` * Section headers are comments-for-humans in practice; the current Env loader ignores lines without `=`. * Removed deprecated/compatibility-only surfaces: * `POP_CLIENT_LISTEN` * POP direct client listener * `src/Server/ClientListener.php` * `bin/client-gateway.php` * `src/Client/ClientGateway.php` * `bin/border-agent.php` * sample border node and border policy docs * Verified POP now starts with only `laylink-pop-agent-listener`. * Fixed SOCKS5 error behavior when POP is not connected: * SOCKS5 method negotiation no longer returns text errors. * POP connection failures during CONNECT now return standard SOCKS5 failure replies. * Added Agent-to-POP Frame encryption: * `LAYLINK_FRAME_ENCRYPTION=none|chacha20` * `LAYLINK_FRAME_ENCRYPTION_KEY` * POP Server and Client Agent must use identical encryption settings. * `chacha20` currently uses libsodium XChaCha20 stream encryption with a random nonce per frame. * Verified `none` and `chacha20` FrameCodec encode/decode round trips. * Verified POP Server starts with `LAYLINK_FRAME_ENCRYPTION=chacha20`. * Added port range matching for policy and agent allowlist ports: * `target_ports` supports exact ports such as `80` and string ranges such as `'8080-10080'`. * `allowed_ports` supports the same syntax. * Allowed sample public TCP egress policy on port range `'8080-10080'` for HTTP-alt/speedtest style endpoints. * Optimized TCP stream `DATA` frames: * Control frames remain JSON. * TCP `DATA` payloads now use binary frame encoding when both ends run the updated code. * This removes base64 expansion and JSON string encoding from the hot TCP data path. * Verified binary TCP `DATA` frame encode/decode under both `none` and `chacha20`. * Added POP-side target DNS pre-resolution: * Domain resolution failures return `OPEN_FAIL` with `dns_resolution_failed`. * POP no longer lets common target DNS failures bubble up as raw `stream_socket_client()` warnings. * Added TCP backpressure for large transfers: * POP pauses target reads when the Agent connection send buffer crosses the high watermark. * POP resumes target reads when the Agent connection drains. * Client Agent pauses local client reads when the POP connection send buffer crosses the high watermark. * Client Agent pauses POP reads while a local client output buffer is full. * Send buffer limits default to 64 MiB with a 32 MiB backpressure high watermark. * Tuning envs: * `LAYLINK_DATA_CHUNK_BYTES` * `LAYLINK_MAX_SEND_BUFFER_BYTES` * `LAYLINK_BACKPRESSURE_HIGH_WATERMARK_BYTES` * Fixed large-download truncation risk: * Client Agent now treats POP `CLOSE` as a graceful remote EOF and waits for the local client send buffer to drain before closing the local socket. * TCP `DATA` is split into configurable chunks, defaulting to 1 MiB, to reduce frame overhead while avoiding oversized frames. * POP refreshes Agent activity on any valid frame, not only `PING`, reducing heartbeat false positives during heavy traffic. * Started Agent-to-POP transport abstraction: * Added `FrameClientTransport`. * Added `TcpFrameClientTransport` as the current TCP implementation. * `AgentClient` now sends and receives LayLink frames through the transport interface instead of directly owning `AsyncTcpConnection` and `FrameParser`. * This preserves current TCP behavior while preparing a `KcpFrameClientTransport` implementation. * Added POP-side frame transport abstraction: * Added `FrameServerConnection`. * Added `TcpFrameServerConnection`. * Added `TcpFrameServerListener`. * `AgentListener`, `NodeConnection`, `NodeRegistry`, and `TunnelSession` now hold Agent connections through `FrameServerConnection`. * TCP listener decode/encode details are isolated from POP session, policy, heartbeat, and relay logic. * Added transport factory/config selection: * `FrameClientTransportFactory` maps `AGENT_TRANSPORT_PROTOCOL=tcp` to `TcpFrameClientTransport`. * `FrameServerListenerFactory` maps the implemented POP transport `tcp` to `TcpFrameServerListener`. * `FrameClientTransportFactory` maps `AGENT_TRANSPORT_PROTOCOL=kcp` to `KcpFrameClientTransport`. * `FrameServerListenerFactory` maps POP transport `kcp` to `KcpFrameServerListener`. * `udp` still fails at factory boundaries with explicit not-implemented errors instead of leaking into business logic. * Added experimental multi-connection Client Agent -> POP support: * `CLIENT_AGENT_POP_CONNECTIONS` controls how many parallel Agent-to-POP long connections a Client Agent opens. * New local TCP sessions are distributed round-robin across authenticated POP transports. * Each session stays bound to its selected POP transport for the whole session lifetime. * POP `NodeRegistry` now supports multiple live connections under the same `NODE_ID`. * Heartbeat activity and offline cleanup are tracked per Agent connection. * KCP/UDP implementation decision: * Start with `kcp` before raw `udp` for Agent-to-POP frame transport. * Existing TCP tunnel sessions require ordered, reliable byte-stream semantics; raw UDP would need retransmission, ordering, MTU fragmentation, congestion/window handling, and session cleanup. * Implementing raw UDP as a general Frame transport would effectively recreate a weaker KCP. * Keep the existing SOCKS5 `UDP ASSOCIATE`/`UDP_DATA` feature separate: it is application datagram relay over the current reliable Agent-to-POP channel, not the Agent-to-POP transport itself. * Recommended KCP path is a transport implementation behind `FrameClientTransport` / `FrameServerConnection`, backed by a native extension, FFI binding, or sidecar process rather than pure PHP for production throughput. * Added KCP Agent-to-POP transport: * `KcpPacketCodec` defines UDP packet types for `SYN`, `SYN_ACK`, `DATA`, `ACK`, and `CLOSE`. * `KcpFrameClientTransport` runs Client Agent frames over UDP while preserving the existing `FrameClientTransport` interface. * `KcpFrameServerListener` and `KcpFrameServerConnection` expose KCP/UDP sessions to POP as `FrameServerConnection`. * POP can now listen on both TCP and KCP when `POP_ALLOWED_AGENT_TRANSPORTS=tcp,kcp`. * `NativeKcpSession` uses PHP FFI to call native upstream `ikcp.c` through `native/kcp/liblaylink_kcp.so`. * `scripts/build-kcp-ffi.sh` builds the native shared library from vendored `native/kcp/ikcp.c`. * `LAYLINK_KCP_BACKEND=ffi` selects the native KCP backend; `LAYLINK_KCP_BACKEND=php` remains as a debugging fallback through `KcpReliableSession`. * `LAYLINK_KCP_FFI_LIB` can point to a custom native KCP library path. * Added array-style env parsing: * `Env::csv()` accepts traditional comma-separated values such as `tcp,kcp`. * `Env::csv()` also accepts JSON arrays such as `["tcp","kcp"]`. * Added KCP congestion and UDP EAGAIN controls: * `KcpUdpPacketSender` bypasses Workerman `UdpConnection::send()` for KCP packets and uses suppressed `stream_socket_sendto()` directly. * UDP `EAGAIN` / "Resource temporarily unavailable" no longer emits PHP warnings from the KCP transport path. * KCP packets that cannot be sent immediately are queued and retried on subsequent transport ticks. * Added KCP tuning envs: * `LAYLINK_KCP_NODELAY` * `LAYLINK_KCP_INTERVAL_MS` * `LAYLINK_KCP_FAST_RESEND` * `LAYLINK_KCP_NO_CONGESTION_CONTROL` * `LAYLINK_KCP_SEND_WINDOW` * `LAYLINK_KCP_RECV_WINDOW` * `LAYLINK_KCP_MTU_BYTES` * `LAYLINK_KCP_TICK_MS` * `LAYLINK_KCP_UDP_SEND_QUEUE_BYTES` * `LAYLINK_KCP_UDP_FLUSH_PACKETS` * `LAYLINK_KCP_OUTPUT_DRAIN_PACKETS` * Added POP worker count configuration: * `POP_AGENT_TCP_WORKERS` controls TCP Agent listener worker count. * `POP_AGENT_KCP_WORKERS` is exposed but currently clamped to `1` in `bin/pop-server.php`. * KCP/UDP must remain single-worker in the current architecture because KCP session state is process-local and UDP packets for one conv can otherwise be handled by different workers. * Native KCP output draining is capped per tick by `LAYLINK_KCP_OUTPUT_DRAIN_PACKETS` to reduce single-flow event-loop monopolization during large downloads. * Fixed KCP POP-side session lookup for real-world UDP/NAT behavior: * POP no longer depends only on `remote ip:port + conv` for KCP session lookup. * POP keeps a `conv -> session` index and migrates the current UDP remote address when a known `conv` arrives from a changed source port. * KCP callbacks now resolve the active connection by `conv`, so migrated sessions continue delivering frames instead of silently dropping `DATA`. * Fixed native KCP send-buffer accounting after heavy speedtest-style traffic: * The previous PHP-side native KCP `queuedBytes` counter could grow during large transfers and never fall back to zero unless user payload was received in the opposite direction. * This could make a long-lived KCP Agent connection permanently appear full after upload/download tests, causing later `OPEN_OK` / `DATA` sends to fail and audit rows to show zero transferred bytes. * Added native wrapper `laylink_kcp_waitsnd()` around upstream `ikcp_waitsnd()`. * `NativeKcpSession::getSendBufferQueueSize()` now derives watermarks from real KCP pending segment count plus pending UDP output bytes. * POP now treats failure to send `OPEN_OK` as `agent_buffer_overflow` instead of leaving a target connection open with no client-visible success. * Restored KCP high-throughput defaults: * Early native KCP testing used hardcoded `nodelay=1, interval=10, resend=2, nc=1, sndwnd=1024, rcvwnd=1024, mtu=1350`. * The first exposed `.env` defaults were more conservative (`nc=0`, smaller windows, `mtu=1200`) and could reduce throughput dramatically on speedtest-style high-BDP paths. * `NativeKcpSession`, KCP UDP queue/flush fallback defaults, current `.env`, and `.env.example` now use the high-throughput profile by default. * For lossy or congested paths, tune down to `LAYLINK_KCP_NO_CONGESTION_CONTROL=0`, `LAYLINK_KCP_MTU_BYTES=1200`, smaller windows, or lower flush/drain packet counts. Known MVP limitations: * The current sandbox cannot bind TCP sockets; startup smoke tests need escalation or a normal shell environment. * raw-json debug ingress uses newline-delimited JSON before switching to raw tunnel mode. Example: ```json {"auth_token":"dev-token","user_id":"admin","target_host":"example.com","target_port":443,"protocol":"tcp"} ``` * No TLS yet. * No production-grade client identity yet; `dev-token` is hardcoded for MVP development. * No automated integration test harness yet. * TCP stream forwarding can now use multiple Agent-to-POP connections per Client Agent, but a single TCP session is still pinned to one POP transport. Binary `DATA` frames, chunking, graceful EOF, and backpressure reduce per-byte overhead and buffer blowups; KCP is experimental and still needs throughput/loss tuning, while multipath and per-session flow-control tuning are future performance work. * No explicit idle timeout or connect timeout enforcement yet. * UDP relay is datagram-oriented and currently creates short-lived POP-side UDP sockets per outbound datagram; pooling and stronger timeout accounting are still future work. * HTTP proxy supports `CONNECT` and ordinary absolute URL HTTP requests; advanced proxy auth and full HTTP/2 proxying are not implemented. Next recommended tasks: 1. Add a local integration harness that starts POP, Client Agent, and a mock TCP echo target, then verifies authorized tunnel, policy denial, and agent local denial. 2. Add configurable client auth token or JWT-ready auth interface. 3. Add target connect timeout and session idle timeout. 4. Add more detailed buffer overflow audit reasons and metrics. 5. Add README quickstart with exact local commands. 6. Add a reproducible throughput benchmark script for direct-vs-LayLink comparisons. 7. Keep TCP tuning as an ongoing task: * benchmark `LAYLINK_DATA_CHUNK_BYTES` at `524288`, `1048576`, `2097152`, and `4194304` * benchmark buffer pairs such as `64MiB/32MiB` and `128MiB/64MiB` * record direct-vs-LayLink throughput, CPU, memory, and disconnect behavior 8. Benchmark and tune `CLIENT_AGENT_POP_CONNECTIONS` for 1, 2, 4, and 8 long connections under mixed single-download and multi-session workloads. 9. Benchmark native FFI `kcp` against `tcp` under latency, loss, and high-throughput workloads; tune KCP nodelay, window, MTU, resend, interval, UDP queue, and flush settings. 10. Design KCP horizontal scaling before allowing `POP_AGENT_KCP_WORKERS>1`; options include multiple POP ports/instances, reuseport five-tuple affinity, external session state, or a UDP dispatcher keyed by conv. 11. Add raw UDP Agent-to-POP transport only for explicitly datagram-oriented frame classes, or after a reliability/window design exists. 12. Add per-session flow-control windows to reduce head-of-line blocking on one Agent connection. 13. Optimize UDP relay with POP-side UDP socket pooling. 14. Add UDP association idle timeouts and cleanup. 15. Aggregate UDP audit records per association instead of per datagram. 16. Add UDP and per-user rate limiting. ## 0. Project Name `LayLink` This project implements a PHP Workerman-based reverse tunnel gateway. The system allows a Client Agent to establish an outbound persistent framed connection to a POP Server. The POP Server authenticates clients, enforces access policy, selects a route, and forwards authorized TCP streams to public Internet targets or later restricted network zones. This is **not** a full Layer-3 VPN. It is a policy-controlled Layer-4 reverse access gateway. --- ## 1. Core Architecture ### 1.1 Node Types The MVP contains two core logical node types: 1. `POP Server` 2. `Client Agent` ### 1.2 Required Topology ```text Client | v POP Server | +--> Direct public egress | +--> Client Agent framed access ``` ### 1.3 Network Constraints The Client Agent is located on the client side. The Client Agent: * Accepts local or LAN client connections. * Initiates outbound connections to `popserver1`, for example `10.1.0.2`. * Wraps client requests and stream data in LayLink frames. The POP Server: * Accepts user/client access. * Maintains persistent connections from Client Agents. * Performs authentication, authorization, route selection, session management, and auditing. * Can optionally connect directly to public Internet destinations. --- ## 2. Non-Negotiable Design Principles ### 2.1 POP Server Owns Policy The POP Server is the only component allowed to make authorization decisions. Agents must not accept arbitrary user-specified forwarding requests. Agents only execute explicit `OPEN` instructions issued by the POP Server after authorization. ### 2.2 Agents Are Controlled Executors Client Agents are controlled executors. They may: * Authenticate themselves to the POP Server. * Maintain heartbeat. * Accept local client connections on explicitly configured local proxy listeners. * Send `OPEN` instructions to the POP Server. * Relay stream data. * Close sessions. They must not: * Expose a public SOCKS5/HTTP proxy unless explicitly configured and protected. * Make authorization decisions locally. * Override POP policy. * Route traffic outside POP authorization. ### 2.3 No Full VPN in MVP The MVP must not implement TUN/TAP, virtual network interfaces, routing tables, or full Layer-3 VPN behavior. The MVP only supports authorized TCP stream forwarding. UDP support may be added later. --- ## 3. MVP Scope The first implementation must support: 1. POP Server starts a TCP listener for clients. 2. POP Server starts a TCP listener for agents. 3. Client Agent connects outbound to POP Server. 4. Agent authenticates with `node_id` and `node_token`. 5. Client connects to POP Server and requests access to a target. 6. POP Server checks policy. 7. POP Server selects a route. 8. POP Server sends `OPEN` frame to selected Agent. 9. Agent connects to the target service. 10. POP Server relays bidirectional TCP data between client and agent. 11. Session closes cleanly on either side disconnecting. 12. Audit log records the session. MVP does not need: * UDP relay. * Web UI. * Multi-POP clustering. * Distributed HA. * TLS certificate automation. * SSH command audit. * Database SQL audit. * Complex identity provider integration. --- ## 4. Recommended Technology Stack Language: ```text PHP 8.2+ ``` Framework: ```text Workerman ``` Recommended packages: ```text workerman/workerman monolog/monolog vlucas/phpdotenv ramsey/uuid ``` Optional later: ```text firebase/php-jwt illuminate/database react/promise ``` --- ## 5. Directory Structure The project should use the following structure: ```text pop-tunnel-gateway/ composer.json .env.example README.md CONTRACT.md bin/ pop-server.php client-agent.php config/ routes.php nodes.php policies.php src/ Protocol/ Frame.php FrameType.php FrameCodec.php FrameParser.php Server/ PopServer.php AgentListener.php Agent/ AgentClient.php TargetConnector.php Client/ ClientGateway.php Session/ TunnelSession.php SessionManager.php Node/ NodeRegistry.php NodeConnection.php Auth/ NodeAuthenticator.php ClientAuthenticator.php PolicyChecker.php Route/ RouteResolver.php RouteDecision.php Audit/ AuditLogger.php Util/ BufferLimiter.php LoggerFactory.php ``` --- ## 6. Frame Protocol The system must use a framed protocol between POP Server and Agents. Raw stream passthrough between POP and Agent is not allowed because the system needs multiplexing, session IDs, heartbeats, error handling, and auditability. ### 6.1 Frame Types Required frame types: | Type | Direction | Meaning | | --- | --- | --- | | `AUTH` | Client Agent -> POP | Agent authenticates itself with `node_id`, `node_type`, `node_token`, and `transport_protocol`. | | `AUTH_OK` | POP -> Client Agent | Agent authentication accepted. | | `AUTH_FAIL` | POP -> Client Agent | Agent authentication rejected; POP closes the connection after sending this frame. | | `PING` | Client Agent -> POP | Agent heartbeat with active session count, load, and timestamp. | | `PONG` | POP -> Client Agent | Heartbeat response. | | `OPEN` | Client Agent -> POP | Client Agent requests POP to authorize and open a target stream. | | `OPEN_OK` | POP -> Client Agent | POP has connected the target and the stream can begin. | | `OPEN_FAIL` | POP -> Client Agent | POP rejected or failed the requested target stream. | | `DATA` | Bidirectional | Stream bytes for one `session_id`; TCP stream payloads use binary frame encoding when both ends are updated. | | `UDP_DATA` | Bidirectional | UDP datagram bytes for one UDP association; MVP payload uses base64 and includes target metadata. | | `CLOSE` | Bidirectional | Close one stream session. | | `ERROR` | Bidirectional | Explicit protocol or session error. | | `WINDOW` | Bidirectional | Reserved flow-control window update for future backpressure. | For the new MVP, `OPEN` always means: ```text Client Agent asks POP Server to connect to the target. ``` It does not mean POP asks Agent to connect to an intranet target. That older direction is reserved for a later executor-agent mode. ### 6.2 Frame Fields Each frame must contain: ```text version type session_id payload_length payload ``` Control frames use JSON payloads. TCP stream `DATA` frames may use the binary DATA encoding below. ### 6.3 Frame Encoding For control frames, use length-prefixed JSON frames. Format: ```text uint32_be length json_payload ``` Example decoded control frame: ```json { "version": 1, "type": "OPEN", "session_id": "018f6f4a-xxxx-xxxx", "payload": { "target_host": "example.com", "target_port": 443, "protocol": "tcp" } } ``` TCP stream `DATA` frames use a binary body before optional encryption: ```text uint32_be encrypted_or_plain_body_length "LLB1" uint8 binary_type=1 uint16_be session_id_length session_id bytes raw TCP payload bytes ``` Legacy JSON/base64 `DATA` decoding remains accepted for compatibility, but updated senders should emit binary `DATA`. --- ## 7. Agent Authentication When an Agent connects to POP Server, it must immediately send an `AUTH` frame. Example: ```json { "version": 1, "type": "AUTH", "session_id": null, "payload": { "node_id": "client-01", "node_type": "client", "node_zone": "corp", "node_token": "CHANGE_ME", "supported_protocols": ["tcp"] } } ``` POP Server must verify: ```text node_id exists node_token matches node_type matches config node is not disabled ``` On success: ```json { "version": 1, "type": "AUTH_OK", "session_id": null, "payload": { "node_id": "client-01", "heartbeat_interval": 10 } } ``` On failure: ```json { "version": 1, "type": "AUTH_FAIL", "session_id": null, "payload": { "reason": "invalid_node_token" } } ``` POP Server must close the connection after `AUTH_FAIL`. --- ## 8. Heartbeat Agents must send `PING` every 10 seconds by default. Example: ```json { "version": 1, "type": "PING", "session_id": null, "payload": { "node_id": "client-01", "active_sessions": 12, "load": 0.35, "timestamp": 1710000000 } } ``` POP Server replies: ```json { "version": 1, "type": "PONG", "session_id": null, "payload": { "timestamp": 1710000001 } } ``` If no heartbeat is received for 30 seconds, POP Server must mark the node as offline and close all sessions routed through that node. --- ## 9. Client Request Model For MVP, the client may connect directly to a POP TCP listener and submit an initial JSON request. Example: ```json { "auth_token": "dev-token", "target_host": "192.168.10.20", "target_port": 22, "protocol": "tcp", "route_hint": "client-01" } ``` After the initial request is accepted, the TCP stream becomes a bidirectional tunnel. Later versions may implement: ```text SOCKS5 HTTP CONNECT WebSocket tunnel mTLS client identity JWT authentication ``` --- ## 10. Route Selection The POP Server must use `RouteResolver` to decide where traffic should go. Possible route types: ```text direct agent border reject ``` Example route decision: ```php [ 'allowed' => true, 'route_type' => 'agent', 'node_id' => 'client-01', 'policy_id' => 'corp-ssh-admin', ] ``` Client `route_hint` is advisory only. The POP Server may ignore, override, or reject the route hint. --- ## 11. Policy Rules Policies must be defined in `config/policies.php`. Example: ```php return [ [ 'policy_id' => 'corp-ssh-admin', 'users' => ['admin', 'devops'], 'target_hosts' => ['192.168.10.20', '192.168.10.21'], 'target_ports' => [22], 'route_type' => 'agent', 'node_id' => 'client-01', 'enabled' => true, ], [ 'policy_id' => 'public-web-egress', 'users' => ['normal-user', 'admin'], 'target_hosts' => ['*'], 'target_ports' => [80, 443, '8080-10080'], 'route_type' => 'direct', 'enabled' => true, ], ]; ``` Policy matching must consider: ```text user identity target host target port protocol requested route node availability policy enabled/disabled state ``` Default behavior must be deny. --- ## 12. Agent Local Allowlist Each Agent must enforce its own local allowlist. Example `config/nodes.php`: ```php return [ 'client-01' => [ 'node_type' => 'client', 'token' => 'CHANGE_ME', 'allowed_cidrs' => [ '192.168.0.0/16', '10.10.0.0/16', ], 'allowed_ports' => [22, 80, 443, '8080-10080', 3306, 5432], 'enabled' => true, ], ]; ``` If an Agent receives an `OPEN` request outside its local allowlist, it must return `OPEN_FAIL`. Example: ```json { "version": 1, "type": "OPEN_FAIL", "session_id": "018f6f4a-xxxx", "payload": { "reason": "agent_local_policy_denied" } } ``` --- ## 13. Opening a Target Connection POP Server sends: ```json { "version": 1, "type": "OPEN", "session_id": "018f6f4a-xxxx", "payload": { "target_host": "192.168.10.20", "target_port": 22, "protocol": "tcp", "user_id": "admin", "policy_id": "corp-ssh-admin" } } ``` Agent connects to the target. On success: ```json { "version": 1, "type": "OPEN_OK", "session_id": "018f6f4a-xxxx", "payload": { "target_host": "192.168.10.20", "target_port": 22 } } ``` On failure: ```json { "version": 1, "type": "OPEN_FAIL", "session_id": "018f6f4a-xxxx", "payload": { "reason": "connection_refused" } } ``` --- ## 14. Data Forwarding After `OPEN_OK`, data is exchanged with `DATA` frames. Updated implementations encode TCP `DATA` as the binary frame described in section 6.3. Legacy JSON/base64 `DATA` frames may still be decoded during rolling upgrades. Both POP Server and Agent must map `session_id` to the corresponding local TCP connection. --- ## 15. Session Lifecycle Session states: ```text NEW OPENING OPEN CLOSING CLOSED FAILED ``` State transitions: ```text NEW -> OPENING OPENING -> OPEN OPENING -> FAILED OPEN -> CLOSING CLOSING -> CLOSED OPEN -> CLOSED ``` A session must be closed when: ```text client disconnects target disconnects agent disconnects policy check fails OPEN_FAIL is received send buffer exceeds hard limit idle timeout is reached ``` --- ## 16. Timeouts Required timeout defaults: ```text agent heartbeat interval: 10 seconds agent offline threshold: 30 seconds target connect timeout: 5 seconds session idle timeout: 300 seconds client initial request timeout: 5 seconds ``` All timeout values should be configurable. --- ## 17. Buffer and Backpressure The implementation must avoid unbounded memory growth. Each Workerman connection should configure a maximum send buffer. Suggested default: ```php $connection->maxSendBufferSize = 8 * 1024 * 1024; ``` The implementation must handle: ```text onBufferFull onBufferDrain onClose onError ``` If a session exceeds buffer limits and cannot recover, close the session and write an audit log entry. --- ## 18. Audit Logging Every session must produce an audit log. Required fields: ```text session_id user_id source_ip target_host target_port protocol route_type node_id policy_id start_time end_time duration_ms bytes_client_to_target bytes_target_to_client result failure_reason ``` MVP may write JSON lines to a local file: ```text runtime/audit.log ``` Example: ```json { "session_id": "018f6f4a-xxxx", "user_id": "admin", "source_ip": "1.2.3.4", "target_host": "192.168.10.20", "target_port": 22, "protocol": "tcp", "route_type": "agent", "node_id": "client-01", "policy_id": "corp-ssh-admin", "start_time": "2026-05-28T10:00:00+08:00", "end_time": "2026-05-28T10:01:00+08:00", "duration_ms": 60000, "bytes_client_to_target": 1024, "bytes_target_to_client": 2048, "result": "closed", "failure_reason": null } ``` --- ## 19. Error Handling Errors must be explicit. Required error reasons include: ```text invalid_frame invalid_auth node_not_found node_offline policy_denied route_not_found target_connect_timeout target_connection_refused agent_local_policy_denied session_not_found buffer_overflow protocol_not_supported internal_error ``` Do not silently drop sessions without logging. --- ## 20. Security Requirements ### 20.1 Required for MVP * Node token authentication. * Default-deny policy. * Agent local allowlist. * Audit logging. * Explicit route decision. * No arbitrary target access. * No unauthenticated Agent registration. * No unauthenticated client request in production mode. ### 20.2 Required Before Production * TLS between Client and POP. * TLS between Agent and POP. * Strong node credentials. * Rotatable node tokens. * JWT or mTLS client authentication. * Per-user policy. * Rate limiting. * Session concurrency limits. * Structured audit storage. * Log redaction for secrets. --- ## 21. Configuration `.env.example`: ```env APP_ENV=dev POP_AGENT_LISTEN=0.0.0.0:9001 NODE_ID=client-01 NODE_TYPE=client NODE_TOKEN=CHANGE_ME POP_SERVER_ADDRESS=tcp://10.1.0.2:9001 AUDIT_LOG=runtime/audit.log LOG_LEVEL=debug ``` --- ## 22. CLI Entrypoints ### 22.1 Start POP Server ```bash php bin/pop-server.php start ``` ### 22.2 Start Client Agent ```bash php bin/client-agent.php start ``` ### 22.3 Stop Services ```bash php bin/pop-server.php stop php bin/client-agent.php stop ``` --- ## 23. MVP Acceptance Tests The implementation is acceptable only if the following tests pass. ### 23.1 Agent Registration Given a valid node token, Client Agent connects to POP Server and becomes online. Expected result: ```text NodeRegistry contains client-01 as online. ``` ### 23.2 Invalid Agent Rejected Given an invalid node token, POP Server returns `AUTH_FAIL` and closes the connection. Expected result: ```text Node is not registered. Audit/security log records invalid_auth. ``` ### 23.3 Authorized TCP Tunnel Given: ```text Client requests 192.168.10.20:22 User is allowed by policy client-01 is online ``` Expected result: ```text POP sends OPEN to client-01. Agent connects to 192.168.10.20:22. Client can exchange TCP data with target. Audit log records success. ``` ### 23.4 Policy Denial Given: ```text Client requests 192.168.99.99:22 No policy allows this target ``` Expected result: ```text POP rejects the request. No OPEN frame is sent to Agent. Audit log records policy_denied. ``` ### 23.5 Agent Local Denial Given: ```text POP sends OPEN to a target outside Agent local allowlist ``` Expected result: ```text Agent returns OPEN_FAIL with agent_local_policy_denied. Session is closed. Audit log records failure. ``` ### 23.6 Agent Offline Given: ```text client-01 is offline Client requests route through client-01 ``` Expected result: ```text POP rejects request with node_offline. ``` ### 23.7 Clean Close Given: ```text Client closes connection ``` Expected result: ```text POP sends CLOSE to Agent. Agent closes target connection. SessionManager removes session. Audit log records closed. ``` ### 23.8 Target Connection Failure Given: ```text Target host or port is unreachable ``` Expected result: ```text Agent sends OPEN_FAIL. POP closes client connection. Audit log records target connection failure. ``` --- ## 24. Implementation Priority Implement in this order: 1. `Frame`, `FrameCodec`, `FrameParser` 2. `NodeAuthenticator` 3. `NodeRegistry` 4. Agent connection listener on POP Server 5. Client Agent outbound connection 6. Heartbeat 7. Client listener 8. Policy checker 9. Route resolver 10. Session manager 11. Agent target connector 12. DATA forwarding 13. CLOSE handling 14. Audit logger 15. Basic tests Do not implement UDP transport, KCP transport, Web UI, or clustering before the MVP TCP-framed proxy path is stable. --- ## 25. Coding Rules * Use strict types where possible. * Keep protocol parsing separate from business logic. * Do not mix policy checking with socket forwarding. * Do not use global mutable arrays except Workerman bootstrap state if unavoidable. * All session IDs must be unique. * Every network error must be logged. * Every rejected access must be auditable. * All config values must be externalized. * The system must run on Linux. * The code must be readable and modular, not a single giant script. --- ## 26. Deliverables Codex should generate: ```text composer.json .env.example README.md bin/pop-server.php bin/client-agent.php config/routes.php config/nodes.php config/policies.php src/**/*.php ``` The generated code must be runnable with: ```bash composer install php bin/pop-server.php start php bin/client-agent.php start ``` --- ## 27. Out of Scope for First Version The following features must not be implemented in the first version unless explicitly requested later: ```text TUN/TAP VPN Layer-3 routing Kernel packet capture UDP relay QUIC Web dashboard Clustered POP Server Redis-based session sharing TLS certificate automation SSH command recording Database SQL auditing Browser-based remote desktop ``` --- ## 28. Final Goal The final MVP should prove this flow: ```text Client -> POP Server -> policy check -> Client Agent -> internal TCP service ``` The POP Server must remain the only policy authority. Agents must remain controlled executors. Default access must always be denied unless a policy explicitly allows it.