Prechádzať zdrojové kódy

RHL-015 refactor(docs): update API and storage documentation for PDF streaming endpoint

Code_Uwe 4 týždňov pred
rodič
commit
f85168a999
3 zmenil súbory, kde vykonal 270 pridanie a 11 odobranie
  1. 133 2
      Docs/api.md
  2. 51 7
      Docs/frontend-api-usage.md
  3. 86 2
      Docs/storage.md

+ 133 - 2
Docs/api.md

@@ -1,3 +1,13 @@
+<!-- --------------------------------------------------------------------------- -->
+
+<!-- Ordner: Docs -->
+
+<!-- Datei: api.md -->
+
+<!-- Relativer Pfad: Docs/api.md -->
+
+<!-- --------------------------------------------------------------------------- -->
+
 # API Overview
 
 This document describes the HTTP API exposed by the application using Next.js **Route Handlers** in the App Router (`app/api/*/route.js`).
@@ -60,7 +70,7 @@ RBAC is enforced on filesystem-related endpoints.
 
 ### 3.1 Standard error response format
 
-All endpoints return JSON.
+Most endpoints return JSON.
 
 Success responses keep their existing shapes (**unchanged**).
 
@@ -82,6 +92,15 @@ Notes:
 - `error.code` is a stable machine-readable identifier (frontend handling, tests, monitoring).
 - `error.details` is optional. When present, it must be a JSON object (e.g. validation info).
 
+**Binary endpoints**
+
+Some endpoints may return a non-JSON body on the **200** happy path (for example `application/pdf`).
+
+Rules for such endpoints:
+
+- On success: return the documented binary payload.
+- On non-200 errors: return the standardized JSON error payload above.
+
 ### 3.2 Status code rules
 
 The API uses the following status codes consistently:
@@ -105,11 +124,29 @@ The API uses these machine-readable codes (non-exhaustive list):
 - Validation:
 
   - `VALIDATION_MISSING_PARAM`
+
   - `VALIDATION_MISSING_QUERY`
+
   - `VALIDATION_INVALID_JSON`
+
   - `VALIDATION_INVALID_BODY`
+
   - `VALIDATION_MISSING_FIELD`
 
+  - `VALIDATION_BRANCH`
+
+  - `VALIDATION_YEAR`
+
+  - `VALIDATION_MONTH`
+
+  - `VALIDATION_DAY`
+
+  - `VALIDATION_FILENAME`
+
+  - `VALIDATION_FILE_EXTENSION`
+
+  - `VALIDATION_PATH_TRAVERSAL`
+
 - Storage:
 
   - `FS_NOT_FOUND`
@@ -149,6 +186,10 @@ All JSON API responses explicitly disable HTTP caching:
 
 This is applied centrally by `lib/api/errors.js` (the `json()` / `jsonError()` helpers). The policy applies to both success and error responses.
 
+For binary endpoints (e.g. PDF streaming), the route handler must explicitly set:
+
+- `Cache-Control: no-store`
+
 #### 3.6.2 Next.js route handler execution
 
 All API route handlers are forced to execute dynamically at request time:
@@ -467,6 +508,96 @@ Example:
 
 ---
 
+### 4.10 `GET /api/files/:branch/:year/:month/:day/:filename`
+
+**Purpose**
+
+Stream (or download) a single PDF file from the NAS while enforcing authentication and branch-level RBAC.
+
+**Authentication**: required.
+
+**RBAC behavior**
+
+- `401 AUTH_UNAUTHENTICATED` when no valid session exists.
+- `403 AUTH_FORBIDDEN_BRANCH` when the session is not allowed to access `:branch`.
+
+**URL params**
+
+- `branch`: `NL` + digits (e.g. `NL01`)
+- `year`: `YYYY` (4 digits)
+- `month`: `MM` (`01`–`12`)
+- `day`: `DD` (`01`–`31`)
+- `filename`: PDF file name (must be a simple file name; no path segments)
+
+**Query params (optional)**
+
+- `download=1` or `download=true`
+
+  - Forces `Content-Disposition: attachment` (download)
+  - Default is `inline` (open in browser)
+
+**Success response (200)**
+
+- Body: raw PDF bytes (not JSON)
+- Headers (example):
+
+  - `Content-Type: application/pdf`
+  - `Content-Disposition: inline; filename="<filename>"` (or `attachment` when `download=1`)
+  - `Cache-Control: no-store`
+
+**Error responses (JSON)**
+
+- `400` validation errors:
+
+  - `VALIDATION_MISSING_PARAM`
+  - `VALIDATION_BRANCH`
+  - `VALIDATION_YEAR`
+  - `VALIDATION_MONTH`
+  - `VALIDATION_DAY`
+  - `VALIDATION_FILENAME`
+  - `VALIDATION_FILE_EXTENSION`
+  - `VALIDATION_PATH_TRAVERSAL`
+
+- `401`
+
+  ```json
+  { "error": { "message": "Unauthorized", "code": "AUTH_UNAUTHENTICATED" } }
+  ```
+
+- `403`
+
+  ```json
+  { "error": { "message": "Forbidden", "code": "AUTH_FORBIDDEN_BRANCH" } }
+  ```
+
+- `404` (file not found)
+
+  ```json
+  {
+  	"error": {
+  		"message": "Not found",
+  		"code": "FS_NOT_FOUND",
+  		"details": {
+  			"branch": "NL01",
+  			"year": "2024",
+  			"month": "10",
+  			"day": "23",
+  			"filename": "example.pdf"
+  		}
+  	}
+  }
+  ```
+
+- `500`
+
+  ```json
+  {
+  	"error": { "message": "Internal server error", "code": "FS_STORAGE_ERROR" }
+  }
+  ```
+
+---
+
 ## 5. API v1 freeze (RHL-008)
 
 The endpoints and response shapes documented here (and in `Docs/frontend-api-usage.md`) are considered **API v1** for the first frontend implementation.
@@ -485,7 +616,7 @@ When adding new endpoints:
 
 1. Define URL + method.
 2. Implement a `route.js` under `app/api/...`.
-3. Use `lib/storage` for filesystem access.
+3. Use `lib/storage` for filesystem listing/navigation access.
 4. Enforce RBAC (`getSession()` + `canAccessBranch()` as needed).
 5. Use the standardized error contract (prefer `withErrorHandling` + `ApiError` helpers).
 6. Add route tests (Vitest).

+ 51 - 7
Docs/frontend-api-usage.md

@@ -25,7 +25,11 @@ Scope:
 Non-goals:
 
 - New major features.
-- PDF streaming/viewer implementation details (see “Out of scope / planned”).
+- PDF viewer UI implementation details (planned as RHL-023).
+
+Notes:
+
+- The backend provides a PDF stream/download endpoint (RHL-015). This document describes the **frontend calling pattern** for that endpoint.
 
 ---
 
@@ -232,7 +236,8 @@ If the UI needs “newest first”, reverse the arrays in the UI.
 Notes:
 
 - `relativePath` is **relative to `NAS_ROOT_PATH`** inside the container.
-- Treat it as an opaque identifier for a future download/stream endpoint.
+- The PDF stream endpoint uses **route segments** (`branch/year/month/day/filename`).
+- For v1 UI usage, treat `files[].name` as the canonical filename and build the stream URL from the current Explorer route segments.
 
 ---
 
@@ -340,6 +345,47 @@ Success:
 }
 ```
 
+#### `GET /api/files/:branch/:year/:month/:day/:filename`
+
+This endpoint returns **binary PDF data** (not JSON).
+
+Recommended frontend usage:
+
+- Open inline in a new tab/window so the browser handles PDF rendering.
+- Do **not** call this endpoint via `apiClient.apiFetch()` because it is JSON-centric.
+
+Open inline:
+
+```js
+const url = `/api/files/${branch}/${year}/${month}/${day}/${encodeURIComponent(
+	filename
+)}`;
+window.open(url, "_blank", "noopener,noreferrer");
+```
+
+Force download:
+
+```js
+const url = `/api/files/${branch}/${year}/${month}/${day}/${encodeURIComponent(
+	filename
+)}?download=1`;
+window.open(url, "_blank", "noopener,noreferrer");
+```
+
+Important notes:
+
+- Use the exact `files[].name` returned by `getFiles()` (case-sensitive on Linux).
+
+- Filenames with special characters must be URL-encoded.
+
+  - In particular, `#` **must** be encoded as `%23`.
+    Otherwise the browser treats it as a fragment and the server receives a truncated filename.
+
+- Host consistency matters for cookies:
+
+  - If you are logged in on `http://localhost:3000`, also open the PDF on `http://localhost:3000`.
+  - Switching to `http://127.0.0.1:3000` will not send the cookie (different host) and results in `401`.
+
 ### 4.4 Health
 
 #### `GET /api/health`
@@ -466,8 +512,6 @@ Rules:
 
 ## 9. Out of scope / planned additions
 
-PDF delivery (download/stream) is not part of the current v1 surface documented above.
-
-Planned as additive change:
-
-- a dedicated endpoint to stream or download a PDF while enforcing RBAC server-side.
+- PDF viewer UX (RHL-023): enabling the Explorer “Open” button and providing a polished open/download experience.
+- Search UI and filters (`/:branch/search`).
+- Admin/dev branch selector and additional sidebar navigation.

+ 86 - 2
Docs/storage.md

@@ -10,9 +10,14 @@
 
 # Storage Module (`lib/storage`)
 
-The `lib/storage` module is the **single source of truth** for reading delivery note PDFs from the NAS share.
+The `lib/storage` module is the **single source of truth** for reading delivery note metadata from the NAS share.
 
-All code that needs to read from the NAS should go through this module instead of using Node.js `fs` directly. This keeps filesystem logic centralized and makes it easier to change conventions later.
+All code that needs to **list** branches/years/months/days/files on the NAS should go through this module instead of using Node.js `fs` directly. This keeps filesystem listing logic centralized and makes it easier to change conventions later.
+
+> Note:
+>
+> - Binary streaming endpoints (e.g. PDF streaming) may use direct filesystem streaming APIs (`fs.createReadStream`) for performance and memory safety.
+> - Such endpoints must still follow the same safety rules (validated segments, no path traversal) and the same error mapping conventions (`lib/api/storageErrors.js`).
 
 ---
 
@@ -159,3 +164,82 @@ The NAS content can change at any time (new scans). To reduce filesystem load wh
 
 - Unit tests clear the storage cache between tests via a test-only helper.
 - A TTL test verifies “stable within TTL” and “refresh after TTL”.
+
+---
+
+## 7. File streaming endpoints (PDF delivery)
+
+The storage module currently focuses on **listing** directory contents.
+
+For endpoints that must return **binary file data** (PDF streaming/download), a direct stream approach is preferred:
+
+- **Do not** read the whole PDF into memory.
+- Use `fs.stat()` first (for existence/type) and then `fs.createReadStream()`.
+
+### 7.1 Security rules
+
+A streaming endpoint must never accept arbitrary paths.
+
+Rules:
+
+- Build the absolute file path from:
+
+  - `NAS_ROOT_PATH`
+  - validated route segments (`branch`, `year`, `month`, `day`)
+  - validated filename (`filename`)
+
+- Validate route segments using strict patterns:
+
+  - `branch`: `^NL\d+$`
+  - `year`: `^\d{4}$`
+  - `month`: `01–12`
+  - `day`: `01–31`
+
+- Validate filename:
+
+  - must be a simple file name (no `/`, `\`, or `..` segments)
+  - only `.pdf` is allowed
+
+- Apply a root containment check:
+
+  - after resolving the absolute path, ensure the resolved path stays within `NAS_ROOT_PATH`
+
+### 7.2 Error mapping rules
+
+Even when the happy-path response is binary, **errors must remain standardized JSON**.
+
+Recommended approach:
+
+1. `stat(absPath)`
+
+2. If it throws:
+
+   - map via `mapStorageReadError(err, { details })`
+
+3. If `stat` succeeds but `!stat.isFile()`:
+
+   - return `404 FS_NOT_FOUND`
+
+### 7.3 HTTP headers
+
+For PDF streaming:
+
+- `Content-Type: application/pdf`
+- `Content-Disposition: inline; filename="..."` (default)
+- `Content-Disposition: attachment; filename="..."` (when `download=1`)
+- `Cache-Control: no-store`
+
+---
+
+## 8. Future extensions
+
+Potential follow-ups for the storage layer:
+
+- A dedicated helper for streaming files (e.g. `openPdfStream(...)`) that centralizes:
+
+  - strict validation
+  - safe path construction
+  - `stat()` + stream creation
+  - consistent error details
+
+This is optional; the current v1 design keeps `lib/storage` focused on listing operations.