Browse Source

RHL-001-refactor: enhance storage module with improved path handling and documentation

Code_Uwe 1 week ago
parent
commit
3f39ad5079
2 changed files with 555 additions and 30 deletions
  1. 449 0
      Docs/Lieferscheine.md
  2. 106 30
      lib/storage.js

+ 449 - 0
Docs/Lieferscheine.md

@@ -0,0 +1,449 @@
+# Storage Module (`lib/storage`)
+
+The `lib/storage` module is the **single source of truth** for reading files
+from the network file share that contains scanned delivery notes.
+
+All code that needs to read from the NAS must go through this module instead
+of using Node.js `fs` directly. This keeps filesystem logic centralized and
+makes it easier to change paths or conventions later.
+
+## High-Level Responsibilities
+
+- Resolve paths under the NAS root (`NAS_ROOT_PATH`).
+- Provide high-level, intention-revealing helpers:
+
+  - `listBranches()` → `['NL01', 'NL02', ...]`
+  - `listYears(branch)` → `['2023', '2024', ...]`
+  - `listMonths(branch, year)` → `['01', '02', ...]`
+  - `listDays(branch, year, month)` → `['01', '02', ...]`
+  - `listFiles(branch, year, month, day)` → `[{ name, relativePath }, ...]`
+
+- Enforce **read-only** access from the filesystem (no delete/move/write logic
+  here).
+- Use asynchronous filesystem APIs (`fs/promises`) to avoid blocking the
+  event loop when reading from a network filesystem (SMB). Using async I/O is a
+  recommended best practice in Node.js for scalability and performance.
+
+## Environment Configuration
+
+The storage module depends on a single environment variable:
+
+- `NAS_ROOT_PATH`
+
+  Absolute path where the NAS share is mounted on the host.
+
+Typical values:
+
+- **Production (Linux server):**
+
+  ```env
+  NAS_ROOT_PATH=/mnt/niederlassungen
+  ```
+
+- **Local development (optional):**
+
+  ```env
+  # Example: local test folder
+  NAS_ROOT_PATH=/Users/<username>/dev/test/niederlassungen
+  ```
+
+  or, if the NAS is mounted locally (e.g. on macOS):
+
+  ```env
+  NAS_ROOT_PATH=/Volumes/Niederlassungen
+  ```
+
+If `NAS_ROOT_PATH` is not set, the helpers will throw when called. This is
+intentional: configuration issues should fail fast instead of causing
+confusing downstream errors.
+
+## Directory Layout Assumptions
+
+The helpers assume the following structure under `NAS_ROOT_PATH`:
+
+```text
+NAS_ROOT_PATH/
+  @Recently-Snapshot/   # ignored
+  NL01/
+    2024/
+      10/
+        23/
+          file1.pdf
+          file2.pdf
+  NL02/
+    2023/
+      12/
+        01/
+          ...
+  ...
+```
+
+Rules:
+
+- Branch directories follow the pattern `NL<Number>`, e.g. `NL01`, `NL23`.
+- Year directories are 4-digit numeric (`2023`, `2024`, ...).
+- Month and day directories are numeric; the helpers normalize them to
+  two‑digit strings for consistent display in the UI:
+
+  - Months: `"01"` … `"12"`
+  - Days: `"01"` … `"31"`
+
+- Only `.pdf` files are returned by `listFiles`.
+
+If the on-disk structure changes, update the logic in `lib/storage` only.
+API routes and UI components should not need to know about the exact layout.
+
+## Helper Functions
+
+All helper functions are asynchronous and return Promises.
+
+### `listBranches(): Promise<string[]>`
+
+Returns the list of branch directories (`NLxx`) under `NAS_ROOT_PATH`.
+
+- Ignores `@Recently-Snapshot`.
+- Filters for names matching `^NL\d+$` (case-insensitive).
+- Sorts branches numerically by their suffix (`NL1`, `NL2`, …, `NL10`).
+
+Example result:
+
+```json
+["NL01", "NL02", "NL03"]
+```
+
+### `listYears(branch: string): Promise<string[]>`
+
+Reads the year directories for a given branch.
+
+- Path: `${NAS_ROOT_PATH}/${branch}`
+- Filters for directories matching `^\d{4}$`.
+- Returns sorted year strings as `['2023', '2024', ...]`.
+
+### `listMonths(branch: string, year: string): Promise<string[]>`
+
+Reads the month directories for the given `branch` and `year`.
+
+- Path: `${NAS_ROOT_PATH}/${branch}/${year}`
+- Filters for directories matching `^\d{1,2}$`.
+- Normalizes month names to two digits (e.g. `'1' → '01'`).
+- Returns sorted month strings.
+
+Example result:
+
+```json
+["01", "02", "03", "10"]
+```
+
+### `listDays(branch: string, year: string, month: string): Promise<string[]>`
+
+Reads the day directories for the given `branch`, `year`, and `month`.
+
+- Path: `${NAS_ROOT_PATH}/${branch}/${year}/${month}`.
+- Filters for directories matching `^\d{1,2}$`.
+- Normalizes day names to two digits (e.g. `'3' → '03'`).
+- Returns sorted day strings.
+
+Example result:
+
+```json
+["01", "02", "03", "23"]
+```
+
+### `listFiles(branch: string, year: string, month: string, day: string): Promise<{ name: string; relativePath: string }[]>`
+
+Reads all PDF files for the given `branch`, `year`, `month`, and `day`.
+
+- Path: `${NAS_ROOT_PATH}/${branch}/${year}/${month}/${day}`.
+- Filters for files whose names end with `.pdf` (case-insensitive).
+- Sorts filenames alphabetically.
+- Returns an array of objects with:
+
+  - `name`: the raw filename (e.g. `"Stapel-1_Seiten-1_Zeit-1048.pdf"`).
+  - `relativePath`: the relative path from `NAS_ROOT_PATH` (e.g.
+    `"NL01/2024/10/23/Stapel-1_Seiten-1_Zeit-1048.pdf"`).
+
+Example result:
+
+```json
+[
+	{
+		"name": "Stapel-1_Seiten-1_Zeit-1048.pdf",
+		"relativePath": "NL01/2024/10/23/Stapel-1_Seiten-1_Zeit-1048.pdf"
+	},
+	{
+		"name": "Stapel-1_Seiten-2_Zeit-1032.pdf",
+		"relativePath": "NL01/2024/10/23/Stapel-1_Seiten-2_Zeit-1032.pdf"
+	}
+]
+```
+
+## Error Handling
+
+`lib/storage` does **not** swallow errors:
+
+- If a folder does not exist or is not accessible, the underlying
+  `fs.promises.readdir` call will throw (e.g. `ENOENT`, `EACCES`).
+- Callers (API routes, services) are responsible for catching these errors and
+  converting them into appropriate HTTP responses.
+
+This separation keeps responsibilities clear:
+
+- `lib/storage` → _How do we read data from the filesystem?_
+- API layer (`app/api/.../route.js`) → _How do we map errors to HTTP responses?_
+
+---
+
+# API Overview
+
+This document describes the HTTP API exposed by the application using Next.js
+**Route Handlers** in the App Router (`app/api/*/route.js`).
+
+> All routes below are served under the `/api` prefix.
+
+> **Note:** Authentication and authorization are not implemented yet. In the
+> final system, branch users should only see their own branch, while admins
+> can access all branches.
+
+## General Conventions
+
+- All endpoints return JSON.
+
+- Successful responses use HTTP status `200`.
+
+- Error responses use `4xx` or `5xx` and have the shape:
+
+  ```json
+  { "error": "Human-readable error message" }
+  ```
+
+- Route Handlers are implemented in `app/api/.../route.js` using the standard
+  Web `Request` / `Response` primitives as described in the Next.js
+  documentation.
+
+- Filesystem access must use `lib/storage` (no direct `fs` calls inside
+  route handlers).
+
+---
+
+## Health Check
+
+### `GET /api/health`
+
+**Purpose**
+
+Check whether:
+
+- The database is reachable.
+- The NAS root path (`NAS_ROOT_PATH`) is readable from the app container.
+
+**Response 200 (example)**
+
+```json
+{
+	"db": "OK",
+	"nas": {
+		"path": "/mnt/niederlassungen",
+		"entriesSample": ["@Recently-Snapshot", "NL01", "NL02", "NL03", "NL04"]
+	}
+}
+```
+
+**Error cases**
+
+- If the database is not reachable, the `db` field contains an error message.
+- If the NAS path cannot be read, the `nas` field contains an error string,
+  e.g. `"error: ENOENT: no such file or directory, scandir '/mnt/niederlassungen'"`.
+
+This endpoint is intended for operations/monitoring and quick manual checks.
+
+---
+
+## Delivery Notes Hierarchy
+
+The following endpoints reflect the filesystem hierarchy:
+
+> `NAS_ROOT_PATH` → Branch → Year → Month → Day → PDF files
+
+### `GET /api/branches`
+
+List all branch directories based on the names under `NAS_ROOT_PATH`.
+
+**Response 200**
+
+```json
+{
+	"branches": ["NL01", "NL02", "NL03"]
+}
+```
+
+**Errors**
+
+- `500` – Internal error (e.g. filesystem error, missing `NAS_ROOT_PATH`).
+
+---
+
+### `GET /api/branches/[branch]/years`
+
+Example: `/api/branches/NL01/years`
+
+Return all year folders for a given branch.
+
+**Response 200**
+
+```json
+{
+	"branch": "NL01",
+	"years": ["2023", "2024"]
+}
+```
+
+**Errors**
+
+- `400` – `branch` parameter is missing (indicates a route/handler bug).
+- `500` – Error while reading year directories.
+
+---
+
+### `GET /api/branches/[branch]/[year]/months`
+
+Example: `/api/branches/NL01/2024/months`
+
+Return all month folders for the given branch and year.
+
+**Response 200**
+
+```json
+{
+	"branch": "NL01",
+	"year": "2024",
+	"months": ["01", "02", "03", "10"]
+}
+```
+
+**Notes**
+
+- Months are returned as two‑digit strings (`"01"` … `"12"`) so that UI
+  code does not need to handle formatting.
+
+**Errors**
+
+- `400` – `branch` or `year` parameter is missing.
+- `500` – Filesystem or configuration error.
+
+---
+
+### `GET /api/branches/[branch]/[year]/[month]/days`
+
+Example: `/api/NL01/2024/10/days`
+
+Return all day folders for the given branch, year, and month.
+
+**Response 200**
+
+```json
+{
+	"branch": "NL01",
+	"year": "2024",
+	"month": "10",
+	"days": ["01", "02", "03", "23"]
+}
+```
+
+**Notes**
+
+- Days are returned as two‑digit strings (`"01"` … `"31"`).
+
+**Errors**
+
+- `400` – `branch`, `year`, or `month` parameter is missing.
+- `500` – Filesystem or configuration error.
+
+---
+
+### `GET /api/files?branch=&year=&month=&day=`
+
+Example:
+
+```text
+/api/files?branch=NL01&year=2024&month=10&day=23
+```
+
+Return the list of PDF files for a specific branch and date.
+
+**Query parameters**
+
+- `branch` – branch identifier (e.g. `NL01`).
+- `year` – four‑digit year (e.g. `2024`).
+- `month` – month (e.g. `10`).
+- `day` – day (e.g. `23`).
+
+**Response 200**
+
+```json
+{
+	"branch": "NL01",
+	"year": "2024",
+	"month": "10",
+	"day": "23",
+	"files": [
+		{
+			"name": "Stapel-1_Seiten-1_Zeit-1048.pdf",
+			"relativePath": "NL01/2024/10/23/Stapel-1_Seiten-1_Zeit-1048.pdf"
+		},
+		{
+			"name": "Stapel-1_Seiten-2_Zeit-1032.pdf",
+			"relativePath": "NL01/2024/10/23/Stapel-1_Seiten-2_Zeit-1032.pdf"
+		}
+	]
+}
+```
+
+**Errors**
+
+- `400` – one or more required query parameters are missing.
+- `500` – filesystem error while reading the day directory or files.
+
+---
+
+## Adding New Endpoints
+
+When adding new endpoints:
+
+1. **Define the URL and method first**, e.g.:
+
+   - `GET /api/file?path=...` (download a single PDF)
+   - `GET /api/search?branch=&query=...` (full‑text search via Qsirch)
+
+2. **Create a `route.js` file** in `app/api/...` following Next.js 16 Route
+   Handler conventions. For dynamic routes, use the `(request, ctx)` signature
+   and resolve parameters via `const params = await ctx.params`.
+
+3. **Use `lib/storage` for filesystem access** instead of calling `fs`
+   directly inside route handlers. If needed, add new helpers to
+   `lib/storage`.
+
+4. **Handle errors explicitly** with `try/catch` in the handler and return
+   `4xx/5xx` responses with clear `error` messages.
+
+5. **Update this document** to describe the new endpoint (URL, purpose,
+   parameters, sample responses, error cases).
+
+---
+
+## Future Extensions
+
+- **Authentication & Authorization**
+
+  - Enforce branch‑level access control (branch user vs. admin).
+  - Likely implemented using JWT stored in cookies and a shared helper
+    (e.g. `lib/auth`) plus a `middleware.js` or per‑route checks.
+
+- **Search Endpoints (Qsirch)**
+
+  - Integrate with QNAP Qsirch via its HTTP API.
+  - Provide endpoints like `GET /api/search?branch=&query=&from=&to=`.
+
+- **File Download / Preview**
+
+  - Add endpoints for streaming PDF content from the NAS to the browser
+    with appropriate `Content-Type` and `Content-Disposition` headers.

+ 106 - 30
lib/storage.js

@@ -1,76 +1,152 @@
 // lib/storage.js
-import fs from "fs/promises";
-import path from "path";
+// -----------------------------------------------------------------------------
+// Central abstraction layer for reading files and directories from the NAS
+// share mounted at `NAS_ROOT_PATH` (e.g. `/mnt/niederlassungen`).
+//
+// All access to the branch/year/month/day/PDF structure should go through
+// these functions instead of using `fs` directly in route handlers.
+//
+// - Read-only: no write/delete operations here.
+// - Async only: uses `fs/promises` + async/await to avoid blocking the event loop.
+// -----------------------------------------------------------------------------
 
+import fs from "node:fs/promises"; // Promise-based filesystem API
+import path from "node:path"; // Safe path utilities (handles separators)
+
+// Root directory of the NAS share, injected via environment variable.
+// On the Linux app server, this is typically `/mnt/niederlassungen`.
+// Locally you can point this to any folder (or leave it unset for now).
 const ROOT = process.env.NAS_ROOT_PATH;
 
+// Build an absolute path below the NAS root from a list of segments.
+// Example: fullPath("NL01", "2024", "10", "23")
+//        → "/mnt/niederlassungen/NL01/2024/10/23"
 function fullPath(...segments) {
+	// Failing fast if ROOT is missing makes configuration errors obvious.
 	if (!ROOT) {
-		throw new Error("NAS_ROOT_PATH ist nicht gesetzt");
+		throw new Error("NAS_ROOT_PATH environment variable is not set");
 	}
+
+	// Always use `path.join` instead of manual string concatenation to avoid
+	// issues with slashes on different platforms.
 	return path.join(ROOT, ...segments.map(String));
 }
 
+// Compare strings that represent numbers in a numeric way.
+// This ensures "2" comes before "10" (2 < 10), not after.
 function sortNumericStrings(a, b) {
 	const na = parseInt(a, 10);
 	const nb = parseInt(b, 10);
+
 	if (!Number.isNaN(na) && !Number.isNaN(nb)) {
 		return na - nb;
 	}
-	return a.localeCompare(b, "de");
+
+	// Fallback to localeCompare if parsing fails
+	return a.localeCompare(b, "en");
 }
 
+// -----------------------------------------------------------------------------
+// 1. Branches (NL01, NL02, ...)
+// Path pattern: `${ROOT}/NLxx`
+// -----------------------------------------------------------------------------
+
 export async function listBranches() {
+	// Read the root directory of the NAS share.
+	// `withFileTypes: true` returns `Dirent` objects so we can call `isDirectory()`
+	// without extra stat() calls, which is more efficient.
 	const entries = await fs.readdir(fullPath(), { withFileTypes: true });
-	return entries
-		.filter(
-			(e) =>
-				e.isDirectory() &&
-				e.name !== "@Recently-Snapshot" &&
-				/^NL\d+$/i.test(e.name)
-		)
-		.map((e) => e.name)
-		.sort((a, b) =>
-			sortNumericStrings(a.replace("NL", ""), b.replace("NL", ""))
-		);
+
+	return (
+		entries
+			.filter(
+				(entry) =>
+					entry.isDirectory() && // only directories
+					entry.name !== "@Recently-Snapshot" && // ignore QNAP snapshot folder
+					/^NL\d+$/i.test(entry.name) // keep only names like "NL01", "NL02", ...
+			)
+			.map((entry) => entry.name)
+			// Sort by numeric branch number: NL1, NL2, ..., NL10
+			.sort((a, b) =>
+				sortNumericStrings(a.replace("NL", ""), b.replace("NL", ""))
+			)
+	);
 }
 
+// -----------------------------------------------------------------------------
+// 2. Years (2023, 2024, ...)
+// Path pattern: `${ROOT}/${branch}/${year}`
+// -----------------------------------------------------------------------------
+
 export async function listYears(branch) {
 	const dir = fullPath(branch);
 	const entries = await fs.readdir(dir, { withFileTypes: true });
+
 	return entries
-		.filter((e) => e.isDirectory() && /^\d{4}$/.test(e.name))
-		.map((e) => e.name)
+		.filter(
+			(entry) => entry.isDirectory() && /^\d{4}$/.test(entry.name) // exactly 4 digits → year folders like "2024"
+		)
+		.map((entry) => entry.name)
 		.sort(sortNumericStrings);
 }
 
+// -----------------------------------------------------------------------------
+// 3. Months (01–12)
+// Path pattern: `${ROOT}/${branch}/${year}/${month}`
+// -----------------------------------------------------------------------------
+
 export async function listMonths(branch, year) {
 	const dir = fullPath(branch, year);
 	const entries = await fs.readdir(dir, { withFileTypes: true });
-	return entries
-		.filter((e) => e.isDirectory() && /^\d{1,2}$/.test(e.name))
-		.map((e) => e.name.padStart(2, "0"))
-		.sort(sortNumericStrings);
+
+	return (
+		entries
+			.filter(
+				(entry) => entry.isDirectory() && /^\d{1,2}$/.test(entry.name) // supports "1" or "10", we normalize below
+			)
+			// Normalize to two digits so the UI shows "01", "02", ..., "12"
+			.map((entry) => entry.name.trim().padStart(2, "0"))
+			.sort(sortNumericStrings)
+	);
 }
 
+// -----------------------------------------------------------------------------
+// 4. Days (01–31)
+// Path pattern: `${ROOT}/${branch}/${year}/${month}/${day}`
+// -----------------------------------------------------------------------------
+
 export async function listDays(branch, year, month) {
 	const dir = fullPath(branch, year, month);
 	const entries = await fs.readdir(dir, { withFileTypes: true });
+
 	return entries
-		.filter((e) => e.isDirectory() && /^\d{1,2}$/.test(e.name))
-		.map((e) => e.name.padStart(2, "0"))
+		.filter(
+			(entry) => entry.isDirectory() && /^\d{1,2}$/.test(entry.name) // supports "1" or "23"
+		)
+		.map((entry) => entry.name.trim().padStart(2, "0"))
 		.sort(sortNumericStrings);
 }
 
+// -----------------------------------------------------------------------------
+// 5. Files (PDFs) for a given day
+// Path pattern: `${ROOT}/${branch}/${year}/${month}/${day}/<file>.pdf`
+// -----------------------------------------------------------------------------
+
 export async function listFiles(branch, year, month, day) {
 	const dir = fullPath(branch, year, month, day);
 	const entries = await fs.readdir(dir);
 
-	return entries
-		.filter((name) => name.toLowerCase().endsWith(".pdf"))
-		.sort((a, b) => a.localeCompare(b, "de"))
-		.map((name) => ({
-			name,
-			relativePath: `${branch}/${year}/${month}/${day}/${name}`,
-		}));
+	return (
+		entries
+			// We only care about PDF files at the moment
+			.filter((name) => name.toLowerCase().endsWith(".pdf"))
+			.sort((a, b) => a.localeCompare(b, "en"))
+			.map((name) => ({
+				// Just the file name, e.g. "Stapel-1_Seiten-1_Zeit-1048.pdf"
+				name,
+				// Relative path from the NAS root, used for download URLs etc.
+				// Example: "NL01/2024/10/23/Stapel-1_Seiten-1_Zeit-1048.pdf"
+				relativePath: `${branch}/${year}/${month}/${day}/${name}`,
+			}))
+	);
 }