Proof DB 管理面板
+在这里维护 archives 表、OpenSearch 状态、管理员账号、API 文档,以及脚本级运维动作。
+diff --git a/.env b/.env
index 1568105..ff8a035 100644
--- a/.env
+++ b/.env
@@ -7,6 +7,7 @@ LLM_METADATA_ENABLED="true"
LLM_METADATA_MODEL="glm-4.7-flash"
LLM_METADATA_MAX_TOKENS=2480
LLM_METADATA_TEMPERATURE=0.1
-OPENSEARCH_HOST="http://localhost:9200"
+OPENSEARCH_HOST="https://localhost:9200"
OPENSEARCH_USERNAME="admin"
-OPENSEARCH_PASSWORD="proofdb"
\ No newline at end of file
+OPENSEARCH_PASSWORD="proofdb"
+ARCHIVE_CASK_URL="https://archive-cask.example.com"
diff --git a/.version b/.version
new file mode 100644
index 0000000..6c6aa7c
--- /dev/null
+++ b/.version
@@ -0,0 +1 @@
+0.1.0
\ No newline at end of file
diff --git a/apidoc/README.md b/apidoc/README.md
new file mode 100644
index 0000000..b90938d
--- /dev/null
+++ b/apidoc/README.md
@@ -0,0 +1,49 @@
+# API 文档总览
+
+当前 `apidoc/` 中的文档按接口域拆分:
+
+- [importapi.md](/www/proofdb/apidoc/importapi.md): 档案导入接口
+- [adminapi.md](/www/proofdb/apidoc/adminapi.md): 管理员认证与后台维护接口
+- [searchapi.md](/www/proofdb/apidoc/searchapi.md): 全文、向量、混合搜索接口
+- [evidenceapi.md](/www/proofdb/apidoc/evidenceapi.md): chunk 详情与 evidence 接口
+
+## 当前已实现接口
+
+```http
+POST /api/articles/import
+POST /api/admin/login
+POST /api/admin/logout
+GET /api/admin/me
+GET /api/admin/archives
+GET /api/admin/archives/{archive_uid}
+PATCH /api/admin/archives/{archive_uid}
+DELETE /api/admin/archives/{archive_uid}
+GET /api/admin/opensearch/status
+GET /api/admin/opensearch/documents
+GET /api/admin/users
+POST /api/admin/users
+PATCH /api/admin/users/{id}
+GET /api/admin/docs
+GET /api/admin/docs/{name}
+GET /api/admin/scripts
+GET /api/admin/scripts/{name}
+POST /api/admin/scripts/run
+POST /api/search/fulltext
+POST /api/search/vector
+POST /api/search/hybrid
+GET /api/chunks/{chunk_uid}
+GET /api/evidence/{chunk_uid}
+```
+
+## 当前接口分层
+
+- 导入层:把 Markdown 档案解析为 archive / chunk,并写入 PostgreSQL。
+- 管理层:管理员登录、会话识别、archives 表管理、OpenSearch 状态、用户管理、文档查看与维护脚本执行。
+- 检索层:从 OpenSearch 做 BM25、向量和 hybrid 检索。
+- 证据层:把 `chunk_uid` 落到 citation、页码和证据正文。
+
+## 说明
+
+- 搜索接口中的 `hits` 始终表示“当前请求下返回的候选结果数组”,不是数据库全量导出。
+- `fulltext`、`vector`、`hybrid` 都支持 `limit`。
+- `hybrid` 的 `total` 表示融合后的候选总数;更细的来源统计在 `sources` 字段中。
diff --git a/apidoc/adminapi.md b/apidoc/adminapi.md
new file mode 100644
index 0000000..c963f63
--- /dev/null
+++ b/apidoc/adminapi.md
@@ -0,0 +1,355 @@
+# 管理员后台 API
+
+## 接口说明
+
+这组接口服务于 Proof DB 的管理员维护面板,包括:
+
+- 管理员登录与会话读取
+- `archives` 表管理
+- OpenSearch 状态查看
+- 管理员用户管理
+- APIDOC 文档查看
+- 维护脚本执行
+
+管理员网页入口仍然是:
+
+- `GET /`
+- `GET /admin/login`
+- `GET /admin`
+
+## 管理员认证
+
+### 管理员登录
+
+```http
+POST /api/admin/login
+```
+
+`Content-Type: application/json`
+
+| 字段 | 类型 | 必填 | 说明 |
+| --- | --- | --- | --- |
+| `username` | string | 是 | 管理员用户名 |
+| `password` | string | 是 | 管理员密码 |
+
+### 管理员退出登录
+
+```http
+POST /api/admin/logout
+```
+
+### 当前管理员会话
+
+```http
+GET /api/admin/me
+```
+
+未登录时返回:
+
+```json
+{
+ "code": 401,
+ "message": "Admin session not found."
+}
+```
+
+## archives 表管理
+
+### 获取档案列表
+
+```http
+GET /api/admin/archives
+```
+
+### 查询参数
+
+| 字段 | 类型 | 必填 | 说明 |
+| --- | --- | --- | --- |
+| `query` | string | 否 | 按 `archive_uid`、`title`、`summary`、`author`、`source`、`series` 模糊搜索 |
+| `page` | integer | 否 | 页码,默认 `1` |
+| `page_size` | integer | 否 | 每页条数,默认 `20`,最大 `100` |
+
+### 请求示例
+
+```bash
+curl '
| ' . $this->renderInline($cell) . ' | ', $table['headers'])) . + '
|---|
| ' . $this->renderInline($cell) . ' | ', $row)) . + '
' . htmlspecialchars(implode("\n", $codeLines), ENT_QUOTES, 'UTF-8') . '';
+ $codeLines = [];
+ $inCodeBlock = false;
+ } else {
+ $inCodeBlock = true;
+ }
+ continue;
+ }
+
+ if ($inCodeBlock) {
+ $codeLines[] = $line;
+ continue;
+ }
+
+ $trimmed = trim($line);
+ if ($trimmed === '') {
+ $flushParagraph();
+ $flushList();
+ $flushTable();
+ continue;
+ }
+
+ if (preg_match('/^(#{1,6})\s+(.+)$/', $trimmed, $matches)) {
+ $flushParagraph();
+ $flushList();
+ $flushTable();
+ $level = strlen($matches[1]);
+ $html[] = sprintf('' . $this->renderInline($matches[1]) . ''; + continue; + } + + if (preg_match('/^---+$/', $trimmed)) { + $flushParagraph(); + $flushList(); + $flushTable(); + $html[] = '
' . htmlspecialchars(implode("\n", $codeLines), ENT_QUOTES, 'UTF-8') . '';
+ }
+
+ $flushParagraph();
+ $flushList();
+ $flushTable();
+
+ return implode("\n", $html);
+ }
+
+ private function renderInline(string $text): string
+ {
+ $text = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
+ $text = preg_replace('/`([^`]+)`/', '$1', $text) ?? $text;
+ $text = preg_replace('/\*\*([^*]+)\*\*/', '$1', $text) ?? $text;
+ $text = preg_replace('/\*([^*]+)\*/', '$1', $text) ?? $text;
+ $text = preg_replace('/\[(.+?)\]\((.+?)\)/', '$1', $text) ?? $text;
+
+ return $text;
+ }
+
+ private function isTableDelimiter(string $line): bool
+ {
+ return (bool) preg_match('/^\|?[\s:-]+\|[\s|:-]*$/', $line);
+ }
+
+ private function tableCells(string $line): array
+ {
+ $line = trim($line);
+ $line = trim($line, '|');
+ return array_map(static fn (string $cell): string => trim($cell), explode('|', $line));
+ }
+}
diff --git a/app/service/AdminConsole/OpenSearchAdminService.php b/app/service/AdminConsole/OpenSearchAdminService.php
new file mode 100644
index 0000000..98557ff
--- /dev/null
+++ b/app/service/AdminConsole/OpenSearchAdminService.php
@@ -0,0 +1,153 @@
+ [
+ 'hosts' => $config['hosts'] ?? [],
+ 'ssl_verify' => (bool) ($config['ssl_verify'] ?? true),
+ 'index_name' => $indexName,
+ ],
+ 'database' => [
+ 'archives_total' => (int) Db::table('archives')->count(),
+ 'chunks_total' => (int) Db::table('chunks')->count(),
+ 'embedded_chunks' => (int) Db::table('chunks')->where('embedding_status', 3)->count(),
+ 'indexed_chunks' => (int) Db::table('chunks')->where('search_index_status', 3)->count(),
+ ],
+ 'opensearch' => [
+ 'reachable' => false,
+ 'index_exists' => false,
+ 'cluster_name' => null,
+ 'health' => null,
+ 'docs_count' => 0,
+ 'mapping_fields' => [],
+ 'error' => null,
+ ],
+ ];
+
+ try {
+ $client = (new OpenSearchClientFactory())->make();
+ $health = $client->cluster()->health();
+ $status['opensearch']['reachable'] = true;
+ $status['opensearch']['cluster_name'] = $health['cluster_name'] ?? null;
+ $status['opensearch']['health'] = $health['status'] ?? null;
+
+ $exists = (bool) $client->indices()->exists(['index' => $indexName]);
+ $status['opensearch']['index_exists'] = $exists;
+
+ if ($exists) {
+ $stats = $client->indices()->stats(['index' => $indexName]);
+ $mapping = $client->indices()->getMapping(['index' => $indexName]);
+ $status['opensearch']['docs_count'] = (int) (($stats['_all']['primaries']['docs']['count'] ?? 0));
+ $status['opensearch']['mapping_fields'] = array_keys($mapping[$indexName]['mappings']['properties'] ?? []);
+ }
+ } catch (Throwable $exception) {
+ $status['opensearch']['error'] = $exception->getMessage();
+ }
+
+ return $status;
+ }
+
+ public function documents(string $query = '', int $size = 20): array
+ {
+ $size = min(50, max(1, $size));
+ $indexName = config('opensearch.indices.chunks', 'proofdb_chunks');
+ $client = (new OpenSearchClientFactory())->make();
+
+ if (!(bool) $client->indices()->exists(['index' => $indexName])) {
+ return [
+ 'index_name' => $indexName,
+ 'items' => [],
+ 'total' => 0,
+ ];
+ }
+
+ $body = [
+ '_source' => [
+ 'includes' => [
+ 'chunk_uid',
+ 'archive_uid',
+ 'chunk_index',
+ 'page_start',
+ 'page_end',
+ 'title',
+ 'summary',
+ 'source',
+ 'author',
+ 'year',
+ 'series',
+ 'tags',
+ 'text',
+ 'embedding_model',
+ 'embedding_dimensions',
+ 'created_time',
+ 'updated_time',
+ ],
+ ],
+ 'size' => $size,
+ 'sort' => [
+ ['updated_time' => ['order' => 'desc']],
+ ],
+ ];
+
+ $query = trim($query);
+ if ($query === '') {
+ $body['query'] = ['match_all' => (object) []];
+ } else {
+ $body['query'] = [
+ 'multi_match' => [
+ 'query' => $query,
+ 'fields' => ['text^3', 'title^2', 'summary^2', 'source', 'author', 'tags'],
+ 'type' => 'best_fields',
+ ],
+ ];
+ }
+
+ $response = $client->search([
+ 'index' => $indexName,
+ 'body' => $body,
+ ]);
+
+ $hits = $response['hits']['hits'] ?? [];
+
+ return [
+ 'index_name' => $indexName,
+ 'total' => (int) (($response['hits']['total']['value'] ?? 0)),
+ 'items' => array_map(function (array $hit): array {
+ $source = $hit['_source'] ?? [];
+ $text = trim((string) ($source['text'] ?? ''));
+ return [
+ 'score' => $hit['_score'] ?? null,
+ 'chunk_uid' => $source['chunk_uid'] ?? ($hit['_id'] ?? null),
+ 'archive_uid' => $source['archive_uid'] ?? null,
+ 'chunk_index' => $source['chunk_index'] ?? null,
+ 'page_start' => $source['page_start'] ?? null,
+ 'page_end' => $source['page_end'] ?? null,
+ 'title' => $source['title'] ?? null,
+ 'summary' => $source['summary'] ?? null,
+ 'source' => $source['source'] ?? null,
+ 'author' => $source['author'] ?? null,
+ 'year' => $source['year'] ?? null,
+ 'series' => $source['series'] ?? null,
+ 'tags' => $source['tags'] ?? [],
+ 'text_preview' => mb_substr($text, 0, 320),
+ 'embedding_model' => $source['embedding_model'] ?? null,
+ 'embedding_dimensions' => $source['embedding_dimensions'] ?? null,
+ 'created_time' => $source['created_time'] ?? null,
+ 'updated_time' => $source['updated_time'] ?? null,
+ ];
+ }, $hits),
+ ];
+ }
+}
diff --git a/app/service/AdminUserRepository.php b/app/service/AdminUserRepository.php
new file mode 100644
index 0000000..17952ae
--- /dev/null
+++ b/app/service/AdminUserRepository.php
@@ -0,0 +1,108 @@
+orderByDesc('id')
+ ->get()
+ ->all();
+
+ return array_map(fn (object $row): array => $this->toArray($row), $rows);
+ }
+
+ public function findByUsername(string $username): ?array
+ {
+ $row = Db::table('admin_users')
+ ->where('username', $username)
+ ->where('is_active', true)
+ ->first();
+
+ return $row ? $this->toArray($row) : null;
+ }
+
+ public function findById(int $id): ?array
+ {
+ $row = Db::table('admin_users')
+ ->where('id', $id)
+ ->where('is_active', true)
+ ->first();
+
+ return $row ? $this->toArray($row) : null;
+ }
+
+ public function findAnyById(int $id): ?array
+ {
+ $row = Db::table('admin_users')->where('id', $id)->first();
+ return $row ? $this->toArray($row) : null;
+ }
+
+ public function findAnyByUsername(string $username): ?array
+ {
+ $row = Db::table('admin_users')->where('username', $username)->first();
+ return $row ? $this->toArray($row) : null;
+ }
+
+ public function touchLastLogin(int $id): void
+ {
+ Db::table('admin_users')
+ ->where('id', $id)
+ ->update(['last_login_at' => Db::raw('CURRENT_TIMESTAMP')]);
+ }
+
+ public function create(string $username, string $password, ?string $displayName = null): array
+ {
+ $id = Db::table('admin_users')->insertGetId([
+ 'username' => $username,
+ 'display_name' => $displayName,
+ 'password_hash' => password_hash($password, PASSWORD_DEFAULT),
+ 'is_active' => true,
+ 'last_login_at' => null,
+ ]);
+
+ return $this->findAnyById((int) $id) ?? [];
+ }
+
+ public function updateUser(int $id, array $fields): ?array
+ {
+ $updates = [];
+
+ if (array_key_exists('display_name', $fields)) {
+ $displayName = $fields['display_name'];
+ $updates['display_name'] = $displayName === null ? null : trim((string) $displayName);
+ }
+
+ if (array_key_exists('password', $fields) && trim((string) $fields['password']) !== '') {
+ $updates['password_hash'] = password_hash((string) $fields['password'], PASSWORD_DEFAULT);
+ }
+
+ if (array_key_exists('is_active', $fields)) {
+ $updates['is_active'] = (bool) $fields['is_active'];
+ }
+
+ if ($updates !== []) {
+ Db::table('admin_users')->where('id', $id)->update($updates);
+ }
+
+ return $this->findAnyById($id);
+ }
+
+ private function toArray(object $row): array
+ {
+ return [
+ 'id' => (int) $row->id,
+ 'username' => (string) $row->username,
+ 'display_name' => $row->display_name,
+ 'password_hash' => (string) $row->password_hash,
+ 'is_active' => (bool) $row->is_active,
+ 'last_login_at' => $row->last_login_at,
+ 'created_time' => $row->created_time,
+ 'updated_time' => $row->updated_time,
+ ];
+ }
+}
diff --git a/app/service/ArchiveRepository.php b/app/service/ArchiveRepository.php
index 43a2c8d..c8222b6 100644
--- a/app/service/ArchiveRepository.php
+++ b/app/service/ArchiveRepository.php
@@ -75,6 +75,102 @@ class ArchiveRepository
return implode("\n\n", array_map(fn ($chunk): string => (string) $chunk->text, $chunks));
}
+ public function findChunk(string $chunkUid): ?array
+ {
+ $row = Db::table('chunks')
+ ->join('archives', 'chunks.archive_uid', '=', 'archives.archive_uid')
+ ->where('chunks.chunk_uid', $chunkUid)
+ ->first([
+ 'chunks.chunk_uid',
+ 'chunks.archive_uid',
+ 'chunks.chunk_index',
+ 'chunks.page_start',
+ 'chunks.page_end',
+ 'chunks.text',
+ 'chunks.length',
+ 'chunks.embedding_status',
+ 'chunks.embedding_ref',
+ 'chunks.embedding_model',
+ 'chunks.embedding_error',
+ 'chunks.search_index_status',
+ 'chunks.search_index_error',
+ 'archives.title',
+ 'archives.summary',
+ 'archives.year',
+ 'archives.author',
+ 'archives.source',
+ 'archives.series',
+ 'archives.tags',
+ 'archives.metadata',
+ ]);
+
+ if (!$row) {
+ return null;
+ }
+
+ return [
+ 'chunk_uid' => (string) $row->chunk_uid,
+ 'archive_uid' => (string) $row->archive_uid,
+ 'chunk_index' => (int) $row->chunk_index,
+ 'page_start' => $row->page_start === null ? null : (int) $row->page_start,
+ 'page_end' => $row->page_end === null ? null : (int) $row->page_end,
+ 'pages' => $this->pages($row->page_start, $row->page_end),
+ 'text' => (string) $row->text,
+ 'length' => $row->length === null ? null : (int) $row->length,
+ 'embedding_status' => (int) $row->embedding_status,
+ 'embedding_ref' => $this->decodeJson($row->embedding_ref ?? null, null),
+ 'embedding_model' => $row->embedding_model,
+ 'embedding_error' => $row->embedding_error,
+ 'search_index_status' => (int) $row->search_index_status,
+ 'search_index_error' => $row->search_index_error,
+ 'archive' => [
+ 'archive_uid' => (string) $row->archive_uid,
+ 'title' => $row->title,
+ 'summary' => $row->summary,
+ 'year' => $row->year === null ? null : (int) $row->year,
+ 'author' => $row->author,
+ 'source' => $row->source,
+ 'series' => $row->series,
+ 'tags' => $this->decodeJson($row->tags ?? null, []),
+ 'metadata' => $this->decodeJson($row->metadata ?? null, []),
+ ],
+ ];
+ }
+
+ public function findArchiveChunks(string $archiveUid): array
+ {
+ $rows = Db::table('chunks')
+ ->join('archives', 'chunks.archive_uid', '=', 'archives.archive_uid')
+ ->where('chunks.archive_uid', $archiveUid)
+ ->orderBy('chunks.chunk_index')
+ ->get([
+ 'chunks.chunk_uid',
+ 'chunks.archive_uid',
+ 'chunks.chunk_index',
+ 'chunks.page_start',
+ 'chunks.page_end',
+ 'chunks.text',
+ 'chunks.length',
+ 'chunks.embedding_status',
+ 'chunks.embedding_ref',
+ 'chunks.embedding_model',
+ 'chunks.embedding_error',
+ 'chunks.search_index_status',
+ 'chunks.search_index_error',
+ 'archives.title',
+ 'archives.summary',
+ 'archives.year',
+ 'archives.author',
+ 'archives.source',
+ 'archives.series',
+ 'archives.tags',
+ 'archives.metadata',
+ ])
+ ->all();
+
+ return array_map(fn (object $row): array => $this->chunkRowToArray($row), $rows);
+ }
+
public function updateMetadata(string $archiveUid, array $fields, array $aiMeta): void
{
$archive = $this->findArchive($archiveUid);
@@ -136,4 +232,68 @@ class ArchiveRepository
'chunks' => json_decode($archive->chunks ?? '[]', true) ?: [],
];
}
+
+ private function chunkRowToArray(object $row): array
+ {
+ return [
+ 'chunk_uid' => (string) $row->chunk_uid,
+ 'archive_uid' => (string) $row->archive_uid,
+ 'chunk_index' => (int) $row->chunk_index,
+ 'page_start' => $row->page_start === null ? null : (int) $row->page_start,
+ 'page_end' => $row->page_end === null ? null : (int) $row->page_end,
+ 'pages' => $this->pages($row->page_start, $row->page_end),
+ 'text' => (string) $row->text,
+ 'length' => $row->length === null ? null : (int) $row->length,
+ 'embedding_status' => (int) $row->embedding_status,
+ 'embedding_ref' => $this->decodeJson($row->embedding_ref ?? null, null),
+ 'embedding_model' => $row->embedding_model,
+ 'embedding_error' => $row->embedding_error,
+ 'search_index_status' => (int) $row->search_index_status,
+ 'search_index_error' => $row->search_index_error,
+ 'archive' => [
+ 'archive_uid' => (string) $row->archive_uid,
+ 'title' => $row->title,
+ 'summary' => $row->summary,
+ 'year' => $row->year === null ? null : (int) $row->year,
+ 'author' => $row->author,
+ 'source' => $row->source,
+ 'series' => $row->series,
+ 'tags' => $this->decodeJson($row->tags ?? null, []),
+ 'metadata' => $this->decodeJson($row->metadata ?? null, []),
+ ],
+ ];
+ }
+
+ private function decodeJson(mixed $value, mixed $fallback): mixed
+ {
+ if ($value === null) {
+ return $fallback;
+ }
+
+ if (is_array($value)) {
+ return $value;
+ }
+
+ if (!is_string($value) || trim($value) === '') {
+ return $fallback;
+ }
+
+ $decoded = json_decode($value, true);
+ return $decoded === null && json_last_error() !== JSON_ERROR_NONE ? $fallback : $decoded;
+ }
+
+ private function pages(mixed $pageStart, mixed $pageEnd): array
+ {
+ if (!is_numeric($pageStart) || !is_numeric($pageEnd)) {
+ return array_values(array_filter([$pageStart, $pageEnd], static fn ($value): bool => $value !== null && $value !== ''));
+ }
+
+ $start = (int) $pageStart;
+ $end = (int) $pageEnd;
+ if ($end < $start) {
+ $end = $start;
+ }
+
+ return range($start, $end);
+ }
}
diff --git a/app/service/ArticleImportService.php b/app/service/ArticleImportService.php
index d676df9..c515d47 100644
--- a/app/service/ArticleImportService.php
+++ b/app/service/ArticleImportService.php
@@ -70,6 +70,16 @@ class ArticleImportService
}
}
+ public function normalizeArchiveContentString(string $content): ?string
+ {
+ return $this->nullableClean($this->cleanMarkdownPage($content));
+ }
+
+ public function normalizeArchiveRawString(string $content): ?string
+ {
+ return $this->nullableClean($content);
+ }
+
private function validate(array $payload): array
{
$errors = [];
@@ -182,8 +192,8 @@ class ArticleImportService
'tags' => is_array($payload['tags'] ?? null) ? array_values($payload['tags']) : [],
'summary' => $this->nullableClean($payload['summary'] ?? null),
'metadata' => $payload['metadata'] ?? [],
- 'content' => $this->nullableClean($payload['content_url'] ?? $payload['content_path'] ?? null),
- 'raw' => $this->nullableClean($payload['raw_url'] ?? $payload['raw_path'] ?? null),
+ 'content' => $this->normalizedArchiveContent($payload),
+ 'raw' => $this->rawArchiveContent($payload),
];
}
@@ -200,6 +210,57 @@ class ArticleImportService
return $this->pageBlocksFromItems($payload, preg_split('/\R{2,}/u', $payload['content']));
}
+ private function normalizedArchiveContent(array $payload): ?string
+ {
+ if (isset($payload['pages']) && is_array($payload['pages'])) {
+ $parts = [];
+ foreach ($payload['pages'] as $page) {
+ if (!is_array($page) || !isset($page['content']) || !is_string($page['content'])) {
+ continue;
+ }
+
+ $content = $this->cleanMarkdownPage($page['content']);
+ if ($content !== '') {
+ $parts[] = $content;
+ }
+ }
+
+ return $this->nullableClean(implode("\n\n", $parts));
+ }
+
+ if (isset($payload['paragraphs']) && is_array($payload['paragraphs'])) {
+ $parts = [];
+ foreach ($payload['paragraphs'] as $paragraph) {
+ $content = is_array($paragraph) ? ($paragraph['content'] ?? '') : $paragraph;
+ if (!is_string($content)) {
+ continue;
+ }
+
+ $content = $this->clean($content);
+ if ($content !== '') {
+ $parts[] = $content;
+ }
+ }
+
+ return $this->nullableClean(implode("\n\n", $parts));
+ }
+
+ if (isset($payload['content']) && is_string($payload['content'])) {
+ return $this->normalizeArchiveContentString($payload['content']);
+ }
+
+ return null;
+ }
+
+ private function rawArchiveContent(array $payload): ?string
+ {
+ if (isset($payload['content']) && is_string($payload['content'])) {
+ return $this->normalizeArchiveRawString($payload['content']);
+ }
+
+ return null;
+ }
+
private function pageBlocksFromPages(array $payload): array
{
$pageBlocks = [];
diff --git a/app/service/Search/ChunkSearchIndexRepository.php b/app/service/Search/ChunkSearchIndexRepository.php
index 8233a7d..404a408 100644
--- a/app/service/Search/ChunkSearchIndexRepository.php
+++ b/app/service/Search/ChunkSearchIndexRepository.php
@@ -7,6 +7,22 @@ use support\Db;
class ChunkSearchIndexRepository
{
+ public function resetEmbeddedChunksToPending(?string $archiveUid = null): int
+ {
+ $query = Db::table('chunks')
+ ->where('embedding_status', EmbeddingStatus::EMBEDDED);
+
+ if ($archiveUid !== null && trim($archiveUid) !== '') {
+ $query->where('archive_uid', trim($archiveUid));
+ }
+
+ return $query->update([
+ 'search_index_status' => SearchIndexStatus::PENDING,
+ 'search_index_error' => null,
+ 'search_index_updated_at' => null,
+ ]);
+ }
+
public function queuePendingArchiveTasks(int $limit): array
{
$statuses = [
@@ -63,6 +79,7 @@ class ChunkSearchIndexRepository
'chunks.created_time',
'chunks.updated_time',
'archives.title',
+ 'archives.summary',
'archives.source',
'archives.author',
'archives.year',
@@ -105,6 +122,7 @@ class ChunkSearchIndexRepository
'page_start' => $row->page_start === null ? null : (int) $row->page_start,
'page_end' => $row->page_end === null ? null : (int) $row->page_end,
'title' => $row->title,
+ 'summary' => $row->summary,
'source' => $row->source,
'author' => $row->author,
'year' => $row->year === null ? null : (int) $row->year,
diff --git a/app/service/Search/OpenSearchChunkIndex.php b/app/service/Search/OpenSearchChunkIndex.php
index f0cc659..90d2b9c 100644
--- a/app/service/Search/OpenSearchChunkIndex.php
+++ b/app/service/Search/OpenSearchChunkIndex.php
@@ -16,6 +16,7 @@ class OpenSearchChunkIndex
$index = $this->indexName();
if ($client->indices()->exists(['index' => $index])) {
+ $this->ensureProperties($client, $index);
return;
}
@@ -64,6 +65,7 @@ class OpenSearchChunkIndex
'page_start' => ['type' => 'integer'],
'page_end' => ['type' => 'integer'],
'title' => $this->textWithKeyword(),
+ 'summary' => ['type' => 'text'],
'source' => $this->textWithKeyword(),
'author' => $this->textWithKeyword(),
'year' => ['type' => 'integer'],
@@ -93,6 +95,31 @@ class OpenSearchChunkIndex
return $this->client ?? (new OpenSearchClientFactory())->make();
}
+ private function ensureProperties(Client $client, string $index): void
+ {
+ $mapping = $client->indices()->getMapping(['index' => $index]);
+ $existing = $mapping[$index]['mappings']['properties'] ?? [];
+ $desired = $this->mapping()['mappings']['properties'] ?? [];
+ $missing = [];
+
+ foreach ($desired as $field => $definition) {
+ if (!array_key_exists($field, $existing)) {
+ $missing[$field] = $definition;
+ }
+ }
+
+ if ($missing === []) {
+ return;
+ }
+
+ $client->indices()->putMapping([
+ 'index' => $index,
+ 'body' => [
+ 'properties' => $missing,
+ ],
+ ]);
+ }
+
private function indexName(): string
{
return config('opensearch.indices.chunks', 'proofdb_chunks');
diff --git a/app/service/Search/OpenSearchSearchService.php b/app/service/Search/OpenSearchSearchService.php
index d2f84ae..962cca7 100644
--- a/app/service/Search/OpenSearchSearchService.php
+++ b/app/service/Search/OpenSearchSearchService.php
@@ -39,6 +39,7 @@ class OpenSearchSearchService
'fields' => [
'text^4',
'title^3',
+ 'summary^2',
'source^2',
'author^2',
'series^2',
@@ -219,6 +220,7 @@ class OpenSearchSearchService
'page_start' => $source['page_start'] ?? null,
'page_end' => $source['page_end'] ?? null,
'title' => $source['title'] ?? null,
+ 'summary' => $source['summary'] ?? null,
'source' => $source['source'] ?? null,
'author' => $source['author'] ?? null,
'year' => $source['year'] ?? null,
@@ -322,6 +324,7 @@ class OpenSearchSearchService
'page_start',
'page_end',
'title',
+ 'summary',
'source',
'author',
'year',
diff --git a/app/view/admin/dashboard.html b/app/view/admin/dashboard.html
new file mode 100644
index 0000000..0840982
--- /dev/null
+++ b/app/view/admin/dashboard.html
@@ -0,0 +1,835 @@
+
+
+
+
+
+ 在这里维护 archives 表、OpenSearch 状态、管理员账号、API 文档,以及脚本级运维动作。
++ 档案储存 标签处理 向量存储 全文搜索 +
+选择进入路径
+Tips: PoofDB的Proof是酒精度的意思
+登入 Proof DB 管理后台。
+进入维护面板
+请使用您的账号和密码进行登录
+