379 lines
7.2 KiB
Markdown
379 lines
7.2 KiB
Markdown
# Chunk 与 Evidence API
|
|
|
|
## 接口说明
|
|
|
|
这组接口用于把搜索结果落到可读的证据对象。
|
|
|
|
- `GET /api/archives/{archive_uid}` 返回 archive 级详情。
|
|
- `GET /api/archives/{archive_uid}/chunks` 返回该 archive 下的 chunk 列表。
|
|
- `GET /api/archives/{archive_uid}/evidence` 返回该 archive 下适合引用/AI 消费的证据列表。
|
|
- `GET /api/chunks/{chunk_uid}` 偏底层,返回 chunk 详情和所属 archive 信息。
|
|
- `GET /api/evidence/{chunk_uid}` 偏引用与展示,返回 citation、页码标签和证据正文。
|
|
|
|
其中 archive 接口以 `archive_uid` 为主键,另外两者以 `chunk_uid` 为主键。
|
|
|
|
## Archive 详情
|
|
|
|
```http
|
|
GET /api/archives/{archive_uid}
|
|
```
|
|
|
|
### 请求示例
|
|
|
|
```bash
|
|
curl <APIdomain>/api/archives/01KQHVREB6XPYF604RVZAP9NNY
|
|
```
|
|
|
|
### 成功响应
|
|
|
|
状态码:
|
|
|
|
```http
|
|
200 OK
|
|
```
|
|
|
|
响应示例:
|
|
|
|
```json
|
|
{
|
|
"code": 0,
|
|
"message": "Archive loaded.",
|
|
"data": {
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"title": "1.test",
|
|
"summary": "This directive, signed by Brent Scowcroft ...",
|
|
"year": 1992,
|
|
"author": "Brent Scowcroft",
|
|
"source": "test/1.test.md",
|
|
"series": null,
|
|
"tags": ["National Security", "Policy"],
|
|
"metadata": {
|
|
"ai_enrichment": {
|
|
"provider": "bigmodel"
|
|
}
|
|
},
|
|
"chunks": [
|
|
"01KQHVREB6XPYF604RVZAP9NNY_1_39003",
|
|
"01KQHVREB6XPYF604RVZAP9NNY_2_12345"
|
|
],
|
|
"chunk_count": 14
|
|
}
|
|
}
|
|
```
|
|
|
|
说明:
|
|
|
|
- `chunks` 是当前 archive 关联的 `chunk_uid` 列表。
|
|
- `chunk_count` 方便调用方快速判断档案规模,而不必自己数数组长度。
|
|
|
|
### 错误响应
|
|
|
|
#### archive 不存在
|
|
|
|
状态码:
|
|
|
|
```http
|
|
404 Not Found
|
|
```
|
|
|
|
```json
|
|
{
|
|
"code": 404,
|
|
"message": "Archive not found.",
|
|
"errors": {
|
|
"archive_uid": "missing_archive_uid"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Archive 下的 Chunk 列表
|
|
|
|
```http
|
|
GET /api/archives/{archive_uid}/chunks
|
|
```
|
|
|
|
### 请求示例
|
|
|
|
```bash
|
|
curl <APIdomain>/api/archives/01KQHVREB6XPYF604RVZAP9NNY/chunks
|
|
```
|
|
|
|
### 成功响应
|
|
|
|
状态码:
|
|
|
|
```http
|
|
200 OK
|
|
```
|
|
|
|
响应示例:
|
|
|
|
```json
|
|
{
|
|
"code": 0,
|
|
"message": "Archive chunks loaded.",
|
|
"data": {
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"title": "1.test",
|
|
"summary": "This directive, signed by Brent Scowcroft ...",
|
|
"source": "test/1.test.md",
|
|
"author": "Brent Scowcroft",
|
|
"year": 1992,
|
|
"series": null,
|
|
"tags": ["National Security", "Policy"],
|
|
"chunk_count": 14,
|
|
"chunks": [
|
|
{
|
|
"chunk_uid": "01KQHVREB6XPYF604RVZAP9NNY_1_39003",
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"chunk_index": 1,
|
|
"page_start": 1,
|
|
"page_end": 1,
|
|
"pages": [1],
|
|
"text": "chunk text...",
|
|
"length": 300,
|
|
"embedding_status": 3,
|
|
"embedding_ref": {
|
|
"provider": "bigmodel",
|
|
"model": "embedding-3",
|
|
"dimensions": 2048
|
|
},
|
|
"embedding_model": "embedding-3",
|
|
"embedding_error": null,
|
|
"search_index_status": 3,
|
|
"search_index_error": null,
|
|
"archive": {
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"title": "1.test",
|
|
"summary": "This directive, signed by Brent Scowcroft ...",
|
|
"year": 1992,
|
|
"author": "Brent Scowcroft",
|
|
"source": "test/1.test.md",
|
|
"series": null,
|
|
"tags": ["National Security", "Policy"],
|
|
"metadata": {}
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
说明:
|
|
|
|
- 这个接口偏底层,适合按 archive 批量读取完整 chunk 数据。
|
|
- `chunks` 按 `chunk_index` 升序返回。
|
|
|
|
## Archive 级 Evidence 列表
|
|
|
|
```http
|
|
GET /api/archives/{archive_uid}/evidence
|
|
```
|
|
|
|
### 请求示例
|
|
|
|
```bash
|
|
curl <APIdomain>/api/archives/01KQHVREB6XPYF604RVZAP9NNY/evidence
|
|
```
|
|
|
|
### 成功响应
|
|
|
|
状态码:
|
|
|
|
```http
|
|
200 OK
|
|
```
|
|
|
|
响应示例:
|
|
|
|
```json
|
|
{
|
|
"code": 0,
|
|
"message": "Archive evidence loaded.",
|
|
"data": {
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"title": "1.test",
|
|
"summary": "This directive, signed by Brent Scowcroft ...",
|
|
"source": "test/1.test.md",
|
|
"author": "Brent Scowcroft",
|
|
"year": 1992,
|
|
"series": null,
|
|
"tags": ["National Security", "Policy"],
|
|
"chunk_count": 14,
|
|
"evidence": [
|
|
{
|
|
"chunk_uid": "01KQHVREB6XPYF604RVZAP9NNY_1_39003",
|
|
"chunk_index": 1,
|
|
"page_start": 1,
|
|
"page_end": 1,
|
|
"pages": [1],
|
|
"page_label": "p. 1",
|
|
"citation": "1.test | Brent Scowcroft | 1992 | p. 1 | test/1.test.md",
|
|
"quote": "chunk text...",
|
|
"length": 300,
|
|
"embedding_model": "embedding-3",
|
|
"embedding_status": 3,
|
|
"search_index_status": 3
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
说明:
|
|
|
|
- 这个接口偏上层,适合 AI、RAG、引用构造和前端证据列表展示。
|
|
- `evidence` 里的每一项都保留了 citation 所需的页码和引用文本。
|
|
|
|
## Chunk 详情
|
|
|
|
```http
|
|
GET /api/chunks/{chunk_uid}
|
|
```
|
|
|
|
### 请求示例
|
|
|
|
```bash
|
|
curl <APIdomain>/api/chunks/01KQHVREB6XPYF604RVZAP9NNY_14_97554
|
|
```
|
|
|
|
### 成功响应
|
|
|
|
状态码:
|
|
|
|
```http
|
|
200 OK
|
|
```
|
|
|
|
响应示例:
|
|
|
|
```json
|
|
{
|
|
"code": 0,
|
|
"message": "Chunk loaded.",
|
|
"data": {
|
|
"chunk_uid": "01KQHVREB6XPYF604RVZAP9NNY_14_97554",
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"chunk_index": 14,
|
|
"page_start": 8,
|
|
"page_end": 8,
|
|
"pages": [8],
|
|
"text": "NSD 45 20 AUG 90 U.S. Policy in Response to the Iraqi Invasion of Kuwait ...",
|
|
"length": 148,
|
|
"embedding_status": 3,
|
|
"embedding_ref": {
|
|
"provider": "bigmodel",
|
|
"model": "embedding-3",
|
|
"dimensions": 2048
|
|
},
|
|
"embedding_model": "embedding-3",
|
|
"embedding_error": null,
|
|
"search_index_status": 3,
|
|
"search_index_error": null,
|
|
"archive": {
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"title": "1.test",
|
|
"summary": null,
|
|
"year": 1992,
|
|
"author": "Brent Scowcroft",
|
|
"source": "test/1.test.md",
|
|
"series": null,
|
|
"tags": [],
|
|
"metadata": {}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 错误响应
|
|
|
|
#### chunk 不存在
|
|
|
|
状态码:
|
|
|
|
```http
|
|
404 Not Found
|
|
```
|
|
|
|
```json
|
|
{
|
|
"code": 404,
|
|
"message": "Chunk not found.",
|
|
"errors": {
|
|
"chunk_uid": "missing_chunk_uid"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Evidence 详情
|
|
|
|
```http
|
|
GET /api/evidence/{chunk_uid}
|
|
```
|
|
|
|
### 请求示例
|
|
|
|
```bash
|
|
curl <APIdomain>/api/evidence/01KQHVREB6XPYF604RVZAP9NNY_14_97554
|
|
```
|
|
|
|
### 成功响应
|
|
|
|
状态码:
|
|
|
|
```http
|
|
200 OK
|
|
```
|
|
|
|
响应示例:
|
|
|
|
```json
|
|
{
|
|
"code": 0,
|
|
"message": "Evidence loaded.",
|
|
"data": {
|
|
"chunk_uid": "01KQHVREB6XPYF604RVZAP9NNY_14_97554",
|
|
"archive_uid": "01KQHVREB6XPYF604RVZAP9NNY",
|
|
"title": "1.test",
|
|
"source": "test/1.test.md",
|
|
"author": "Brent Scowcroft",
|
|
"year": 1992,
|
|
"series": null,
|
|
"tags": [],
|
|
"page_start": 8,
|
|
"page_end": 8,
|
|
"pages": [8],
|
|
"page_label": "p. 8",
|
|
"citation": "1.test | Brent Scowcroft | 1992 | p. 8 | test/1.test.md",
|
|
"quote": "NSD 45 20 AUG 90 U.S. Policy in Response to the Iraqi Invasion of Kuwait ...",
|
|
"chunk": {
|
|
"chunk_index": 14,
|
|
"length": 148,
|
|
"embedding_model": "embedding-3",
|
|
"embedding_status": 3,
|
|
"search_index_status": 3
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 错误响应
|
|
|
|
#### evidence 不存在
|
|
|
|
状态码:
|
|
|
|
```http
|
|
404 Not Found
|
|
```
|
|
|
|
```json
|
|
{
|
|
"code": 404,
|
|
"message": "Evidence not found.",
|
|
"errors": {
|
|
"chunk_uid": "missing_chunk_uid"
|
|
}
|
|
}
|
|
```
|