Schema Idea
All checks were successful
Build and push Docker image at git tag / build (push) Successful in 44s

This commit is contained in:
2024-10-30 21:07:54 +01:00
parent 786a230e78
commit f2f9819d27
10 changed files with 1452 additions and 674 deletions

View File

@ -38,3 +38,8 @@ from ums.utils.types import (
from ums.utils.request import ManagementRequest
from ums.utils.functions import list_shared_data, list_shared_schema
from ums.utils.schema import (
ExtractionSchema,
ExtractedData
)

View File

@ -13,15 +13,73 @@
The types are implemented using [pydantic](https://docs.pydantic.dev/).
It provides validation, allow JSON serialization and works well with [FastAPI](https://fastapi.tiangolo.com/) which is used internally for the http request between the agents and the management.
**This is work in progress!**
"""
from enum import Enum
from typing import List, Any
from typing import List, Any, Dict
from pydantic import BaseModel
class ExtractionSchema(BaseModel):
"""
This is the basic class used as superclass for all extracted information from data items.
"""
For all the `ExtractionSchema` is is required that the data can be serialized to json.
Thus, mostly only default data types like `int, str, bool, list, dict, tuple` also including `ExtractionSchema` and `RiddleInformation` can be used here!
"""
class ExtractedContent(ExtractionSchema):
"""
An extracted content item.
"""
type : str
"""
The type, as a string, the actual string will depend on the extraction agent.
"""
content : str | Any
"""
The extracted content
"""
class ExtractedPositions(ExtractionSchema):
"""
A position (like time, coordinates, ...) where something was extracted (each position should belong to a content item).
"""
type : str
"""
The type, as a string, the actual string will depend on the extraction agent.
"""
position : str | int | Any
"""
The position, will also depend on the extraction agent.
"""
description : str | Any = None
"""
An optional description for more details.
"""
class ExtractedData(ExtractionSchema):
"""
Contains the extracted items from a data file.
"""
contents : List[ExtractedContent] = []
"""
The extracted contents (i.e., transcriptions etc.), each item here should belong a position item at the same index.
"""
positions : List[ExtractedPositions] = []
"""
The positions of extracted contents, each item here should belong a content item at the same index.
"""
other : Dict[str, Any] = {}
"""
Possibly more data. Use a keywords (depending on agent) and store the data there.
"""