ums.utils.schema

This represents the basic types used for representing extracted information from the data. The types are implemented using pydantic. It provides validation, allow JSON serialization and works well with FastAPI which is used internally for the http request between the agents and the management.

This is work in progress!

 1# Agenten Plattform
 2#
 3# (c) 2024 Magnus Bender
 4# 	Institute of Humanities-Centered Artificial Intelligence (CHAI)
 5# 	Universitaet Hamburg
 6# 	https://www.chai.uni-hamburg.de/~bender
 7#  
 8# source code released under the terms of GNU Public License Version 3
 9# https://www.gnu.org/licenses/gpl-3.0.txt
10
11"""
12	This represents the basic types used for representing extracted information from the data.
13	The types are implemented using [pydantic](https://docs.pydantic.dev/).
14	It provides validation, allow JSON serialization and works well with [FastAPI](https://fastapi.tiangolo.com/) which is used internally for the http request between the agents and the management.
15
16	**This is work in progress!**
17"""
18
19from typing import List, Any, Dict
20
21from pydantic import BaseModel
22
23class ExtractionSchema(BaseModel):
24	"""
25		This is the basic class used as superclass for all extracted information from data items.
26
27		For all the `ExtractionSchema` is is required that the data can be serialized to json. 
28		Thus, mostly only default data types like `int, str, bool, list, dict, tuple` also including `ExtractionSchema` and `RiddleInformation` can be used here!
29	"""
30
31class ExtractedContent(ExtractionSchema):
32	"""
33		An extracted content item.
34	"""
35
36	type : str 
37	"""
38		The type, as a string, the actual string will depend on the extraction agent.
39	"""
40
41	content : str | Any
42	"""
43		The extracted content
44	"""
45
46class ExtractedPositions(ExtractionSchema):
47	"""
48		A position (like time, coordinates, ...) where something was extracted (each position should belong to a content item).
49	"""
50
51	type : str 
52	"""
53		The type, as a string, the actual string will depend on the extraction agent.
54	"""
55
56	position : str | int | Any
57	"""
58		The position, will also depend on the extraction agent.
59	"""
60
61	description : str | Any = None
62	"""
63		An optional description for more details.
64	"""
65
66class ExtractedData(ExtractionSchema):
67	"""
68		Contains the extracted items from a data file.
69	"""
70
71	contents : List[ExtractedContent] = []
72	"""
73		The extracted contents (i.e., transcriptions etc.), each item here should belong a position item at the same index.
74	"""
75
76	positions : List[ExtractedPositions] = []
77	"""
78		The positions of extracted contents, each item here should belong a content item at the same index.
79	"""
80
81	other : Dict[str, Any] = {}
82	"""
83		Possibly more data. Use a keywords (depending on agent) and store the data there.
84	"""
class ExtractionSchema(pydantic.main.BaseModel):
24class ExtractionSchema(BaseModel):
25	"""
26		This is the basic class used as superclass for all extracted information from data items.
27
28		For all the `ExtractionSchema` is is required that the data can be serialized to json. 
29		Thus, mostly only default data types like `int, str, bool, list, dict, tuple` also including `ExtractionSchema` and `RiddleInformation` can be used here!
30	"""

This is the basic class used as superclass for all extracted information from data items.

For all the ExtractionSchema is is required that the data can be serialized to json. Thus, mostly only default data types like int, str, bool, list, dict, tuple also including ExtractionSchema and RiddleInformation can be used here!

Inherited Members
pydantic.main.BaseModel
BaseModel
model_extra
model_fields_set
model_construct
model_copy
model_dump
model_dump_json
model_json_schema
model_parametrized_name
model_post_init
model_rebuild
model_validate
model_validate_json
model_validate_strings
dict
json
parse_obj
parse_raw
parse_file
from_orm
construct
copy
schema
schema_json
validate
update_forward_refs
class ExtractedContent(ExtractionSchema):
32class ExtractedContent(ExtractionSchema):
33	"""
34		An extracted content item.
35	"""
36
37	type : str 
38	"""
39		The type, as a string, the actual string will depend on the extraction agent.
40	"""
41
42	content : str | Any
43	"""
44		The extracted content
45	"""

An extracted content item.

type: str

The type, as a string, the actual string will depend on the extraction agent.

content: str | typing.Any

The extracted content

Inherited Members
pydantic.main.BaseModel
BaseModel
model_extra
model_fields_set
model_construct
model_copy
model_dump
model_dump_json
model_json_schema
model_parametrized_name
model_post_init
model_rebuild
model_validate
model_validate_json
model_validate_strings
dict
json
parse_obj
parse_raw
parse_file
from_orm
construct
copy
schema
schema_json
validate
update_forward_refs
class ExtractedPositions(ExtractionSchema):
47class ExtractedPositions(ExtractionSchema):
48	"""
49		A position (like time, coordinates, ...) where something was extracted (each position should belong to a content item).
50	"""
51
52	type : str 
53	"""
54		The type, as a string, the actual string will depend on the extraction agent.
55	"""
56
57	position : str | int | Any
58	"""
59		The position, will also depend on the extraction agent.
60	"""
61
62	description : str | Any = None
63	"""
64		An optional description for more details.
65	"""

A position (like time, coordinates, ...) where something was extracted (each position should belong to a content item).

type: str

The type, as a string, the actual string will depend on the extraction agent.

position: str | int | typing.Any

The position, will also depend on the extraction agent.

description: str | typing.Any

An optional description for more details.

Inherited Members
pydantic.main.BaseModel
BaseModel
model_extra
model_fields_set
model_construct
model_copy
model_dump
model_dump_json
model_json_schema
model_parametrized_name
model_post_init
model_rebuild
model_validate
model_validate_json
model_validate_strings
dict
json
parse_obj
parse_raw
parse_file
from_orm
construct
copy
schema
schema_json
validate
update_forward_refs
class ExtractedData(ExtractionSchema):
67class ExtractedData(ExtractionSchema):
68	"""
69		Contains the extracted items from a data file.
70	"""
71
72	contents : List[ExtractedContent] = []
73	"""
74		The extracted contents (i.e., transcriptions etc.), each item here should belong a position item at the same index.
75	"""
76
77	positions : List[ExtractedPositions] = []
78	"""
79		The positions of extracted contents, each item here should belong a content item at the same index.
80	"""
81
82	other : Dict[str, Any] = {}
83	"""
84		Possibly more data. Use a keywords (depending on agent) and store the data there.
85	"""

Contains the extracted items from a data file.

contents: List[ExtractedContent]

The extracted contents (i.e., transcriptions etc.), each item here should belong a position item at the same index.

positions: List[ExtractedPositions]

The positions of extracted contents, each item here should belong a content item at the same index.

other: Dict[str, Any]

Possibly more data. Use a keywords (depending on agent) and store the data there.

Inherited Members
pydantic.main.BaseModel
BaseModel
model_extra
model_fields_set
model_construct
model_copy
model_dump
model_dump_json
model_json_schema
model_parametrized_name
model_post_init
model_rebuild
model_validate
model_validate_json
model_validate_strings
dict
json
parse_obj
parse_raw
parse_file
from_orm
construct
copy
schema
schema_json
validate
update_forward_refs