This article is the 23rd day article of Python Part 2 Advent Calendar 2020.
It's been almost eight years as a back-end engineer, my name is Kouki Hoshi. Although the range of defense is server side, it is a wide and shallow engineer who has experience in improving application, infrastructure, testing, and development environment. Recently I've been working with python for almost a year using a framework called chalice.
When creating the API, I sometimes want to map the JSON received as input to the intended class ... I didn't have a library like that in python, so I made it myself. The version of python is 3.8 (from typing import get_origin get_args/can only be 3.8 or later).
It seems that you can easily do it with json.loads of json, which is a general-purpose library. For that reason, I seemed to have to work hard to write the settings, and I felt that it would be difficult to write them universally.
Even in Java, there is a famous library called Jackson for a long time, so I thought that I could find it, but there is no such description. So, I made it by trying to make it better.
Dropping an instance of a class into some form of data is called serialization. Conversely, converting some data format into an instance of a class is called deserialization.
Here, let's assume the following usage.
@dataclass
class Hello:
hello: str
objectMapper = ObjectMapper()
instance = objectMapper.deserialize('{"hello": "mapper"}', Hello)
print(instance.hello) ##output: mapper
This time, we will make it with the following class structure.
import json
from typing import Type, TypeVar, List
from abc import ABCMeta, abstractmethod
class NotImplementedError(Exception):
def __init__(self, message):
super().__init__(message)
class JsonDeserializer(metaclass=ABCMeta):
@abstractmethod
def canDeserialize(self, json: object, mappingClass: type) -> bool:
pass
@abstractmethod
def deserialize(self, json: object, mappingClass: type) -> object:
pass
T = TypeVar('T')
class ObjectMapper:
deserializers: List[JsonDeserializer] = []
def deserialize(self, jsonText: str, mappingClass: Type[T]) -> T:
jsonData = json.loads(jsonText)
for deserializer in ObjectMapper.deserializers:
if deserializer.canDeserialize(jsonData, mappingClass):
return deserializer.deserialize(jsonData, mappingClass)
raise NotImplementedError(f'Cannot deserialize json({jsonData}) to class({mappingClass}).')
In this way, you can have the JsonDeserializer object in the ObjectMapper in list format. If you find a compatible JsonDeserializer (canDeserialize (json, mappingClass) == true), Use it to make it available for various classes.
First, write a unit test for the literal.
from main.json_deserializer import ObjectMapper
class TestObjectMapper:
def test_deserializeInt(self):
actual = ObjectMapper().deserialize('1', int)
assert actual == 1
def test_deserializeStr(self):
actual = ObjectMapper().deserialize('"1"', str)
assert actual == '1'
def test_deserializeFloat(self):
actual = ObjectMapper().deserialize('1.5', float)
assert actual == 1.5
def test_deserializeNull(self):
actual = ObjectMapper().deserialize('null', str)
assert actual is None
$ python -m pytest
================================================================ short test summary info =================================================================
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeInt - main.json_deserializer.NotImplementedError: Cannot deserialize json(1) to c...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeStr - main.json_deserializer.NotImplementedError: Cannot deserialize json(1) to c...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeFloat - main.json_deserializer.NotImplementedError: Cannot deserialize json(1.5) ...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeNull - main.json_deserializer.NotImplementedError: Cannot deserialize json(Null) ...
=================================================================== 5 failed in 0.21s ====================================================================```
I haven't implemented it yet, so of course I can do it. Implements JsonDeserializer for literals.
class ObjectMapper:
deserializers: List[JsonDeserializer] = [
LiteralDeserializer () ## added
]
class LiteralDeserializer(JsonDeserializer):
def canDeserialize(self, json: object, mappingClass: type) -> bool:
return type(json) in [int, float, str, type(None)]
def deserialize(self, json: object, mappingClass: type) -> object:
return json
collected 4 items
tests/test_object_mapper.py ....
=================================================================== 4 passed in 0.03s ====================================================================
###It's a literal, but if you want to make it a type such as date and time / Enum
Now add the following test.
from datetime import datetime, date
from enum import Enum
class Member(Enum):
JOHN = 'john'
BOB = 'bob'
----(abridgement)----
def test_deserializeDate(self):
actual = ObjectMapper().deserialize('"2020-12-23"', date)
assert actual == date(2020, 12, 23)
def test_deserializeNaiveDateTime(self):
actual = ObjectMapper().deserialize('"2020-12-23T03:00:00"', datetime)
assert actual == datetime(2020, 12, 23, 3, 0, 0)
def test_deserializeAwareDateTime(self):
actual = ObjectMapper().deserialize('"2020-12-23T03:00:00+0900"', datetime)
assert actual == datetime(2020, 12, 23, 3, 0, 0, tzinfo=timezone(timedelta(hours=+9), 'JST'))
def test_deserializeEnum(self):
actual = ObjectMapper().deserialize('"john"', Member)
assert actual is Member.JOHN
At daytime, Naive(No time zone), Aware(There is a time zone)there is. It may be confusing because it can not be calculated between different dates and times, so it may be better to use Aware for everything, but This time, we will make it possible to create the corresponding data in either case.
Test execution as a trial
================================================================ short test summary info =================================================================
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeDate - AssertionError: assert '2020-12-23' == datetime.date(2020, 12, 23)
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeNativeDateTime - AssertionError: assert '2020-12-23T03:00:00' == datetime.datetim...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeAwareDateTime - AssertionError: assert '2020-12-23T03:00:00+0900' == datetime.dat...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeEnum - AssertionError: assert 'john' is <Member.JOHN: 'john'>
============================================================== 4 failed, 4 passed in 0.12s ===============================================================
...That's why I implemented it.
class DateDeserializer(JsonDeserializer):
def canDeserialize(self, json: object, mappingClass: type) -> bool:
return mappingClass == date
def deserialize(self, json: object, mappingClass: type) -> object:
dt = datetime.strptime(json, '%Y-%m-%d')
return date(dt.year, dt.month, dt.day)
class DatetimeDeserializer(JsonDeserializer):
def canDeserialize(self, json: object, mappingClass: type) -> bool:
return mappingClass == datetime
def deserialize(self, json: object, mappingClass: type) -> object:
try:
return datetime.strptime (json,'% Y-% m-% dT% H:% M:% S% z') # Quite miscellaneous ..
except ValueError:
return datetime.strptime(json, '%Y-%m-%dT%H:%M:%S')
class EnumDeserializer(JsonDeserializer):
def canDeserialize(self, json: object, mappingClass: type) -> bool:
return issubclass(mappingClass, Enum)
def deserialize(self, json: object, mappingClass: type) -> object:
for enum in mappingClass:
if enum.value == json:
return enum
class ObjectMapper:
deserializers: List[JsonDeserializer] = [
DatetimeDeserializer (), ## added
EnumDeserializer (), ## added
DateDeserializer (), ## added
LiteralDeserializer()
]
collected 8 items
tests/test_object_mapper.py ........
=================================================================== 8 passed in 0.07s ====================================================================
###Handle container classes
Now we will deal with container classes. This time, create the following container class.
First, create a container common class. To do something like this This is because the container class has to request deserialization again for another internal structure. (It's a little tricky, but again for the internal structure ObjectMapper._I'm calling deserialize)
class ContainerDeserializer(JsonDeserializer):
def deserializeChild(self, json: object, mappingClass: type) -> object:
return ObjectMapper._deserialize(json, mappingClass)
class ObjectMapper:
deserializers: List[JsonDeserializer] = [
...JsonDeserializer
]
def deserialize(self, jsonText: str, mappingClass: Type[T]) -> T:
return self._deserialize(json.loads(jsonText), mappingClass)
@staticmethod
def _deserialize(jsonData: object, mappingClass: Type[T]) -> T:
for deserializer in ObjectMapper.deserializers:
if deserializer.canDeserialize(jsonData, mappingClass):
return deserializer.deserialize(jsonData, mappingClass)
raise NotImplementedError(f'Cannot deserialize json({jsonData}) to class({mappingClass}).')
And first, since we changed the structure, make sure that the existing structure is not broken.
collected 8 items
tests/test_object_mapper.py ........ [100%]
=================================================================== 8 passed in 0.07s ====================================================================
It looks okay, so add the test again.
@dataclass
class Person:
name: str
age: int
@dataclass
class Group:
name: str
leader: Person
## --- Add test ----
def test_deserializeRawList(self):
actual = ObjectMapper (). deserialize ('[{"age": 35, "name": "Suzuki"}, {"age": 21, "name": "Yamada"}]', list)
assert len(actual) == 2
assert actual[0].age == 35
assert actual [0] .name =='Suzuki'
assert actual[1].age == 21
assert actual [1] .name =='Yamada'
def test_deserializeTypedList(self):
actual = ObjectMapper (). deserialize ('[{"age": 35, "name": "Suzuki"}, {"age": 21, "name": "Yamada"}]', List [Person])
assert len(actual) == 2
assert type(actual[0]) == Person
assert actual[0].age == 35
assert actual [0] .name =='Suzuki'
assert type(actual[1]) == Person
assert actual[1].age == 21
assert actual [1] .name =='Yamada'
def test_deserializeRawDict(self):
actual = ObjectMapper (). deserialize ('{"ID1": {"age": 35, "name": "Suzuki"}, "ID3": {"age": 21, "name": "Yamada"}} ', dict)
assert len(actual) == 2
assert actual['ID1'].age == 35
assert actual ['ID1']. name =='Suzuki'
assert actual['ID3'].age == 21
assert actual ['ID3']. name =='Yamada'
def test_deserializeTypedDict(self):
actual = ObjectMapper().deserialize(
'{" ID1 ": {" age ": 35," name ":" Suzuki "}," ID3 ": {" age ": 21," name ":" Yamada "}}',
Dict[str, Person]
)
assert len(actual) == 2
assert type(actual['ID1']) == Person
assert actual['ID1'].age == 35
assert actual ['ID1']. name =='Suzuki'
assert type(actual['ID3']) == Person
assert actual['ID3'].age == 21
assert actual ['ID3']. name =='Yamada'
def test_deserializeRawObject(self):
actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', object)
assert actual.name =='group'
assert actual.leader.age == 35
assert actual.leader.name =='Suzuki'
def test_deserializeTypedObject(self):
actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', Group)
assert type(actual) == Group
assert actual.name =='group'
assert type(actual.leader) == Person
assert actual.leader.age == 35
assert actual.leader.name =='Suzuki'
Test execution as a trial...Naturally implemented n (ry
================================================================ short test summary info =================================================================
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeRawList - main.json_deserializer.NotImplementedError: Cannot deserialize json([{'...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeTypedList - main.json_deserializer.NotImplementedError: Cannot deserialize json([...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeRawDict - main.json_deserializer.NotImplementedError: Cannot deserialize json({'I...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeTypedDict - main.json_deserializer.NotImplementedError: Cannot deserialize json({...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeTypedObject - main.json_deserializer.NotImplementedError: Cannot deserialize json...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeRawObject - main.json_deserializer.NotImplementedError: Cannot deserialize json({...
============================================================== 6 failed, 8 passed in 0.36s ===============================================================
Implementation.
from inspect import signature, _ParameterKind
from typing import Type, TypeVar, List, Dict, Set, get_origin, get_args
class ListDeserializer(ContainerDeserializer):
def canDeserialize(self, json: object, mappingClass: type) -> bool:
return get_origin(mappingClass) == list or mappingClass == list
def deserialize(self, json: object, mappingClass: type) -> object:
genericParams = get_args(mappingClass)
hasGenericParams = genericParams is not None and len(genericParams) > 0
param = genericParams[0] if hasGenericParams else object
return [self.deserializeChild(el, param) for el in json]
class DictDeserializer(ContainerDeserializer):
def canDeserialize(self, json: object, mappingClass: type) -> bool:
return get_origin(mappingClass) == dict or mappingClass == dict
def deserialize(self, json: object, mappingClass: type) -> object:
genericParams = get_args(mappingClass)
hasKeyParam = genericParams is not None and len(genericParams) > 0
hasValueParam = genericParams is not None and len(genericParams) > 1
return {
self.deserializeChild(k, genericParams[0] if hasKeyParam else object)
: self.deserializeChild(v, genericParams[1] if hasValueParam else object)
for k, v in json.items()
}
class ObjectDeserializer(ContainerDeserializer):
def canDeserialize(self, json: object, mappingClass: type) -> bool:
return type(json) == dict
def deserialize(self, json: object, mappingClass: type) -> object:
if mappingClass == object:
return self.createRawObject(json)
else:
return self.createObject(json, mappingClass)
def createRawObject(self, args: Dict[str, object]) -> object:
className = ''.join(key.title() for key in args.keys()) + 'Obejct'
newInstance = type(className, (object,), {})()
for k, v in args.items():
setattr(newInstance, k, self.deserializeChild(v, object))
return newInstance
def createObject(self, args: Dict[str, object], mappingClass: type) -> object:
annotations = self.findAnnotations(mappingClass)
requireArgs = self.findInitRequireArgs(mappingClass)
result = object.__new__(mappingClass)
initArgs = {}
for k, v in args.items():
val = self.deserializeChild(v, annotations.get(k, object))
if k in requireArgs:
initArgs[k] = val
else:
setattr(result, k, val)
for req in requireArgs:
if req not in initArgs:
initArgs[req] = None
result.__init__(**initArgs)
return result
@staticmethod
def findAnnotations(mappingClass: type) -> Dict[str, type]:
if hasattr(mappingClass, '__annotations__'):
return mappingClass.__annotations__
for k, v in signature(mappingClass.__init__).parameters.items():
return {k:v.annotation for k,v in signature(mappingClass.__init__).parameters.items()}
@staticmethod
def findInitRequireArgs(mappingClass: type) -> Set[str]:
return {
k for k, v in signature(mappingClass.__init__).parameters.items()
if v.name != 'self' and v.kind == _ParameterKind.POSITIONAL_OR_KEYWORD
}
class ObjectMapper:
deserializers: List[JsonDeserializer] = [
DateDeserializer(),
DatetimeDeserializer(),
EnumDeserializer(),
LiteralDeserializer(),
ListDeserializer (), ## added
DictDeserializer (), ## added
ObjectDeserializer () ## added
]
Test run
collected 14 items
tests/test_object_mapper.py .............. [100%]
=================================================================== 14 passed in 0.18s ===================================================================
####Behavior description of Object deserializer
When it comes to mapping Objects, the difficulty suddenly increased...So I will explain it briefly.
Also, to deserialize the data contained in the object, You have to get the class of data, but there are two places where it may be stored:
-If class has a type definition(Group of code below, GroupWithoutDataclass)
__init__
When the type is defined by the argument of(GroupInitDef with the code below)@dataclass
class Person:
name: str
age: int
# pattern 1
@dataclass
class Group:
name: str
leader: Person
# Pattern 2
class GroupWithoutDataclass:
name: str
leader: Person
# Pattern 3
class GroupInitDef:
def __init__(self, name: str, leader: Person):
self.name = name
self.leader = leader
Also in pythonPerson('AA', 23)
And runPerson
class__init__
The method is executed.
Variadic arguments in this method(*args, **kwargs etc.)An error will occur if you do not specify any variables other than those defined in.
for that reason,def findInitRequireArgs(mappingClass: type) -> Set[str]
I am getting the variable name to be defined in.
And this time, when the JSON data and the mapping data have the following relationship, the following implementation is used.
-JSON data does not have the data required for mapping → Fill with None -JSON data has unnecessary data for mapping → Attribute is added with setattr
This time, I'm doing this, but in some cases it may be better to make an error.
Added a test just in case.(Group has been tested earlier, so omitted)
def test_deserializeTypedObjectWithoutDataclass(self):
actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', GroupWithoutDataclass)
assert type(actual) == GroupWithoutDataclass
assert actual.name =='group'
assert type(actual.leader) == Person
assert actual.leader.age == 35
assert actual.leader.name =='Suzuki'
def test_deserializeTypedObjectInitDef(self):
actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', GroupInitDef)
assert type(actual) == GroupInitDef
assert actual.name =='group'
assert type(actual.leader) == Person
assert actual.leader.age == 35
assert actual.leader.name =='Suzuki'
def test_deserializeTypedObjectEmptyJson(self):
actual = ObjectMapper().deserialize('{}', Group)
assert type(actual) == Group
assert actual.name == None
assert actual.leader == None
def test_deserializeTypedObjectRedundantData(self):
actual = ObjectMapper (). deserialize ('{"name": "group", "id": 2, "leader": {"age": 35, "name": "Suzuki", "role": "chief" }}', Group)
assert actual.name =='group'
assert actual.id == 2
assert type(actual.leader) == Person
assert actual.leader.age == 35
assert actual.leader.name =='Suzuki'
assert actual.leader.role =='Chief'
collected 18 items
tests/test_object_mapper.py .................. [100%]
=================================================================== 18 passed in 0.17s ===================================================================
###If you want to devise more...
Actually, there are some more complicated specifications, so it may be necessary to adjust the specifications according to the project.
__init__
To the method*args: T
, **kwargs: T
Variadic arguments like
-Correspondence to default value
-At the time of object initialization__init__I want you not to do it?OrderedDict
Is it possible to deal with things that have an order such as?However, this time, the purpose is to map the JSON received by API etc. to the class. JSON models usually don't need that complicated requirements.
##Finally
This time, I created an object mapper that instantiates a class from JSON. It's not a very difficult implementation, and you can handle anything by adding deserialization logic each time. I don't think maintenance is that hard.
Type definitions are becoming more and more convenient in python, so there may be something good in the future if you keep this mechanism in mind.
Recommended Posts