[Solved] Go deserialization when type is not known


TL;DR

Just use json.Unmarshal. You can wrap it lightly, using your transport, and call json.Unmarshal (or with a json.Decoder instance, use d.Decode) on your prebuilt JSON bytes and the v interface{} argument from your caller.

Somewhat longer, with an example

Consider how json.Unmarshal does its own magic. Its first argument is the JSON (data []byte), but its second argument is of type interface{}:

func Unmarshal(data []byte, v interface{}) error

As the documentation goes on to say, if v really is just an interface{}:

To unmarshal JSON into an interface value, Unmarshal stores one of these in the interface value:

bool, for JSON booleans
float64, for JSON numbers
string, for JSON strings
[]interface{}, for JSON arrays
map[string]interface{}, for JSON objects
nil for JSON null

but if v has an underlying concrete type, such as type myData struct { ... }, it’s much fancier. It only does the above if v‘s underlying type is interface{}.

Its actual implementation is particularly complex because it’s optimized to do the de-JSON-ification and the assignment into the target object at the same time. In principle, though, it’s mostly a big type-switch on the underlying (concrete) type of the interface value.

Meanwhile, what you are describing in your question is that you will first deserialize into generic JSON—which really means a variable of type interface{}—and then do your own assignment out of this pre-decoded JSON into another variable of type interface{}, where the type signature of your own decoder would be:

func xxxDecoder(/* maybe some args here, */ v interface{}) error {
    var predecoded interface{}

    // get some json bytes from somewhere into variable `data`
    err := json.Unmarshal(data, &predecoded)

    // now emulate json.Unmarshal by getting field names and assigning
    ... this is the hard part ...
}

and you would then call this code by writing:

type myData struct {
    Field1 int    `xxx:"Field1"`
    Field2 string `xxx:"Field2"`
}

so that you know that JSON object key “Field1” should fill in your Field1 field with an integer, and JSON object key “Field2” should fill in your Field2 field with a string:

func whatever() {
    var x myData
    err := xxxDecode(..., &x)
    if err != nil { ... handle error ... }
    ... use x.Field1 and x.Field2 ...
}

But this is silly. You can just write:

type myData struct {
    Field1 int    `json:"Field1"`
    Field2 string `json:"Field2"`
}

(or even omit the tags since the field’s names are the default json tags), and then do this:

func xxxDecode(..., v interface{}) error {
    ... get data bytes as before ...
    return json.Unmarshal(data, v)
}

In other words, just let json.Unmarshal do all the work by providing json tags in the data structures in question. You still get—and transmit across your special transport—the JSON data bytes from json.Marshal and json.Unmarshal. You do the transmitting and receiving. json.Marshal and json.Unmarshal do all the hard work: you don’t have to touch it!

It’s still fun to see how Json.Unmarshal works

Jump down to around line 660 of encoding/json/decode.go, where you will find the thing that handles a JSON “object” ({ followed by either } or a string that represents a key), for instance:

func (d *decodeState) object(v reflect.Value) error {

There are some mechanics to handle corner cases (including the fact that v might not be settable and/or might be a pointer that should be followed), then it makes sure that v is either a map[T1]T2 or struct, and if it is a map, that it’s suitable—that both T1 and T2 will work when decoding the “key”:value items in the object.

If all goes well, it gets into the JSON key-and-value scanning loop starting at line 720 (for {, which will break or return as appropriate). On each trip through this loop, the code reads the JSON key first, leaving the : and value part for later.

If we’re decoding into a struct, the decoder now uses the struct’s fields—names and json:"..." tags—to find a reflect.Value that we’ll use to store right into the field.1 This is subv, found by calling v.Field(i) for the right i, with some slightly complicated goo to handle embedded anonymous structs and pointer-following. The core of this is just subv = v.Field(i), though, where i is whichever field this key names, within the struct. So subv is now a reflect.Value that represents the actual struct instance’s value, which we should set once we’ve decoded the value part of the JSON key-value pair.

If we’re decoding into a map, we will decode the value into a temporary first, then store it into the map after decoding. It would be nice to share this with the struct-field storing, but we need a different reflect function to do the store into the map: v.SetMapIndex, where v is the reflect.Value of the map. That’s why for a map, subv points to a temporary Elem.

We’re now ready to convert the actual value to the target type, so we go back to the JSON bytes and consume the colon : character and read the JSON value. We get the value and store it into our storage location (subv). This is the code starting at line 809 (if destring {). The actual assigning is done through the decoder functions (d.literalStore at line 908, or d.value at line 412) and these actually decode the JSON value while doing the storing. Note that only d.literalStore really stores the value—d.value calls on d.array, d.object, or d.literalStore to do the work recursively if needed.

d.literalStore therefore contains many switch v.Kind()s: it parses a null or a true or false or an integer or a string or an array, then makes sure it can store the resulting value into v.Kind(), and chooses how to store that resulting value into v.Kind() based on the combination of what it just decoded, and the actual v.Kind(). So there’s a bit of a combinatorial explosion here, but it gets the job done.

If all that worked, and we’re decoding to a map, we may now need to massage the type of the temporary, find the real key, and store the converted value into the map. That’s what lines 830 (if v.Kind() == reflect.Map {) through the final close brace at 867 are about.


1To find fields, we first look over at encoding/json/encode.go to find cachedTypeFields. It is a caching version of typeFields. This is where the json tags are found and put into a slice. The result is cached via cachedTypeFields in a map indexed by the reflect-type value of the struct type. So what we get is a slow lookup the first time we use a struct type, then a fast lookup afterwards, to get a slice of information about how to do the decoding. This slice-of-information maps from json-tag-or-field name to: field; type; whether it’s a sub-field of an anonymous structure; and so on: everything we will need to know to decode it properly—or to encode it, on the encoding side. (I didn’t really look closely at this code.)

5

solved Go deserialization when type is not known