Jump to content

Generating a unique key structure from JSON


cindreta
Go to solution Solved by requinix,

Recommended Posts

Hi guys,

i've been struggling with a problem regarding reading the JSON structure, decoding it and than trying to extract a UNIQUE key structure from it.

 

So let me show you by example what im trying to do here. Let's take a look at this JSON:

{
   "status": true,
   "message": "Everything was ok",
   "posts":[
      {
         "id": 1,
         "post_type": "type_a",
         "posted_on": "",
         "updated_on":"",
         "user":{
            "id": 1,
            "name": ""
         },
         "num_comments": 2,
         "comments":[
            {
               "id": 1,
               "comment":"",
               "commented_on":"",
               "user":{
                  "id": 1,
                  "name": "",
                  "profile_image": ""
               }
            },
            {
               "id": "",
               "comment": "",
               "commented_on": "",
               "user":{
                  "id": "",
                  "name": "",
                  "profile_image":""
               }
            }
         ],
         "message": "",
         "images":[
            {
               "image":"",
               "image_large":""
            }
         ]
      },
      {
         "id": 2,
         "post_type": "type_b",
         "posted_on": "",
         "updated_on":"",
         "notes":{
            "id": 1,
            "name": ""
          },
         "user":{
            "id": 1,
            "name": ""
         },
         "num_comments":"0",
         "images":[
            {
               "image":"",
               "image_large":""
            }
         ]
      },
   ]
}

So as you can see from the example the structure of the two post arrays varies, one has a comments section, the other one doesnt, one has a notes section, the other one doesnt etc etc. What i would somehow like to be able to do is extract a UNIQUE JSON structure for this document. Meaning i only care about the KEYS not the values and i wanna show the user what can he expect and what are the possibilities.

 

Here is my code so far:

function keys_are_equal($array1, $array2, $key) {
      return !array_diff_key($array1, $array2) && !array_diff_key($array2, $array1);
    }

    function prettyParseJson($json) {
        $decoded = json_decode($json, true);
        if($decoded && is_array($decoded)) {
            return recursiveJsonRead($decoded);
        }
    }

    function recursiveJsonRead($data) {
        
        $previous_array = array();
        $unique_array = array();

        foreach ($data as $key => $val) {
              
            if(is_array($val)) {
                if(!empty($val)) {
                  
                    if(keys_are_equal($previous_array, $val, $key)) {
                     
                    } else {

                        $unique_array[$key] = gettype($val);
                        $unique_array[$key] = recursiveJsonRead($val);
                        $previous_array = $val;
                    }
                }

            } else {
                $unique_array[$key] = gettype($val);
            }
        }
        return $unique_array;
    }

As you can see im trying to decode the json, im trying to walk through it and extract all keys and their types but currently it only works if the structure is ALWAYS the same, if it varies if creates duplicates of keys that are already defined in a previous array. And again i hope im explaing this correctly but all i wanna do is be able to load a JSON file and read out its UNIQUE key structure.

 

If someone can help me out it would be great cos im losing my mind over it.

 

Link to comment
Share on other sites

The keys aren't too helpful without knowing what the data is behind them. Like, what is "post_type"? Is "posted_on" a timestamp or a date string? Which of these can have empty values and which cannot? For types, what about the occasional "id" that is an empty string or an integer?

 

What output are you trying to get? I'm figuring

{
	"status": "bool",
	"message": "string",
	"posts": [
		{
			"id": "int",
			"post_type": "string",
			"posted_on": "string",
			"updated_on": "string",
			"user": {
				"id": "int",
				"name": "string"
			},
			"num_comments": "int",
			"comments": [
				{
					"id": "int",
					"comment": "string",
					"commented_on": "string",
					"user": {
						"id": "int",
						"name": "string",
						"profile_image": "string"
					}
				}
			],
			"message": "string",
			"images": [
				{
					"image": "string",
					"image_large": "string"
				}
			],
			"notes": {
				"id": "int",
				"name": "string"
			}
		}
	]
}
Link to comment
Share on other sites

I'll come back with code later (got stuff to do) but some talking points:

 

- Decoding using objects and not associative arrays - arrays and objects have different meanings in JSON and you'll need to know which is which

- Traverse the source and storage arrays at the same time, recursively

- If the source value is scalar then use its type in storage if there isn't already one

- If the source value is an array then set/reuse a [0] in storage and traverse each item [n] in the source subarray with storage's [0], as in

foreach ($source as $item) {
	recurse($item, $storage[0]); // each $source[$n] but always $storage[0]
- If the source value is an object then recurse as expected
Link to comment
Share on other sites

  • Solution
function parse_structure($source, $output = null) {
if (is_object($source)) {
// objects are easy: recurse over all the properties
$output || $output = [];
foreach (get_object_vars($source) as $key => $value) {
$output[$key] = parse_structure($value, isset($output[$key]) ? $output[$key] : null);
}
return $output;
} else if (is_array($source)) {
// arrays should contain all of the same data type
$output || $output = [];
if (is_scalar($source[0])) {
// foreach on the array would be wasteful - just check the first item
isset($output[0]) || $output[0] = gettype($source[0]);
} else {
// recurse over all the items and put the merged structure into [0]
isset($output[0]) || $output[0] = [];
foreach ($source as $arrayitem) {
$output[0] = parse_structure($arrayitem, $output[0]);
}
}
return $output;
} else {
// if $output already has a type then keep it
return $output ?: gettype($source);
}
}
$output = parse_structure(json_decode($json) /* decode with objects */);
  • Like 1
Link to comment
Share on other sites

Just tested and works GREAT! Thank you so much. I tested on two JSONs and one one of them i get a NOTISE on this:

 

f (is_scalar($source[0])) {

saying:

 

Notice: Undefined offset: 0

 

But on other example it worked great. Wow this will help me so much. You are a true magician : D

 

PS. while i was extensively Googling i found this tool: https://jsonschema.net/#/editor it goes a step further and generates a JSON schema based on the JSON input which is also good but from what i see it uses some kind of an API from Google which i can't seem to find.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.