loadjson.m
data=loadjson(fname,opt)
or
[data, mmap]=loadjson(fname,'param1',value1,'param2',value2,...)
parse a JSON (JavaScript Object Notation) file or string and return a
matlab data structure with optional memory-map (mmap) table
authors:Qianqian Fang (q.fang <at> neu.edu)
created on 2011/09/09, including previous works from
Nedialko Krouchev: http://www.mathworks.com/matlabcentral/fileexchange/25713
created on 2009/11/02
François Glineur: http://www.mathworks.com/matlabcentral/fileexchange/23393
created on 2009/03/22
Joel Feenstra:
http://www.mathworks.com/matlabcentral/fileexchange/20565
created on 2008/07/03
input:
fname: input file name; if fname contains "{}" or "[]", fname
will be interpreted as a JSON string
opt: (optional) a struct to store parsing options, opt can be replaced by
a list of ('param',value) pairs - the param string is equivalent
to a field in opt. opt can have the following
fields (first in [.|.] is the default)
SimplifyCell [1|0]: if set to 1, loadjson will call cell2mat
for each element of the JSON data, and group
arrays based on the cell2mat rules.
FastArrayParser [1|0 or integer]: if set to 1, use a
speed-optimized array parser when loading an
array object. The fast array parser may
collapse block arrays into a single large
array similar to rules defined in cell2mat; 0 to
use a legacy parser; if set to a larger-than-1
value, this option will specify the minimum
dimension to enable the fast array parser. For
example, if the input is a 3D array, setting
FastArrayParser to 1 will return a 3D array;
setting to 2 will return a cell array of 2D
arrays; setting to 3 will return to a 2D cell
array of 1D vectors; setting to 4 will return a
3D cell array.
UseMap [0|1]: if set to 1, loadjson uses a containers.Map to
store map objects; otherwise use a struct object
ShowProgress [0|1]: if set to 1, loadjson displays a progress bar.
ParseStringArray [0|1]: if set to 0, loadjson converts "string arrays"
(introduced in MATLAB R2016b) to char arrays; if set to 1,
loadjson skips this conversion.
FormatVersion [3|float]: set the JSONLab format version; since
v2.0, JSONLab uses JData specification Draft 1
for output format, it is incompatible with all
previous releases; if old output is desired,
please set FormatVersion to 1.9 or earlier.
Encoding ['']: json file encoding. Support all encodings of
fopen() function
ObjectID [0|integer or list]: if set to a positive number,
it returns the specified JSON object by index
in a multi-JSON document; if set to a vector,
it returns a list of specified objects.
JDataDecode [1|0]: if set to 1, call jdatadecode to decode
JData structures defined in the JData
Specification.
BuiltinJSON [0|1]: if set to 1, this function attempts to call
jsondecode, if presents (MATLAB R2016b or Octave
6) first. If jsondecode does not exist or failed,
this function falls back to the jsonlab parser
MmapOnly [0|1]: if set to 1, this function only returns mmap
MMapInclude 'str1' or {'str1','str2',..}: if provided, the
returned mmap will be filtered by only keeping
entries containing any one of the string patterns
provided in a cell
MMapExclude 'str1' or {'str1','str2',..}: if provided, the
returned mmap will be filtered by removing
entries containing any one of the string patterns
provided in a cell
output:
dat: a cell array, where {...} blocks are converted into cell arrays,
and [...] are converted to arrays
mmap: (optional) a cell array as memory-mapping table in the form of
{{jsonpath1,[start,length,<whitespace_pre>]},
{jsonpath2,[start,length,<whitespace_pre>]}, ...}
where jsonpath_i is a string in the JSONPath [1,2] format, and
"start" is an integer referring to the offset from the beginning
of the stream, and "length" is the JSON object string length.
An optional 3rd integer "whitespace_pre" may appear to record
the preceding whitespace length in case expansion of the data
record is needed when using the mmap.
The format of the mmap table returned from this function
follows the JSON-Mmap Specification Draft 1 [3] defined by the
NeuroJSON project, see https://neurojson.org/jsonmmap/draft1/
Memory-mapping table (mmap) is useful when fast reading/writing
specific data records inside a large JSON file without needing
to load/parse/overwrite the entire file.
The JSONPath keys used in mmap is largely compatible to the
upstream specification defined in [1], with a slight extension
to handle concatenated JSON files.
In the mmap jsonpath key, a '$' denotes the root object, a '.'
denotes a child of the preceding element; '.key' points to the
value segment of the child named "key" of the preceding
object; '[i]' denotes the (i+1)th member of the preceding
element, which must be an array. For example, a key
$.obj1.obj2[0].obj3
defines the memory-map of the "value" section in the below
hierarchy:
{
"obj1":{
"obj2":[
{"obj3":value},
...
],
...
}
}
Please note that "value" can be any valid JSON value, including
an array, an object, a string or numerical value.
To handle concatenated JSON objects (including ndjson,
http://ndjson.org/), such as
{"root1": {"obj1": ...}}
["root2", value1, value2, {"obj2": ...}]
{"root3": ...}
we use '$' or '$0' for the first root-object, and '$1' refers
to the 2nd root object (["root2",...]) and '$2' refers to the
3rd root object, and so on. Please note that this syntax is an
extension from the JSONPath documentation [1,2]
[1] https://goessner.net/articles/JsonPath/
[2] http://jsonpath.herokuapp.com/
[3] https://neurojson.org/jsonmmap/draft1/
examples:
dat=loadjson('{"obj":{"string":"value","array":[1,2,3]}}')
dat=loadjson(['examples' filesep 'example1.json'])
[dat, mmap]=loadjson(['examples' filesep 'example1.json'],'SimplifyCell',0)
license:
BSD or GPL version 3, see LICENSE_{BSD,GPLv3}.txt files for details
-- this function is part of JSONLab toolbox (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab)