loadjson.m

  data=loadjson(fname,opt)
     or
  [data, mmap]=loadjson(fname,'param1',value1,'param2',value2,...)
 
  parse a JSON (JavaScript Object Notation) file or string and return a
  matlab data structure with optional memory-map (mmap) table
 
  authors:Qianqian Fang (q.fang <at> neu.edu)
  created on 2011/09/09, including previous works from
 
          Nedialko Krouchev: http://www.mathworks.com/matlabcentral/fileexchange/25713
             created on 2009/11/02
          François Glineur: http://www.mathworks.com/matlabcentral/fileexchange/23393
             created on  2009/03/22
          Joel Feenstra:
          http://www.mathworks.com/matlabcentral/fileexchange/20565
             created on 2008/07/03
 
  input:
       fname: input file name; if fname contains "{}" or "[]", fname
              will be interpreted as a JSON string
       opt: (optional) a struct to store parsing options, opt can be replaced by
            a list of ('param',value) pairs - the param string is equivalent
            to a field in opt. opt can have the following
            fields (first in [.|.] is the default)
 
            SimplifyCell [1|0]: if set to 1, loadjson will call cell2mat
                          for each element of the JSON data, and group
                          arrays based on the cell2mat rules.
            FastArrayParser [1|0 or integer]: if set to 1, use a
                          speed-optimized array parser when loading an
                          array object. The fast array parser may
                          collapse block arrays into a single large
                          array similar to rules defined in cell2mat; 0 to
                          use a legacy parser; if set to a larger-than-1
                          value, this option will specify the minimum
                          dimension to enable the fast array parser. For
                          example, if the input is a 3D array, setting
                          FastArrayParser to 1 will return a 3D array;
                          setting to 2 will return a cell array of 2D
                          arrays; setting to 3 will return to a 2D cell
                          array of 1D vectors; setting to 4 will return a
                          3D cell array.
            UseMap [0|1]: if set to 1, loadjson uses a containers.Map to
                          store map objects; otherwise use a struct object
            ShowProgress [0|1]: if set to 1, loadjson displays a progress bar.
            ParseStringArray [0|1]: if set to 0, loadjson converts "string arrays"
                          (introduced in MATLAB R2016b) to char arrays; if set to 1,
                          loadjson skips this conversion.
            FormatVersion [3|float]: set the JSONLab format version; since
                          v2.0, JSONLab uses JData specification Draft 1
                          for output format, it is incompatible with all
                          previous releases; if old output is desired,
                          please set FormatVersion to 1.9 or earlier.
            Encoding ['']: json file encoding. Support all encodings of
                          fopen() function
            ObjectID [0|integer or list]: if set to a positive number,
                          it returns the specified JSON object by index
                          in a multi-JSON document; if set to a vector,
                          it returns a list of specified objects.
            JDataDecode [1|0]: if set to 1, call jdatadecode to decode
                          JData structures defined in the JData
                          Specification.
            BuiltinJSON [0|1]: if set to 1, this function attempts to call
                          jsondecode, if presents (MATLAB R2016b or Octave
                          6) first. If jsondecode does not exist or failed,
                          this function falls back to the jsonlab parser
            MmapOnly [0|1]: if set to 1, this function only returns mmap
            MMapInclude 'str1' or  {'str1','str2',..}: if provided, the
                          returned mmap will be filtered by only keeping
                          entries containing any one of the string patterns
                          provided in a cell
            MMapExclude 'str1' or  {'str1','str2',..}: if provided, the
                          returned mmap will be filtered by removing
                          entries containing any one of the string patterns
                          provided in a cell
 
  output:
       dat: a cell array, where {...} blocks are converted into cell arrays,
            and [...] are converted to arrays
       mmap: (optional) a cell array as memory-mapping table in the form of
              {{jsonpath1,[start,length,<whitespace_pre>]},
               {jsonpath2,[start,length,<whitespace_pre>]}, ...}
            where jsonpath_i is a string in the JSONPath [1,2] format, and
            "start" is an integer referring to the offset from the beginning
            of the stream, and "length" is the JSON object string length.
            An optional 3rd integer "whitespace_pre" may appear to record
            the preceding whitespace length in case expansion of the data
            record is needed when using the mmap.
 
            The format of the mmap table returned from this function
            follows the JSON-Mmap Specification Draft 1 [3] defined by the
            NeuroJSON project, see https://neurojson.org/jsonmmap/draft1/
 
            Memory-mapping table (mmap) is useful when fast reading/writing
            specific data records inside a large JSON file without needing
            to load/parse/overwrite the entire file.
 
            The JSONPath keys used in mmap is largely compatible to the
            upstream specification defined in [1], with a slight extension
            to handle concatenated JSON files.
 
            In the mmap jsonpath key, a '$' denotes the root object, a '.'
            denotes a child of the preceding element; '.key' points to the
            value segment of the child named "key" of the preceding
            object; '[i]' denotes the (i+1)th member of the preceding
            element, which must be an array. For example, a key
 
            $.obj1.obj2[0].obj3
 
            defines the memory-map of the "value" section in the below
            hierarchy:
              {
                 "obj1":{
                     "obj2":[
                        {"obj3":value},
                        ...
                     ],
                     ...
                  }
              }
            Please note that "value" can be any valid JSON value, including
            an array, an object, a string or numerical value.
 
            To handle concatenated JSON objects (including ndjson,
            http://ndjson.org/), such as
 
              {"root1": {"obj1": ...}}
              ["root2", value1, value2, {"obj2": ...}]
              {"root3": ...}
 
            we use '$' or '$0' for the first root-object, and '$1' refers
            to the 2nd root object (["root2",...]) and '$2' refers to the
            3rd root object, and so on. Please note that this syntax is an
            extension from the JSONPath documentation [1,2]
 
            [1] https://goessner.net/articles/JsonPath/
            [2] http://jsonpath.herokuapp.com/
            [3] https://neurojson.org/jsonmmap/draft1/
 
  examples:
       dat=loadjson('{"obj":{"string":"value","array":[1,2,3]}}')
       dat=loadjson(['examples' filesep 'example1.json'])
       [dat, mmap]=loadjson(['examples' filesep 'example1.json'],'SimplifyCell',0)
 
  license:
      BSD or GPL version 3, see LICENSE_{BSD,GPLv3}.txt files for details
 
  -- this function is part of JSONLab toolbox (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab)

Examples

       % loadjson can directly parse a JSON string if it starts with "[" or "{", here is an empty object
       loadjson('{}')

       % loadjson can also parse complex JSON objects in a string form
       dat=loadjson('{"obj":{"string":"value","array":[1,2,3]}}')
       
       % if the input is a file name, loadjson reads the file and parse the data inside
       dat=loadjson(['examples' filesep 'example1.json'])

       % param/value pairs can be used following the 1st input to customize the parsing behavior
       dat=loadjson(['examples' filesep 'example1.json'],'SimplifyCell',0)

       % if a URL is provided, loadjson reads JSON data from the URL and return the parsed results,
       % similar to webread, except loadjson calls jdatadecode to decode JData annotations
       dat=loadjson('https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json')

       % using the 'BuildinJSON' flag, one can use the built-in jsondecode.m in MATLAB (R2016+)
       % or Octave (7.0+) to parse the JSON data for better speed, note that jsondecode encode
       % key names differently compared to loadjson
       dat=loadjson('{"_obj":{"string":"value","array":[1,2,3]}}', 'builtinjson', 1)

       % when the JSON data contains long key names, one can use 'UseMap' flag to
       % request loadjson to store the data in a containers.Map instead of struct (key name limited to 63)
       dat=loadjson('{"obj":{"an object with a key longer than 63":"value","array":[1,2,3]}}', 'UseMap', 1)

       % loadjson can further download the linked data pointed by _DataLink_ tag, and merge with the parent
       dat=loadjson('{"obj":{"_DataLink_":"https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json"},"array":[1,2]}','maxlinklevel',1)

       % a JSONPath can be attached to the URL to retrieve a sub element
       dat=loadjson('{"obj":{"_DataLink_":"https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json:$.address.city"},"array":[1,2]}','maxlinklevel',1)

       % loadjson can optionally return a JSON-memory-map object, which defines each JSON element's
       % memory buffer offset and length to enable disk-map like fast read/write operations
       [dat, mmap]=loadjson('{"obj":{"key":"value","array":[1,2,3]}}')

       % if set 'mmaponly' to 1, loadjson only returns the JSON-mmap structure
       mmap=loadjson('{"obj":{"key":"value","array":[1,2,3]}}', 'mmaponly', 1)
Powered by Habitat