JSON Graph Entry Format (GEF)

How to interpret various JSON constructs as a graph, mapped to {cj}.

Date

2025-07-14

Editor

Dr. Max Völkel

Status

Working Draft

Git

https://github.com/Calpano/connected-json.git

1. Introduction

1.1. Goals and Motivation

Real-world JSON comes in many dialects and variants. This document defines how to interpret various JSON structures unambiguously as a graph of nodes and edges. For clarity, the interpretation as {cj} is given. Many formats in the wild are already quite close to {cj} anyway. Graph Entry Format (GEF) is a superset of JSON Graph Format.

Different from GraphML and {cj}, this format allows flexible data attachment directly onto elements alongside the core graph structure: Any element can carry arbitrary additional properties. No intermediate data required.

As this format is meant for relaxed parsing of various CJ-like JSON dialects, it does not define rules for stream processing. See Notes on Streaming JSON and especially Streaming in CJ for details.

1.2. Example

First of all, all {cj} files are valid Graph Entry Format (GEF). So look at the example in the CJ specification.

Here is a simple example showcasing some additional features:

Graph Entry Format (GEF) Example File
{
  "nodes": [
    // can use a number for an ID
    { "id":  12 },
    { "id":  "a",
      "ports": [ "a1",
                 { "id": "a2",
                   "ports": [ "a2-1", "a2-2" ]
    }]},
    { "id":  "b", "foo": "bar" },
    { "id":  "c" },
    { "id":  "d" },
    { "id":  "e" },
    { "id":  "f" }
  ],
  // Have any JSON just here, not `data` wrapping
  "hello": ["My data","can be","here"],
  "edges": [
    // source & target shortcuts
    { "source": 12, "target": "a"},
    // normal CJ endpoints
    { "endpoints":  [
      { "direction": "in", "node": 12, "port":  "a2-1"},
      { "direction": "out", "node": "a"}
    ]},

    // ports only available in endpoints syntax
    { "endpoints":  [
      { "direction": "in", "node": 12},
      { "direction": "in", "node": "a", "port": "a2-1" },
      { "direction": "out", "node": "d"},
      { "direction": "out", "node": "e"}
    ]},

    // mixed edges only available in endpoints syntax
    { "endpoints":  [
      { "direction": "in", "node": 12},
      { "direction": "in", "node": "a"},
      { "direction": "out", "node": "d"},
      { "direction": "out", "node": "e"},
      { "direction": "undir", "node": "f"}
    ]}
  ]
}

1.3. Change Log

2025-07-14: Version 5.0.0
  • Split spec into two parts: {cj} for writing strict files, where there is always only one option to encode a structure and Graph Entry Format (GEF) which is much more liberal and flexible in parsing.

  • Moved edgeDefault to Graph Entry Format (GEF).

2. Overview

Graph Entry Format (GEF) uses the same conceptual model as {cj}, but it maps a broader range of JSON structures to the graph model. It is a superset of {cj}.

2.1. Conventions

Where a property is just copied over from the {cj} specification with the exact same semantics, we use the symbolic to indicate this, followed by a link to the documentation in the Cj spec, prefixed with CJ:.

2.2. Polymorphism

Graph Entry Format (GEF) uses a kind of 'JSON polymorphism': A property allows multiple JSON types to be used. Interpretation of that JSON value depends on the used type.

  • For example, the property label is in Graph Entry Format (GEF) either a JSON string or a JSON object (multilingual like in {cj}). Interpretation is defined per property.

  • Some properties allow for either a JSON object or a JSON array with multiple such objects. Example: graph and graphs. Their interpretation is described below.

2.2.1. Object vs. Array

The interpretation of an object-vs-array property is by creating an implicit array with one member, the stated object.

Graph Entry Format (GEF) Input
{
  "graphs": {
    /* graph A */
  }
}
{cj} Interpretation
{
  "graphs": [
    {
      /* graph A */
    }
  ]
}

3. Elements

Every element can carry arbitrary JSON properties besides the ones interpreted by this spec. The interplay with the explicit data property is described in Data.

3.1. Document

For the default document structure, see {cj-document}.

Root Object as Graph Object

In Graph Entry Format (GEF), a graph can be stated at the root level, without using a graph property. Using a graph property is additionally allowed, effectively creating subgraphs. If any Graph-level attributes are used, a synthetic root graph is created. The graph-level properties are applied to this synthetic root graph. If subgraphs are created, those properties inherit downwards as usual.

Table 1. Property Table
Property Type Description

baseUri

string(URI)

Optional. Document-level. Documented in {cj-document}.

connectedJson

object

Optional. Document-level. Documented in CJ Document Metadata.

compoundNode

boolean

Optional. Graph-level. Documented in Graph.

edgeDefault

string

Optional. Graph-level. Documented in Graph.

edges

array(Edge [])

Optional. Graph-level. Documented in {cj-graph}.

graph

object(Graph)

Optional. Document-level. Can also be used inside graphs. Documented in {cj-document} and {cj-graph}.

graphs

object(Graph)

graph

array(Graph [])

graphs

array(Graph [])

id

string

Optional. Graph-level. Documented in {cj-graph}. A number is converted to string to be used as ID.

id

number (integer)

label

object

Optional. Graph-level. Documented in {cj-graph}.

label

string

Optional. Graph-level. See Label.

connectedJson

object(Document Metadata])

Optional. Graph Metadata].

nodes

array(Node [])

Optional. Graph-level. Documented in {cj-graph}.

graph and graphs

If both are used, a single array is constructed with the entry of graph first, followed by the entries of graphs. The whole array is then considered the value of graphs.

The pattern of allowing both (a) a single object or (b) an array of a-objects, is used throughout the spec.

3.2. ID

In all places where a {cj-id} can be used, Graph Entry Format (GEF) also allows to state the ID using a JSON number. This number is converted at parse-time into the corresponding string. E.g., the number 3 is convered to the string id "3". Floating-point numbers or negative numbers are not allowed.

3.3. Label

Shorthand

A label definition like "label": "Hello, World" has no language information. It is interpreted as { "value": x } with x being the string value. In other words: A {cj-label} without language.

3.4. Graph

For the default graph structure, see Graph in {cj}. Graph Entry Format (GEF) adds some aliases.

Table 2. Property Table
Property Type Description

compoundNode

boolean

Optional. Default: false.
If true, and a graph directly contains another graph, the subgraph is interpreted as a synthetic compound node, using the graph id and label. The handling of compound nodes is application-specific. See Nested Graph as Compound Node below. This is similar to DOT syntax using cluster as a prefix for subgraphs. Having a cluster prefix is like "compoundNode": true.

edgeDefault

string

Optional. Default is directed. Defines the default directedness for edges in this graph and all subgraphs. See Edge Direction.

id

string

→ {cj-node}.id: a {cj-id}

id

number (integer)

Is converted to string and interpreted as a {cj-id}.

label

object

→ {cj-node}.label: A {cj-label}.

label

string

Optional. See Label.

meta

object(Graph Metadata])

Optional. Graph Metadata].

nodes

array(Node [])

0 to n nodes. Default: Empty.

edges

array(Edge [])

→ {cj-graph}.edges: An array of {cj-edge}

graph

object(Graph)

graph allowed as alias for graphs

graph

array(Graph [])

graph allowed as alias for graphs

graphs

object(Graph)

Polymorph object and array

graphs

array(Graph [])

→ {cj-graph}.graphs: An array of {cj-graph}s.

3.4.1. Nested Graph as Compound Node

The graph-level compoundNode property can be used on a graph (or document) to force processing as if the subgraph had been nested in a node.

Inheritance

The compoundNode setting is inherited downwards from parent to child graphs, for all kinds of nesting. Child graphs may overwrite a parent setting. In this case, the redefined value is inherited downwards.

Table 3. Example for Nested Graph as Compound Node
Graph Entry Format (GEF) Input
{
  "graphs": [{
    "id": "graph-A",
    "nodes": [{ "id": "node-A1" }],
    "graphs": [{
        "compoundNode": true,
        "id": "graph-B",
        "label": {
          "en": "Graph B"
        },
        "nodes": [
          { "id": "node-B1" },
          { "id": "node-B2" }
        ]
    }]
  }]
}
{cj} Interpretation
{
  "graphs": [{
    "id": "graph-A",
    "nodes": [
      { "id": "node-A1" },
      { "id": "graph-B",
        "graphs": [{
            "compoundNode": true,
            "id": "graph-B",
            "label": {
              "en": "Graph B"
            },
            "nodes": [
              { "id": "node-B1" },
              { "id": "node-B2" }
            ]
        }]
      }
    ]
  }]
}

3.5. Node

For the default node structure, see Node in {cj}.

Table 4. Property Table
Property Type Description

id

string

→ {cj-node}.id: a {cj-id}

id

number (integer)

Is converted to string and interpreted as a {cj-id}.

label

string

Interpreted as { "value": x } with x being the string value. In other words: A {cj-label} without language.

ports

array(Port [])

→ {cj-node}.ports: an array of {cj-port}

graphs

array (Graph[])

→ {cj-node} graphs: An array of {cj-graph}s.

graph

object(Graph)

graph allowed as alias for graphs

graph

array(Graph [])

graph allowed as alias for graphs

graphs

object(Graph)

Polymorph object and array

3.6. Port

For the default port structure, see Port in {cj}.

Property Type Description

id

string

→ {cj-port}.id: a {cj-id}

id

number

See ID

label

object

→ {cj-port}.label: A {cj-label}.

label

string

See Label

ports

array(Port [])

→ {cj-port}.ports: An array of {cj-port}s.

3.7. Edge

For the default edge structure, see Edge in {cj}.

Property Type Description

id

string

→ {cj-edge}.id: a {cj-id}

id

number

See ID

directed

boolean

Default: true. See Edge Direction and Graph.edgeDefault.

label

object

→ {cj-edge}.label: A {cj-label}.

label

string

See Label

type

string

→ {cj-edge}.type

typeUri

string

→ {cj-edge}.typeUri

typeNode

string

→ {cj-edge}.typeNode

typeNode

number (integer)

Interpreted as the ID of the Node that defines the type of edge. Like {cj-edge}.typeNode.

Defining Endpoints

source

string

A single node id interpreted as endpoint with direction in. See below.

source

number

A single node id (converted to string) interpreted as endpoint with direction in. See below.

source

array(node id [])

An array containing either strings or integers (converted to strings) interpreted as node id in an endpoint with direction in. See below.

sources

string

A single node id interpreted as endpoint with direction in. See below.

sources

number

A single node id (converted to string) interpreted as endpoint with direction in. See below.

sources

array(node id [])

An array containing either strings or integers (converted to strings) interpreted as node id in an endpoint with direction in. See below.

target

string

A single node id interpreted as endpoint with direction out. See below.

target

number

A single node id (converted to string) interpreted as endpoint with direction out. See below.

target

array(node id [])

An array containing either strings or integers (converted to strings) interpreted as node id in an endpoint with direction out. See below.

targets

string

A single node id interpreted as endpoint with direction out. See below.

targets

number

A single node id (converted to string) interpreted as endpoint with direction out. See below.

targets

array(node id [])

An array containing either strings or integers (converted to strings) interpreted as node id in an endpoint with direction out. See below.

endpoints

array (Edge Endpoint [])

This is the canonical way to express edges.

Nested Graphs

graph

object (Graph)

graph allowed as alias for graphs

graph

array (Graph [])

graph allowed as alias for graphs

graphs

object(Graph)

Polymorph object and array

graphs

array (Graph [])

→ {cj-edge} graphs: An array of {cj-graph}s.

All endpoint-generating properties () are evaluated and generate endpoints.

Source(s)

Shortcut syntax: All created endpoints are interpreted as incoming. I.e. "source": "n17" has the same effect as
"endpoint": { "node":"n17", "direction": "in" }. Ports are only available in endpoints property.

Target(s)

Shortcut syntax: All created endpoints are interpreted as outgoing. I.e.
"target": "n12" has the same effect as
"endpoint": { "node":"n12", "direction": "out" }.

3.7.1. Bi-Edges

Simplified Structure for Bi-Edges, without Ports, without Nested Graphs
Figure 1. Simplified Structure for Bi-Edges, without Ports, without Nested Graphs

A bi-edge always has exactly 2 endpoints. A generic hyperedge can have 0 to n endpoints.

Example: Directed Bi-Edge
{
  "source": "n12",
  "target": "n17"
}

is the same as

{
  "endpoints": [
    { "node": "n12",  "direction": "in"  },
    { "node": "n17",  "direction": "out" }
  ]
}

or

{
  "source": "n12",
  "target": "n17",
  "directed": true
}
Example: Undirected Bi-Edge
{
  "source": "n12",
  "target": "n17",
  "directed": false
}

is the same as

{
  "endpoints": [
    { "node": "n12",  "direction": "undir" },
    { "node": "n17",  "direction": "undir" }
  ]
}
Example: Directed Hyper-Edge
{
  "source": ["n12","n3","n123"],
  "target": ["n17","n100"]
}

is the same as

{
  "endpoints": [
    { "node": "n12",  "direction": "in"  },
    { "node": "n3",   "direction": "in"  },
    { "node": "n123", "direction": "in"  },
    { "node": "n17",  "direction": "out" },
    { "node": "n100", "direction": "out" }
  ]
}
Example: Undirected Hyper-Edge
{
  "source": ["n12","n3","n123"],
  "target": ["n17","n100"],
  "directed": false
}

is the same as

{
  "endpoints": [
    { "node": "n12",  "direction": "undir" },
    { "node": "n3",   "direction": "undir" },
    { "node": "n123", "direction": "undir" },
    { "node": "n17",  "direction": "undir" },
    { "node": "n100", "direction": "undir" }
  ]
}

3.8. Edge Endpoint

For the default endpoint structure, see Endpoint in {cj}.

Property Type Description

direction

string

→ {cj-endpoint}.direction. See also Edge Direction.

node

string

→ {cj-endpoint}.node: A {cj-id}

node

number (integer)

Required property. Value may be string (see above) or number (auto-converted to string). Defines the endpoint node: The node to which the endpoint of the edge is attached.

port

string

→ {cj-endpoint}.port

port

number (integer)

Optional. Interpreted as a {cj-id} to refer to a Port.

type

string

→ {cj-endpoint}.type

typeUri

string(URI)

→ {cj-endpoint}.typeUri

typeNode

string

→ {cj-endpoint}.typeNode

typeNode

number (integer)

Auto-converted to a string and interpreted like {cj-endpoint}.typeNode.

4. Features

4.1. Data

In {cj}, extended data can only be attached to elements via the data property.

In Graph Entry Format (GEF), data handling is much more relaxed: Each element defined in this spec can have arbitrary additional JSON properties — except those defined in this spec. All additional properties are interpreted as data attached to the structural element. Nested JSON is allowed.

Any other properties are copied over as user-data. Values within the data property remain.

Table 5. Moving User-Defined Properties to data
Graph Entry Format (GEF) Input
{
  "id": "node123",
  "label": "Apple",
  "model": "MacBook Pro",
  "data": {
    "insurance": false
  }
}
{cj} Interpretation
{
  "id": "node123",
  "label": "Apple",
  "data": {
    "model": "MacBook Pro",
    "insurance": false
  }
}

Edge case: If an object, e.g., a node, is using both a direct JSON property and the same time the same property key within the data object, but with a different value, then both properties are 'shifted outwards'.

Table 6. Edge Case: Conflicting Properties
Graph Entry Format (GEF) Input
{
  "id": "node123",
  "label": "Apple",
  "model": "MacBook Pro",
  "insurance": true,
  "data": {
    "insurance": false,
    "foo": "bar",
    "data" : {
      "insurance": 7
    }
  }
}
{cj} Interpretation
{
  "id": "node123",
  "label": "Apple",
  "data": {
    "foo": "bar",
    "insurance": true,
    "model": "MacBook Pro",
    "data": {
      "insurance": false,
      "data": {
        "insurance": 7
      }
    }
  }
}

4.2. Edge Direction

The edge direction can be stated in three locations: (1) on the graph, (2) on each edge, and (3) on each endpoint. The precedence is as follows: endpoint > edge > graph.

  1. The Graph property edgeDefault can be used to set a default direction for all edges in the graph. The two valid values are directed and undirected (the default). This has only an effect if neither the edge nor an endpoint override this setting.

  2. An Edge may state a directed property. If set to true, the edge is directed. If set to false, the edge is undirected. How to interpret a hyperedge with, e.g. 4 endpoints as directed? In Graph Entry Format (GEF), the first endpoint is interpreted as in and all others as out.

  3. Each Edge Endpoint can use a direction property to explicitly state its direction. The valid values are in, out, and undir (undirected). This option is the only one supported by {cj}.

Table 7. Examples
Input Result

endpoint direction

edge directed

graph edgeDefault

Result Endpoint Direction

in

Ignored, endpoint is stated

in

out

out

undir

undir

 — 

true

Ignored, edge is stated

in/out (see above)

false

undir

 — 

 — 

directed

in/out (see above)

undirected

undir

4.3. Alias Properties

Some properties, such as Edge Endpoint dir, are an alias for another property (here: direction).

  • If both aliases and the original property are stated, alias values are ignored. See exceptions below. And a parser warning is emitted if the values differ.

Exceptions
  • Arrays: When one is a single-value and the other is an array, the single value is prepended to the array. The resulting merged array is used. If both state arrays, the arrays are merged. First lexicographically sorted aliases, then the original property.

Table 8. Single and Multi-Value Aliases
Graph Entry Format (GEF) Graph Input
{
  "id": "graph-1",
  "nodes": [
    { "id": "node-123" },
    { "id": "node-456" }
  ],
  "node": "node-789"
}
{cj} Interpretation
{
  "id": "graph-1",
  "nodes": [
    { "id": "node-789" },
    { "id": "node-123" },
    { "id": "node-456" }
  ]
}
  • If multiple aliases are defined, but no original property, the lexicographically first such alias is used. And a parser warning is emitted if the values differ.

  • Endpoints: All endpoint properties generate endpoints of an edge. So source neither is not overwritten by endpoints.

Appendix A: Reserved Property Names

The following property names are used by {cj} in certain places.

Property Usage

baseUri

CJ: Graph base URI for RDF interpretation

description

Suggested for Node, Edge, Graph, Port, Edge Endpoint description. Will end up in data.

dir

Alias for Edge direction

directed

Edge directedness

Graph directed: trueedgeDefault: directed

Graph directed: falseedgeDefault: undirected

direction

CJ: Edge Endpoint direction (in/out/undir)

edgeDefault

Graph default edge direction

edge

Alias for Edge edges

edges

CJ: Graph edges

endpoint

Alias for Edge endpoints

endpoints

CJ: Edge endpoints

from

Alias for Edge source

graph

Alias for Graph graphs

graphs

CJ: Node nested graphs, Edge nested graphs

hyperedges

Alias for Edge edges. For JSON Graph compatibility.

id

CJ: Node id, Edge id, Graph id, Port id

label

CJ: Node, Edge, Graph, Port

language

CJ: Label

node

CJ: Edge Endpoint referenced node id

Alias for Graph nodes

nodes

CJ: Graph nodes

port

CJ: Edge Endpoint referenced port id

ports

CJ: Node ports

source

Edge

sources

Alias for Edge source

target

Edge

targets

Alias for Edge target

to

Alias for Edge target

type

CJ: Edge, Edge Endpoint

typeNode

CJ: Edge, Edge Endpoint

typeUri

CJ: Edge, Edge Endpoint

value

CJ: Label

Table 9. Values for Endpoint direction
Value Is Aliases For Usage in

incoming

in

Endpoint

none

undir

Endpoint

outgoing

out

Endpoint

undirected

undir

Endpoint