Tree representation

Tree reprensentation between tensorflow protobuf and onnx protobuf have following diagram.

Codes are defined on a dir /tf_onnx/core_/

But, why do we need tree representation for just another conversion ?

The reasons are following.

  1. Both Tensorflow and Onnx assumes to be constructed as a tree referring names and input attribute (on tf) , input and output (on Onnx).
  1. There are some differences between Tensorflow and Onnx in multiple levels. Most of these each difference are so trivial & still unstable that you should not let them hard-corded for retaining quality of code-maintenance and flexibility. Letting higher order function with your own custom pure function in a tree will be a best way to handle with such dirty jobs.

Onnx supports nearly 100 operators so far. They are seemingly not classified with some categories to make more sense of them. Intermidiate tree representation is somewhat rather organized as it classifies them as the number of arguments and the features of each functions.

Structure of tree

You can take a look at source codes guided by its documentations.

But, to get you overview of it, I will provide some essense of it in this section.

Basic tree structure is composed by a set of node which refers another nodes.

To make characteristics on each nodes distinct & organized, nodes are categorized by being inherited.

First specification (de-abstruction) is if nodes has nothing or anything.

If a node does not have any reference to another node, it means the node is a leaf.

If a node has more than 1 node, it is called inner node.

If a node is a leaf, then this node is either

Const (statically initialized with a value that had been set in a read only manner)

or

Placeholder(statically initialized but with no values)

and no more specifications.

Tensorflow has Variable but I still do not let it join a component of a tree assuming they are converted to const.

If a node is not a leaf ( inner node ), another de-abstructions comes and ask it number of reference to another child.

If it is 1, it have Arg1 instance, 2, Arg2, and more than 3 , ArgMany.

If you set a line between Arg1 and Arg2, it would be useful for generating ONNX format because attributes on each operator on ONNX is well categorized with numbers of argument.

You might say ArgMany contains Arg1 or 2 even None.

But, the idea that every operator has list of reference would be not suitable for lower-layer since you do not know register allocation until just before it is called from a calling function.

followings specifications are done.

On Arg1

  • Arithmetic

    arithmetic with arg1 contains , for instance, (reduced) sum , (reduce) mean. if a provided value is a scalar, there is no computation, but vector, they are reduced with a set of information which axis are reduced. In an instruction level, they are composed by element-wise operations.

  • Logic

    logic contains so far only not but, should contain (reduced) and or xor.

  • Compare

    Max, Min , ArgMax,ArgMin ,Relu e.g. is contained which means they need basic comparator with reduced operations.

  • Cast

    It contains floor , abs , sign , round. Even if the data type are not casted, if it is sort of cast, it includes them.

  • F_shape

    Tensor has three type information; type, rank, and each shape. F_shape is a function which modifies rank and type without changing shape such as split, transpose, reshape .e.g.

On Arg2

  • Arithmetic

    It contains (elementwise) addition, subtraction e.g. Second arguments can be broadcasted with a given axis information. See more ONNX documentation for broadcasting tensor.

  • Logic

    Logic contains and , or , xor They can be broadcasted, too.

  • Compare

    Equal , Greater, Less etc. you name it!

  • Cast

    Cast is basically single argument operations.

  • F_shape

    Still only Concat where two tensors are provided as arguments

On ArgMany

  • I do not let them formalized as it should be somehow categorized with 1 or 2 arguments in my stance.