Module Spn

module Spn: sig .. end
SPN library provides data structure and methods for learning sum-product networks.


Data structures

type data_t = int array array 
type local_schema_t = (int * int) array 
type schema_t = int array 
type spn_node_type = 
| TimesNode
| PlusNode
| NullNode (*Used for dummy nodes.*)
| LeafNode
Different node types in SPNs
type spnode = {
   id : int; (*Each node of an SPN has a unique auto_increment ID.*)
   mutable parents : spnode array; (*Parents of this node in the SPN.*)
   mutable children : spnode array; (*Children of this node in the SPN.*)
   mutable params : float array; (*If the spnode is PlusNode, then params holds the weight of each child in the children list.*)
   mutable nodetype : spn_node_type;
   mutable schema : local_schema_t; (*Range and variable id for each variable in the scope of this node.*)
   mutable data : data_t; (*Subset of the training data used for learning the parameters of this node.*)
   mutable acname : string; (*If the node is a LeafNode, then acname is the filename of a tractable graphical model that represent the leaf distribution.*)
   mutable ac_index : int; (*If the node is a LeafNode, the Comps[ac_index] holds the leaf distribution after loading them from file.*)
   mutable final : bool; (*Used in the learning to show that a node is final and cannot be modified.*)
   mutable infcached : bool; (*True if cached inference results are available.*)
   mutable logpTemp : float array; (*Used for caching inference results for faster inference.*)
}
Each node in an SPN is a spnode.
exception HSplitNotPossible
Raised when data clustering is not possible.
exception VSplitNotPossible
Raised when variable clustering is not possible.
exception NodeTypeError
val node_array : spnode array Pervasives.ref
Stores the nodes of an SPN network.

Operations

val create_node : spn_node_type ->
spnode array ->
spnode array ->
float array -> data_t -> local_schema_t -> spnode
create_node t pt c pm d s creates a new spn node and adds it to Spn.node_array. t is the node type, pt is the list of parents, c is the children list, pm is the list of parameters, d is data, and s is the schema.
val create_plus : data_t -> local_schema_t -> spnode
Creates a plus node for the given data and schema.
val create_times : data_t -> local_schema_t -> spnode
Creates a times node for the given data and schema.
val create_leaf : data_t -> local_schema_t -> spnode
Creates a leaf node for the given data and schema.
val create_null : unit -> spnode
Creates a dummy node.
val get_type_name : spn_node_type -> string
Returns a string value corresponding to the node type.
val get_node_by_id : int -> spnode
Returns an spnode with the given id from node_array.
val get_spn_size : unit -> int
Returns the size of an SPN network (the lengths of node_array)
val add_child : spnode -> spnode -> unit
Adds a child to the node and updates the parameters based on the number of samples in the child node.
val set_parent : spnode -> spnode -> unit
Sets the parent of a node, and also updates the children of the parent node.
val print_node : Pervasives.out_channel -> spnode -> unit
Writes the information of a single SPN node into an output file.
val print_model : Pervasives.out_channel -> spnode -> unit
Writes the SPN structure into the output file
val print_spn : spnode -> unit
Prints SPN information. Used for dubugging.
val load_model : string -> unit
Constructs a SPN structure given the model file
val h_split : spnode -> int -> float -> int -> float -> spnode list
h_split node num_part lambda concurrent_thr ratio learns a sum node using sample clustering for conditional distributions. num_part is the maximum number of clusters. lambda is a penalty for the number of clusters. concurrent_thr determined the number of concurrent processes for clustering. ratio is no longer in use.
Raises HSplitNotPossible if the number of found clusters is one.
val h_split_force : spnode -> int -> float -> int -> float -> spnode list
Similar to h_split, but never raises HSplitNotPossible, so it always find a split. (Useful for the spnlearn algorithm.)
val v_split : spnode -> float -> spnode list
v_split node cut_thr learns a product node using variable clustering. node is a spnode to split. We suppose that there is no edge between two nodes if the mutual information of the nodes is less than cut_thr.
val v_split_gtest : spnode -> float -> spnode list
v_split_gtest node cut_thr learns a product node using variable clustering. node is a spnode to split. It suppose no edge between two nodes if g-test is less than cut_thr.