#359 Proposal: Data Model 3.0 - Kind

Brian Frank Mon 11 Jan 2016

This is one component of Data Model 3.0 proposal.

Haystack tries to achieve a balance with dynamic typing and static typing. We want enough formalism to create useful data models that can parsed and exchanged. For example we use a much richer type system than JSON to formally define Ref, DateTimes, etc. But we also keep things fairly dynamically typed for flexible modeling which I believe suits the complexity of modeling typical IoT systems.

However, increasingly I believe we need to add a bit more formalism to the type system used to define tags. With nested structures we now have the potential to type tags as a List, Dict, or Grid.

In the cast of Lists, 90% of the time you want to type the list as a specific type. For example under this proposal it would be nice to redefine the enum tag to be a list of strings.

In the cast of Ref or Dict, we often are expecting a specific type of entity. For example the ahuRef tag is expected to dereference an entity that has the ahu tag which then implies other semantics (most informal).

Capturing these concepts would provide a degree more formality to the tag definitions. But it would also provide a lot more flexibility for tooling to do a better job. For example if I add an ahuRef tag on an entity, then I would like my tool to then let me pick from the available ahu entities defined in the system already.

I would like to propose the following syntax for Kind:

Ref        // simple type
Ref[]      // list of Refs
Ref<ahu>   // Ref to an entity with ahu tag
Ref<ahu>[] // list of Refs to ahu entities
Dict       // generic nested dict
Dict<ahu>  // nested dict with ahu entity
Obj        // any value type
Obj[]      // list of Obj, same as List

<kind>      := <base> | <paramRef> | <paramDict> | <list>
<list>      := <kind> "[]"
<paramRef>  := "Ref" <tag>
<paramDict> := "Dict" <tag>
<tag>       := "<" <tagName> ">"
<tagName>   := standard tag name rules
<base>      := "Obj" | "Marker" | "NA" | "Bool" | "Number" | "Str" | 
               "Ref" | "Date" | "Time" | "DateTime" | "Coord" |
               "List" | "Dict" | "Grid"

Another option would be to make it look more like parameterized types and make List work the same way:

List<Ref>       // Ref[]
List<Ref<ahu>>  // Ref<ahu>[]

But I personally prefer making list syntax a special case - its much easier to read IMO.

Kevin Kelley Mon 11 Jan 2016

Wondering if Interval ought to be a fundamental type...

There's Date and DateTime, but often needed is a representation of an interval, either a pair of DateTime, start to end; or DateTime start plus duration (a number with a Time unit).

It certainly gets used a lot, and there's no standard way to represent it here.

In fact (thinking out loud) it seems like this ties to the problem of time-series data intervals and whether the sample with timestamp represents a sample ending at the timestamp, starting at timestamp, or centered...

If the ts column of such a interval-data grid were typed as Interval instead of DateTime, then that could be captured.

Brian Frank Mon 11 Jan 2016

Wondering if Interval ought to be a fundamental type..

It is definitely commonly used, in SkySpark we have two key types: DateSpan and DateTimeSpan which is effectively two dates or two datetimes (we also support from relative versions such as today, yesterday, etc). I think tackling those is best done using the XStr proposal - in fact standardizing a string encoding for DateSpan is one of my main uses cases.

Kevin Kelley Mon 11 Jan 2016

best done using the XStr proposal

... so in your <base> Kind above, "Str" would mean'"Str" | "XStr"', letting us "extend" the Kind system with non-standardized Kinds?

I can see that; just meant that maybe this one might be more fundamental. Just a thought. In general I like this proposal a lot.

Login or Signup to reply.