Schema Components

Note

The following is a sample from a previous documentation project for a backend machine learning service. Names and identifying content have been changed.

If you’ve exported a JSON file of your schema, or if you’re building one yourself

The Basics

Each schema is broken into five parts:

  • Schema flag
  • User object
  • Objects
  • Attributes
  • Interactions

Schema Flag

The initial key, schema, flags the document as a schema file for our training algorithm. This is mandatory and must be the only top-level item in the schema. It should look like this:

{
    "schema": {}
}

Objects

Objects are specific tags relevant the machine learning algorithm. There are two types of objects: User objects and General objects.

User Objects

User objects define the aspects of a user. This includes user names, date of birth, location, etc. This is mandatory and must be included in all schemae. They will look like this:

{
    "schema": {
    "user": {}
    }
}

General Objects

General objects define other objects in the schema the user interacts with (books, magazines, libraries, events, meetings, pages, game items, etc). Defining these objects enable you to use them as sources to run your algorithm.

General objects are optional, but must be placed below the User Object.

As an example, below is a custom object for book.

{
    "schema": {
        "user": {},
        "book": {},
    }
}

Labels

Labels are used to classify objects. In our example, is_fake, and not_fake for a user object. They are presented like this:

{
    "schema": {
        "user": {
            "labels" : ["fake", "not_fake"],
        },
        "book": {},
    },
}

You can have any number of labels under any name, as long as they are in snake_case.

Attributes

Attributes allow you to add more detail to your objects. They are composed of a name and a type. Certain types allow for more parameters.

The types available are:

  • id
  • boolean
  • lat/lng location
  • number (includes optional parameters for minv, maxv, and num_slices)
  • date(includes parameter for min_date)
  • category(includes required options for category names)

They appear in your JSON file like this:

{
    "schema": {
        "user": {
        "labels" : ["fake", "not_fake"],
        "date_of_birth": {},
        },
        "book": {},
    },
}

You can have any number of attributes under any name, as long as they are in snake_case.

{
    "schema": {
        "user": {
            "labels" : ["fake", "not_fake"],
            "date_of_birth": {},
            "location": {},
            "last_book_read": {},
        },
        "book": {},
    },
}

Optional parameters are added to the associated attribute like so:

{
    "schema": {
        "user": {
            "labels" : ["fake", "not_fake"],
            "date_of_birth": {
                "type": "date",
                    "min_date": "1960-01-01",
                    "map_to": "user_date_of_birth"
            }
        },
        "book": {},
        "interactions": {}
    }
}

Interactions

Interactions help to describe connections between the user, other users, and other objects in your schema. The interaction comprises a name and optional attributes.

They appear in the JSON file like this:

{
    "schema": {
        "user": {
            "labels" : ["fake", "not_fake"],
            "date_of_birth": {
                "type": "date",
            },
        },
        "book": {},
        "interactions": {}
    }
}

Further Reading

Creating a JSON Schema