Models

Model Creation and Category Selection

Thanks to straightforward access to quality normalized data, the model contributors can get started creating models without a large barrier to entry. This softens onboarding, and even makes it easy for students and future model contributors to gain access to testing data.

For example, let’s break down a potential category:

Uniswap v3 (the Web3 application)
API3/ETH Liquidity Pool (specific liquidity pool in Uniswap)
Optimal Ranges For Best APY (what to optimize for)
Low Risk/Medium Risk/High Risk (the categories for the output schema)

Thus, a model created for this category is required to optimize the Uniswap v3 ranges(14) for the API3(15)/ETH(16) liquidity pool to find the best ranges for users to provide liquidity that provides the most profit at their specified risk choice (low/medium/high). The model itself must provide these 3 outputs as defined by the output schema (see below for more detail) and do its best to evenly weight these three outputs to optimize for all of them at once.

These categories can be quite specific, allowing for the models to optimize on a per-category basis. This means that the number of potential categories will grow more complex with time. Uniswap itself can theoretically provide an infinite number of liquidity pools. To attract the first contributors, the initial set of categories will be small so they can grow accustomed to the platform. Categories will be added by governance based on demand.

Once the contributor chooses which category they wish to build a model with, the Credmark platform will provide them with relevant data. It will then be up to the creator to use their domain knowledge to create the model itself, using industry-standard tools accepted on the Credmark platform.

Model Submission Process

Once a model contributor has created their model for a given category, they have the ability to submit it to the platform to be run and potentially earn income.

Because the models are required to be run and assessed 24/7, the model submission process also requires staking CMK tokens. This aids in preventing sybil attacks on the model submission process and stops bad actors from abusing the system by submitting useless models that run endlessly with no hope or intention of being accurate.

Such attacks attempt to waste resources and break the protocol. CMK staking creates a small barrier to entry, reducing economic incentives to attack the protocol and to increase token demand. As model contributors realize the opportunity to earn income from the Credmark platform, the demand for CMK to submit models will continue to grow.

When submitting a model, contextual information and metadata|together with the staked CMK tokens|will be sent during submission to the on-chain model attestation registry, which acts as an authoritative on-chain reference of the current set of actively running models. Extra metadata is added in order to streamline the web3 application integration experience by providing the required information for web3 applications. This includes the 7 category of the model, the model ID to make requests for specific models, and the post-process schema of the data. These attributes are key to ensuring both transparency and composability with other DeFi infrastructure.

Once a model has been successfully submitted to the model attestation registry, it can be accepted by the Credmark platform and start being tested using the latest data every day. As a result, the model contributor who submitted the model now has the opportunity to earn income if their model is one of the best in the category.

Transparent Verifiable Model Execution Platform

Every model that has been attested for on-chain and conforms to the standards as set out by Credmark is run on Credmark’s transparent and verifiable model execution platform.

This platform takes advantage of the infrastructure that provides reproducibility and trustlessness. Using Nix as the foundation for the platform architecture makes this possible.

Nix(17) is a package manager and build system that enables the creation of reproducible builds across immutable systems. Packages built with Nix have unique hashes that refer to the installed software. If even one line of code is changed in any part of the package, the entire hash changes, producing immutable builds that can be verified.

Nix is key for enabling the system to minimize trust when running models. Every model attestation that is submitted on-chain to the attestation registry will include the Nix hash of the model, as well as its underlying dependencies. This provides transparency to exactly what is being run on Credmark’s execution platform

With the Nix hash accessible to everyone, any user can read it from the attestation registry, verify a local environment, acquire the data from the platform, feed it into the model, and finally verify that the results produced by Credmark’s execution platform are accurate.

Rather than merely trusting that the execution platform is doing its job properly, Credmark pushes the adage \don’t trust, verify" to its natural limit for model execution.

This same infrastructure performs validation on the performance of competing models in a given category, allowing the best models to rise to the top and to provide risk analysis recommendations to end users.

Performance validation occurs via three methods:

Historical performance validation.
Edge case performance validation.
Live performance validation.

Historical performance validation is the simplest of the three. When a model is submitted to the platform, it’s run against all available historical data for its given category. This establishes an initial baseline for the accuracy of the model.

However, this initial performance validation cannot be solely trusted and weighed too heavily because it’s easy for model contributors to create models that overfit on the training data. In other words, the model will work seamlessly on historical data, but it’s too specialized and thus has inadequate predictive capabilities in both edge cases and the real world scenario. These are also addressed with the proceeding two performance validation metrics.

Edge case performance validation aims to test models against preselected edge cases for its given category (the edge cases are defined when the category is created). This is weighted separately from the historical performance validation to ensure that the model can properly account for novel scenarios such as runs and bear markets. This further selects models which best account for all scenarios, not just maximizing bull market gains. These edge case metrics allow users to measure the model performance in case of heavily skewed data and outliers. 8

Lastly, we have live performance validation. While it’s useful to take into account the model’s effectiveness on past data, there is no substitute for running the model in real-time and capturing its performance as it runs on live data. After a model has been submitted, it goes through live performance validation for four weeks to establish the effectiveness and robustness of the model.

Live performance validation is where all over-fitted models face reality while the cream of the crop are elevated for use by end users. Four weeks is an ample amount of time in the cryptocurrency space, with its wide and rapid fluctuations, and will be a sufficient run-up period to evaluate the performance of new models.

PreviousData Aggregation and Normalization NextCategorization

Last updated 2 years ago