Concepts

Models are the core resources of AI workloads, and inference is the core AI workload type on Neutree.

Model registry

A model registry stores models. Neutree supports public model registries of the Hugging Face type and private model registries based on a file system.

Model catalog

The model catalog compiles best-practice parameters for commonly used models, with preset inference engines, resource specifications, and runtime parameters. It aims to standardize model deployment and reduce operational costs.

Engine

An engine is a built-in code framework within Neutree for running models. It includes multiple configurable parameters and provides functions such as model loading, accelerator adaptation, inference optimization, and OpenAI-compatible APIs.

Endpoint

An endpoint is the specific deployment entity for a model inference service. Each endpoint corresponds to an independent inference service that provides OpenAI-compatible API interfaces to external clients. An endpoint contains a complete inference service runtime environment, including model selection, inference engine configuration, and inference parameter settings.