Basic Concepts

Models are the core resources for AI workloads, and inference is the primary AI workload type in Neutree.

Model Registry

A model registry stores models. Neutree supports Hugging Face type public model registries and file system based private model registries.

Model Catalog

The model catalog provides best-practice configurations for commonly used models, with preset inference engines, resource specifications, and runtime parameters. It aims to standardize model deployment and reduce operational costs.

Inference Engine

An inference engine is a built-in code framework in Neutree for running models. It includes various configurable parameters and provides features such as model loading, accelerator adaptation, inference optimization, and OpenAI-compatible APIs.

Endpoint

An endpoint is a deployed instance of a model inference service. Each endpoint corresponds to an independent inference service that provides OpenAI-compatible API interfaces. An endpoint includes the complete runtime environment for inference services, including model selection, inference engine configuration, and inference parameter settings.