Concepts
Models are the core resources of AI workloads, and inference is the core AI workload type on Neutree.
Model registry
Section titled “Model registry”A model registry stores models. Neutree supports public model registries of the Hugging Face type and private model registries based on a file system.
Model catalog
Section titled “Model catalog”The model catalog compiles best-practice parameters for commonly used models, with preset inference engines, resource specifications, and runtime parameters. It aims to standardize model deployment and reduce operational costs.
Engine
Section titled “Engine”An engine is a built-in code framework within Neutree for running models. It includes multiple configurable parameters and provides functions such as model loading, accelerator adaptation, inference optimization, and OpenAI-compatible APIs.
Endpoint
Section titled “Endpoint”An endpoint is the specific deployment entity for a model inference service. Each endpoint corresponds to an independent inference service that provides OpenAI-compatible API interfaces to external clients. An endpoint contains a complete inference service runtime environment, including model selection, inference engine configuration, and inference parameter settings.