Basic Concepts
Models are the core resources for AI workloads, and inference is the primary AI workload type in Neutree.
Model Registry
Section titled “Model Registry”A model registry stores models. Neutree supports Hugging Face type public model registries and file system based private model registries.
Model Catalog
Section titled “Model Catalog”The model catalog provides best-practice configurations for commonly used models, with preset inference engines, resource specifications, and runtime parameters. It aims to standardize model deployment and reduce operational costs.
Inference Engine
Section titled “Inference Engine”An inference engine is a built-in code framework in Neutree for running models. It includes various configurable parameters and provides features such as model loading, accelerator adaptation, inference optimization, and OpenAI-compatible APIs.
Endpoint
Section titled “Endpoint”An endpoint is a deployed instance of a model inference service. Each endpoint corresponds to an independent inference service that provides OpenAI-compatible API interfaces. An endpoint includes the complete runtime environment for inference services, including model selection, inference engine configuration, and inference parameter settings.