Skip to content
Neutree Documentation

What's in this release

  • Supports configuring external endpoints.
    • Supports proxying external APIs (such as OpenAI) through the AI gateway with unified authentication and usage statistics.
    • Compatible with the Anthropic protocol.
  • Adds the following features to endpoints:
    • Adds a replica selector to the monitoring page for viewing metrics by replica.
    • Adds auto-refresh for endpoint logs.
    • Compatible with the Anthropic protocol.
  • Static node clusters support running different engine versions simultaneously.
  • Supports online cluster version upgrade for both static node clusters and Kubernetes clusters without recreating clusters.
  • Supports login with username in addition to email.
  • Supports NFS model cache integrity verification to ensure model data consistency.
  • Supports search and batch deletion functions on the resource list page.
  • Supports importing engine version metadata from a standalone manifest.yaml file without downloading the full engine image.
  • Adds a quick-start wizard that guides new users through core features on first login.
  • Adds the following parameter and commands to the CLI tool:
    • Adds the --registry-project parameter to specify the image registry project name.
    • Adds apply, get, wait, delete, and cleanup commands for declarative resource management.
    • Adds the engine remove-version command to delete custom engine versions.
  • Adds vLLM engine versions v0.11.2 and v0.17.1 as built-in options.
  • Upgrades Ray to v2.53.0 for static node clusters.
  • The cluster status with spec hash comparison now includes Updating, Upgrading, and Deleting for accurate cluster status determination.
  • Enhances Grafana dashboard theme styling with custom CSS injection.
  • Displays token usage in compact notation (K/M/B).
  • The endpoint list is sorted by running status by default.
  • Supports dynamic browser tab icon updates when customizing the platform appearance.
  • Supports automatic selection of dependent permissions when assigning permissions to roles.
  • Restricts PostgREST anonymous role permissions and optimizes container security configurations to enhance overall system security.
  • Reduces Ray Object Store memory from 30% to 10%.
  • Switches GPU detection from nvidia-smi to lspci to avoid driver loading race conditions.
  • When using the vLLM engine with multiple accelerators configured, the system automatically sets the engine variable tensor_parallel_size to the number of accelerators, eliminating the need for manual configuration.
  • Optimizes the file descriptor limit (ulimit nofile) configuration for control plane containers to improve system stability.
  • The handling of JSON values in engine_args for Kubernetes and SSH/Ray paths was inconsistent. The issue has been resolved in this release.
  • The race condition in Ray Serve concurrent deployment has been resolved in this release.
  • Pod-level labels caused duplication of DCGM metrics. The issue has been resolved in this release.
  • The GGUF model file in subdirectories was not discovered. The issue has been resolved in this release.
  • The file filter was applied to non-GGUF models in downloaders. The issue has been resolved in this release.
  • Model name and version were not correctly updated during push. The issue has been resolved in this release.
  • The endpoint unhealthy detection was inaccurate. The issue has been resolved in this release.
  • The image registry URL caused an exception when the URL included a scheme prefix. The issue has been resolved in this release.
  • Accelerator data format validation was missing during endpoint import. The issue has been resolved in this release.
  • Repeated image extraction occurred when uploading images with the CLI tool. The issue has been resolved in this release.
  • PostgreSQL pods were recreated during minor version upgrades. The issue has been resolved in this release.
  • The SSH cluster node recovery status was not written correctly. The issue has been resolved in this release.
  • A nil map panic occurred when DeploymentOptions was unset. The issue has been resolved in this release.
  • The form retained the previous template configuration after switching the model catalog template for an endpoint. The issue has been resolved in this release.
  • Worker nodes could not be added when editing a static node cluster. The issue has been resolved in this release.
  • The accelerator count was displayed as a negative number. The issue has been resolved in this release.
  • The workspace filter criteria were unexpectedly changed when editing resources. The issue has been resolved in this release.
  • Available resources were displayed as exceeding the total amount. The issue has been resolved in this release.
  • Automatic recovery was not triggered when the Raylet process on the head node of a static node cluster exited while the dashboard remained accessible. The issue has been resolved in this release.
  • Temporary files overwrote each other when the CLI tool import command was run concurrently. The issue has been resolved in this release.
  • Endpoints failed to start because the engine could not recognize engine variables. The issue has been resolved in this release.
  • After deleting an endpoint in a Kubernetes cluster, a re-created endpoint with the same name was skipped. The issue has been resolved in this release.