Configure Zhipu AI
Zhipu AI is a leading cognitive intelligence large model company in China, providing the GLM-4 series of large language model services. GLM-4 supports ultra-long context, multimodal understanding, high-performance reasoning, and other features, widely used in dialogue, text generation, code assistance, and other scenarios.
1. Get Zhipu AI API Key
1.1 Access AI Open Platform
Visit the AI Open Platform and log in: https://open.bigmodel.cn/

1.2 Go to API Management Page
Click the user avatar in the upper right corner, select API Management.

1.3 Create a New API Key
Click the Create API Key button.

1.4 Set API Key Information
In the popup dialog:
- Enter a name for the API Key (e.g., CueMate)
- Select applicable models
- Click the Confirm button

1.5 Copy API Key
After successful creation, the system will display the API Key.
Important: This is the only time you can see the complete API Key, please copy and save it securely immediately.

Click the copy button, and the API Key will be copied to your clipboard.
2. Configure Zhipu AI Model in CueMate
2.1 Go to Model Settings Page
After logging into the CueMate system, click Model Settings in the dropdown menu in the upper right corner.

2.2 Add New Model
Click the Add Model button in the upper right corner.

2.3 Select Zhipu AI Provider
In the popup dialog:
- Provider Type: Select Zhipu AI
- Click to automatically proceed to the next step

2.4 Fill in Configuration Information
Fill in the following information on the configuration page:
Basic Configuration
- Model Name: Give this model configuration a name (e.g., Zhipu GLM-4 Plus)
- API URL: Keep the default
https://open.bigmodel.cn/api/paas/v4(OpenAI compatible format) - API Key: Paste the Zhipu AI API Key you just copied
- Model Version: Select the model ID to use, common models include:
glm-4-plus: Most powerful model, suitable for complex reasoning and deep analysis (max output 8K)glm-4-long: Ultra-long text processing, supports 1M context (max output 8K)glm-4-air: Lightweight efficient version, fast response (max output 8K)glm-4-airx: Ultra-fast version, ultra-low latency (max output 8K)glm-4-flash: Lightning response, real-time dialogue (max output 8K)glm-4: Standard version, balances performance and cost (max output 8K)glm-4v: Multimodal model, supports image understanding (max output 8K)glm-4v-plus: Multimodal enhanced version, supports 2-hour video, 4K images (max output 8K)glm-3-turbo: Affordable version, daily dialogue (max output 4K)

Advanced Configuration (Optional)
Expand the Advanced Configuration panel to adjust the following parameters:
Parameters adjustable in CueMate interface:
Temperature: Controls output randomness
- Range: 0-1
- Recommended Value: 0.7
- Function: Higher values produce more random and creative output, lower values produce more stable and conservative output
- Usage Suggestions:
- Creative writing/brainstorming: 0.8-0.95
- Regular conversation/Q&A: 0.6-0.8
- Code generation/precise tasks: 0.3-0.5
- Note: Zhipu AI's temperature range is 0-1, different from OpenAI's 0-2
Max Tokens: Limits single output length
- Range: 256 - 8192 (depending on model)
- Recommended Value: 4096
- Function: Controls the maximum word count per model response
- Model Limits:
- GLM-4 series: Max 8K tokens
- GLM-3-turbo: Max 4K tokens
- Usage Suggestions:
- Brief Q&A: 1024-2048
- Regular conversation: 4096-8192
- Long text generation: 8192 (maximum)

Other advanced parameters supported by Zhipu AI API:
While the CueMate interface only provides temperature and max_tokens adjustments, if you call Zhipu AI directly via API, you can also use the following advanced parameters (Zhipu AI uses OpenAI compatible API format):
top_p (nucleus sampling)
- Range: 0-1
- Default: 0.7
- Function: Samples from the smallest candidate set whose cumulative probability reaches p
- Relationship with temperature: Can be used together
- Usage Suggestions:
- Maintain diversity: 0.7-0.95
- More conservative output: 0.5-0.7
do_sample
- Type: Boolean
- Default: true
- Function: Enables random sampling (set to false for greedy decoding)
- Use Cases:
- Creative tasks: true (enable sampling)
- Deterministic tasks: false (greedy decoding)
stop (stop sequences)
- Type: String or array
- Default: null
- Maximum: 4 strings
- Function: Stops when generated content contains specified strings
- Example:
["###", "User:", "\n\n"] - Use Cases:
- Structured output: Use delimiters to control format
- Dialogue systems: Prevent model from speaking for user
stream (streaming output)
- Type: Boolean
- Default: false
- Function: Enables SSE streaming return, returns as it generates
- In CueMate: Automatically handled, no manual setting needed
tools (tool calling)
- Type: Object array
- Function: Defines tools/functions the model can call
- Use Cases: Function Calling, Agent applications
- Example:json
{ "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get weather for specified city", "parameters": { "type": "object", "properties": { "city": {"type": "string"} } } } } ] }
Zhipu AI Special Parameters:
- request_id
- Type: String
- Function: User-provided unique ID for request tracking
- Usage Suggestion: Pass unique identifier for tracking and debugging
| No. | Scenario | temperature | max_tokens | top_p | do_sample | stop |
|---|---|---|---|---|---|---|
| 1 | Creative Writing | 0.8-0.95 | 4096-8192 | 0.9 | true | null |
| 2 | Code Generation | 0.2-0.5 | 2048-4096 | 0.7 | true | null |
| 3 | Q&A System | 0.6-0.8 | 1024-2048 | 0.7 | true | null |
| 4 | Summarization | 0.3-0.5 | 512-1024 | 0.7 | true | null |
| 5 | Deterministic Tasks | 0 | 2048 | 1.0 | false | null |
2.5 Test Connection
After filling in the configuration, click the Test Connection button to verify if the configuration is correct.

If the configuration is correct, a success message will be displayed along with a sample response from the model.

If the configuration is incorrect, an error log will be displayed, and you can view specific error information through log management.
2.6 Save Configuration
After successful testing, click the Save button to complete the model configuration.

3. Use the Model
Go to the system settings page through the dropdown menu in the upper right corner, and select the model configuration you want to use in the LLM provider section.
After configuration, you can select this model in features like interview training and question generation. You can also select this model configuration for a specific interview in the interview options.

4. Supported Model List
| No. | Model Name | Model ID | Max Output | Use Case |
|---|---|---|---|---|
| 1 | GLM-4 Plus | glm-4-plus | 8K tokens | Most powerful version, complex reasoning |
| 2 | GLM-4 Long | glm-4-long | 8K tokens | Long text processing, supports 1M context |
| 3 | GLM-4 Air | glm-4-air | 8K tokens | Lightweight, fast response |
| 4 | GLM-4 AirX | glm-4-airx | 8K tokens | Ultra-fast version, ultra-low latency |
| 5 | GLM-4 Flash | glm-4-flash | 8K tokens | Real-time dialogue, lightning response |
| 6 | GLM-4 | glm-4 | 8K tokens | Standard version, technical interviews |
| 7 | GLM-4V | glm-4v | 8K tokens | Multimodal, supports image understanding |
| 8 | GLM-4V Plus | glm-4v-plus | 8K tokens | Multimodal enhanced, supports 2-hour video |
| 9 | GLM-3 Turbo | glm-3-turbo | 4K tokens | Affordable, regular dialogue |
5. FAQ
5.1 Invalid API Key
Symptom: API Key error when testing connection
Solutions:
- Check if API Key is completely copied
- Confirm API Key has not expired or been disabled
- Check if account has available credits
5.2 Request Timeout
Symptom: Long wait time with no response when testing connection or using the model
Solutions:
- Check if network connection is normal
- Check firewall settings
- Confirm Zhipu AI service status is normal
5.3 Insufficient Quota
Symptom: Quota exhausted message
Solutions:
- Log in to Zhipu AI platform to check account balance
- Top up or request more quota
- Optimize usage frequency
