Configure Zhipu AI

Zhipu AI is a leading cognitive intelligence large model company in China, providing the GLM-4 series of large language model services. GLM-4 supports ultra-long context, multimodal understanding, high-performance reasoning, and other features, widely used in dialogue, text generation, code assistance, and other scenarios.

1. Get Zhipu AI API Key

1.1 Access AI Open Platform

Visit the AI Open Platform and log in: https://open.bigmodel.cn/

Access AI Platform

1.2 Go to API Management Page

Click the user avatar in the upper right corner, select API Management.

Go to API Management Page

1.3 Create a New API Key

Click the Create API Key button.

Click Create Button

1.4 Set API Key Information

In the popup dialog:

Enter a name for the API Key (e.g., CueMate)
Select applicable models
Click the Confirm button

Set API Key Information

1.5 Copy API Key

After successful creation, the system will display the API Key.

Important: This is the only time you can see the complete API Key, please copy and save it securely immediately.

Copy API Key

Click the copy button, and the API Key will be copied to your clipboard.

2. Configure Zhipu AI Model in CueMate

2.1 Go to Model Settings Page

After logging into the CueMate system, click Model Settings in the dropdown menu in the upper right corner.

Go to Model Settings

2.2 Add New Model

Click the Add Model button in the upper right corner.

Click Add Model

2.3 Select Zhipu AI Provider

In the popup dialog:

Provider Type: Select Zhipu AI
Click to automatically proceed to the next step

Select Zhipu AI

2.4 Fill in Configuration Information

Fill in the following information on the configuration page:

Basic Configuration

Model Name: Give this model configuration a name (e.g., Zhipu GLM-4 Plus)
API URL: Keep the default https://open.bigmodel.cn/api/paas/v4 (OpenAI compatible format)
API Key: Paste the Zhipu AI API Key you just copied
Model Version: Select the model ID to use, common models include:
- glm-4-plus: Most powerful model, suitable for complex reasoning and deep analysis (max output 8K)
- glm-4-long: Ultra-long text processing, supports 1M context (max output 8K)
- glm-4-air: Lightweight efficient version, fast response (max output 8K)
- glm-4-airx: Ultra-fast version, ultra-low latency (max output 8K)
- glm-4-flash: Lightning response, real-time dialogue (max output 8K)
- glm-4: Standard version, balances performance and cost (max output 8K)
- glm-4v: Multimodal model, supports image understanding (max output 8K)
- glm-4v-plus: Multimodal enhanced version, supports 2-hour video, 4K images (max output 8K)
- glm-3-turbo: Affordable version, daily dialogue (max output 4K)

Fill in Basic Configuration

Advanced Configuration (Optional)

Expand the Advanced Configuration panel to adjust the following parameters:

Parameters adjustable in CueMate interface:

Temperature: Controls output randomness
- Range: 0-1
- Recommended Value: 0.7
- Function: Higher values produce more random and creative output, lower values produce more stable and conservative output
- Usage Suggestions:
  - Creative writing/brainstorming: 0.8-0.95
  - Regular conversation/Q&A: 0.6-0.8
  - Code generation/precise tasks: 0.3-0.5
- Note: Zhipu AI's temperature range is 0-1, different from OpenAI's 0-2
Max Tokens: Limits single output length
- Range: 256 - 8192 (depending on model)
- Recommended Value: 4096
- Function: Controls the maximum word count per model response
- Model Limits:
  - GLM-4 series: Max 8K tokens
  - GLM-3-turbo: Max 4K tokens
- Usage Suggestions:
  - Brief Q&A: 1024-2048
  - Regular conversation: 4096-8192
  - Long text generation: 8192 (maximum)

Advanced Configuration

Other advanced parameters supported by Zhipu AI API:

While the CueMate interface only provides temperature and max_tokens adjustments, if you call Zhipu AI directly via API, you can also use the following advanced parameters (Zhipu AI uses OpenAI compatible API format):

top_p (nucleus sampling)
- Range: 0-1
- Default: 0.7
- Function: Samples from the smallest candidate set whose cumulative probability reaches p
- Relationship with temperature: Can be used together
- Usage Suggestions:
  - Maintain diversity: 0.7-0.95
  - More conservative output: 0.5-0.7
do_sample
- Type: Boolean
- Default: true
- Function: Enables random sampling (set to false for greedy decoding)
- Use Cases:
  - Creative tasks: true (enable sampling)
  - Deterministic tasks: false (greedy decoding)
stop (stop sequences)
- Type: String or array
- Default: null
- Maximum: 4 strings
- Function: Stops when generated content contains specified strings
- Example: ["###", "User:", "\n\n"]
- Use Cases:
  - Structured output: Use delimiters to control format
  - Dialogue systems: Prevent model from speaking for user
stream (streaming output)
- Type: Boolean
- Default: false
- Function: Enables SSE streaming return, returns as it generates
- In CueMate: Automatically handled, no manual setting needed

tools (tool calling)

Type: Object array
Function: Defines tools/functions the model can call
Use Cases: Function Calling, Agent applications

Example:

json

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather for specified city",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {"type": "string"}
          }
        }
      }
    }
  ]
}

Zhipu AI Special Parameters:

request_id
- Type: String
- Function: User-provided unique ID for request tracking
- Usage Suggestion: Pass unique identifier for tracking and debugging

No.	Scenario	temperature	max_tokens	top_p	do_sample	stop
1	Creative Writing	0.8-0.95	4096-8192	0.9	true	null
2	Code Generation	0.2-0.5	2048-4096	0.7	true	null
3	Q&A System	0.6-0.8	1024-2048	0.7	true	null
4	Summarization	0.3-0.5	512-1024	0.7	true	null
5	Deterministic Tasks	0	2048	1.0	false	null

2.5 Test Connection

After filling in the configuration, click the Test Connection button to verify if the configuration is correct.

Test Connection

If the configuration is correct, a success message will be displayed along with a sample response from the model.

Test Success

If the configuration is incorrect, an error log will be displayed, and you can view specific error information through log management.

2.6 Save Configuration

After successful testing, click the Save button to complete the model configuration.

Save Configuration

3. Use the Model

Go to the system settings page through the dropdown menu in the upper right corner, and select the model configuration you want to use in the LLM provider section.

After configuration, you can select this model in features like interview training and question generation. You can also select this model configuration for a specific interview in the interview options.

Select Model

4. Supported Model List

No.	Model Name	Model ID	Max Output	Use Case
1	GLM-4 Plus	`glm-4-plus`	8K tokens	Most powerful version, complex reasoning
2	GLM-4 Long	`glm-4-long`	8K tokens	Long text processing, supports 1M context
3	GLM-4 Air	`glm-4-air`	8K tokens	Lightweight, fast response
4	GLM-4 AirX	`glm-4-airx`	8K tokens	Ultra-fast version, ultra-low latency
5	GLM-4 Flash	`glm-4-flash`	8K tokens	Real-time dialogue, lightning response
6	GLM-4	`glm-4`	8K tokens	Standard version, technical interviews
7	GLM-4V	`glm-4v`	8K tokens	Multimodal, supports image understanding
8	GLM-4V Plus	`glm-4v-plus`	8K tokens	Multimodal enhanced, supports 2-hour video
9	GLM-3 Turbo	`glm-3-turbo`	4K tokens	Affordable, regular dialogue

5. FAQ

5.1 Invalid API Key

Symptom: API Key error when testing connection

Solutions:

Check if API Key is completely copied
Confirm API Key has not expired or been disabled
Check if account has available credits

5.2 Request Timeout

Symptom: Long wait time with no response when testing connection or using the model

Solutions:

Check if network connection is normal
Check firewall settings
Confirm Zhipu AI service status is normal

5.3 Insufficient Quota

Symptom: Quota exhausted message

Solutions:

Log in to Zhipu AI platform to check account balance
Top up or request more quota
Optimize usage frequency

Configure Zhipu AI

1. Get Zhipu AI API Key ​

1.1 Access AI Open Platform ​

1.2 Go to API Management Page ​

1.3 Create a New API Key ​

1.4 Set API Key Information ​

1.5 Copy API Key ​

2. Configure Zhipu AI Model in CueMate ​

2.1 Go to Model Settings Page ​

2.2 Add New Model ​

2.3 Select Zhipu AI Provider ​

2.4 Fill in Configuration Information ​

Basic Configuration ​

Advanced Configuration (Optional) ​

2.5 Test Connection ​

2.6 Save Configuration ​

3. Use the Model ​

4. Supported Model List ​

5. FAQ ​

5.1 Invalid API Key ​

5.2 Request Timeout ​

5.3 Insufficient Quota ​

Related Links ​

1. Get Zhipu AI API Key

1.1 Access AI Open Platform

1.2 Go to API Management Page

1.3 Create a New API Key

1.4 Set API Key Information

1.5 Copy API Key

2. Configure Zhipu AI Model in CueMate

2.1 Go to Model Settings Page

2.2 Add New Model

2.3 Select Zhipu AI Provider

2.4 Fill in Configuration Information

Basic Configuration

Advanced Configuration (Optional)

2.5 Test Connection

2.6 Save Configuration

3. Use the Model

4. Supported Model List

5. FAQ

5.1 Invalid API Key

5.2 Request Timeout

5.3 Insufficient Quota

Related Links