Configure Alibaba Cloud Bailian
Alibaba Cloud Bailian is an enterprise-level large model service platform launched by Alibaba Cloud, providing Tongyi Qianwen (Qwen) series models. It supports multilingual, multimodal, ultra-long context capabilities, suitable for intelligent customer service, content creation, code assistance and other scenarios.
1. Get Alibaba Cloud Bailian API Key
1.1 Access Alibaba Cloud Bailian Platform
Visit the Alibaba Cloud Bailian (Tongyi Qianwen) platform and log in: https://dashscope.aliyun.com/

1.2 Go to API-KEY Management Page
After logging in, click the user avatar in the upper right corner and select API-KEY Management.

1.3 Create New API Key
Click the Create New API-KEY button.

1.4 Set API Key Information
In the popup dialog:
- Enter the API Key name (e.g., CueMate)
- Click the Create Key button

1.5 Copy API Key
After successful creation, the system will display the API Key.
Important: Please copy and save it immediately. The API Key starts with sk- and is 32 characters long.

Click the copy button, and the API Key has been copied to the clipboard.
2. Configure Alibaba Cloud Bailian Model in CueMate
2.1 Go to Model Settings Page
After logging into CueMate system, click Model Settings in the upper right dropdown menu.

2.2 Add New Model
Click the Add Model button in the upper right corner.

2.3 Select Alibaba Cloud Bailian Provider
In the popup dialog:
- Provider Type: Select Alibaba Cloud Bailian
- After clicking, it will automatically proceed to the next step

2.4 Fill in Configuration Information
Fill in the following information on the configuration page:
Basic Configuration
- Model Name: Give this model configuration a name (e.g., Tongyi Qianwen 3-Max)
- API URL: Keep default
https://dashscope.aliyuncs.com/compatible-mode/v1(OpenAI compatible format) - API Key: Paste the Alibaba Cloud Bailian API Key just copied
- Model Version: Select or enter the model ID to use, common models include:
qwen3-max: Latest most powerful model, maximum output 65Kqwen-plus: High cost-performance version, maximum output 32Kqwen-flash: Fast response version, maximum output 16Kqwen-max: Classic flagship version, maximum output 8Kqwen-turbo: Fast version, maximum output 8Kqwen3-235b-a22b: Super large parameter version, maximum output 8K- Other Qwen3 series models (0.6b/1.7b/4b/8b/14b/32b/30b-a3b)

Advanced Configuration (Optional)
Expand the Advanced Configuration panel to adjust the following parameters:
CueMate Interface Adjustable Parameters:
Temperature: Control output randomness
- Range: 0-2
- Recommended Value: 0.7
- Function: Higher values produce more random and creative output, lower values produce more stable and conservative output
- Usage Recommendations:
- Creative writing/brainstorming: 1.0-1.5
- General conversation/Q&A: 0.7-0.9
- Code generation/precise tasks: 0.3-0.5
Max Tokens (max_tokens): Limit single output length
- Range: 256 - 65536 (depending on the model)
- Recommended Value: 8192
- Function: Control the maximum word count of model's single response
- Model Limitations:
- qwen3-max: Maximum 65K tokens
- qwen-plus: Maximum 32K tokens
- qwen-flash: Maximum 16K tokens
- qwen-max/qwen-turbo/qwen3 series: Maximum 8K tokens
- Usage Recommendations:
- Short Q&A: 1024-2048
- General conversation: 4096-8192
- Long text generation: 16384-32768
- Ultra-long documents: 65536 (qwen3-max only)

Other Advanced Parameters Supported by Alibaba Cloud Bailian API:
Although the CueMate interface only provides temperature and max_tokens adjustments, if you call Alibaba Cloud Bailian directly through the API, you can also use the following advanced parameters (Alibaba Cloud Bailian uses OpenAI-compatible API format):
top_p (nucleus sampling)
- Range: 0-1
- Default: 1
- Function: Sample from the smallest candidate set where probability accumulates to p
- Relationship with temperature: Usually only adjust one of them
- Usage Recommendations:
- Maintain diversity but avoid absurdity: 0.9-0.95
- More conservative output: 0.7-0.8
frequency_penalty (frequency penalty)
- Range: -2.0 to 2.0
- Default: 0
- Function: Reduce the probability of repeating the same words (based on word frequency)
- Usage Recommendations:
- Reduce repetition: 0.3-0.8
- Allow repetition: 0 (default)
presence_penalty (presence penalty)
- Range: -2.0 to 2.0
- Default: 0
- Function: Reduce the probability of words that have already appeared appearing again (based on whether they appeared)
- Usage Recommendations:
- Encourage new topics: 0.3-0.8
- Allow topic repetition: 0 (default)
stop (stop sequences)
- Type: String or array
- Default: null
- Function: Stop when generated content contains specified strings
- Example:
["###", "User:", "\n\n"] - Use Cases:
- Structured output: Use delimiters to control format
- Dialogue systems: Prevent model from speaking for the user
stream (streaming output)
- Type: Boolean
- Default: false
- Function: Enable SSE streaming return, generate while returning
- In CueMate: Automatically handled, no need to set manually
tools (tool calling)
- Type: Array of objects
- Function: Define tools/functions that the model can call
- Use Cases: Function Calling, Agent applications
- Example:json
{ "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get weather information for specified city", "parameters": { "type": "object", "properties": { "city": {"type": "string", "description": "City name"} } } } } ] }
Alibaba Cloud Bailian Exclusive Parameters:
- enable_search (internet search)
- Type: Boolean
- Default: false
- Function: Enable real-time internet search to enhance answer timeliness
- Use Cases: Q&A tasks requiring latest information
- Note: Only supported by some models (such as qwen-max, qwen-plus)
Parameter Combination Recommendations:
| No. | Scenario | temperature | max_tokens | top_p | frequency_penalty | presence_penalty | enable_search |
|---|---|---|---|---|---|---|---|
| 1 | Creative writing | 1.0-1.2 | 4096-8192 | 0.95 | 0.5 | 0.5 | false |
| 2 | Code generation | 0.2-0.5 | 2048-4096 | 0.9 | 0.0 | 0.0 | false |
| 3 | Q&A system | 0.7 | 1024-2048 | 0.9 | 0.0 | 0.0 | false |
| 4 | Summary | 0.3-0.5 | 512-1024 | 0.9 | 0.0 | 0.0 | false |
| 5 | Real-time news | 0.7 | 2048-4096 | 0.9 | 0.0 | 0.0 | true |
2.5 Test Connection
After filling in the configuration, click the Test Connection button to verify if the configuration is correct.

If the configuration is correct, a test success message will be displayed, along with a model response example.

If the configuration is wrong, a test error log will be displayed, and you can view specific error information through log management.
2.6 Save Configuration
After successful testing, click the Save button to complete the model configuration.

3. Use Model
Through the upper right dropdown menu, go to system settings interface, select the model configuration you want to use in the large model provider section.
After configuration is complete, you can select this model in interview training, question generation and other features, or select this model configuration for this interview in the interview options.

4. Supported Model List
4.1 Qwen3 Series (Latest)
| No. | Model Name | Model ID | Max Output | Use Cases |
|---|---|---|---|---|
| 1 | Qwen3-Max | qwen3-max | 65K tokens | Latest most powerful model, complex reasoning |
| 2 | Qwen3-235B-A22B | qwen3-235b-a22b | 8K tokens | Super large-scale tasks |
| 3 | Qwen3-32B | qwen3-32b | 8K tokens | Large-scale tasks |
| 4 | Qwen3-30B-A3B | qwen3-30b-a3b | 8K tokens | Professional domains |
| 5 | Qwen3-14B | qwen3-14b | 8K tokens | Medium-scale tasks |
| 6 | Qwen3-8B | qwen3-8b | 8K tokens | Medium-scale tasks |
| 7 | Qwen3-4B | qwen3-4b | 8K tokens | Small-scale tasks |
| 8 | Qwen3-1.7B | qwen3-1.7b | 8K tokens | Lightweight applications |
| 9 | Qwen3-0.6B | qwen3-0.6b | 8K tokens | Lightweight applications |
4.2 Tongyi Qianwen Series
| No. | Model Name | Model ID | Max Output | Use Cases |
|---|---|---|---|---|
| 1 | Tongyi Qianwen-Plus | qwen-plus | 32K tokens | General scenarios, high cost-performance |
| 2 | Tongyi Qianwen-Flash | qwen-flash | 16K tokens | Fast response, real-time conversation |
| 3 | Tongyi Qianwen-Max | qwen-max | 8K tokens | Technical interviews, complex reasoning |
| 4 | Tongyi Qianwen-Turbo | qwen-turbo | 8K tokens | Fast response, simple conversations |
5. FAQ
5.1 Invalid API Key
Issue: API Key error when testing connection
Solutions:
- Check if API Key starts with
sk- - Confirm API Key is 32 characters long
- Check for extra spaces
- Confirm API Key is not expired or disabled
5.2 Request Timeout
Issue: No response for a long time when testing connection or using
Solutions:
- Check if network connection is normal
- Confirm API URL address is correct
- Check firewall settings
5.3 Insufficient Quota
Issue: Quota used up or insufficient balance
Solutions:
- Log in to Alibaba Cloud Bailian platform to check account balance
- Top up or apply for more quota
- Check API call frequency limit
