Skip to content
Kimi

Configure Kimi

Kimi is a large language model service developed by Moonshot AI, renowned for its ultra-long context processing capability (2 million characters). It supports long document reading, file analysis, web search, and is particularly suitable for scenarios involving large amounts of text information.

1. Obtain Kimi API Key

1.1 Access Kimi Open Platform

Visit the Moonshot Open Platform and log in: https://platform.moonshot.cn/

Access Kimi Platform

1.2 Navigate to API Key Management

After logging in, click API Key Management in the left sidebar menu.

Navigate to API Key Management

1.3 Create a New API Key

Click the Create New button in the upper right corner.

Click Create Button

1.4 Set API Key Information

In the dialog that appears:

  1. Enter the key name (e.g., CueMate)
  2. Click the Confirm button

Set API Key Information

1.5 Copy API Key

After successful creation, the system will display the API Key.

Important: Copy and save it immediately. The API Key starts with sk-.

Copy API Key

Click the copy button to copy the API Key to your clipboard.

2. Configure Kimi Model in CueMate

2.1 Navigate to Model Settings

After logging into CueMate, click Model Settings in the dropdown menu at the top right corner.

Navigate to Model Settings

2.2 Add New Model

Click the Add Model button in the upper right corner.

Click Add Model

2.3 Select Kimi Provider

In the dialog that appears:

  1. Provider Type: Select Kimi
  2. After clicking, it will automatically proceed to the next step

Select Kimi

2.4 Fill in Configuration Information

Fill in the following information on the configuration page:

Basic Configuration

  1. Model Name: Give this model configuration a name (e.g., Kimi-128K)
  2. API URL: Keep the default https://api.moonshot.cn/v1 (OpenAI-compatible format)
  3. API Key: Paste the Kimi API Key you just copied
  4. Model Version: Select the model ID you want to use. Common models include:
    • moonshot-v1-128k: 128K ultra-long context, max output 65K, suitable for ultra-long document understanding and multi-turn conversations
    • moonshot-v1-32k: 32K long context, max output 16K, suitable for long document processing and complex reasoning
    • moonshot-v1-8k: 8K standard context, max output 4K, suitable for regular conversations and quick responses

Fill in Basic Configuration

Advanced Configuration (Optional)

Expand the Advanced Configuration panel to adjust the following parameters:

Parameters Adjustable in CueMate Interface:

  1. Temperature: Controls output randomness

    • Range: 0-1 (Note: Kimi's temperature limit is 1, unlike OpenAI's 2)
    • Recommended Value: 0.3
    • Effect: Higher values produce more random and creative outputs, lower values produce more stable and conservative outputs
    • Usage Recommendations:
      • Creative writing/brainstorming: 0.7-0.9
      • Regular conversation/Q&A: 0.3-0.5
      • Code generation/precise tasks: 0.1-0.3
      • Long document analysis: 0.2-0.4
  2. Max Tokens: Limits single output length

    • Range: 256 - 65536 (depending on the model)
    • Recommended Value: 8192
    • Effect: Controls the maximum number of tokens in a single model response
    • Model Limits:
      • moonshot-v1-128k: Max 65K tokens
      • moonshot-v1-32k: Max 16K tokens
      • moonshot-v1-8k: Max 4K tokens
    • Usage Recommendations:
      • Short Q&A: 1024-2048
      • Regular conversation: 4096-8192
      • Long text generation: 16384-32768
      • Ultra-long documents: 65536 (128k model only)

Advanced Configuration

Additional Advanced Parameters Supported by Kimi API:

Although the CueMate interface only provides temperature and max_tokens adjustments, if you call Kimi directly through the API, you can also use the following advanced parameters (Kimi uses an OpenAI-compatible API format):

  1. top_p (nucleus sampling)

    • Range: 0-1
    • Default Value: 1
    • Effect: Samples from the smallest set of candidates whose cumulative probability reaches p
    • Relationship with temperature: Usually only adjust one of them
    • Usage Recommendations:
      • Maintain diversity but avoid extremes: 0.9-0.95
      • More conservative output: 0.7-0.8
  2. frequency_penalty

    • Range: -2.0 to 2.0
    • Default Value: 0
    • Effect: Reduces the probability of repeating the same words (based on word frequency)
    • Usage Recommendations:
      • Reduce repetition: 0.3-0.8
      • Allow repetition: 0 (default)
  3. presence_penalty

    • Range: -2.0 to 2.0
    • Default Value: 0
    • Effect: Reduces the probability of words that have already appeared appearing again (based on presence)
    • Usage Recommendations:
      • Encourage new topics: 0.3-0.8
      • Allow repeated topics: 0 (default)
  4. stop (stop sequences)

    • Type: String or array
    • Default Value: null
    • Effect: Stops generation when the specified string is included in the generated content
    • Example: ["###", "User:", "\n\n"]
  5. stream

    • Type: Boolean
    • Default Value: false
    • Effect: Enables SSE streaming return
    • In CueMate: Handled automatically, no manual setting required
No.Scenariotemperaturemax_tokenstop_pfrequency_penaltypresence_penalty
1Creative Writing0.7-0.98192-163840.950.50.5
2Code Generation0.1-0.32048-40960.90.00.0
3Q&A System0.3-0.51024-20480.90.00.0
4Document Analysis0.2-0.48192-327680.90.00.0
5Ultra-long Documents0.332768-655360.90.00.0

2.5 Test Connection

After filling in the configuration, click the Test Connection button to verify the configuration is correct.

Test Connection

If the configuration is correct, a test success prompt will be displayed, along with a sample response from the model.

Test Success

If the configuration is incorrect, a test error log will be displayed, and you can view specific error information through log management.

2.6 Save Configuration

After successful testing, click the Save button to complete the model configuration.

Save Configuration

3. Use the Model

Through the dropdown menu in the top right corner, navigate to the system settings interface and select the model configuration you want to use in the large model provider section.

After configuration, you can select to use this model in interview training, question generation, and other features. You can also individually select the model configuration for a specific interview in the interview options.

Select Model

4. Supported Model List

4.1 Moonshot v1 Series

No.Model NameModel IDContext LengthMax OutputUse Cases
1Moonshot v1 128Kmoonshot-v1-128k128K tokens65K tokensUltra-long document understanding, multi-turn conversations
2Moonshot v1 32Kmoonshot-v1-32k32K tokens16K tokensLong document processing, complex reasoning
3Moonshot v1 8Kmoonshot-v1-8k8K tokens4K tokensRegular conversations, quick responses

5. Common Issues

5.1 Invalid API Key

Symptom: API Key error prompt during connection test

Solution:

  1. Check if the API Key starts with sk-
  2. Confirm the API Key is completely copied
  3. Check if the account has available quota

5.2 Request Timeout

Symptom: No response for a long time during connection test or use

Solution:

  1. Check if the network connection is normal
  2. Confirm the API URL address is correct
  3. Check firewall settings

5.3 Insufficient Quota

Symptom: Prompt indicating quota exhausted or insufficient balance

Solution:

  1. Log in to the Moonshot platform to check account balance
  2. Recharge or apply for more quota
  3. Choose an appropriate model version

Released under the GPL-3.0 License.