Integrating DeepSeek Api with AWS Bedrock: A Comprehensive Overview

Here in this article i have uploaded Deepseek model >> deepseek-ai/DeepSeek-R1-Distill-Llama-8B.

·

7 min read

Here’s a step-by-step guide for installing the DeepSeek API using Hugging Face CLI, Amazon Bedrock, and an S3 bucket. Link to model is https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Step 1: Launch an EC2 Instance

  1. Log in to your AWS account.

  2. Navigate to EC2 > Instances and click Launch Instance.

  3. Choose an Amazon Linux 2 or Ubuntu AMI.

  4. Select an appropriate instance type (e.g., t3.medium for general use).

  5. Configure security groups to allow SSH access (port 22).

  6. Launch the instance and connect via SSH.


Step 2: Install Required Dependencies on EC2

  1. Update your package lists:

     sudo apt update && sudo apt upgrade -y  # For Ubuntu
     sudo yum update -y  # For Amazon Linux
    
  2. Install Python and pip:

    if cant install huggingface using other command use this:

    Install pipx

     sudo apt update && sudo apt install pipx -y
    

    Then, ensure pipx is ready to use:

     pipx ensurepath
    

    Restart your terminal or run:

     source ~/.bashrc
    

     sudo apt install python3-pip -y  # For Ubuntu
     sudo yum install python3-pip -y  # For Amazon Linux
    
  3. Install Git and other dependencies:

     sudo apt install git -y
     sudo yum install git -y
    

Step 3: Install and Configure Hugging Face CLI

  1. Install the Hugging Face CLI:(If cant installed using this then used step 2 method __ using pipx)

     pip3 install --upgrade huggingface_hub
    
  2. Authenticate with Hugging Face:

     huggingface-cli login 
    
     Enter your Hugging Face Access Token (Get it from Hugging Face Tokens).
     Ensure the token has read access.
    

    First of all, let’s install the CLI:

     >>> pip install -U "huggingface_hub[cli]"
    

    Once installed, you can check that the CLI is correctly setup:

     >>> huggingface-cli --help
     usage: huggingface-cli <command> [<args>]
    
     positional arguments:
       {env,login,whoami,logout,repo,upload,download,lfs-enable-largefiles,lfs-multipart-upload,scan-cache,delete-cache,tag}
                             huggingface-cli command helpers
         env                 Print information about the environment.
         login               Log in using a token from huggingface.co/settings/tokens
         whoami              Find out which huggingface.co account you are logged in as.
         logout              Log out
         repo                {create} Commands to interact with your huggingface.co repos.
         upload              Upload a file or a folder to a repo on the Hub
         download            Download files from the Hub
         lfs-enable-largefiles
                             Configure your repository to enable upload of files > 5GB.
         scan-cache          Scan cache directory.
         delete-cache        Delete revisions from the cache directory.
         tag                 (create, list, delete) tags for a repo in the hub
    
     options:
       -h, --help            show this help message and exit
    

    Step 4: Authenticate Hugging Face CLI

Step 4 : Download DeepSeek-R1-Distill-Llama-8B

  1. Download the model from Hugging Face

     huggingface-cli download deepseek-ai/DeepSeek-R1-Distill-Llama-8B --local-dir deepseek-llama
    

    This will download the model files into the deepseek-llama directory.

  2. Verify the downloaded files

     ls deepseek-llama
    

    output:


Step 5: Upload Model to AWS S3

1. Install AWS CLI (If not installed)

sudo apt install awscli -y  # For Ubuntu
sudo yum install awscli -y  # For Amazon Linux

2. Configure AWS CLI

aws configure
  • Enter your AWS Access Key ID

  • Enter your AWS Secret Access Key

  • Set Default region (e.g., us-east-1)

  • Leave output format as default (json)

Get aws access key like this.

3. Compress the Model Files (Optional)

To reduce the number of files and speed up the upload:

tar -czvf deepseek-llama.tar.gz deepseek-llama

4. Upload to Your S3 Bucket (mine is sarunthapa)

aws s3 cp deepseek-llama.tar.gz s3://sarunthapa/

5. Verify the Upload

aws s3 ls s3://sarunthapa/

Step 6: Deploying the Model in Amazon Bedrock

Amazon Bedrock allows deploying custom models stored in an S3 bucket and using them via API or the Bedrock Playground.


1. Opening Amazon Bedrock in AWS Console

  1. Logged in to the AWS Management Console.

  2. In the AWS search bar, searched for Amazon Bedrock and opened it.

  3. Clicked on Models & Endpoints from the left panel.


2. Creating a Custom Model Deployment

  1. Clicked on Create Model.

  2. Selected Custom Model Deployment.

  3. Clicked Next.


3. Configuring the Model Deployment

  1. Entered Model Name:

    • Set the model name as sarun.
  2. Selected Model Source:

    • Chose Amazon S3 as the source.

    • Entered the S3 path where the model is stored:

        s3://sarunthapa/sarun.tar.gz
      
    • Clicked Validate to ensure that AWS Bedrock can access the S3 bucket.

  3. Selected Model Instance Type:

    • Chose a GPU-based instance for DeepSeek-R1-Distill-Llama-8B, selecting:

      • ml.g5.12xlarge (for standard performance)

      • ml.g5.24xlarge (for high performance)

  4. Configured IAM Role:

    • Created a new IAM role with permissions to read from S3:

        jsonCopyEdit{
          "Effect": "Allow",
          "Action": ["s3:GetObject"],
          "Resource": ["arn:aws:s3:::sarunthapa/*"]
        }
      
    • Attached this role to the model deployment.

  5. Set Environment Variables (if needed):

    • Entered any required configurations for the model.
  6. Clicked Deploy to start the deployment process.


4. Verifying Deployment Status

  1. Navigated to Models & Endpoints.

  2. Located the model (sarun).

  3. Checked the Status:

    • In Progress → Model is still being deployed.

    • Ready → Deployment is successful, and the model is available.


Step 6: Testing the Model in Bedrock Playground

1. Opening the Bedrock Playground

  1. Opened Amazon Bedrock Console.

  2. Clicked on Playground from the left panel.

2. Selecting the Model

  • In the Select a Model dropdown, selected sarun.

3. Entering a Test Prompt

  • Typed a sample prompt:

      What is the capital of Australia?
    
  • Clicked Run.

4. Reviewing the Output

  • If everything was set up correctly, a generated response appeared from the model.

  • If any errors occurred, checked CloudWatch Logs to debug the issue.


Step 7: Using the Model via API in Amazon Bedrock

After completing the Bedrock Playground integration, I now focused on interacting with the model programmatically through the Amazon Bedrock API. Below are the steps I followed to make the API calls:

7.1. Ensure Model Deployment

Before moving to the API, I made sure that my model, sarun, was successfully deployed in Amazon Bedrock. I went to the Models & Endpoints section in the Amazon Bedrock console, verified that my model was listed, and checked that it was in the "Ready" state.


7.2. Install and Configure AWS CLI(Already Done In my case):

To make API calls, I first set up the AWS CLI. I ran the following command to install it:

pip install awscli

After installation, I ran the aws configure command to set up my AWS credentials:

aws configure

I entered the necessary details:

  • AWS Access Key ID: <My_Access_Key_ID>

  • AWS Secret Access Key: <My_Secret_Access_Key>

  • Default region name: us-east-1 (or the region where the model is deployed)

  • Default output format: json


7.3. Install Python SDK (Boto3)

Next, I installed the Boto3 SDK, which is used to interact with AWS services in Python:

pip install boto3

7.4. Initialize Bedrock Client in Python

After installing Boto3, I created a Python script to interact with my sarun model via the API. I started by importing the necessary libraries:

import boto3
import json

Then, I initialized the Amazon Bedrock client with the following code:

client = boto3.client("bedrock-runtime", region_name="us-east-1")

7.5. Define the Input Payload

I needed to prepare the input data for my model. For this example, I wanted to send a simple text prompt to sarun. I defined the input payload as follows:

input_payload = {
    "inputText": "What are the uses of AI in healthcare?"
}

I then converted this input to a JSON string:

payload_str = json.dumps(input_payload)

7.6. Invoke the Model via Bedrock API

Now, I was ready to invoke the model using the Bedrock API. I used the invoke_model function to make the API call:

response = client.invoke_model(
    modelId="sarun",  # The name of the model deployed in Amazon Bedrock
    body=payload_str  # The input data sent as a JSON body
)

7.7. Read and Parse the Response

After sending the request, I received a response from the API in bytes. To decode the response and extract the model's output, I used the following code:

response_body = response["body"].read().decode("utf-8")
response_json = json.loads(response_body)
print("Model Response:", response_json["outputText"])

The response from sarun was now ready to be displayed. For example, if my input was:

What are the uses of AI in healthcare?

The output is:

AI is increasingly being used in healthcare for automating tasks, diagnosing diseases, and improving patient care.

Hence I Successfully deployed and tested the DeepSeek model (sarun) in AWS Bedrock!