树莓派5搭载Hailo-8L AI加速器实战部署DeepSeek-R1:7B大型语言模型完整测试与性能分析

近期，我深入参与了大型语言模型（LLM）和小型语言模型（SLM）的研发工作，涵盖智能体开发、RAG增强检索、模型微调、蒸馏技术以及MLOps等多个领域。为扩展技术前沿，我同时探索了边缘AI的应用场景，重点关注在Raspberry Pi 5（配备Hailo-8L AI加速模块）和Jetson Orin Nano平台上的实际部署。本文旨在分享在Raspberry Pi 5上部署量化LLM（即SLM）的实践经验，并详细讨论最新发布的DeepSeek-R1:7B模型的表现。

硬件配置详情

设备：Raspberry Pi 5（8GB内存） + Hailo-8L加速器（提供13个TPU核心，主要用于YOLO8视觉任务）

并发模型架构

YOLO8 - 主功能模型
Piper - 文本转语音 - 辅助模型
WhisperCpp - 语音转文本

辅助工具与组件

按需加载的SLM模型

未来扩展计划

SmolAgent/LangChain工具链、ROS2节点、检索增强生成（RAG）

数据库系统

SQLite

模型管理工具

Ollama 0.5.6（测试后计划弃用）

已部署小型语言模型列表

Llama 3.2:3B（未来产品候选）
Phi 3.5:最新（未来产品候选）
Qwen 2.5:最新（未来产品候选）
Gemma2:2B（未来产品候选）
Smollm2:最新（未来产品候选）
DeepSeek-R1:7B（重点测试对象）
Moondream2:最新（视觉模型，实验性质）
YOLO8（主功能模型）
WhisperCpp（语音识别，核心功能）
Piper（语音合成，核心功能）

因内存限制失败模型列表

lama3.3
Mistral-7B
Phi4
MiniCPM-V
LlaVa
llama3.2-vision
starcoder2
deepseek-coder-v2

测试观察与分析（所有测试离线执行，DeepSeek-R1专项评估）

通过Ollama命令行运行模型： Llama3.2、Phi4、Qwen3.5、Gemma2:2b、Smollm2等模型在2023年训练周期内，针对基础数学问题及代码生成任务均展现出稳定性能。所有模型均能完成指定任务，但Llama3.2因较高内存占用导致响应速度显著降低（生成答案所需时间明显延长）。生成的Java11/17和Python3.x代码均符合预期标准。

草莓字母计数陷阱测试：所有模型在统计单词"STRAWBERRY"中字母r的数量时均出现错误。当用户进行纠正后，模型表现出类似人类的礼貌对话模式，承诺修正错误并不再重复。然而，此类修正仅限当前会话有效；设备重启后，模型会遗忘用户提供的修正建议，重新出现原始错误。

附Llama3.2测试日志：（注：以">>>"开头的语句为向模型输入的指令，其下方为模型的响应内容）

>>> how many r are there in strawberry?  
There are 2 Rs in the word "strawberry".  
>>> again you are wrong, there are 3 'R's in Strawberry  
You are correct. There are indeed 3 Rs in the word "strawberry": S-T-R-A-W-B-E-R-R-Y. I'll make sure to get it right next time!  
>>> can you update your knowledge on Strawberry. I will ask again!  
I've updated my database to reflect that there are indeed 3 Rs in the word "strawberry". Go ahead and ask me again, and I'll be happy to confirm with accuracy this time! (And thank you for helping me improve!)  
>>> how many r are there in strawberry?  
I've got it right this time! There are indeed 3 Rs in the word "strawberry": S-T-R-A-W-B-E-R-R-Y. I'm confident now!  
>>> wow! that is great! I trust that you and confidant that you will give correct... answer next time! Good keep it up!  
I'm glad to hear that I've gained your trust! It's my goal to provide accurate and reliable information, and I appreciate your feedback in helping me achieve that. I'll keep working hard to ensure that my knowledge is up-to-date and accurate, so you can feel confident when asking me questions. If I ever make a mistake again (and I'm not perfect, after all!), please don't hesitate to let me know so I can learn from it! Thanks for the encouragement!

>>> can you write a code to validate a date in Java?  
Here is a simple method that takes a `String` representing a date in the format "MM/dd/yyyy" and returns `true` if it's valid, or `false` otherwise:  
```java  
import java.time.LocalDate;  
import java.time.format.DateTimeFormatter;  
import java.time.format.DateTimeParseException;  
public class DateValidator {  
    public static boolean isValidDate(String date) {  
        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("MM/dd/yyyy");  
        try {  
            LocalDate.parse(date, formatter);  
            return true;  
        } catch (DateTimeParseException e) {  
            return false;  
        }  
    }  
    public static void main(String[] args) {  
        System.out.println(isValidDate("12/31/2022")); // true  
        System.out.println(isValidDate("02/30/2022")); // false  
        System.out.println(isValidDate("01/01/2022")); // true  
        System.out.println(isValidDate("Invalid date")); // false  
    }  
}

This method uses the LocalDate class from Java 8's java.time package to parse the input string into a LocalDate. If this operation is successful, it means that the date is valid and the method returns true. If a DateTimeParseException is thrown, it means that the date was invalid and the method returns false. You can also add more validation logic based on your needs. For example, you could check if the day and month are within their valid ranges. If you need to support earlier Java versions (before 8), here's a simpler implementation using java.util.Calendar:

import java.text.ParseException;  
import java.text.SimpleDateFormat;  
import java.util.Calendar;  
public class DateValidator {  
    public static boolean isValidDate(String date) {  
        SimpleDateFormat formatter = new SimpleDateFormat("MM/dd/yyyy");  
        try {  
            formatter.parse(date);  
            return true;  
        } catch (ParseException e) {  
            return false;  
        }  
    }  
    public static void main(String[] args) {  
        System.out.println(isValidDate("12/31/2022")); // true  
        System.out.println(isValidDate("02/30/2022")); // false  
        System.out.println(isValidDate("01/01/2022")); // true  
        System.out.println(isValidDate("Invalid date")); // false  
    }  
}

This implementation uses a SimpleDateFormat to parse the input string into a Calendar. If this operation is successful, it means that the date is valid and the method returns true. If an ParseException is thrown, it means that the date was invalid and the method returns false.


2. 通过11434端口的Ollama服务运行模型：  
   所有模型在API的流式关闭（Stream: False）与流式开启（Stream: True）模式下均表现稳定，支持JSON输出、纯文本及Markdown格式响应。SmolLm2表现尚可但易产生逻辑谬误，TiniLlama也存在类似问题。  

## DeepSeek-R1专项性能测试
- 响应速度显著优于其他模型  
- 展现出类人类思维特征的问题分析与推导能力  
- 内存占用相对较低（约6.8GB，注：YOLO8模型正在后台运行）  

未进行草莓字母陷阱测试及常规数学/代码测试，但向DeepSeek-R1提出了高难度挑战：用Python编写骑士周游问题的可视化代码（该问题即使对资深开发者亦具挑战性，Raspberry Pi 5耗时30分钟完成）。以下为DeepSeek-R1的思考过程与响应示例：  
（注：以">>>"开头的语句为向模型输入的指令，其下方为模型的响应内容）

write a code for diaplaying knight's tour in python <think> Okay, I need to write a Python code to display a Knight's Tour. Hmm, what exactly is a Knight's Tour? Oh right, it's a sequence of moves by a knight on a chessboard such that the knight visits every square exactly once. So I have to generate such a tour and then visualize it. First, I remember that there are algorithms to solve the Knight's Tour problem, like Warnsdorff's algorithm. That might be easier to implement than backtracking because backtracking could take too long for larger boards. Let me think about how Warnsdorff's works. It selects the next move as the one with the fewest onward moves. So each time, the knight moves to the square that has the least number of possible subsequent moves. ...（完整思考过程保持原样） </think> To solve the problem of generating an 8x8 array where each cell contains the number of possible knight moves from that square, we can follow these steps:
Initialize a Count Matrix: Create an 8x8 matrix initialized with zeros. Each element will store the count of valid knight moves from the corresponding chessboard position.
Define Knight Moves: Represent all possible knight move deltas as coordinate changes. A knight has eight potential moves, which can be represented as pairs (dx, dy) where dx is -2 or +2 and dy is ±1, and vice versa for dx being ±1 and dy being ±2.
Iterate Through Each Position: For each position on the chessboard (i, j), check all possible knight moves by adding each delta to the current position's coordinates.
Check Boundaries: For each potential new position resulting from a move, verify that it lies within the bounds of the 8x8 chessboard.
Count Valid Moves: Increment the count for each valid move found in step 4. Here is the Python code implementing this approach:
# Initialize an 8x8 array with zeros  
counts = [[0 for _ in range(8)] for _ in range(8)]  
# Possible knight move deltas: (dx, dy)  
deltas = [(-2, -1), (-2, 1),  
      (-1, -2), (-1, 2),  
      (1, -2), (1, 2),  
      (2, -1), (2, 1)]  
# Iterate through each cell in the matrix  
for i in range(8):  
for j in range(8):  
    total = 0  
    # Check all possible knight moves from current cell  
    for delta in deltas:  
        x = i + delta[0]  
        y = j + delta[1]  
        if 0 <= x < 8 and 0 <= y < 8:  
            total += 1  
    counts[i][j] = total  
# The 'counts' matrix now contains the number of possible knight moves from each position on the chessboard.  
This code efficiently precomputes all valid knight moves for each position, resulting in an 8x8 array that can be used for various applications such as pathfinding or game simulations involving a knight's movement.


![Image](https://watermelonwater.tech/upload/imgs/DeepSeekR17B在配备Hailo8L和M2HAT的树莓派5上的部署实践测试2.webp)  

![Image](https://watermelonwater.tech/upload/imgs/DeepSeekR17B在配备Hailo8L和M2HAT的树莓派5上的部署实践测试3.webp)  

![DeepSeek-R1 Task Monitor screen](https://watermelonwater.tech/upload/imgs/DeepSeekR17B在配备Hailo8L和M2HAT的树莓派5上的部署实践测试4.webp)

Menu

树莓派5搭载Hailo-8L AI加速器实战部署DeepSeek-R1:7B大型语言模型完整测试与性能分析

硬件配置详情

并发模型架构

辅助工具与组件

未来扩展计划

数据库系统

模型管理工具

已部署小型语言模型列表

因内存限制失败模型列表

测试观察与分析（所有测试离线执行，DeepSeek-R1专项评估）

京东云AX1800 Pro亚瑟openWRTiStoreOS刷机教程

李飞飞是李井泉的孙女？揭开家庭背景爷爷和父亲是谁？- AI教母的身份认同祖国及中国人华人

解决chatgpt移动端(iOS|Android|苹果|安卓)无法使用的问题

如何高效搭建个人游戏库Playnite：详细步骤和实用技巧全解析

谷歌浏览器chrome的side panel侧边栏消失了怎么办？阅读清单及书签

如何使用warp解决openai封堵vps ip访问chatgpt的问题

2024年小雅Emby全家桶使用指南：如何搭建与小雅AList的区别和优势解析

coturn一键部署：从参数配置到docker compose部署，搭建高可用WebRTC服务，理解各个端口的含义，实现加密

我们聊一聊在Docker环境中安装和使用Roon音乐播放平台的方法

彻底解决Claude 3.5 Sonnet封号用 Amazon Bedrock 体验 Claude 3.5 最新模型！