node-llama-cpp - llama.cpp 的 node.js 绑定

在你的 node.js 项目目录中，运行以下命令：

bash

npm install --save node-llama-cpp

npm install --save node-llama-cpp

node-llama-cpp 自带 macOS、Linux 和 Windows 的预构建二进制文件。

如果您的平台没有可用的二进制文件，它将回退到下载 llama.cpp 的一个发行版，并使用 cmake 从源代码构建。要禁用此行为，请将环境变量 NODE_LLAMA_CPP_SKIP_DOWNLOAD 设置为 true。

ESM usage

node-llama-cpp 是一个ES module，因此只能使用import来加载它，不能使用require。

确保您可以在项目中使用它，请确保您的 package.json 文件中包含 `

Metal: macOS 默认启用了 Metal 支持。如果您使用的是搭载英特尔芯片的 Mac，您可能希望禁用它。

CUDA: 要启用CUDA支持，请参阅CUDA指南。

获取模型文件

我们建议您从Hugging Face上的TheBloke获取GGUF模型。

我们建议您从获取一个参数不多的小型模型开始，以确保一切正常运行，因此首先尝试下载一个7B参数模型（搜索名称中同时包含7B和GGUF的模型）。

为了提高下载速度，您可以使用ipull来下载模型：

bash

npx ipull <model-file-ul>

npx ipull <model-file-ul>

验证模型

验证您下载的模型是否正常工作，请运行以下命令与其聊天：

bash

npx --no node-llama-cpp chat --model <path-to-a-model-file-on-your-computer>

npx --no node-llama-cpp chat --model <path-to-a-model-file-on-your-computer>

尝试告诉模型嗨，看看它的反应。如果回复看起来奇怪或没有意义，请尝试使用另一个模型。

如果模型不停止生成输出，请尝试使用不同的聊天包装器。例如：

bash

npx --no node-llama-cpp chat --wrapper llamaChat --model <path-to-a-model-file-on-your-computer>

npx --no node-llama-cpp chat --wrapper llamaChat --model <path-to-a-model-file-on-your-computer>

用法

聊天机器人

typescript

import {fileURLToPath} from "url";
import path from "path";
import {LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const model = new LlamaModel({
 modelPath: path.join(__dirname, "models", "codellama-13b.Q3_K_M.gguf")
});
const context = new LlamaContext({model});
const session = new LlamaChatSession({context});

const q1 = "嗨，你好吗？";
console.log("用户：" + q1);

const a1 = await session.prompt(q1);
console.log("AI: " + a1);

const q2 = "总结你说的内容";
console.log("用户: " + q2);

const a2 = await session.prompt(q2);
console.log("AI: " + a2);

import {fileURLToPath} from "url";
import path from "path";
import {LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const model = new LlamaModel({
 modelPath: path.join(__dirname, "models", "codellama-13b.Q3_K_M.gguf")
});
const context = new LlamaContext({model});
const session = new LlamaChatSession({context});

const q1 = "嗨，你好吗？";
console.log("用户：" + q1);

const a1 = await session.prompt(q1);
console.log("AI: " + a1);

const q2 = "总结你说的内容";
console.log("用户: " + q2);

const a2 = await session.prompt(q2);
console.log("AI: " + a2);

若要使用自定义聊天提示包装器，请参阅聊天提示包装器指南。

使用 JSON 模式的聊天机器人

要强制模型根据 JSON 模式生成输出，请使用 LlamaJsonSchemaGrammar 类。

它将强制模型根据您提供的 JSON 模式生成输出，并且将在文本生成级别上执行。

它仅支持JSON模式规范的一个小子集，但足以使用文本生成模型生成有用的JSON对象。

注意

要了解如何正确使用语法，请阅读语法指南。

typescript

import {fileURLToPath} from "url";
import path from "path";
import {
 LlamaModel, LlamaJsonSchemaGrammar, LlamaContext, LlamaChatSession
} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const model = new LlamaModel({
 modelPath: path.join(__dirname, 'models', 'codellama-13b.Q3_K_M.gguf')
})
const grammar = new LlamaJsonSchemaGrammar({
 'type': 'object',
 'properties': {
 'responseMessage': {
 'type': 'string'
 },
 'requestPositivityScoreFromOneToTen': {
 'type': 'number'
 }
 }
} as const);
const context = new LlamaContext({model});
const session = new LlamaChatSession({context});

const q1 = '你好吗？';
console.log("用户：" + q1);

const a1 = await session.prompt(q1, { grammar, maxTokens: context.getContextSize() });
console.log("AI: " + a1);

const parsedA1 = grammar.parse(a1);
console.log(
 parsedA1.responseMessage,
 parsedA1.requestPositivityScoreFromOneToTen
);

import {fileURLToPath} from "url";
import path from "path";
import {
 LlamaModel, LlamaJsonSchemaGrammar, LlamaContext, LlamaChatSession
} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const model = new LlamaModel({
 modelPath: path.join(__dirname, 'models', 'codellama-13b.Q3_K_M.gguf')
})
const grammar = new LlamaJsonSchemaGrammar({
 'type': 'object',
 'properties': {
 'responseMessage': {
 'type': 'string'
 },
 'requestPositivityScoreFromOneToTen': {
 'type': 'number'
 }
 }
} as const);
const context = new LlamaContext({model});
const session = new LlamaChatSession({context});

const q1 = '你好吗？';
console.log("用户：" + q1);

const a1 = await session.prompt(q1, { grammar, maxTokens: context.getContextSize() });
console.log("AI: " + a1);

const parsedA1 = grammar.parse(a1);
console.log(
 parsedA1.responseMessage,
 parsedA1.requestPositivityScoreFromOneToTen
);

原始

typescript

import {fileURLToPath} from "url";
import path from "path";
import {
 LlamaModel, LlamaContext, LlamaChatSession, Token
} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const model = new LlamaModel({
 modelPath: path.join(__dirname, "models", "codellama-13b.Q3_K_M.gguf")
});

const context = new LlamaContext({model});

const q1 = "嗨，你好吗？";
console.log("AI: " + q1);

const tokens = context.encode(q1);
const res: Token[] = [];
for await (const modelToken of context.evaluate(tokens)) {
 res.push(modelToken);

 // 不要将结果作为字符串连接，这很重要，因为这样做会破坏一些由多个标记组成的字符（比如一些表情符号）。通过使用标记数组，我们可以正确地一起解码它们。const resString: string = context.decode(res);

 const lastPart = resString.split("ASSISTANT:").reverse()[0];
 if (lastPart.includes("USER:"))
 break;
}

const a1 = context.decode(res).split("USER:")[0];
console.log("AI: " + a1);

import {fileURLToPath} from "url";
import path from "path";
import {
 LlamaModel, LlamaContext, LlamaChatSession, Token
} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const model = new LlamaModel({
 modelPath: path.join(__dirname, "models", "codellama-13b.Q3_K_M.gguf")
});

const context = new LlamaContext({model});

const q1 = "嗨，你好吗？";
console.log("AI: " + q1);

const tokens = context.encode(q1);
const res: Token[] = [];
for await (const modelToken of context.evaluate(tokens)) {
 res.push(modelToken);

 // 不要将结果作为字符串连接，这很重要，因为这样做会破坏一些由多个标记组成的字符（比如一些表情符号）。通过使用标记数组，我们可以正确地一起解码它们。const resString: string = context.decode(res);

 const lastPart = resString.split("ASSISTANT:").reverse()[0];
 if (lastPart.includes("USER:"))
 break;
}

const a1 = context.decode(res).split("USER:")[0];
console.log("AI: " + a1);

node-llama-cpp - llama.cpp 的 node.js 绑定

聊天机器人 ​

使用 JSON 模式的聊天机器人 ​

原始 ​

聊天机器人

使用 JSON 模式的聊天机器人

原始