本篇看一下ImageToText,获取图片内的信息,加以利用。全例是用户上传图片,利用AI来判断上传的图片是否正确。
<ItemGroup>
<PackageReference Include="Microsoft.SemanticKernel" Version="1.6.2" />
</ItemGroup>
下面是识别图片,之前一直是把问题和图片作为入参,效果不稳定,这次做了调整,先询问图片有哪些特征,让模型回答,等回来特征后,二次通过文本询问,进行判断,效果要好一些,看来该花的token省不了。
后端cs代码如下:
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel;
var builder = WebApplication.CreateBuilder(args);
var app = builder.Build;
app.UseStaticFiles;
var key = File.ReadAllText(@"C:\GPT\key.txt");
var kernel = Kernel.CreateBuilder
.AddOpenAIChatCompletion("gpt-4-vision-preview", key)
.Build;
var chatGPT = kernel.GetRequiredService<IChatCompletionService>;
var systemMessage = @"你是一个很很认真的助手,能很仔细出色的识别图片。";
var chatHistory = new ChatHistory(systemMessage);
var prompt = "请简单描述图片的特征";
app.MapGet("/iscard", async (string name) =>
{
app.Logger.LogInformation("图片名称:{0}", name);
chatHistory.Clear;
var ImageUri = $"https://github.com/axzxs2001/Asp.NetCoreExperiment/blob/master/Asp.NetCoreExperiment/SemanticKernel/GPTVision/{name}.png?raw=true";
chatHistory.AddUserMessage(new ChatMessageContentItemCollection
{
new TextContent(prompt),
new ImageContent(new Uri(ImageUri))
});
var reply = await chatGPT.GetChatMessageContentAsync(chatHistory, new OpenAIPromptExecutionSettings { MaxTokens = 1000 });
var message = reply.Content;
chatHistory.Clear;
app.Logger.LogInformation(message);
var newprompt = @$"根据特征下面的特征:
--------------------------
{message},
--------------------------
回答下面的问题,请用“Yes”或“No”回答,如果不能识别,请用“No”回答。
问题:这张图片是“Residence Card”吗?
“Residence Card”有如下特征:
包含有“Residence Card”字样
包含有“GOVERNMENT OF JAPAN”字样
包含有“DATE OF BIRTH”字样
包含有“ADDRESS”字样
包含有“STATUS”字样
";
chatHistory.AddUserMessage(new ChatMessageContentItemCollection
{
new TextContent(newprompt),
});
reply = await chatGPT.GetChatMessageContentAsync(chatHistory, new OpenAIPromptExecutionSettings { MaxTokens = 100 });
app.Logger.LogInformation(reply.Content);
return reply.Content;
});
app.Run;
前端代码如下:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>上传</title>
<script src="//img.17u1u.com/https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-ka7Sk0Gln4gmtz2MlQnikT1wXgYsOg OMhuP IlRH9sENBO0LRn5q 8nbTov4 1p" crossorigin="anonymous"></script>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3" crossorigin="anonymous">
<script src="//img.17u1u.com/https://code.jquery.com/jquery-3.7.1.min.js"
integrity="sha256-/JqT3SQfawRcv/BIHPThkBvs0OEvtFFmqPF/lYI/Cxo="
crossorigin="anonymous"></script>
</head>
<body>
<div class="container">
<div class="row">
<div class="col">
<img onclick="isziliucard('a')" src="//img.17u1u.com/https://github.com/axzxs2001/Asp.NetCoreExperiment/blob/master/Asp.NetCoreExperiment/SemanticKernel/GPTVision/a.png?raw=true" width="330" />
</div>
<div class="col">
<img onclick="isziliucard('zlk')" src="//img.17u1u.com/https://github.com/axzxs2001/Asp.NetCoreExperiment/blob/master/Asp.NetCoreExperiment/SemanticKernel/GPTVision/zlk.png?raw=true" width="330" />
</div>
<div class="col">
<img onclick="isziliucard('c')" src="//img.17u1u.com/https://github.com/axzxs2001/Asp.NetCoreExperiment/blob/master/Asp.NetCoreExperiment/SemanticKernel/GPTVision/c.png?raw=true" width="330" />
</div>
</div>
<div class="row">
<span id="result"></span>
</div>
</div>
<script>
function isziliucard(name) {
$('#result').css('color', 'black');
$("#result").html("判断中……");
$.ajax({
url: '/iscard?name=' name,
type: 'GET',
success: function (data) {
if (data.includes('No')) {
$('#result').css('color', 'red');
} else {
$('#result').css('color', 'green');
}
$("#result").html('判断结果:' data);
},
error: function (xhr, status, error) {
alter(error)
}
});
}
</script>
</body>
</html>
支行结果:
下面是询问过程中的后台返回,a是第一张图片的名字,zlk是第二张图片的名字,c是第三张图片的名字。
Copyright © 2024 妖气游戏网 www.17u1u.com All Rights Reserved