Java OCR

17 Mar 2025 | 6 分钟阅读

什么是 Tesseract OCR？

Tesseract OCR 是由 HP 实验室于 1985 年开发、2005 年推出的光学字符识别引擎。自 2006 年以来，一直由 Google 开发。Tesseract 支持 Unicode (UTF-8)，可以“开箱即用”识别 100 多种语言，因此可用于创建不同语言的扫描软件。Tesseract 的最新版本是 Tesseract 4。它增加了一个新的基于 OCR 的神经网络 (LSTM) 引擎，专注于行识别，但也支持通过识别字符模式工作的 Tesseract OCR 传统引擎。

随着人工智能和机器学习的飞速发展，我们现在需要严格的图像处理。它使我们能够在 Java 中执行此类处理。

OCR 如何工作？

Tesseract OCR 可在 Windows、Mac 和 OS 等所有主要操作系统上下载。要了解 OCR 的工作原理，请按顺序考虑以下步骤：

预处理图像数据，例如：转换为灰度、平滑、去歪斜、过滤。
检测线条、单词和字母。
根据一套合格的数据集生成候选字符的排名列表。（此处使用 setDataPath() 方法设置训练器数据路径）
发送视觉字符的过程，根据上一步的置信度以及语言数据选择最佳字符。语言数据包括字典、语法规则等。

如何使用 Tesseract OCR？

要在 Java 中使用 Tesseract OCR，请按照以下步骤操作：

下载 Tess4J API。
从下载的文件中解压文件。
打开任何 IDE 并创建一个新项目。
将 jar 文件链接到您的项目。
请通过此路径“..\Tess4J-3.4.8-src\Tess4J\dist”进行操作。

jar 文件已成功链接到项目，因此 Tesseract 引擎已准备好使用。

对清晰的图像执行 OCR

现在我们已经链接了 jar 文件，我们可以开始编码部分了。以下代码读取图像文件并执行 OCR，并在控制台上显示文本。

OCR.java

import java.io.File ;
import net.sourceforge.tess4j.Tesseract ;
import net.sourceforge.tess4j.TesseractException ;
public class OCR {
    public static void main( String[ ] args )
    {
        // creating an object of class Tesseract
        Tesseract tesseract = new Tesseract( ) ;
        try {
            // this includes the path of tessdata inside the extracted folder
            tesseract.setDatapath( " D:/Tess4J/tessdata " ) ;
            // specifying the image that has to be read
            String text = tesseract.doOCR( new File( " image.jpg " ) ) ;  
            // printing the text corresponding to the image interpreted
            System.out.print( text ) ;
        }
        catch ( TesseractException e ) {
            e.printStackTrace( ) ;
        }
    }

输入

image.jpg

输出

Sometimes, this simply isn't possible. Sometimes, we wish to automate a task of rewriting text from an image with our own hands.

使用 OCR 读取不清晰的图像

请注意，上面选择的图像分辨率非常高，字体一致，但这在大多数情况下不会发生。在大多数情况下，我们会得到一个不清晰或可能失真的图像，从而导致失真的输出。为了解决这个问题，我们需要对图像执行一些称为图像处理的步骤。

Tesseract 在文本与背景的分割非常清晰时效果最佳。事实上，确保良好的分离可能非常具有挑战性。如果图像具有不清晰或失真的背景，则可能无法获得 Tesseract 的高质量输出，原因有很多。在这种情况下，我们需要知道图像应该如何处理。

在这里，我们将创建一个小型智能模型，该模型将扫描图像的 RGB 内容，将其转换为灰度，并再次创建缩放效果。

下面的示例是根据 RGB 内容对图像进行灰度处理的示例代码。

ReadingImage.java

// importing all the required packages
import java.awt.Graphics2D ;
import net.sourceforge.tess4j.* ;
import java.awt.image.* ;
import java.io.* ;
import javax.imageio.ImageIO ;
public class ReadingImage
{
	public static void processImg( BufferedImage inputImage, float scaleFactor, float offset )
		throws IOException, TesseractException
	{		
	// We will create an image buffer
	// for storing the image later on
	// and inputImage is an image buffer
	// of input image
	BufferedImage outputImage = new BufferedImage( 1050, 1024, inputImage.getType( ) ) ;
	// Now, for drawing the new image
	// we will create a 2D platform
	// on the buffer image
	Graphics2D grp = outputImage.createGraphics( ) ;
	// drawing a new zoomed image starting
	// from 0 0 of size 1050 x 1024
	// and null is the ImageObserver class object
	grp.drawImage( inputImage, 0, 0, 1050, 1024, null ) ;
	grp.dispose( ) ;		
	// for the gray scaling of images
	// we'll use RescaleOp object
	RescaleOp rescaleOutput = new RescaleOp( scaleFactor, offset, null ) ;	
	// Here, we are going to perform
	// scaling of the image and then
	// writing on a .jpg file
	BufferedImage finalOutputimage = rescaleOutput.filter( outputImage, null ) ;
	ImageIO.write( finalOutputimage, " jpg ",
				new File( " C:/Users/Yukta Malhotra/Desktop/pico.jpg " ) ) ;
	// Creating an instance of Tesseract class
	// that will be used to perform OCR
	Tesseract tesseractInstance = new Tesseract( ) ;
	tesseractInstance.setDatapath( " C:/Users/Yashneet/Desktop/Tess4J/tessdata " ) ;
	// finally performing OCR on the image
	// and then storing the result in 'str' string
	String str = tesseractInstance.doOCR( finalOutputimage ) ;
	System.out.println( str ) ;
	}
	public static void main( String args[ ] ) throws Exception
	{
	File f = new File( " C:/Users/Yashneet/Desktop/pic3.jpg " ) ;
	BufferedImage inputImage = ImageIO.read( f ) ;
	// here, we're getting the RGB content of the complete image file
	double d = inputImage.getRGB(inputImage.getTileWidth( ) / 2,   
                                      inputImage.getTileHeight() / 2 ) ;
	// now, we'll compare the values and
	// set up new scaling values
	// which will be use by RescaleOp later on
	if ( d >= -1.4211511E7 && d < -7254228 ) {
		processImg( inputImage, 3f, -10f ) ;
	}
	else if ( d >= -7254228 && d < -2171170 ) {
		processImg( inputImage, 1.455f, -47f ) ;
	}
	else if ( d >= -2171170 && d < -1907998 ) {
		processImg( inputImage, 1.35f, -10f ) ;
	}
	else if ( d >= -1907998 && d < -257 ) {
		processImg( inputImage, 1.19f, 0.5f ) ;
	}
	else if ( d >= -257 && d < -1 ) {
		processImg( inputImage, 1f, 0.5f ) ;
	}
	else if ( d >= -1 && d < 2 ) {
		processImg( inputImage, 1f, 0.35f ) ;
	}
	}
}

输入

输出

Time taken to search elements keep increasing as the number of elements were increased.

优点

OCR 的优点如下：

它提高了办公室工作的效率。
能够快速搜索内容非常有用，尤其是在办公室环境中，您需要处理大量扫描或大量文档输入。
OCR 速度快，可确保文档内容保持不变，从而节省时间。
工作流程得到提高，因为员工不再花费时间进行体力劳动，可以更快、更有效地工作。

缺点

OCR 的缺点如下：

OCR 仅限于语言识别。
创建不同语言的数据并实现它们需要大量的努力。
还需要在图像处理方面做更多工作，因为这是 OCR 性能最重要的部分。
在执行如此大量的工作后，没有 OCR 可以提供 100% 的准确性，即使在 OCR 之后，我们仍然需要在邻近的机器学习方法中确定一个未知字符或亲自修复它。

下一个主题Java 中的对象定义

Java OCR

什么是 Tesseract OCR？

OCR 如何工作？

如何使用 Tesseract OCR？

对清晰的图像执行 OCR

使用 OCR 读取不清晰的图像

优点

缺点

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

Java Conversion

Java Misc

Java OCR

什么是 Tesseract OCR？

OCR 如何工作？

如何使用 Tesseract OCR？

对清晰的图像执行 OCR

使用 OCR 读取不清晰的图像

优点

缺点

相关帖子

Java IdentityHashMap Class

Java 中接口变量和枚举的区别

Java 中的拼写检查器

Eclipse 主题更改

Java 中的 Zygodromes

Types of Garbage Collector in Java

Java 中范围内双递增序列的计数

Java 包注解

JUnit test case example in Java

Java 中的 Border 布局管理器

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器